Directional Venn diagram

venndir(
  setlist,
  overlap_type = c("detect", "concordance", "each", "overlap", "agreement"),
  sets = NULL,
  set_colors = NULL,
  setlist_labels = NULL,
  legend_labels = NULL,
  draw_legend = TRUE,
  legend_font_cex = 1,
  proportional = FALSE,
  show_labels = "Ncs",
  main = NULL,
  return_items = TRUE,
  show_items = c(NA, "none", "sign item", "sign", "item"),
  max_items = 3000,
  show_zero = FALSE,
  font_cex = c(1, 1, 0.7),
  fontfamily = "Arial",
  show_label = NA,
  display_counts = TRUE,
  poly_alpha = 0.6,
  alpha_by_counts = FALSE,
  label_style = c("basic", "fill", "shaded", "shaded_box", "lite", "lite_box"),
  label_preset = "none",
  template = c("wide", "tall"),
  unicode = TRUE,
  big.mark = ",",
  curate_df = NULL,
  venn_jp = NULL,
  inside_percent_threshold = 0,
  item_cex = 1,
  item_style = c("default", "text", "gridtext"),
  item_buffer = -0.15,
  item_degrees = 0,
  sign_count_delim = " ",
  padding = c(3, 2),
  r = 2,
  center = c(0, -0.15),
  segment_distance = 0.1,
  segment_buffer = -0.1,
  show_segments = TRUE,
  sep = "&",
  do_plot = TRUE,
  verbose = FALSE,
  debug = 0,
  circle_nudge = NULL,
  lwd = 1,
  rotate_degrees = 0,
  ...
)

Arguments

setlist

list of named vectors, whose names represent set items, and whose values represent direction using values c(-1, 0, 1).

overlap_type

character value indicating the type of overlap logic:

  • "each" records each combination of signs;

  • "overlap" disregards the sign and returns any match item overlap;

  • "concordance" represents counts for full agreement, or "mixed" for any inconsistent overlapping direction;

  • "agreement" represents full agreement in direction as "agreement", and "mixed" for any inconsistent direction.

sets

integer index with optional subset of sets in setlist for the Venn diagram. This option is useful when defining consistent set_colors for all entries in setlist.

set_colors

character vector of R colors, or default NULL to use categorical colors defined by colorjam::rainbowJam(). It will generate colors for every element in setlist even when a subset is defined with sets.

setlist_labels

character vector with optional custom labels to display in the Venn diagram. This option is intended when the names(setlist) are not suitable for display, but should still be maintained as the original names.

legend_labels

character vector with optional custom labels to display in the Venn legend. This option is intended when the names(setlist) are not suitable for a legend, but should still be maintained as the original names. The legend labels are typically single-line entries and should have relatively short text length.

draw_legend

logical passed to render_venndir(), and stored in the Venndir metadata.

legend_font_cex

numeric scalar, default 1, used to adjust the relative size of fonts with venndir_legender() when draw_legend=TRUE. This value is stored in metadata for persistence.

proportional

logical (default FALSE) indicating whether to draw proportional Venn circles, also known as a Euler diagram. Proportional circles are not guaranteed to represent all possible overlaps. Proportional circles are determined by calling eulerr::eulerr(). Use shape="ellipse" for eulerr() to provide elliptical shapes.

show_labels

character string to define the labels to display, and where they should be displayed. The definition uses a single letter to indicate each type of label to display, using UPPERCASE to display the label outside the Venn shape, and lowercase to display the label inside the Venn shape. The default "Ncs" displays _N_ame (outside), _c_ount (inside), and _s_igned count (inside).

The label types are defined below:

  • _N_ame: "n" or "N" - the set name, by default it is displayed.

  • _O_verlap: "o" or "O" - the overlap name, by default it is hidden, because these labels can be very long, also the overlap should be evident in the Venn diagram already.

  • _c_ount: "c" or "C" - overlap count, independent of the sign

  • _p_ercentage: "p" or "P" - overlap percentage, by default hidden, but available as an option

  • _s_igned count: "s" or "S" - the signed overlap count, tabulated based upon overlap_type ("each", "concordant", "agreement", etc/)

  • _i_tems: "i" only, by default hidden. When enabled, item labels defined by show_items are spread across the specific Venn overlap region.

main

character string used as a plot title, default NULL will render no title. When provided, it is rendered using gridtext::richtext_grob() which enables some Markdown-style formatting. The title is stored in venndir@metadata$main for persistence.

return_items

logical (default TRUE) indicating whether to return items in the overlap data. When FALSE item labels also cannot be displayed in the figure. The main reason not to return items is to conserve memory, for example if setlist is extremely large.

show_items

character used to define the item label, only used when the show_label entry includes "i" which enables item display inside the Venn diagram.

  • "item": shows only the item labels

  • "sign": shows only the sign of each item

  • "sign items": shows the sign and item together (or "item sign" will show the item, then the sign).

max_items

numeric (default 3000) indicating the maximum number of item labels to display when enabled.

show_zero

logical (default FALSE) indicating whether empty overlaps are labeled with count zero 0. When show_zero=TRUE the count zero label is displayed, otherwise no count label is shown.

font_cex

numeric vector recycled and applied in order:

  1. Set label

  2. Overlap count label

  3. Signed count label

The default c(1, 1, 0.7) defines the signed count label slightly smaller than other labels.

  • When one value is provided, it is multiplied by c(1, 1, 0.7) to adjust font sizes altogether, keeping relative sizes.

  • When two values are provided, they are multiplied by c(1, 1, 0.7) using the second value twice.

  • When three values are provided, they are used as-is without change.

fontfamily

character string to define the fontfamily. The fontfamily must match a recognized font for the given output device, and this font must be capable of producing UTF-8 / Unicode characters, in order to print up arrow and down arrow. You may review systemfonts::system_fonts() for a listing of fonts recognized by ragg devices, which seems to have the best overall font capabilities. When it does not work, either use unicode=FALSE, or check the output from Sys.getlocale() to ensure the setting is capable of using UTF-8 (for example "C" may not be sufficient). Using the package ragg appears to be more consistently successful for rasterized output than base R output, for example: ragg::agg_png(), ragg::agg_tiff(), ragg::agg_jpeg() produce substantially higher quality output, and with more successful usage of system fonts, than png(), tiff(), and jpeg(). Similarly, for PDF output, consider cairo_pdf() or Cairo::CairoPDF() instead of using pdf().

poly_alpha

numeric (default 0.6) value between 0 and 1, for alpha transparency of the polygon fill color. This value is ignored when alpha_by_counts=TRUE.

  • poly_alpha=1 is completely opaque (no transparency)

  • poly_alpha=0.8 is 80% opaque

alpha_by_counts

logical indicating whether to define alpha transparency to Venn polygon fill based upon the counts contained in each polygon. When TRUE the poly_alpha is ignored.

label_style

character string indicating the style for labels. Label color is adjusted based upon the determined background color, determined based upon the label fill color, and either the device background color, or Venn overlap fill color. There are pre-defined label styles.

  • "basic" no background shading

  • "fill" an opaque colored background

  • "shaded" a partially transparent colored background

  • "lite" a partially transparent lite background

  • "box" adds a dark border around the label region

label_preset

character deprecated in favor of show_labels. This argument is passed to venndir_label_style().

template

character (default "wide") describing the default layout for counts and signed counts. The value is stored in venndir@metadata$template for persistence.

  • "wide" - main counts on the left, right-justified; signed counts on the right, left-justified. This option is preferred for small numbers, and less-crowded diagrams.

  • "tall" - main counts, center-justified; signed counts below main counts, center-justified. This option is recommended for large numbers (where there are 1000 or more items in a single overlap region), or for crowded diagrams.

unicode

logical (default TRUE) indicating whether to display Unicode arrows for signed overlaps. Passed to curate_venn_labels(). Use unicode=FALSE if the signed label is not displayed properly. The most common causes: (1) the R console (terminal) is not configured to allow Unicode (UTF-8 or UTF-16) characters; (2) the display font does not contain Unicode characters in the font set.

big.mark

character (default ",") passed to format() to augment numeric labels.

curate_df

data.frame or NULL passed to curate_venn_labels(), used to customize the formatting of signed overlaps.

venn_jp

NULL or optional JamPolygon which contains one polygon for each setlist, intended to allow custom shapes to be used. Otherwise get_venn_polygon_shapes() is called.

inside_percent_threshold

numeric (default 0) indicating the percent area that a Venn overlap region must contain in order for the count label to be displayed inside the region, otherwise the label is displayed outside the region. Values are expected to range from 0 to 100.

item_cex

numeric default 1, used to define baseline font size (single value), or exact font cex values (multiple values).

  • When a single value is provided, each set of items is used to define a font scaling, based upon the relative area of the overlap polygon to the max item polygon area, and the number of items in each polygon. These values are multiplied by item_cex to produce the final adjustment. These values are multiplied by item_cex_factor.

  • When multiple values are provided, they are recycled to the number of polygons that contain items, and applied in order. There is no further adjustment by polygon area, nor number of labels. These values are multiplied by item_cex_factor.

item_style

character string (default "text") indicating the style to display item labels when they are enabled.

  • "default" detects whether item labels contain "<br>" for newlines, and uses "gridtext" if that is the case, otherwise it uses "text" which is markedly faster.

  • "text" option is substantially faster, but does not allow markdown.

  • "gridtext": substantially slower for a large number of labels, but enables use of limited markdown by calling gridtext::richtext_grob(). Mostly useful for venn_meme().

item_buffer

numeric value (default -0.15) indicating the buffer adjustment applied to Venn overlap regions before arranging item labels. Passed to label_fill_JamPolygon() via render_venndir(). Negative values are recommended, so they shrink the region.

sign_count_delim

character string used as a delimiter between the sign and counts, when overlap_type is not "overlap".

padding

numeric padding in units "mm" (default c(3, 2)) for overlap count, and signed overlap count labels, in order.

r

numeric radius in units "mm" used for rounded rectangle corners for labels. Only visible when label_preset includes a background fill ("lite", "shaded", "fill"), or "box".

center

numeric coordinates relative to the plot bounding box, default c(0, -0.15) uses a center point in the middle (x=0) and slightly down (y=-0.15) from the plot center. It is used to place labels outside the diagram. In short, labels are placed by drawing a line from this center point, outward through the Venn overlap region to be labeled. The label is positioned outside the polygon region by segment_distance. The default c(0, -0.15) ensures that labels tend to be at the top of the plot, and not on the left/right side of the plot. This argument is passed along to label_outside_JamPolygon().

segment_distance

numeric value indicating the distance between outside labels and the outer edge of the Venn diaram region. Larger values place labels farther away, while also shrinking the relative size of the Venn diagram.

sep

character used as a delimiter between set names, the default is "&".

do_plot

`logical (default TRUE) indicating whether to generate the the figure.

  • When do_plot=TRUE it calls render_venndir() to create grid objects to be displayed. Arguments in ... are passed to render_venndir(): To hide display, use do_draw=FALSE. To prevent calling grid::grid.newpage() so the plot can be drawn inside another active display device, use do_newpage=FALSE.

  • When do_plot=FALSE the returned Venndir object can be passed to render_venndir() to render the figure. Same points are valid regarding do_draw and do_newpage, which are arguments

verbose

logical indicating whether to print verbose output.

debug

numeric optional internal debug.

circle_nudge

list of numeric x,y vectors. Not yet re-implemented after the version 0.0.30.900 update.

rotate_degrees

numeric value in degrees, allowing rotation of the Venn diagram. Not yet re-implemented after version 0.0.30.900.

...

additional arguments are passed to render_venndir().

Value

Venndir object with slots:

  • "jps": JamPolygon which contains each set polygon, and each overlap polygon defined for the Venn diagram.

  • "label_df": data.frame which contains the coordinates for each Venn set, and Venn overlap label.

  • "setlist": list as input to venndir(). This entry may be empty.

When do_plot=TRUE this function also calls render_venndir(), and returns the grid graphical objects (grobs) in the attributes:

  • "gtree": a grid::gTree object suitable for drawing with grid::grid.draw(attr(vo, "gtre"))

  • "grob_list": a list of grid object components used to build the complete diagram, they can be plotted individually, or assembled with do.call(grid::gList, grob_list). The grid::gList can be assembled into a gTree with: grid::grobTree(gList=do.call(grid::gList, grob_list)

  • "viewport": the grid::viewport that holds important context for the graphical objects, specifically the use of coordinate grid::unit measure "snpc", which maintains a fixed aspect ratio.

See also

Other venndir core: render_venndir(), signed_overlaps(), textvenn(), venn_meme()

Examples

setlist <- make_venn_test(100, 3, do_signed=FALSE);

setlist <- make_venn_test(100, 3, do_signed=TRUE);
vo <- venndir(setlist)

jamba::sdim(vo);
#>          rows cols      class
#> jps        10      JamPolygon
#> label_df   21   51 data.frame
#> setlist     3            list
#> metadata   15            list

# custom set labels
vo <- venndir(setlist,
   setlist_labels=paste("set", LETTERS[1:3]))


# custom set labels with Markdown custom colors
vo <- venndir(setlist,
   setlist_labels=paste0("Set <span style='color:blue'>", LETTERS[1:3], "</span>"))


# custom set and legend labels
vo <- venndir(setlist,
   setlist_labels=paste0("set<br>", LETTERS[1:3]),
   legend_labels=paste("Set", LETTERS[1:3]))


# custom set and legend labels
# proportional
# Set Name is inside with show_labels having lowercase "n"
vo <- venndir(setlist,
   proportional=TRUE,
   show_labels="ncs",
   label_style="lite box",
   setlist_labels=paste0("Set: ", LETTERS[1:3]),
   legend_labels=paste("Set", LETTERS[1:3]))