Calculate signed, directional overlaps across sets
signed_overlaps(
setlist,
overlap_type = c("detect", "each", "overlap", "concordance", "agreement"),
return_items = FALSE,
return_item_labels = return_items,
sep = "&",
trim_label = TRUE,
include_blanks = TRUE,
keep_item_order = FALSE,
verbose = FALSE,
warn = FALSE,
...
)
list
of named vectors, whose names represent
set items, and whose values represent direction using
values c(-1, 0, 1)
.
character
value indicating the type of
overlap logic:
"each"
records each combination of signs;
"overlap"
disregards the sign and returns any match
item overlap;
"concordance"
represents counts for full
agreement, or "mixed"
for any inconsistent overlapping
direction;
"agreement"
represents full agreement in direction
as "agreement"
, and "mixed"
for any inconsistent
direction.
logical
indicating whether to return
the items within each overlap set.
logical
indicating whether to return
the directional label associated with each item. A directional
label combines the direction from setlist
by item.
character
used as a delimiter between set names,
the default is "&"
.
logical
indicating whether to trim the
directional label, for example instead of returning "0 1 -1"
it will return "1 -1"
because the overlap name already
indicates the sets involved.
logical
indicating whether each set overlap
should be represented at least once even when no items are
present in the overlap. When include_blanks=TRUE
is useful
in that it guarantees all possible combinations of overlaps
are represented consistently in the output.
logical
default FALSE, to determine whether
items will be stored and displayed in the order they are provided.
Note: keep_item_order=TRUE
enables the following behaviors:
Any character
vector input will retain the order they appear.
Any factor
vector input will sort items using factor levels
,
which maintains the factor level order.
Any named vector will use the character
vector of names, keeping
the order they appear in the vector.
logical
indicating whether to print verbose output.
logical
default FALSE, whether to print warnings during
import in the event that input data is coerced to another type.
additional arguments are passed to list2imsigned()
.
data.frame
with columns intended to support venndir()
,
but which may be more widely useful:
"sets"
- character vector with sets and overlap names.
one column indicating the overlap_type
and corresponding values:
"overlap"
- This column is always included.
"concordance"
- includes 1
(concordant) and -1
(discordant)
"agreement"
- includes "agreement"
and "disgreement"
"each"
- includes sign values -1
and 1
.
"overlap"
- integer vector with overlap values, where 0
and 1
indicate which sets contained these items. This column is always included,
even when overlap_type
is something else.
"num_sets"
- integer number of sets represented in the overlap.
"count"
- integer number of items in the overlap.
one colname for each set name represented in the "sets"
column,
intended to help filter by each set. Values will be 0
or 1
.
overlap_label
- will represent only the non-0 elements from
"overlap"
for convenience.
"items"
- when return_items=TRUE
this column will contain
a list
(in AsIs
format) of character
vectors, with the items.
This function is the core function to summarize overlaps that include signed directionality. It is intended for situations where two sets may share items, but where the signed direction associated with those items may or may not also be shared.
One motivating example is with biological data, where a subset of genes, proteins, or regions of genome, may be regulated up or down, and this direction is relevant to understanding the biological process. Two experiments may identify similar genes, proteins, or regions of genome, but they may not regulate them in the same direction. This function is intended to help summarize item overlaps alongside the directionality of each item.
The directional counts can be summarized in slightly different
ways, defined by the argument overlap_type
:
overlap_type="detect"
- default behavior: each vector in setlist
is handled independently:
a vector with no names will use the vector
values as items after converting them to character
;
a named vector with character
or factor
values
will will use the vector names as items,
and character values as item values;
a named vector with numeric
or integer
values
will use vector names as items, and will convert
numeric values to sign()
.
overlap_type="each"
- this option returns all possible
directions individually counted.
overlap_type="concordance"
- this option returns the counts
for each consistent direction, for example "up-up-up"
would
be counted, and "down-down-down"
would be counted, but any
mixture of "up"
and "down"
would be summarized and counted
as "mixed"
. For 3-way overlaps, there are 8 possible directions,
the labels are difficult to place in the Venn diagram, and are not
altogether meaningful. Note that this option is the default
for venndir()
.
overlap_type="overlap"
- this option only summarizes overlaps
without regard to direction. This option returns standard Venn
overlap counts.
overlap_type="agreement"
- this option groups all directions
that agree and returns them as "concordant"
, all others are
returned as "mixed"
.
Note that overlap_type="agreement"
and overlap_type="concordance"
will not convert numeric
values to sign()
, so if the input
contains numeric
values such as 1.2435
they should probably be
converted to sign()
before calling signed_overlaps()
, for example:
signed_overlaps(lapply(setlist, sign))
Other venndir core:
render_venndir()
,
textvenn()
,
venn_meme()
,
venndir()
setlist <- make_venn_test(100, 2, do_signed=FALSE);
setlist <- make_venn_test(1e6, 3, do_signed=FALSE);
# so is a data.frame
so <- signed_overlaps(setlist, verbose=TRUE);
#> ## (17:23:35) 19Nov2024: signed_overlaps(): Processing input setlist.
#> ## (17:23:36) 19Nov2024: signed_overlaps(): Processing overlap_type='detect'
#> ## (17:23:36) 19Nov2024: signed_overlaps(): Creating other data types.
#> ## (17:23:36) 19Nov2024: signed_overlaps(): Creating overlap vector.
#> ## (17:23:37) 19Nov2024: signed_overlaps(): Creating concordance vector.
#> ## (17:23:37) 19Nov2024: signed_overlaps(): Creating split names.
#> ## (17:23:37) 19Nov2024: signed_overlaps(): Creating final vector.
#> ## (17:23:37) 19Nov2024: signed_overlaps(): Splitting by observed directions per overlap.
#> ## (17:23:37) 19Nov2024: signed_overlaps(): Creating labels for each split.
#> ## (17:23:37) 19Nov2024: signed_overlaps(): Processing include_blanks=TRUE
#> ## (17:23:37) 19Nov2024: signed_overlaps(): Sorting rows by overlap count then set.
so
#> sets overlap num_sets count set_A set_B
#> set_A|1 0 0 set_A 1 0 0 1 29914 1 0
#> set_B|0 1 0 set_B 0 1 0 1 378895 0 1
#> set_C|0 0 1 set_C 0 0 1 1 74287 0 0
#> set_A&set_B|1 1 0 set_A&set_B 1 1 0 2 27516 1 1
#> set_A&set_C|1 0 1 set_A&set_C 1 0 1 2 5479 1 0
#> set_B&set_C|0 1 1 set_B&set_C 0 1 1 2 68631 0 1
#> set_A&set_B&set_C|1 1 1 set_A&set_B&set_C 1 1 1 3 4995 1 1
#> set_C overlap_label
#> set_A|1 0 0 0 1
#> set_B|0 1 0 0 1
#> set_C|0 0 1 1 1
#> set_A&set_B|1 1 0 0 1 1
#> set_A&set_C|1 0 1 1 1 1
#> set_B&set_C|0 1 1 1 1 1
#> set_A&set_B&set_C|1 1 1 1 1 1 1
# detect overlap_type
attr(signed_overlaps(setlist, "detect"), "overlap_type")
#> [1] "overlap"
setlist <- make_venn_test(100, 2, do_signed=TRUE);
# detect overlap_type
attr(signed_overlaps(setlist, "detect"), "overlap_type")
#> [1] "concordance"
# straight overlap counts
signed_overlaps(setlist, "overlap");
#> sets overlap num_sets count set_A set_B overlap_label
#> set_A|1 0 set_A 1 0 1 25 1 0 1
#> set_B|0 1 set_B 0 1 1 9 0 1 1
#> set_A&set_B|1 1 set_A&set_B 1 1 2 7 1 1 1 1
# each directional overlap count
signed_overlaps(setlist, "each");
#> sets each overlap num_sets count set_A set_B
#> set_A|-1 0 set_A -1 0 1 0 1 13 1 0
#> set_A|1 0 set_A 1 0 1 0 1 12 1 0
#> set_B|0 -1 set_B 0 -1 0 1 1 4 0 1
#> set_B|0 1 set_B 0 1 0 1 1 5 0 1
#> set_A&set_B|-1 -1 set_A&set_B -1 -1 1 1 2 2 1 1
#> set_A&set_B|-1 1 set_A&set_B -1 1 1 1 2 1 1 1
#> set_A&set_B|1 1 set_A&set_B 1 1 1 1 2 4 1 1
#> overlap_label
#> set_A|-1 0 -1
#> set_A|1 0 1
#> set_B|0 -1 -1
#> set_B|0 1 1
#> set_A&set_B|-1 -1 -1 -1
#> set_A&set_B|-1 1 -1 1
#> set_A&set_B|1 1 1 1
# concordance overlap counts
signed_overlaps(setlist, "concordance");
#> sets concordance overlap num_sets count set_A set_B
#> set_A|-1 0 set_A -1 0 1 0 1 13 1 0
#> set_A|1 0 set_A 1 0 1 0 1 12 1 0
#> set_B|0 -1 set_B 0 -1 0 1 1 4 0 1
#> set_B|0 1 set_B 0 1 0 1 1 5 0 1
#> set_A&set_B|-1 -1 set_A&set_B -1 -1 1 1 2 2 1 1
#> set_A&set_B|1 1 set_A&set_B 1 1 1 1 2 4 1 1
#> set_A&set_B|mixed set_A&set_B mixed 1 1 2 1 1 1
#> overlap_label
#> set_A|-1 0 -1
#> set_A|1 0 1
#> set_B|0 -1 -1
#> set_B|0 1 1
#> set_A&set_B|-1 -1 -1 -1
#> set_A&set_B|1 1 1 1
#> set_A&set_B|mixed mixed
# agreement overlap counts
signed_overlaps(setlist, "agreement");
#> sets agreement overlap num_sets count set_A set_B
#> set_A|agreement set_A agreement 1 0 1 25 1 0
#> set_B|agreement set_B agreement 0 1 1 9 0 1
#> set_A&set_B|agreement set_A&set_B agreement 1 1 2 6 1 1
#> set_A&set_B|mixed set_A&set_B mixed 1 1 2 1 1 1
#> overlap_label
#> set_A|agreement agreement
#> set_B|agreement agreement
#> set_A&set_B|agreement agreement
#> set_A&set_B|mixed mixed
# test to ensure factor input is handled properly
inputlist <- list(setA=factor(c("A", "B", "D")),
setB=factor(c("A", "C", "E", "F")))
signed_overlaps(inputlist, return_items=TRUE)
#> sets overlap num_sets count setA setB overlap_label items
#> setA|1 0 setA 1 0 1 2 1 0 1 B, D
#> setB|0 1 setB 0 1 1 3 0 1 1 C, E, F
#> setA&setB|1 1 setA&setB 1 1 2 1 1 1 1 1 A
# check to verify
signed_overlaps(inputlist, return_items=TRUE)$items
#> $`setA|1 0`
#> [1] "B" "D"
#>
#> $`setB|0 1`
#> [1] "C" "E" "F"
#>
#> $`setA&setB|1 1`
#> [1] "A"
#>
# test specific factor level order
inputlist <- list(
setA=factor(c("A", "B", "D"), levels=c("D", "B", "A")),
setB=factor(c("A", "C", "E", "F")))
signed_overlaps(inputlist, return_items=TRUE)
#> sets overlap num_sets count setA setB overlap_label items
#> setA|1 0 setA 1 0 1 2 1 0 1 B, D
#> setB|0 1 setB 0 1 1 3 0 1 1 C, E, F
#> setA&setB|1 1 setA&setB 1 1 2 1 1 1 1 1 A