Collapse incidence matrix using row groups, for example when converting probe-level, transcript-level, peptide-level data to gene-level data.
Usage
collapse_im(
im,
row_groups = NULL,
logic = c("majority-hit", "majority"),
verbose = FALSE,
...
)
Arguments
- im
numeric
matrix with columns for each set- row_groups
character
orfactor
with row groupings.- logic
character
logic to use, default 'majority-hit'."majority-hit"
: uses the majority winner among non-zero values."majority"
: uses the majority winner including non-zero and zero.
- ...
additional arguments are ignored.
Details
This function is a simple converted for incidence matrix data, taking the "majority-hit" for each row grouping. The most common scenario is to group rows by gene, in order to summarize the observed changes at gene level, when the original data may contain multiple possible measurements for each gene.
The default logic assumes that any observed statistical hit for a gene is sufficient evidence to implicate that gene as a "hit", even if other potential measurements for the same gene did not meet the statistical criteria used, as relevant to the platform technology.
See also
Other venndir conversion:
counts2setlist()
,
im2list()
,
im_value2list()
,
list2im_opt()
,
list2im_value()
,
overlaplist2setlist()
,
signed_counts2setlist()
Examples
im <- cbind(A=c(-1, -1, 0, 1, 1, 1, -1, 0, 0, 1, 1, 0),
B=c(-1, -1, -1, 1, 1, 0, -1, 0, 0, 1, 1, 1),
C=c(-1, -1, -1, 1, 1, 0, -1, 0, 0, 0, 0, 0));
row_groups <- rep(c("a", "b", "c"), c(6, 3, 3))
# default logic returns the majority non-zero value when present
new_im <- collapse_im(im, row_groups)
new_im
#> A B C
#> a 1 -1 -1
#> b -1 -1 -1
#> c 1 1 0
# majority logic will prioritize "0" when it is the majority
# (not recommended for most gene-based data)
new_im2 <- collapse_im(im, row_groups, logic="majority")
new_im2
#> A B C
#> a 1 -1 -1
#> b 0 0 0
#> c 1 1 0
# more detail
imdf <- data.frame(im, row_groups,
new_im[match(row_groups, rownames(new_im)), ])
split(imdf, imdf$row_groups)
#> $a
#> A B C row_groups A.1 B.1 C.1
#> a -1 -1 -1 a 1 -1 -1
#> a.1 -1 -1 -1 a 1 -1 -1
#> a.2 0 -1 -1 a 1 -1 -1
#> a.3 1 1 1 a 1 -1 -1
#> a.4 1 1 1 a 1 -1 -1
#> a.5 1 0 0 a 1 -1 -1
#>
#> $b
#> A B C row_groups A.1 B.1 C.1
#> b -1 -1 -1 b -1 -1 -1
#> b.1 0 0 0 b -1 -1 -1
#> b.2 0 0 0 b -1 -1 -1
#>
#> $c
#> A B C row_groups A.1 B.1 C.1
#> c 1 1 0 c 1 1 0
#> c.1 1 1 0 c 1 1 0
#> c.2 0 1 0 c 1 1 0
#>