Optimized conversion of list to incidence matrix
Arguments
- setlist
listof vectors- empty
default single value used for empty/missing entries, the default
empty=0uses zero for entries not present. Another alternative isNA. Providing acharactervalue will convert the output to acharactermatrix, be warned.- do_sparse
logicalindicating whether to coerce the output to sparse matrix class"CsparseMatrix"from the Matrix package. The default isFALSEas of version 0.0.33.900, since the most common use case requires a regular matrix. For extremely large data, consider using a sparse matrix.- ...
additional arguments are ignored.
Value
matrix object with value c(0, 1) when do_sparse=FALSE
(default), or when do_sparse=TRUE, it returns a Matrix object
class "CsparseMatrix" with logical values, only when
Matrix is available.
Details
This function rapidly converts a list of vectors into
an incidence matrix whose rownames are items, and colnames
are the names of the input list. The default output
do_sparse=TRUE returns a logical matrix class ngCMatrix
from the Matrix package. When do_sparse=FALSE the
output is a matrix class with numeric values 0 and 1.
Note that the rows in the output matrix are not sorted, since the step of sorting item names may take several seconds when working with a list whose vectors contain millions of items. For sorted rows, the best remedy is to run this function, the re-order rownames afterward.
See also
Other venndir conversion:
counts2setlist(),
im2list(),
im_value2list(),
list2im_value(),
overlaplist2setlist(),
signed_counts2setlist()
Examples
setlist <- list(A=c("one", "two", "three"),
b=c("two", "one", "four", "five"));
list2im_opt(setlist);
#> A b
#> one 1 1
#> two 1 1
#> three 1 0
#> four 0 1
#> five 0 1
list2im_opt(setlist, do_sparse=TRUE);
#> 5 x 2 sparse Matrix of class "ngCMatrix"
#> A b
#> one | |
#> two | |
#> three | .
#> four . |
#> five . |