Optimized conversion of list to incidence matrix
Arguments
- setlist
list
of vectors- empty
default single value used for empty/missing entries, the default
empty=0
uses zero for entries not present. Another alternative isNA
. Providing acharacter
value will convert the output to acharacter
matrix, be warned.- do_sparse
logical
indicating whether to coerce the output to sparse matrix class"CsparseMatrix"
from the Matrix package. The default isFALSE
as of version 0.0.33.900, since the most common use case requires a regular matrix. For extremely large data, consider using a sparse matrix.- ...
additional arguments are ignored.
Value
matrix
object with value c(0, 1)
when do_sparse=FALSE
(default), or when do_sparse=TRUE
, it returns a Matrix
object
class "CsparseMatrix"
with logical
values, only when
Matrix
is available.
Details
This function rapidly converts a list of vectors into
an incidence matrix whose rownames are items, and colnames
are the names of the input list. The default output
do_sparse=TRUE
returns a logical
matrix class ngCMatrix
from the Matrix
package. When do_sparse=FALSE
the
output is a matrix
class with numeric
values 0
and 1
.
Note that the rows in the output matrix are not sorted, since the step of sorting item names may take several seconds when working with a list whose vectors contain millions of items. For sorted rows, the best remedy is to run this function, the re-order rownames afterward.
See also
Other venndir conversion:
counts2setlist()
,
im2list()
,
im_value2list()
,
list2im_value()
,
overlaplist2setlist()
,
signed_counts2setlist()
Examples
setlist <- list(A=c("one", "two", "three"),
b=c("two", "one", "four", "five"));
list2im_opt(setlist);
#> A b
#> one 1 1
#> two 1 1
#> three 1 0
#> four 0 1
#> five 0 1
list2im_opt(setlist, do_sparse=TRUE);
#> 5 x 2 sparse Matrix of class "ngCMatrix"
#> A b
#> one | |
#> two | |
#> three | .
#> four . |
#> five . |