convert list to incidence matrix

list2im(x, keepCounts = FALSE, emptyValue = 0, verbose = FALSE, ...)

Arguments

x

list of vectors

keepCounts

boolean indicating whether to return values indicating the number of occurrences of each item.

emptyValue

any single value that should be used for blank entries, by default zero 0. Use emptyValue=NA to return NA for missing entries.

verbose

boolean indicating whether to print verbose output.

...

additional arguments are ignored.

Value

numeric matrix whose rownames were vector items of the input list, and whole colnames were list names.

Details

This function converts a list of vectors into an incidence matrix, where the rows are the vector items and the columns are the list names. It uses an object from the arules package called arules::transactions which offers highly efficient methods for interconverting from list to matrix. The transactions class is itself an enhanced data matrix, which stores data using sparse matrix object type from the Matrix package, but also associates a data.frame to both the rows and columns of the matrix to offer additional row and column annotation, as needed.

Performance benchmarks showed high speed of converting a list to a matrix, but also that the resulting matrix was substantially smaller (5-20 times) then comparable methods producing a data matrix.

When argument keepCounts=TRUE, the method of applying counts only updates entries with multiple instances, which helps make this step relatively fast.

See also

Examples

L1 <- list(A=c("C","A","B","A"),
   D=c("D","E","F","D"),
   A123=c(1:8,3,5),
   T=LETTERS[7:9]);
# Default behavior is to make items unique
list2im(L1);
#> Warning: removing duplicated items in transactions
#>   A D A123 T
#> 1 0 0    1 0
#> 2 0 0    1 0
#> 3 0 0    1 0
#> 4 0 0    1 0
#> 5 0 0    1 0
#> 6 0 0    1 0
#> 7 0 0    1 0
#> 8 0 0    1 0
#> A 1 0    0 0
#> B 1 0    0 0
#> C 1 0    0 0
#> D 0 1    0 0
#> E 0 1    0 0
#> F 0 1    0 0
#> G 0 0    0 1
#> H 0 0    0 1
#> I 0 0    0 1

# Option to report the counts
list2im(L1, keepCounts=TRUE);
#> Warning: removing duplicated items in transactions
#>   A D A123 T
#> 1 0 0    1 0
#> 2 0 0    1 0
#> 3 0 0    2 0
#> 4 0 0    1 0
#> 5 0 0    2 0
#> 6 0 0    1 0
#> 7 0 0    1 0
#> 8 0 0    1 0
#> A 2 0    0 0
#> B 1 0    0 0
#> C 1 0    0 0
#> D 0 2    0 0
#> E 0 1    0 0
#> F 0 1    0 0
#> G 0 0    0 1
#> H 0 0    0 1
#> I 0 0    0 1