Paste a list of vectors into a character vector, with values delimited by default with a comma.
Usage
cPaste(
x,
sep = ",",
doSort = FALSE,
makeUnique = FALSE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
useLegacy = FALSE,
honorFactor = TRUE,
verbose = FALSE,
...
)
cPasteS(
x,
sep = ",",
doSort = TRUE,
makeUnique = FALSE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)
cPasteSU(
x,
sep = ",",
doSort = TRUE,
makeUnique = TRUE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)
cPasteUnique(
x,
sep = ",",
doSort = FALSE,
makeUnique = TRUE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)
cPasteU(
x,
sep = ",",
doSort = FALSE,
makeUnique = TRUE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)Arguments
- x
listof vectors- sep
characterdelimiter used to paste multiple values together- doSort
logicalindicating whether to sort each vector usingmixedOrder().- makeUnique
logicalindicating whether to make each vector in the input list unique before pasting its values together.- na.rm
logicalindicating whether to remove NA values from each vector in the input list. Whenna.rmisTRUEand a list element contains onlyNAvalues, the resulting string will be"".- keepFactors
logicalonly used whenuseLegacy=TRUEanddoSort=TRUE; indicating whether to preserve factors, keeping factor level order. WhenkeepFactors=TRUE, if any list element is afactor, all elements are converted to factors. Note that this step combines overall factor levels, and non-factors will be ordered usingbase::order()instead ofjamba::mixedOrder()(for now.)- checkClass
logical, default TRUE, whether to check the class of each vector in the input list.When TRUE, it confirms the class of each element in the
listbefore processing, to prevent conversion which may otherwise lose information.For all cases when a known vector is split into a
list,checkClass=FALSEis preferred since there is only one class in the resultinglistelements. This approach is faster especially for for large input lists, 10000 or more.When
checkClass=FALSEit assumes all entries can be coerced tocharacter, which is fastest, but does not preserve factor levels due to R coersion methods used byunlist().
- useBioc
logicalindicating whether this function should try to useS4Vectors::unstrsplit()when the Bioconductor packageS4Vectorsis installed, otherwise it will use a less efficientmapply()operation.- useLegacy
logicalindicating whether to enable to previous legacy process used bycPaste().- honorFactor
logicalpassed tomixedSorts(), whether anyfactorvector should be sorted in factor level order. WhenhonorFactor=FALSEthen evenfactorvectors are sorted as if they werecharactervectors, ignoring the factor levels.- verbose
logicalindicating whether to print verbose output.- ...
additional arguments are passed to
mixedOrder()whendoSort=TRUE.
Details
cPaste()concatenates vector values using a delimiter.cPasteS()sorts each vector usingmixedSort().cPasteU()appliesuniques()to retain unique values per vector.cPasteSU()appliesmixedSort()anduniques().
This function is essentially a wrapper for S4Vectors::unstrsplit()
except that it also optionally applies uniqueness to each vector
in the list, and sorts values in each vector using mixedOrder().
The sorting and uniqueness is applied to the unlisted vector of
values, which is substantially faster than any apply family function
equivalent. The uniqueness is performed by uniques(), which itself
will use S4Vectors::unique() if available.
See also
Other jam list functions:
heads(),
jam_rapply(),
list2df(),
mergeAllXY(),
mixedSorts(),
rbindList(),
relist_named(),
rlengths(),
sclass(),
sdim(),
uniques(),
unnestList()
Examples
L1 <- list(CA=LETTERS[c(1:4,2,7,4,6)], B=letters[c(7:11,9,3)]);
cPaste(L1);
#> CA B
#> "A,B,C,D,B,G,D,F" "g,h,i,j,k,i,c"
# CA B
# "A,B,C,D,B,G,D,F" "g,h,i,j,k,i,c"
cPaste(L1, doSort=TRUE);
#> CA B
#> "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
# CA B
# "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
## The sort can be done with convenience function cPasteS()
cPasteS(L1);
#> CA B
#> "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
# CA B
# "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
## Similarly, makeUnique=TRUE and cPasteU() are the same
cPaste(L1, makeUnique=TRUE);
#> CA B
#> "A,B,C,D,G,F" "g,h,i,j,k,c"
cPasteU(L1);
#> CA B
#> "A,B,C,D,G,F" "g,h,i,j,k,c"
# CA B
# "A,B,C,D,G,F" "g,h,i,j,k,c"
## Change the delimiter
cPasteSU(L1, sep="; ")
#> CA B
#> "A; B; C; D; F; G" "c; g; h; i; j; k"
# CA B
# "A; B; C; D; F; G" "c; g; h; i; j; k"
# test mix of factor and non-factor
L2 <- c(
list(D=factor(letters[1:12],
levels=letters[12:1])),
L1);
L2;
#> $D
#> [1] a b c d e f g h i j k l
#> Levels: l k j i h g f e d c b a
#>
#> $CA
#> [1] "A" "B" "C" "D" "B" "G" "D" "F"
#>
#> $B
#> [1] "g" "h" "i" "j" "k" "i" "c"
#>
cPasteSU(L2, keepFactors=TRUE);
#> D CA B
#> "l,k,j,i,h,g,f,e,d,c,b,a" "A,B,C,D,F,G" "c,g,h,i,j,k"
# tricky example with mix of character and factor
# and factor levels are inconsistent
# end result: factor levels are defined in order they appear
L <- list(entryA=c("miR-112", "miR-12", "miR-112"),
entryB=factor(c("A","B","A","B"),
levels=c("B","A")),
entryC=factor(c("C","A","B","B","C"),
levels=c("A","B","C")),
entryNULL=NULL)
L;
#> $entryA
#> [1] "miR-112" "miR-12" "miR-112"
#>
#> $entryB
#> [1] A B A B
#> Levels: B A
#>
#> $entryC
#> [1] C A B B C
#> Levels: A B C
#>
#> $entryNULL
#> NULL
#>
cPaste(L);
#> entryA entryB entryC
#> "miR-112,miR-12,miR-112" "A,B,A,B" "C,A,B,B,C"
#> entryNULL
#> ""
cPasteU(L);
#> entryA entryB entryC entryNULL
#> "miR-112,miR-12" "A,B" "C,A,B" ""
# by default keepFactors=FALSE, which means factors are sorted as characters
cPasteS(L);
#> entryA entryB entryC
#> "miR-12,miR-112,miR-112" "B,B,A,A" "B,B,A,C,C"
#> entryNULL
#> ""
cPasteSU(L);
#> entryA entryB entryC entryNULL
#> "miR-12,miR-112" "B,A" "B,A,C" ""
# keepFactors=TRUE will keep unique factor levels in the order they appear
# this is the same behavior as unlist(L[c(2,3)]) on a list of factors
cPasteSU(L, keepFactors=TRUE);
#> entryA entryB entryC entryNULL
#> "miR-12,miR-112" "B,A" "B,A,C" ""
levels(unlist(L[c(2,3)]))
#> [1] "B" "A" "C"