Paste a list of vectors into a character vector, with values delimited by default with a comma.
Usage
cPaste(
x,
sep = ",",
doSort = FALSE,
makeUnique = FALSE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
useLegacy = FALSE,
honorFactor = TRUE,
verbose = FALSE,
...
)
cPasteS(
x,
sep = ",",
doSort = TRUE,
makeUnique = FALSE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)
cPasteSU(
x,
sep = ",",
doSort = TRUE,
makeUnique = TRUE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)
cPasteUnique(
x,
sep = ",",
doSort = FALSE,
makeUnique = TRUE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)
cPasteU(
x,
sep = ",",
doSort = FALSE,
makeUnique = TRUE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
...
)
Arguments
- x
list
of vectors- sep
character
delimiter used to paste multiple values together- doSort
logical
indicating whether to sort each vector usingmixedOrder()
.- makeUnique
logical
indicating whether to make each vector in the input list unique before pasting its values together.- na.rm
logical
indicating whether to remove NA values from each vector in the input list. Whenna.rm
isTRUE
and a list element contains onlyNA
values, the resulting string will be""
.- keepFactors
logical
only used whenuseLegacy=TRUE
anddoSort=TRUE
; indicating whether to preserve factors, keeping factor level order. WhenkeepFactors=TRUE
, if any list element is afactor
, all elements are converted to factors. Note that this step combines overall factor levels, and non-factors will be ordered usingbase::order()
instead ofjamba::mixedOrder()
(for now.)- checkClass
logical
, default TRUE, whether to check the class of each vector in the input list.When TRUE, it confirms the class of each element in the
list
before processing, to prevent conversion which may otherwise lose information.For all cases when a known vector is split into a
list
,checkClass=FALSE
is preferred since there is only one class in the resultinglist
elements. This approach is faster especially for for large input lists, 10000 or more.When
checkClass=FALSE
it assumes all entries can be coerced tocharacter
, which is fastest, but does not preserve factor levels due to R coersion methods used byunlist()
.
- useBioc
logical
indicating whether this function should try to useS4Vectors::unstrsplit()
when the Bioconductor packageS4Vectors
is installed, otherwise it will use a less efficientmapply()
operation.- useLegacy
logical
indicating whether to enable to previous legacy process used bycPaste()
.- honorFactor
logical
passed tomixedSorts()
, whether anyfactor
vector should be sorted in factor level order. WhenhonorFactor=FALSE
then evenfactor
vectors are sorted as if they werecharacter
vectors, ignoring the factor levels.- verbose
logical
indicating whether to print verbose output.- ...
additional arguments are passed to
mixedOrder()
whendoSort=TRUE
.
Details
cPaste()
concatenates vector values using a delimiter.cPasteS()
sorts each vector usingmixedSort()
.cPasteU()
appliesuniques()
to retain unique values per vector.cPasteSU()
appliesmixedSort()
anduniques()
.
This function is essentially a wrapper for S4Vectors::unstrsplit()
except that it also optionally applies uniqueness to each vector
in the list, and sorts values in each vector using mixedOrder()
.
The sorting and uniqueness is applied to the unlist
ed vector of
values, which is substantially faster than any apply
family function
equivalent. The uniqueness is performed by uniques()
, which itself
will use S4Vectors::unique()
if available.
See also
Other jam list functions:
heads()
,
jam_rapply()
,
list2df()
,
mergeAllXY()
,
mixedSorts()
,
rbindList()
,
relist_named()
,
rlengths()
,
sclass()
,
sdim()
,
uniques()
,
unnestList()
Examples
L1 <- list(CA=LETTERS[c(1:4,2,7,4,6)], B=letters[c(7:11,9,3)]);
cPaste(L1);
#> CA B
#> "A,B,C,D,B,G,D,F" "g,h,i,j,k,i,c"
# CA B
# "A,B,C,D,B,G,D,F" "g,h,i,j,k,i,c"
cPaste(L1, doSort=TRUE);
#> CA B
#> "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
# CA B
# "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
## The sort can be done with convenience function cPasteS()
cPasteS(L1);
#> CA B
#> "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
# CA B
# "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
## Similarly, makeUnique=TRUE and cPasteU() are the same
cPaste(L1, makeUnique=TRUE);
#> CA B
#> "A,B,C,D,G,F" "g,h,i,j,k,c"
cPasteU(L1);
#> CA B
#> "A,B,C,D,G,F" "g,h,i,j,k,c"
# CA B
# "A,B,C,D,G,F" "g,h,i,j,k,c"
## Change the delimiter
cPasteSU(L1, sep="; ")
#> CA B
#> "A; B; C; D; F; G" "c; g; h; i; j; k"
# CA B
# "A; B; C; D; F; G" "c; g; h; i; j; k"
# test mix of factor and non-factor
L2 <- c(
list(D=factor(letters[1:12],
levels=letters[12:1])),
L1);
L2;
#> $D
#> [1] a b c d e f g h i j k l
#> Levels: l k j i h g f e d c b a
#>
#> $CA
#> [1] "A" "B" "C" "D" "B" "G" "D" "F"
#>
#> $B
#> [1] "g" "h" "i" "j" "k" "i" "c"
#>
cPasteSU(L2, keepFactors=TRUE);
#> D CA B
#> "l,k,j,i,h,g,f,e,d,c,b,a" "A,B,C,D,F,G" "c,g,h,i,j,k"
# tricky example with mix of character and factor
# and factor levels are inconsistent
# end result: factor levels are defined in order they appear
L <- list(entryA=c("miR-112", "miR-12", "miR-112"),
entryB=factor(c("A","B","A","B"),
levels=c("B","A")),
entryC=factor(c("C","A","B","B","C"),
levels=c("A","B","C")),
entryNULL=NULL)
L;
#> $entryA
#> [1] "miR-112" "miR-12" "miR-112"
#>
#> $entryB
#> [1] A B A B
#> Levels: B A
#>
#> $entryC
#> [1] C A B B C
#> Levels: A B C
#>
#> $entryNULL
#> NULL
#>
cPaste(L);
#> entryA entryB entryC
#> "miR-112,miR-12,miR-112" "A,B,A,B" "C,A,B,B,C"
#> entryNULL
#> ""
cPasteU(L);
#> entryA entryB entryC entryNULL
#> "miR-112,miR-12" "A,B" "C,A,B" ""
# by default keepFactors=FALSE, which means factors are sorted as characters
cPasteS(L);
#> entryA entryB entryC
#> "miR-12,miR-112,miR-112" "B,B,A,A" "B,B,A,C,C"
#> entryNULL
#> ""
cPasteSU(L);
#> entryA entryB entryC entryNULL
#> "miR-12,miR-112" "B,A" "B,A,C" ""
# keepFactors=TRUE will keep unique factor levels in the order they appear
# this is the same behavior as unlist(L[c(2,3)]) on a list of factors
cPasteSU(L, keepFactors=TRUE);
#> entryA entryB entryC entryNULL
#> "miR-12,miR-112" "B,A" "B,A,C" ""
levels(unlist(L[c(2,3)]))
#> [1] "B" "A" "C"