Paste a list of vectors into a character vector, with values delimited by default with a comma.
cPaste(
x,
sep = ",",
doSort = FALSE,
makeUnique = FALSE,
na.rm = FALSE,
keepFactors = FALSE,
checkClass = TRUE,
useBioc = TRUE,
useLegacy = FALSE,
honorFactor = TRUE,
verbose = FALSE,
...
)
input list
of vectors
character
delimiter used to paste multiple values together
logical
indicating whether to sort each vector
using mixedOrder()
.
logical
indicating whether to make each vector in
the input list unique before pasting its values together.
boolean indicating whether to remove NA values from
each vector in the input list. When na.rm
is TRUE
and a
list element contains only NA
values, the resulting string
will be ""
.
logical
only used when useLegacy=TRUE
and
doSort=TRUE
; indicating whether to preserve factors,
keeping factor level order. When
keepFactors=TRUE
, if any list element is a factor
, all elements
are converted to factors. Note that this step combines overall
factor levels, and non-factors will be ordered using
base::order()
instead of jamba::mixedOrder()
(for now.)
logical
indicating whether this function should try
to use S4Vectors::unstrsplit()
when the Bioconductor package
S4Vectors
is installed, otherwise it will use a less
efficient mapply()
operation.
logical
indicating whether to enable to previous
legacy process used by cPaste()
.
logical
passed to mixedSorts()
, whether any
factor
vector should be sorted in factor level order.
When honorFactor=FALSE
then even factor
vectors are sorted
as if they were character
vectors, ignoring the factor levels.
additional arguments are passed to mixedOrder()
when
doSort=TRUE
.
character vector with the same names and in the same order
as the input list x
.
This function is essentially a wrapper for S4Vectors::unstrsplit()
except that it also optionally applies uniqueness to each vector
in the list, and sorts values in each vector using mixedOrder()
.
The sorting and uniqueness is applied to the unlist
ed vector of
values, which is substantially faster than any apply
family function
equivalent. The uniqueness is performed by uniques()
, which itself
will use S4Vectors::unique()
if available.
Other jam string functions:
asSize()
,
breaksByVector()
,
cPasteSU()
,
cPasteS()
,
cPasteUnique()
,
cPasteU()
,
fillBlanks()
,
formatInt()
,
gsubOrdered()
,
gsubs()
,
makeNames()
,
mixedOrder()
,
mixedSortDF()
,
mixedSorts()
,
mixedSort()
,
mmixedOrder()
,
nameVectorN()
,
nameVector()
,
padInteger()
,
padString()
,
pasteByRowOrdered()
,
pasteByRow()
,
sizeAsNum()
,
tcount()
,
ucfirst()
,
uniques()
Other jam list functions:
cPasteSU()
,
cPasteS()
,
cPasteUnique()
,
cPasteU()
,
heads()
,
jam_rapply()
,
list2df()
,
mergeAllXY()
,
mixedSorts()
,
rbindList()
,
relist_named()
,
rlengths()
,
sclass()
,
sdim()
,
uniques()
,
unnestList()
L1 <- list(CA=LETTERS[c(1:4,2,7,4,6)], B=letters[c(7:11,9,3)]);
cPaste(L1);
#> CA B
#> "A,B,C,D,B,G,D,F" "g,h,i,j,k,i,c"
# CA B
# "A,B,C,D,B,G,D,F" "g,h,i,j,k,i,c"
cPaste(L1, doSort=TRUE);
#> CA B
#> "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
# CA B
# "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
## The sort can be done with convenience function cPasteS()
cPasteS(L1);
#> CA B
#> "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
# CA B
# "A,B,B,C,D,D,F,G" "c,g,h,i,i,j,k"
## Similarly, makeUnique=TRUE and cPasteU() are the same
cPaste(L1, makeUnique=TRUE);
#> CA B
#> "A,B,C,D,G,F" "g,h,i,j,k,c"
cPasteU(L1);
#> CA B
#> "A,B,C,D,G,F" "g,h,i,j,k,c"
# CA B
# "A,B,C,D,G,F" "g,h,i,j,k,c"
## Change the delimiter
cPasteSU(L1, sep="; ")
#> CA B
#> "A; B; C; D; F; G" "c; g; h; i; j; k"
# CA B
# "A; B; C; D; F; G" "c; g; h; i; j; k"
# test mix of factor and non-factor
L2 <- c(
list(D=factor(letters[1:12],
levels=letters[12:1])),
L1);
L2;
#> $D
#> [1] a b c d e f g h i j k l
#> Levels: l k j i h g f e d c b a
#>
#> $CA
#> [1] "A" "B" "C" "D" "B" "G" "D" "F"
#>
#> $B
#> [1] "g" "h" "i" "j" "k" "i" "c"
#>
cPasteSU(L2, keepFactors=TRUE);
#> D CA B
#> "l,k,j,i,h,g,f,e,d,c,b,a" "A,B,C,D,F,G" "c,g,h,i,j,k"
# tricky example with mix of character and factor
# and factor levels are inconsistent
# end result: factor levels are defined in order they appear
L <- list(entryA=c("miR-112", "miR-12", "miR-112"),
entryB=factor(c("A","B","A","B"),
levels=c("B","A")),
entryC=factor(c("C","A","B","B","C"),
levels=c("A","B","C")),
entryNULL=NULL)
L;
#> $entryA
#> [1] "miR-112" "miR-12" "miR-112"
#>
#> $entryB
#> [1] A B A B
#> Levels: B A
#>
#> $entryC
#> [1] C A B B C
#> Levels: A B C
#>
#> $entryNULL
#> NULL
#>
cPaste(L);
#> entryA entryB entryC
#> "miR-112,miR-12,miR-112" "A,B,A,B" "C,A,B,B,C"
#> entryNULL
#> ""
cPasteU(L);
#> entryA entryB entryC entryNULL
#> "miR-112,miR-12" "A,B" "C,A,B" ""
# by default keepFactors=FALSE, which means factors are sorted as characters
cPasteS(L);
#> entryA entryB entryC
#> "miR-12,miR-112,miR-112" "B,B,A,A" "B,B,A,C,C"
#> entryNULL
#> ""
cPasteSU(L);
#> entryA entryB entryC entryNULL
#> "miR-12,miR-112" "B,A" "B,A,C" ""
# keepFactors=TRUE will keep unique factor levels in the order they appear
# this is the same behavior as unlist(L[c(2,3)]) on a list of factors
cPasteSU(L, keepFactors=TRUE);
#> entryA entryB entryC entryNULL
#> "miR-12,miR-112" "B,A" "B,A,C" ""
levels(unlist(L[c(2,3)]))
#> [1] "B" "A" "C"