Paste data.frame rows into an ordered factor
Arguments
- x
data.frame- sep
characterseparator to use between columns- na.rm
logicalwhether to remove NA values, or include them as "NA"- condenseBlanks
logicalwhether to condense blank or empty values without including an extra delimiter between columns.- includeNames
logicalwhether to include the colname delimited prior to the value, using sepName as the delimiter.- keepOrder
logicalindicating whether non-factor columns should order factor levels based upon the existing order of unique items. This option is intended fordata.framewhose columns are already sorted in proper order, but where columns are notfactorwith appropriate factor levels. Note that even whenkeepOrder=TRUEall existingfactorcolumns will honor the order of factor levels already present in those columns.- byCols
integerorcharacterpassed tomixedSortDF(). This argument defines the order of columns sorted bymixedSortDF(), and does not affect the order of columns pasted. Columns are always pasted in the same order they appear inx. This argumentbyColswas previously passed via...but is added here to make this connection more direct.- na.last
logicalpassed tobase::factor()to determine whetherNAvalues are first or last in factor level order.- ...
additional arguments are passed to
jamba::pasteByRow(), and tojamba::mixedSortDF().
Details
This function is an extension to jamba::pasteByRow() which
pastes rows from a data.frame into a character vector. This
function defines factor levels by running jamba::mixedSortDF(unique(x))
and calling jamba::pasteByRow() on the result. Therefore the
original order of the input x is maintained while the factor
levels are based upon the appropriate column-based sort.
Note that the ... additional arguments are
passed to jamba::mixedSortDF() to customize the column-based
sort order, used to define factor levels. A good way to test the
order of factors is to run jamba::mixedSortDF(unique(x)) with
appropriate arguments, and confirm the rows are ordered as expected.
Note also that jamba::mixedSortDF() uses jamba::mixedSort()
which itself performs alphanumeric sort in order to keep
values in proper numeric order where possible.
See also
Other jam string functions:
asSize(),
breaksByVector(),
fillBlanks(),
formatInt(),
gsubOrdered(),
gsubs(),
makeNames(),
nameVector(),
nameVectorN(),
padInteger(),
padString(),
pasteByRow(),
sizeAsNum(),
tcount(),
ucfirst()
Examples
f <- LETTERS;
df <- data.frame(A=f[rep(1:3, each=2)],
B=c(NA, f[3]),
C=c(NA, NA, f[2]))
df
#> A B C
#> 1 A <NA> <NA>
#> 2 A C <NA>
#> 3 B <NA> B
#> 4 B C <NA>
#> 5 C <NA> <NA>
#> 6 C C B
# note that output is consistent with mixedSortDF()
jamba::mixedSortDF(df)
#> A B C
#> 2 A C <NA>
#> 1 A <NA> <NA>
#> 4 B C <NA>
#> 3 B <NA> B
#> 6 C C B
#> 5 C <NA> <NA>
jamba::pasteByRowOrdered(df)
#> 1 2 3 4 5 6
#> A A_C B_B B_C C C_C_B
#> Levels: A_C A B_C B_B C_C_B C
jamba::mixedSortDF(df, na.last=FALSE)
#> A B C
#> 1 A <NA> <NA>
#> 2 A C <NA>
#> 3 B <NA> B
#> 4 B C <NA>
#> 5 C <NA> <NA>
#> 6 C C B
jamba::pasteByRowOrdered(df, na.last=FALSE)
#> 1 2 3 4 5 6
#> A A_C B_B B_C C C_C_B
#> Levels: A A_C B_B B_C C C_C_B
jamba::mixedSortDF(df, byCols=c(3, 2, 1))
#> A B C
#> 6 C C B
#> 3 B <NA> B
#> 2 A C <NA>
#> 4 B C <NA>
#> 1 A <NA> <NA>
#> 5 C <NA> <NA>
jamba::pasteByRowOrdered(df, byCols=c(3, 2, 1))
#> 1 2 3 4 5 6
#> A A_C B_B B_C C C_C_B
#> Levels: C_C_B B_B A_C B_C A C
df1 <- data.frame(group=rep(c("Control", "ABC1"), each=6),
time=rep(c("Hour2", "Hour10"), each=3),
rep=paste0("Rep", 1:3))
# default will sort each column alphanumerically
pasteByRowOrdered(df1)
#> 1 2 3 4
#> Control_Hour2_Rep1 Control_Hour2_Rep2 Control_Hour2_Rep3 Control_Hour10_Rep1
#> 5 6 7 8
#> Control_Hour10_Rep2 Control_Hour10_Rep3 ABC1_Hour2_Rep1 ABC1_Hour2_Rep2
#> 9 10 11 12
#> ABC1_Hour2_Rep3 ABC1_Hour10_Rep1 ABC1_Hour10_Rep2 ABC1_Hour10_Rep3
#> 12 Levels: ABC1_Hour2_Rep1 ABC1_Hour2_Rep2 ABC1_Hour2_Rep3 ... Control_Hour10_Rep3
# keepOrder=TRUE will honor existing order of character columns
pasteByRowOrdered(df1, keepOrder=TRUE)
#> 1 2 3 4
#> Control_Hour2_Rep1 Control_Hour2_Rep2 Control_Hour2_Rep3 Control_Hour10_Rep1
#> 5 6 7 8
#> Control_Hour10_Rep2 Control_Hour10_Rep3 ABC1_Hour2_Rep1 ABC1_Hour2_Rep2
#> 9 10 11 12
#> ABC1_Hour2_Rep3 ABC1_Hour10_Rep1 ABC1_Hour10_Rep2 ABC1_Hour10_Rep3
#> 12 Levels: Control_Hour2_Rep1 Control_Hour2_Rep2 ... ABC1_Hour10_Rep3