make unique vector names
makeNames(
x,
unique = TRUE,
suffix = "_v",
renameOnes = FALSE,
doPadInteger = FALSE,
startN = 1,
numberStyle = c("number", "letters", "LETTERS"),
useNchar = NULL,
renameFirst = TRUE,
keepNA = TRUE,
...
)
character vector to be used when defining names. All other vector types will be coerced to character prior to use.
argument which is ignored, included only for
compatibility with base::make.names
. All results from
makeNames()
are unique.
character separator between the original entry and the version, if necessary.
logical whether to rename single, unduplicated, entries.
logical whether to pad integer values to a consistent number of digits, based upon all suffix values needed. This output allows for more consistent sorting of names. To define a fixed number of digits, use the useNchar parameter.
integer number used when numberStyle is "number", this integer is used for the first entry to be renamed. You can use this value to make zero-based suffix values, for example.
character style for version numbering
Use integer numbers to represent each duplicated entry.
Use lowercase letters to represent each duplicated entry. The 27th entry uses the pattern "aa" to represent two 26-base digits. When doPadInteger=TRUE, a zero is still used to pad the resulting version numbers, again to allow easy sorting of text values, but also because there is no letter equivalent for the number zero. It is usually best to change the suffix to "_" or "" when using "letters".
Use uppercase letters to represent each duplicated entry, with the same rules as applied to "letters".
integer or NULL, number of digits to use when padding integer values with leading zero, only relevant when usePadInteger=TRUE.
logical whether to rename the first entry in a set of duplicated entries. If FALSE then the first entry in a set will not be versioned, even when renameOnes=TRUE.
logical whether to retain NA values using the string "NA".
If keepNA is FALSE, then NA values will remain NA, thus causing some
names to become <NA>
, which can cause problems with some downstream
functions which assume all names are either NULL or non-NA.
character vector of unique names
This function extends the basic goal from make.names
which is intended to make syntactically valid names from a character vector.
This makeNames function makes names unique, and offers configurable methods
to handle duplicate names. By default, any duplicated entries receive a
suffix _v# where # is s running count of entries observed, starting at 1.
The make.names
function, by contrast, renames the
second observed entry starting at .1, leaving the original entry
unchanged. Optionally, makeNames can rename all entries with a numeric
suffix, for consistency.
For example:
A, A, A, B, B, C
becomes:
A_v1, A_v2, A_v3, B_v1, B_v2, C
Also, makeNames always allows "_".
This makeNames function is similar to make.unique
which also converts a vector into a unique vector by adding suffix values,
however the make.unique
function intends to allow
repeated operations which recognize duplicated entries and continually
increment the suffix number. This makeNames function currently does not
handle repeat operations. The recommended approach to workaround having
pre-existing versioned names would be to remove suffix values prior to
running this function. One small distinction from
make.unique
is that makeNames does version the first
entry in a set.
Other jam string functions:
asSize()
,
breaksByVector()
,
cPasteSU()
,
cPasteS()
,
cPasteUnique()
,
cPasteU()
,
cPaste()
,
fillBlanks()
,
formatInt()
,
gsubOrdered()
,
gsubs()
,
mixedOrder()
,
mixedSortDF()
,
mixedSorts()
,
mixedSort()
,
mmixedOrder()
,
nameVectorN()
,
nameVector()
,
padInteger()
,
padString()
,
pasteByRowOrdered()
,
pasteByRow()
,
sizeAsNum()
,
tcount()
,
ucfirst()
,
uniques()
V <- rep(LETTERS[1:3], c(2,3,1));
makeNames(V);
#> [1] "A_v1" "A_v2" "B_v1" "B_v2" "B_v3" "C"
makeNames(V, renameOnes=TRUE);
#> [1] "A_v1" "A_v2" "B_v1" "B_v2" "B_v3" "C_v1"
makeNames(V, renameFirst=FALSE);
#> [1] "A" "A_v1" "B" "B_v1" "B_v2" "C"
exons <- makeNames(rep("exon", 3), suffix="");
makeNames(rep(exons, c(2,3,1)), numberStyle="letters", suffix="");
#> [1] "exon1a" "exon1b" "exon2a" "exon2b" "exon2c" "exon3"