order alphanumeric values keeping numeric values in proper order
Source:R/jamba-mixedSort.R
mixedOrder.Rdorder alphanumeric values keeping numeric values in proper order
Usage
mixedOrder(
x,
...,
blanksFirst = TRUE,
na.last = NAlast,
keepNegative = FALSE,
keepInfinite = FALSE,
keepDecimal = FALSE,
ignore.case = TRUE,
useCaseTiebreak = TRUE,
honorFactor = FALSE,
returnDebug = FALSE,
returnType = c("order", "rank"),
NAlast = TRUE,
verbose = FALSE,
debug = FALSE
)Arguments
- x
input vector
- ...
additional parameters are sent to
mixedOrder().- blanksFirst
logicalwhether to order blank entries before entries containing a value.- na.last
logicalwhether to move NA entries to the end of the sort. Whenna.last=TRUEthenNAvalues will always be last, even following blanks and infinite values. Whenna.last=FALSEthenNAvalues will always be first, even before blanks and negative infinite values.- keepNegative
logicalwhether to keep '-' associated with adjacent numeric values, in order to sort them as negative values. Note thatkeepNegative=TRUEalso forceskeepDecimal=TRUE, and enables matching of scientific notation such as-1.23e-10as a numeric value. WhenkeepNegative=FALSEthe dash"-"is treated as a common delimiter.- keepInfinite
logicalwhether to allow "Inf" in the inputxto be considered a numeric infinite value. Note that"-Inf"is only treated as a negative infinite value whenkeepNegative=TRUE. Also note that"Inf"is only recognized as infinite when it appears between non-character delimiters, and not part of a larger character string like"Information". Be careful withkeepInfinite=TRUEwhen sorting gene symbols, there are gene symbols like"Inf3"which should not be sorted as infinite. Lastly, infinite values are sorted at the end, notably after all character values which differs from some mixed sorting algorithms.- keepDecimal
logicalwhether to keep the decimal in numbers, sorting as a true number and not as a version number. By defaultkeepDecimal=FALSE``, which means "v1.200" will be ordered after "v1.30", since it considers"1.200"effectively as1and200, and"1.30"effectively as1and30. WhenkeepDecimal=TRUE, the numeric sort orders"v1.200"before"v1.30"`.- ignore.case
logicalwhether to ignore uppercase and lowercase characters when defining the sort order.- useCaseTiebreak
logicalindicating whether to break ties whenignore.case=TRUE, using mixed case as a tiebreaker.- honorFactor
logicalindicating whether to honor the order oflevelsif the inputxis afactor. The defaulthonorFactor=FALSEis to maintain consistent legacy behavior. The purpose of this function is to enable alphanumeric sorting, which is not the purpose of sorting by factor levels.- returnDebug
logicalindicating whether to include additional debug info as attributes.- returnType
characterstring to define the return type:- NAlast
logicalDEPRECATED in favor ofna.lastfor consistency with other base R functions.- verbose
logicalwhether to print verbose output.- debug
logicalindicating whether to return intermediate data useful only for debugging purposes.
Value
integer vector of orders derived from x,
or when returnType="rank" an integer vector of ranks allowing ties.
The rank is therefore valid for use in chains, such as multiple
columns of a data.frame.
Details
This function is a refactor of gtools mixedorder() which was
the source of inspiration for this function, thanks to Gregory R. Warnes!
This function was designed to improve the efficiency for large vectors,
and to handle special cases slightly differently. It was driven by some
need to sort gene symbols, and miRNA symbols in numeric order, for example:
- test set:
miR-12,miR-1,miR-122,miR-1b,miR-1a,miR-2
sort:miR-1,miR-12,miR-122,miR-1a,miR-1b,miR-2
gtools::mixedsort:miR-122,miR-12,miR-2,miR-1,miR-1a,miR-1b
mixedSort:miR-1,miR-1a,miR-1b,miR-2,miR-12,miR-122
This function does not by default consider negative numbers as negative, instead it treats '-' as a delimiter, unless keepNegative=TRUE.
When keepNegative=TRUE this function also recognizes scientific
notation, for example "1.23e-2" will be treated as numeric 0.0123.
Note that keepNegative=TRUE also forces keepDecimal=TRUE.
When keepDecimal=TRUE this function maintains numeric values that
include one ".".
This function is the core of a family of mixedSort functions:
mixedSort()Applies
mixedOrder()to an input vector.mixedSorts()Applies
mixedOrder()to a list of vectors, returning the list where each vector is independently sorted.mixedSortDF()Applies
mixedOrder()to each column of adata.frameor comparable object, optionally specifying the order of columns used during the sort.
Extra thanks to Gregory R. Warnes for the gtools mixedorder()
that proved to be so useful it ultimately inspired this function.
See also
gtools::mixedorder(), gtools::mixedsort()
Other jam sort functions:
mixedSort(),
mixedSortDF(),
mixedSorts(),
mmixedOrder()
Examples
x <- c("miR-12","miR-1","miR-122","miR-1b", "miR-1a","miR-2");
mixedOrder(x);
#> [1] 2 5 4 6 1 3
x[mixedOrder(x)];
#> [1] "miR-1" "miR-1a" "miR-1b" "miR-2" "miR-12" "miR-122"
mixedSort(x);
#> [1] "miR-1" "miR-1a" "miR-1b" "miR-2" "miR-12" "miR-122"
order(x);
#> [1] 2 1 3 5 4 6
x[order(x)];
#> [1] "miR-1" "miR-12" "miR-122" "miR-1a" "miR-1b" "miR-2"
sort(x);
#> [1] "miR-1" "miR-12" "miR-122" "miR-1a" "miR-1b" "miR-2"
## Complex example including NA, blanks, and infinite "Inf"
x <- c("Inf",
"+Inf12",
NA,
"-Inf14",
"-",
"---",
"Jnf12",
"Hnf12",
"--",
"Information");
## By default, strings are sorted as-is, "Hnf" before "Inf" before "Jnf"
## blanks are first, NA values are last
x[mixedOrder(x)];
#> [1] "-" "--" "---" "Hnf12" "+Inf12"
#> [6] "-Inf14" "Inf" "Information" "Jnf12" NA
## blanks are last, but before NA values which are also last
x[mixedOrder(x, blanksFirst=FALSE)];
#> [1] "Hnf12" "+Inf12" "-Inf14" "Inf" "Information"
#> [6] "Jnf12" "-" "--" "---" NA
## Recognize infinite, but not the negative sign
## Now infinite values are at the end, ordered by the number that follows.
x[mixedOrder(x, blanksFirst=FALSE, keepInfinite=TRUE)]
#> [1] "Hnf12" "Information" "Jnf12" "+Inf12" "-Inf14"
#> [6] "Inf" "-" "--" "---" NA
## Now also recognize negative infinite values,
## which puts "-Inf14" at the very beginning.
x[mixedOrder(x, blanksFirst=FALSE, keepInfinite=TRUE, keepNegative=TRUE)]
#> [1] "-Inf14" "Hnf12" "Information" "Jnf12" "+Inf12"
#> [6] "Inf" "-" "--" "---" NA
# test factor level order
factor1 <- factor(c("Cnot9", "Cnot8", "Cnot10"))
sort(factor1)
#> [1] Cnot10 Cnot8 Cnot9
#> Levels: Cnot10 Cnot8 Cnot9
mixedSort(factor1)
#> [1] Cnot8 Cnot9 Cnot10
#> Levels: Cnot10 Cnot8 Cnot9
factor1[mixedOrder(factor1)]
#> [1] Cnot8 Cnot9 Cnot10
#> Levels: Cnot10 Cnot8 Cnot9
factor1[mixedOrder(factor1, honorFactor=TRUE)]
#> [1] Cnot10 Cnot8 Cnot9
#> Levels: Cnot10 Cnot8 Cnot9