R/jamba-mixedSort.R
mixedOrder.Rd
order alphanumeric values keeping numeric values in proper order
mixedOrder(
x,
...,
blanksFirst = TRUE,
na.last = NAlast,
keepNegative = FALSE,
keepInfinite = FALSE,
keepDecimal = FALSE,
ignore.case = TRUE,
useCaseTiebreak = TRUE,
honorFactor = FALSE,
returnDebug = FALSE,
returnType = c("order", "rank"),
NAlast = TRUE,
verbose = FALSE,
debug = FALSE
)
input vector
additional parameters are sent to mixedOrder()
.
logical
whether to order blank entries before entries
containing a value.
logical
whether to move NA entries to the end of the sort.
When na.last=TRUE
then NA
values will always be last, even following
blanks and infinite values. When na.last=FALSE
then NA
values
will always be first, even before blanks and negative infinite values.
logical
whether to keep '-' associated with adjacent
numeric values, in order to sort them as negative values. Note that
keepNegative=TRUE
also forces keepDecimal=TRUE
, and enables
matching of scientific notation such as -1.23e-10
as a numeric
value. When keepNegative=FALSE
the dash "-"
is treated as
a common delimiter.
logical
whether to allow "Inf" in the input x
to be considered a numeric infinite value. Note that "-Inf"
is
only treated as a negative infinite value when keepNegative=TRUE
.
Also note that "Inf"
is only recognized as infinite when it
appears between non-character delimiters, and not part of a
larger character string like "Information"
. Be careful
with keepInfinite=TRUE
when sorting gene symbols, there are
gene symbols like "Inf3"
which should not be sorted as infinite.
Lastly, infinite values are sorted at the end, notably after
all character values which differs from some mixed sorting
algorithms.
logical
whether to keep the decimal in numbers,
sorting as a true number and not as a version number. By default
keepDecimal=FALSE``, which means "v1.200" will be ordered after "v1.30", since it considers
"1.200"effectively as
1and
200, and
"1.30"effectively as
1and
30. When
keepDecimal=TRUE, the numeric sort orders
"v1.200"before
"v1.30"`.
logical
whether to ignore uppercase and lowercase
characters when defining the sort order.
logical
indicating whether to break ties
when ignore.case=TRUE
, using mixed case as a tiebreaker.
logical
indicating whether to honor the
order of levels
if the input x
is a factor
. The default
honorFactor=FALSE
is to maintain consistent legacy behavior.
The purpose of this function is to enable alphanumeric sorting,
which is not the purpose of sorting by factor levels.
logical
indicating whether to include
additional debug info as attributes.
character
string to define the return type:
logical
DEPRECATED in favor of na.last
for
consistency with other base R functions.
logical
whether to print verbose output.
logical
indicating whether to return intermediate data
useful only for debugging purposes.
integer
vector of orders derived from x,
or when returnType="rank"
an integer vector of ranks allowing ties.
The rank is therefore valid for use in chains, such as multiple
columns of a data.frame
.
This function is a refactor of gtools::mixedorder()
which was
the source of inspiration for this function, thanks to Gregory R. Warnes!
This function was designed to improve the efficiency for large vectors,
and to handle special cases slightly differently. It was driven by some
need to sort gene symbols, and miRNA symbols in numeric order, for example:
miR-12,miR-1,miR-122,miR-1b,miR-1a,miR-2
sort
:miR-1,miR-12,miR-122,miR-1a,miR-1b,miR-2
gtools::mixedsort
:miR-122,miR-12,miR-2,miR-1,miR-1a,miR-1b
mixedSort
:miR-1,miR-1a,miR-1b,miR-2,miR-12,miR-122
This function does not by default consider negative numbers as negative, instead it treats '-' as a delimiter, unless keepNegative=TRUE.
When keepNegative=TRUE
this function also recognizes scientific
notation, for example "1.23e-2"
will be treated as numeric 0.0123
.
Note that keepNegative=TRUE
also forces keepDecimal=TRUE
.
When keepDecimal=TRUE
this function maintains numeric values that
include one "."
.
This function is the core of a family of mixedSort functions:
mixedSort()
Applies mixedOrder()
to an input vector.
mixedSorts()
Applies mixedOrder()
to a list of vectors,
returning the list where each vector is independently sorted.
mixedSortDF()
Applies mixedOrder()
to each column of a
data.frame
or comparable object, optionally specifying the order
of columns used during the sort.
Extra thanks to Gregory R. Warnes for the gtools::mixedorder()
that proved to be so useful it ultimately inspired this function.
gtools::mixedorder()
, gtools::mixedsort()
Other jam sort functions:
mixedSortDF()
,
mixedSorts()
,
mixedSort()
,
mmixedOrder()
Other jam string functions:
asSize()
,
breaksByVector()
,
cPasteSU()
,
cPasteS()
,
cPasteUnique()
,
cPasteU()
,
cPaste()
,
fillBlanks()
,
formatInt()
,
gsubOrdered()
,
gsubs()
,
makeNames()
,
mixedSortDF()
,
mixedSorts()
,
mixedSort()
,
mmixedOrder()
,
nameVectorN()
,
nameVector()
,
padInteger()
,
padString()
,
pasteByRowOrdered()
,
pasteByRow()
,
sizeAsNum()
,
tcount()
,
ucfirst()
,
uniques()
x <- c("miR-12","miR-1","miR-122","miR-1b", "miR-1a","miR-2");
mixedOrder(x);
#> [1] 2 5 4 6 1 3
x[mixedOrder(x)];
#> [1] "miR-1" "miR-1a" "miR-1b" "miR-2" "miR-12" "miR-122"
mixedSort(x);
#> [1] "miR-1" "miR-1a" "miR-1b" "miR-2" "miR-12" "miR-122"
order(x);
#> [1] 2 1 3 5 4 6
x[order(x)];
#> [1] "miR-1" "miR-12" "miR-122" "miR-1a" "miR-1b" "miR-2"
sort(x);
#> [1] "miR-1" "miR-12" "miR-122" "miR-1a" "miR-1b" "miR-2"
## Complex example including NA, blanks, and infinite "Inf"
x <- c("Inf",
"+Inf12",
NA,
"-Inf14",
"-",
"---",
"Jnf12",
"Hnf12",
"--",
"Information");
## By default, strings are sorted as-is, "Hnf" before "Inf" before "Jnf"
## blanks are first, NA values are last
x[mixedOrder(x)];
#> [1] "-" "--" "---" "Hnf12" "+Inf12"
#> [6] "-Inf14" "Inf" "Information" "Jnf12" NA
## blanks are last, but before NA values which are also last
x[mixedOrder(x, blanksFirst=FALSE)];
#> [1] "Hnf12" "+Inf12" "-Inf14" "Inf" "Information"
#> [6] "Jnf12" "-" "--" "---" NA
## Recognize infinite, but not the negative sign
## Now infinite values are at the end, ordered by the number that follows.
x[mixedOrder(x, blanksFirst=FALSE, keepInfinite=TRUE)]
#> [1] "Hnf12" "Information" "Jnf12" "+Inf12" "-Inf14"
#> [6] "Inf" "-" "--" "---" NA
## Now also recognize negative infinite values,
## which puts "-Inf14" at the very beginning.
x[mixedOrder(x, blanksFirst=FALSE, keepInfinite=TRUE, keepNegative=TRUE)]
#> [1] "-Inf14" "Hnf12" "Information" "Jnf12" "+Inf12"
#> [6] "Inf" "-" "--" "---" NA
# test factor level order
factor1 <- factor(c("Cnot9", "Cnot8", "Cnot10"))
sort(factor1)
#> [1] Cnot10 Cnot8 Cnot9
#> Levels: Cnot10 Cnot8 Cnot9
mixedSort(factor1)
#> [1] Cnot8 Cnot9 Cnot10
#> Levels: Cnot10 Cnot8 Cnot9
factor1[mixedOrder(factor1)]
#> [1] Cnot8 Cnot9 Cnot10
#> Levels: Cnot10 Cnot8 Cnot9
factor1[mixedOrder(factor1, honorFactor=TRUE)]
#> [1] Cnot10 Cnot8 Cnot9
#> Levels: Cnot10 Cnot8 Cnot9