Convert contrast to short-form comp, convert comp to contrast
Source:R/jam_contrast2comp.R
contrast2comp.Rd
Convert contrast to short-form comp, and convert comp back to original contrast.
Usage
contrast2comp(
contrast_names,
contrast_delim = "-",
contrast_factor_delim = "_",
comp_factor_delim = ":",
add_attr = FALSE,
abbreviate = FALSE,
verbose = FALSE,
...
)
comp2contrast(
comps,
contrast_delim = "-",
contrast_factor_delim = "_",
comp_factor_delim = ":",
factor_order = NULL,
add_attr = FALSE,
verbose = FALSE,
...
)
names_contrast2comp(
contrast_names,
contrast_delim = "-",
contrast_factor_delim = "_",
comp_factor_delim = ":",
add_attr = FALSE,
verbose = FALSE,
...
)
names_comp2contrast(
comps,
contrast_delim = "-",
contrast_factor_delim = "_",
comp_factor_delim = ":",
factor_order = NULL,
add_attr = FALSE,
verbose = FALSE,
...
)
Arguments
- contrast_names
character
vector of statistical contrasts- contrast_delim
character
string delimiter between groups, typically"-"
to indicate subtraction of group means.- contrast_factor_delim
character
string delimiter between design factors in a contrast.- comp_factor_delim
character
string delimiter between design factors in a comp.- add_attr
logical
indicating whether to add attributes to the output, containing the input values provided.- abbreviate
logical
indicating whether to abbreviate factors, by callingshortest_unique_abbreviation()
. Note this option prevents output from being reversible, since the abbreviated term will not match the original factor level.- verbose
logical
indicating whether to print verbose output, or for much more verbose output useverbose=2
.- ...
additional arguments are ignored.
- factor_order
integer
,list
ofinteger
vectors, orNULL
. When supplied asinteger
vector, it is converted to alist
and expanded tolength()
of the input. Theinteger
values are used bycomp2contrast()
to force the order of factor comparisons for two-way and higher order contrasts.
Details
These functions are intended to reduce the number of characters
required to represent a statistical contrast.
contrast2comp()
converts long to short form, and
comp2contrast()
converts short to long form.
"contrast"
: the fully-defined contrast"comp"
: equivalent abbreviated form, a short comparison
Note that one goal is to reduce characters in Excel worksheet names,
currently limited to 31 characters. Also note, the ":"
delimiter is
not permitted in Excel sheet names, thus save_sestats()
uses
semicolon ";"
. This limitation may warrant using a different default
delimiter between factors, such as comma ","
, or pipe "|"
,
or forward-slash "/"
.
Assumptions
The key assumption is that an experimental group name is a
character
string composed of its factor levels, with a delimiter
between factors. For example:
CellA_Treated
- is interpreted as"CellA"
and"Treated"
CellA_Control
- is interpreted as"CellA"
and"Control"
A contrast therefore:
CellA_Treated-CellA_Control
can be re-writtenCellA:Treated-Control
Factors must be in identical order for all groups, and there
must be no empty factor levels.
Do not use: "CellA_Treated_Time0"
, "CellA_Time0"
.
Finally, the overall assumption is that contrasts are composed of reasonable comparisons between factor levels, with no more factors being compared than the depth of contrast. For example, a one-way contrast can compare one factor, a two-way contrast can compared two factors, and so on. In most cases where the assumptions above are broken, the output should be the same as the input, with no change.
When using groups_to_sedesign()
, the output contrasts should all
meet these requirements, therefore all contrasts can be
converted to "comp"
form for plot labels, and converted
back to "contrasts"
as needed.
Delimiters can be customized, however they must all be
single-character values, avoiding ()[]
which are reserved.
For example, sometimes factors are separated by "."
such as
in the contrast: "A.B-C.B"
. In this case, use:
contrast2comp("A.B-C.B", contrast_factor_delim=".")
.
The corresponding conversion back to contrast would be:
comp2contrast("A-C:B", contrast_factor_delim=".")
Design goals for conversion to short form comp
"comp"
should be interchangeable with"contrast"
use
contrast2comp()
andcomp2contrast()
when a contrast cannot be abbreviated, comp will use contrast
see examples
when more factors are being compared than the contrast order, the function will leave the contrast as-is
Consider
"CellA_Treated-CellB_Control"
. Both"CellA-CellB"
and"Treated-Control"
are compared in a one-way contrast, therefore it cannot be abbreviated.
"comp"
shall not create any whitespacefactors will be delimited with
":"
factor levels will be delimited with
"-"
other potential delimiters
"*"
,"+"
already have meaning in formula context.
"comp"
shall not use parentheses"()"
, where possiblethe goal is to reduce characters
parentheses are not necessary for balanced contrasts
unbalanced contrasts (see point 2) will retain the original syntax
the order of factors should be maintained in
"comp"
goal is to reproduce the original correct group name in contrast form
the original group name is necessary for the design matrix
Worked examples
One-way contrast
contrast:
CellA_Treated-CellA_Control
comment:
CellA
is unchanged,Treated-Control
is changedcomp:
CellA:Treated-Control
Two-way contrast
contrast:
(CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
comment:
CellA-CellB
is changed,Treated-Control
is changedcomp:
CellA-CellB:Treated-Control
note: when converting comp
CellA-CellB:Treated-Control
back to contrast, two forms are mathematically equivalent:
# form 1 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control) # form 2 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control) # both are equivalent CellA_Treated - CellB_Treated - CellA_Control + CellB_Control
These two forms can be controlled in
comp2contrast()
with argumentfactor_order
.
Three-way contrast (it happens rarely, but does happen)
contrast:
(CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut) - (CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT)
comment:
CellA-CellB
,Treated-Control
,Mut-WT
are changedcomp:
CellA-CellB:Treated-Control:Mut-WT
One-way contrast with additional unchanged factors
contrast:
CellA_Treated_WT-CellA_Control_WT
comment:
CellA
,WT
are unchanged,Treated-Control
is changedcomp:
CellA:Treated-Control:WT
Unbalanced one-way contrast
contrast:
CellA_Treated-CellB_Control
comment:
CellA-CellB
andTreated-Control
are changedcomp:
CellA_Treated-CellB_Control
Mis-directed two-way contrast
contrast:
(CellA_Treated-CellA_Control)-(CellB_Control-CellB_Treated)
comment:
CellA-CellB
are changed,Treated-Control/Control-Treated
are changedNote: The
Treated-Control
andControl-Treated
do not agree in direction. The output is partially abbreviated, and maintains the original direction to prevent loss of information.comp:
(CellA:Treated-Control)-(CellB:Control-Treatment)
See also
Other jam experiment design:
check_sedesign()
,
contrast_colors_by_group()
,
contrast_names_to_sedesign()
,
contrasts_to_factors()
,
contrasts_to_venn_setlists()
,
draw_oneway_contrast()
,
draw_twoway_contrast()
,
filter_contrast_names()
,
groups_to_sedesign()
,
plot_sedesign()
,
sedesign_to_factors()
,
validate_sedesign()
Examples
contrast_names <- c(
"CellA_Treated-CellA_Control",
"CellB_Treated-CellB_Control",
"CellB_Treated-CellA_Control",
"(CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)",
"(CellB_Treated-CellB_Control)-(CellA_Treated-CellA_Control)",
"(CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control)"
);
contrast2comp(contrast_names)
#> [1] "CellA:Treated-Control" "CellB:Treated-Control"
#> [3] "CellB_Treated-CellA_Control" "CellA-CellB:Treated-Control"
#> [5] "CellB-CellA:Treated-Control" "CellA-CellB:Treated-Control"
contrast2comp(contrast_names, abbreviate=TRUE)
#> [1] "CellA:T-C" "CellB:T-C" "CellB_T-CellA_C" "CellA-CellB:T-C"
#> [5] "CellB-CellA:T-C" "CellA-CellB:T-C"
contrast2comp(contrast_names, comp_factor_delim=";")
#> [1] "CellA;Treated-Control" "CellB;Treated-Control"
#> [3] "CellB_Treated-CellA_Control" "CellA-CellB;Treated-Control"
#> [5] "CellB-CellA;Treated-Control" "CellA-CellB;Treated-Control"
comps <- contrast2comp(contrast_names)
data.frame(contrast_names,
nchar_contrasts=nchar(contrast_names),
comps,
nchar_comps=nchar(comps))
#> contrast_names nchar_contrasts
#> 1 CellA_Treated-CellA_Control 27
#> 2 CellB_Treated-CellB_Control 27
#> 3 CellB_Treated-CellA_Control 27
#> 4 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control) 59
#> 5 (CellB_Treated-CellB_Control)-(CellA_Treated-CellA_Control) 59
#> 6 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control) 59
#> comps nchar_comps
#> 1 CellA:Treated-Control 21
#> 2 CellB:Treated-Control 21
#> 3 CellB_Treated-CellA_Control 27
#> 4 CellA-CellB:Treated-Control 27
#> 5 CellB-CellA:Treated-Control 27
#> 6 CellA-CellB:Treated-Control 27
# compare conversion back to contrast
data.frame(contrast_names,
comps=comps,
contrast_again=comp2contrast(comps),
changed=contrast_names != comp2contrast(comps))
#> contrast_names
#> 1 CellA_Treated-CellA_Control
#> 2 CellB_Treated-CellB_Control
#> 3 CellB_Treated-CellA_Control
#> 4 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
#> 5 (CellB_Treated-CellB_Control)-(CellA_Treated-CellA_Control)
#> 6 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control)
#> comps
#> 1 CellA:Treated-Control
#> 2 CellB:Treated-Control
#> 3 CellB_Treated-CellA_Control
#> 4 CellA-CellB:Treated-Control
#> 5 CellB-CellA:Treated-Control
#> 6 CellA-CellB:Treated-Control
#> contrast_again changed
#> 1 CellA_Treated-CellA_Control FALSE
#> 2 CellB_Treated-CellB_Control FALSE
#> 3 CellB_Treated-CellA_Control FALSE
#> 4 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control) TRUE
#> 5 (CellB_Treated-CellA_Treated)-(CellB_Control-CellA_Control) TRUE
#> 6 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control) FALSE
# factors can be ordered by contrast
contrasts2 <- comp2contrast(comps,
factor_order=list(1:2, 1:2, 1:2,
2:1, 2:1, 1:2))
# compare conversion back to contrast
data.frame(contrast_names,
comps=comps,
contrasts2,
changed=contrast_names != contrasts2)
#> contrast_names
#> 1 CellA_Treated-CellA_Control
#> 2 CellB_Treated-CellB_Control
#> 3 CellB_Treated-CellA_Control
#> 4 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
#> 5 (CellB_Treated-CellB_Control)-(CellA_Treated-CellA_Control)
#> 6 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control)
#> comps
#> 1 CellA:Treated-Control
#> 2 CellB:Treated-Control
#> 3 CellB_Treated-CellA_Control
#> 4 CellA-CellB:Treated-Control
#> 5 CellB-CellA:Treated-Control
#> 6 CellA-CellB:Treated-Control
#> contrasts2 changed
#> 1 CellA_Treated-CellA_Control FALSE
#> 2 CellB_Treated-CellB_Control FALSE
#> 3 CellB_Treated-CellA_Control FALSE
#> 4 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control) FALSE
#> 5 (CellB_Treated-CellB_Control)-(CellA_Treated-CellA_Control) FALSE
#> 6 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control) FALSE
# note change in direction for two-way contrasts
# Treated-Control and Control-Treated
contrast_diff <- "(CellA_Treated-CellA_Control)-(CellB_Control-CellB_Treated)";
comp_diff <- contrast2comp(contrast_diff)
# partially abbreviated comp
comp_diff
#> [1] "(CellA:Treated-Control)-(CellB:Control-Treated)"
# it is converted back to original form
comp2contrast(comp_diff)
#> [1] "(CellA_Treated-CellA_Control)-(CellB_Control-CellB_Treated)"
data.frame(contrast_diff,
nchar_contrasts=nchar(contrast_diff),
comp_diff,
nchar_comps=nchar(comp_diff))
#> contrast_diff nchar_contrasts
#> 1 (CellA_Treated-CellA_Control)-(CellB_Control-CellB_Treated) 59
#> comp_diff nchar_comps
#> 1 (CellA:Treated-Control)-(CellB:Control-Treated) 47
# evaluate the rare three-way contrast
contrast_names_3way <- c(
contrast_names[4],
gsub("([a-zA-Z])([-)])", "\\1_Mut\\2", contrast_names[4]),
gsub("([a-zA-Z])([-)])", "\\1_WT\\2", contrast_names[4]),
paste0("(",
gsub("([a-zA-Z])([-)])", "\\1_Mut\\2", contrast_names[4]),
")-(",
gsub("([a-zA-Z])([-)])", "\\1_WT\\2", contrast_names[4]),
")"))
contrast_names_3way <- c(
paste0("(CellA_Treated-CellA_Control)-",
"(CellB_Treated-CellB_Control)"),
paste0("(CellA_Treated_Mut-CellA_Control_Mut)-",
"(CellB_Treated_Mut-CellB_Control_Mut)"),
paste0("(CellA_Treated_WT-CellA_Control_WT)-",
"(CellB_Treated_WT-CellB_Control_WT)"),
paste0("((CellA_Treated_Mut-CellB_Treated_Mut)-",
"(CellA_Control_Mut-CellB_Control_Mut))-",
"((CellA_Treated_WT-CellB_Treated_WT)-",
"(CellA_Control_WT-CellB_Control_WT))"),
paste0("((CellA_Treated_Mut-CellA_Control_Mut)-",
"(CellB_Treated_Mut-CellB_Control_Mut))-",
"((CellA_Treated_WT-CellA_Control_WT)-",
"(CellB_Treated_WT-CellB_Control_WT))"))
comp_3way <- contrast2comp(contrast_names_3way);
data.frame(contrast_names_3way,
nchar_contrasts=nchar(contrast_names_3way),
comp_3way,
nchar_comps=nchar(comp_3way));
#> contrast_names_3way
#> 1 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
#> 2 (CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut)
#> 3 (CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT)
#> 4 ((CellA_Treated_Mut-CellB_Treated_Mut)-(CellA_Control_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellB_Treated_WT)-(CellA_Control_WT-CellB_Control_WT))
#> 5 ((CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT))
#> nchar_contrasts comp_3way nchar_comps
#> 1 59 CellA-CellB:Treated-Control 27
#> 2 75 CellA-CellB:Treated-Control:Mut 31
#> 3 71 CellA-CellB:Treated-Control:WT 30
#> 4 151 CellA-CellB:Treated-Control:Mut-WT 34
#> 5 151 CellA-CellB:Treated-Control:Mut-WT 34
# compare to input
contrasts2_3way <- comp2contrast(comp_3way);
# mathematically correct contrasts but in different order from input
data.frame(contrast_names_3way,
contrasts2_3way,
changed=contrast_names_3way != contrasts2_3way);
#> contrast_names_3way
#> 1 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
#> 2 (CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut)
#> 3 (CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT)
#> 4 ((CellA_Treated_Mut-CellB_Treated_Mut)-(CellA_Control_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellB_Treated_WT)-(CellA_Control_WT-CellB_Control_WT))
#> 5 ((CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT))
#> contrasts2_3way
#> 1 (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control)
#> 2 (CellA_Treated_Mut-CellB_Treated_Mut)-(CellA_Control_Mut-CellB_Control_Mut)
#> 3 (CellA_Treated_WT-CellB_Treated_WT)-(CellA_Control_WT-CellB_Control_WT)
#> 4 ((CellA_Treated_Mut-CellB_Treated_Mut)-(CellA_Control_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellB_Treated_WT)-(CellA_Control_WT-CellB_Control_WT))
#> 5 ((CellA_Treated_Mut-CellB_Treated_Mut)-(CellA_Control_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellB_Treated_WT)-(CellA_Control_WT-CellB_Control_WT))
#> changed
#> 1 TRUE
#> 2 TRUE
#> 3 TRUE
#> 4 FALSE
#> 5 TRUE
# custom factor order produces the same contrasts as input
contrasts2_3way_v2 <- comp2contrast(comp_3way,
factor_order=list(c(2,1,3), c(2,1,3), c(2,1,3),
c(1,2,3), c(2,1,3)));
data.frame(contrast_names_3way,
contrasts2_3way_v2,
changed=contrast_names_3way != contrasts2_3way_v2);
#> contrast_names_3way
#> 1 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
#> 2 (CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut)
#> 3 (CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT)
#> 4 ((CellA_Treated_Mut-CellB_Treated_Mut)-(CellA_Control_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellB_Treated_WT)-(CellA_Control_WT-CellB_Control_WT))
#> 5 ((CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT))
#> contrasts2_3way_v2
#> 1 (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
#> 2 (CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut)
#> 3 (CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT)
#> 4 ((CellA_Treated_Mut-CellB_Treated_Mut)-(CellA_Control_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellB_Treated_WT)-(CellA_Control_WT-CellB_Control_WT))
#> 5 ((CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut))-((CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT))
#> changed
#> 1 FALSE
#> 2 FALSE
#> 3 FALSE
#> 4 FALSE
#> 5 FALSE