Define experimental contrasts from sample groups

groups2contrasts(
  iFactors,
  groupColumns = NULL,
  iSamples = NULL,
  iDesign = NULL,
  factorOrder = NULL,
  omitGrep = "[-,]",
  maxDepth = 2,
  currentDepth = 1,
  factorSep = "_",
  contrastSep = "-",
  renameFirstDepth = TRUE,
  returnDesign = FALSE,
  removePairs = NULL,
  makeUnique = TRUE,
  addContrastNamesDF = NULL,
  preControlTerms = NULL,
  verbose = FALSE,
  ...
)

Arguments

iFactors

vector of sample groups with one entry per sample, or data.frame whose colnames are experimental factors, and rows are samples.

groupColumns

character vector or NULL, to define an optional subset of colnames when iFactors is a data.frame.

iSamples

character vector or NULL, optionally used to subset the sample identifiers used in subsequent steps. Note that only groups and contrasts that contain samples will be defined.

iDesign

optional numeric design matrix, an optional method of defining sample-to-group mapping.

factorOrder

integer vector, optionally used to define the order of factor contrasts when there are multiple experimental factors. It can be helpful to force a secondary factor to be compared before a primary factor especially in two-way contrasts. Note that factorOrder refers to the columns (factors) and not the factor levels (not column values).

omitGrep

character grep pattern used to exclude secondary factors from contrasts, mainly used internally by this function.

maxDepth

integer value, the maximum number of factor "depth" to define contrasts, for example maxDepth=2 will define two-way contrasts, maxDepth=1 will only define one-way contrasts.

currentDepth

integer value used internally by groups2contrasts() for iterative operations.

factorSep, contrastSep

character values used as delimiter in factor and contrast names, respectively.

renameFirstDepth

logical used internally for iterative calls to groups2contrasts().

returnDesign

logical indicating whether to return the full set of design (iDesign), contrast (iContrasts) matrices, in addition to the contrastNames data.frame.

removePairs

list of pairwise vectors of factors which should not be compared, or NULL to include all comparisons. The values in each vector should be factor levels that should not be compared. When the vector contains only one value, it removes contrasts where that factor is not changed, which is relevant when there are two or more factors.

makeUnique

logical indicating whether to make output contrasts unique.

addContrastNamesDF

data.frame or NULL, optionally used to append to the calculated contrastNames data.frame, useful to add custom contrasts.

preControlTerms

character vector or NULL, optionally used to help define factor order, for example preControlTerms=c("WT") would help order "WT" before "KO" when defining control factor levels, so the resulting contrasts would become "KO-WT". This vector should contain the factor levels that should be used as the preferred control term in each contrast, where the earlier terms are preferred.

verbose

logical indicating whether to print verbose output.

...

additional arguments are ignored.

Value

list of data matrices: iDesign numeric design matrix; iContrasts numeric contrast matrix; contrastNames data.frame showing the full factor breakdown, with colnames "contrastName" which shows a text contrast suitable for use in limma::makeContrasts(). When returnDesign=FALSE the output is only the contrastNames data.frame.

Details

This function is intended to define statistical contrasts that compare one factor at a time. For two-factor designs, it will create two-way contrasts, defined as the contrast of pairwise contrasts.

Input can be a character vector of group names, where by default each factor is separated by an underscore "_". An example might be:

iFactors <- c("Control_Wildtype", "Control_Knockout", "Treated_Wildtype", "Treated_Knockout")

In that case, there are two factors. The first factor contains factor levels c("Control", "Treated"), and the second factor contains factor levels c("Wildtype", "Knockout").

Input can also be a data.frame (or compatible table-like object including data.table and tibble). Each column is considered a factor. From the example above, we can create a data.frame using jamba::rbindList(), see the Examples for more detail.

jamba::rbindList(strsplit(iFactors, "_"))

Lastly, if the input is a named vector, or a data.frame with rownames,

This function will change any "-" in a factor name to "." prior to detecting valid contrasts. Note that groups2contrasts() does not call base::make.names() because that function too aggressively converts characters to ".". If data must be compliant with the rules used by base::make.names(), run that function prior to calling groups2contrasts().

See also

Examples

# first define a vector of sample groups iGroups <- jamba::nameVector(paste(rep(c("WT", "KO"), each=6), rep(c("Control", "Treated"), each=3), sep="_")); iGroups <- factor(iGroups, levels=unique(iGroups)); iGroups;
#> WT_Control_v1 WT_Control_v2 WT_Control_v3 WT_Treated_v1 WT_Treated_v2 #> WT_Control WT_Control WT_Control WT_Treated WT_Treated #> WT_Treated_v3 KO_Control_v1 KO_Control_v2 KO_Control_v3 KO_Treated_v1 #> WT_Treated KO_Control KO_Control KO_Control KO_Treated #> KO_Treated_v2 KO_Treated_v3 #> KO_Treated KO_Treated #> Levels: WT_Control WT_Treated KO_Control KO_Treated
iDesignL <- groups2contrasts(iGroups, returnDesign=TRUE);
#> Warning: package ‘limma’ was built under R version 3.6.2
#> Warning: package ‘arules’ was built under R version 3.6.2
iDesignL$iDesign;
#> WT_Control WT_Treated KO_Control KO_Treated #> WT_Control_v1 1 0 0 0 #> WT_Control_v2 1 0 0 0 #> WT_Control_v3 1 0 0 0 #> WT_Treated_v1 0 1 0 0 #> WT_Treated_v2 0 1 0 0 #> WT_Treated_v3 0 1 0 0 #> KO_Control_v1 0 0 1 0 #> KO_Control_v2 0 0 1 0 #> KO_Control_v3 0 0 1 0 #> KO_Treated_v1 0 0 0 1 #> KO_Treated_v2 0 0 0 1 #> KO_Treated_v3 0 0 0 1
iDesignL$iContrasts;
#> Contrasts #> Levels KO_Control-WT_Control KO_Treated-WT_Treated WT_Treated-WT_Control #> WT_Control -1 0 -1 #> WT_Treated 0 -1 1 #> KO_Control 1 0 0 #> KO_Treated 0 1 0 #> Contrasts #> Levels KO_Treated-KO_Control #> WT_Control 0 #> WT_Treated 0 #> KO_Control -1 #> KO_Treated 1 #> Contrasts #> Levels (KO_Treated-KO_Control)-(WT_Treated-WT_Control) #> WT_Control 1 #> WT_Treated -1 #> KO_Control -1 #> KO_Treated 1
# now you can visualize the samples used in each contrast iDesignL$iDesign %*% iDesignL$iContrasts;
#> Contrasts #> KO_Control-WT_Control KO_Treated-WT_Treated #> WT_Control_v1 -1 0 #> WT_Control_v2 -1 0 #> WT_Control_v3 -1 0 #> WT_Treated_v1 0 -1 #> WT_Treated_v2 0 -1 #> WT_Treated_v3 0 -1 #> KO_Control_v1 1 0 #> KO_Control_v2 1 0 #> KO_Control_v3 1 0 #> KO_Treated_v1 0 1 #> KO_Treated_v2 0 1 #> KO_Treated_v3 0 1 #> Contrasts #> WT_Treated-WT_Control KO_Treated-KO_Control #> WT_Control_v1 -1 0 #> WT_Control_v2 -1 0 #> WT_Control_v3 -1 0 #> WT_Treated_v1 1 0 #> WT_Treated_v2 1 0 #> WT_Treated_v3 1 0 #> KO_Control_v1 0 -1 #> KO_Control_v2 0 -1 #> KO_Control_v3 0 -1 #> KO_Treated_v1 0 1 #> KO_Treated_v2 0 1 #> KO_Treated_v3 0 1 #> Contrasts #> (KO_Treated-KO_Control)-(WT_Treated-WT_Control) #> WT_Control_v1 1 #> WT_Control_v2 1 #> WT_Control_v3 1 #> WT_Treated_v1 -1 #> WT_Treated_v2 -1 #> WT_Treated_v3 -1 #> KO_Control_v1 -1 #> KO_Control_v2 -1 #> KO_Control_v3 -1 #> KO_Treated_v1 1 #> KO_Treated_v2 1 #> KO_Treated_v3 1
# you can adjust the order of factor levels per comparison groups2contrasts(as.character(iGroups))$contrastName
#> [1] KO_Control-WT_Control #> [2] KO_Treated-WT_Treated #> [3] WT_Treated-WT_Control #> [4] KO_Treated-KO_Control #> [5] (KO_Treated-KO_Control)-(WT_Treated-WT_Control) #> 6 Levels: KO_Control-WT_Control ... (KO_Treated-WT_Treated)-(KO_Control-WT_Control)
# make "WT" the first control term groups2contrasts(as.character(iGroups), preControlTerms=c("WT"), factorOrder=2:1)$contrastName
#> [1] WT_Treated-WT_Control #> [2] KO_Treated-KO_Control #> [3] KO_Control-WT_Control #> [4] KO_Treated-WT_Treated #> [5] (KO_Treated-WT_Treated)-(KO_Control-WT_Control) #> 6 Levels: WT_Treated-WT_Control ... (KO_Treated-KO_Control)-(WT_Treated-WT_Control)
# prevent comparisons of WT to WT, or KO to KO groups2contrasts(as.character(iGroups), removePairs=list(c("WT"), c("KO")))
#> factor_v1 factor_v2 #> KO_Control-WT_Control KO,WT Control #> KO_Treated-WT_Treated KO,WT Treated #> (KO_Treated-WT_Treated)-(KO_Control-WT_Control) KO,WT Treated,Control #> contrastName #> KO_Control-WT_Control KO_Control-WT_Control #> KO_Treated-WT_Treated KO_Treated-WT_Treated #> (KO_Treated-WT_Treated)-(KO_Control-WT_Control) (KO_Treated-WT_Treated)-(KO_Control-WT_Control) #> contrastString #> KO_Control-WT_Control factor_v1:KO,WT;factor_v2:Control #> KO_Treated-WT_Treated factor_v1:KO,WT;factor_v2:Treated #> (KO_Treated-WT_Treated)-(KO_Control-WT_Control) factor_v1:KO,WT;factor_v2:Treated,Control
# input as a data.frame with ordered factor levels iFactors <- data.frame(Genotype=factor(c("WT","WT","KO","KO"), levels=c("WT","KO")), Treatment=factor(c("Treated","Control"), levels=c("Control","Treated"))) iFactors;
#> Genotype Treatment #> 1 WT Treated #> 2 WT Control #> 3 KO Treated #> 4 KO Control
groups2contrasts(iFactors)
#> Genotype Treatment #> KO_Control-WT_Control KO,WT Control #> KO_Treated-WT_Treated KO,WT Treated #> WT_Treated-WT_Control WT Treated,Control #> KO_Treated-KO_Control KO Treated,Control #> (KO_Treated-KO_Control)-(WT_Treated-WT_Control) KO,WT Treated,Control #> contrastName #> KO_Control-WT_Control KO_Control-WT_Control #> KO_Treated-WT_Treated KO_Treated-WT_Treated #> WT_Treated-WT_Control WT_Treated-WT_Control #> KO_Treated-KO_Control KO_Treated-KO_Control #> (KO_Treated-KO_Control)-(WT_Treated-WT_Control) (KO_Treated-KO_Control)-(WT_Treated-WT_Control) #> contrastString #> KO_Control-WT_Control Genotype:KO,WT;Treatment:Control #> KO_Treated-WT_Treated Genotype:KO,WT;Treatment:Treated #> WT_Treated-WT_Control Genotype:WT;Treatment:Treated,Control #> KO_Treated-KO_Control Genotype:KO;Treatment:Treated,Control #> (KO_Treated-KO_Control)-(WT_Treated-WT_Control) Genotype:KO,WT;Treatment:Treated,Control
# Again remove WT-WT and KO-KO contrasts groups2contrasts(iFactors, removePairs=list(c("WT"), c("KO")))
#> Genotype Treatment #> KO_Control-WT_Control KO,WT Control #> KO_Treated-WT_Treated KO,WT Treated #> (KO_Treated-WT_Treated)-(KO_Control-WT_Control) KO,WT Treated,Control #> contrastName #> KO_Control-WT_Control KO_Control-WT_Control #> KO_Treated-WT_Treated KO_Treated-WT_Treated #> (KO_Treated-WT_Treated)-(KO_Control-WT_Control) (KO_Treated-WT_Treated)-(KO_Control-WT_Control) #> contrastString #> KO_Control-WT_Control Genotype:KO,WT;Treatment:Control #> KO_Treated-WT_Treated Genotype:KO,WT;Treatment:Treated #> (KO_Treated-WT_Treated)-(KO_Control-WT_Control) Genotype:KO,WT;Treatment:Treated,Control