Convert limma eBayes fit to data.frame with annotated hits
Source:R/jam_secontrasts.R
ebayes2dfs.RdConvert limma eBayes fit to data.frame with annotated hits
Usage
ebayes2dfs(
lmFit3 = NULL,
lmFit1 = NULL,
lmFit4 = NULL,
define_hits = TRUE,
adjp_cutoff = 0.05,
p_cutoff = NULL,
fold_cutoff = 1.5,
int_adjp_cutoff = adjp_cutoff,
int_p_cutoff = p_cutoff,
int_fold_cutoff = fold_cutoff,
mgm_cutoff = NULL,
ave_cutoff = NULL,
confint = FALSE,
use_cutoff_colnames = TRUE,
rename_headers = TRUE,
return_fold = TRUE,
merge_df = FALSE,
include_ave_expr = FALSE,
include_group_means = TRUE,
transform_means = c("none", "exp2signed", "10^"),
rowData_df = NULL,
collapse_by_gene = FALSE,
rename_contrasts = FALSE,
sep = " ",
int_grep = "[(].+-.+-.+[)]|-.+-",
trim_colnames = c("t", "B", "F", "sca.t"),
posthoc_test = c("none", "DEqMS"),
verbose = FALSE,
...
)Arguments
- lmFit3
object returned by
limma::eBayes().- lmFit1
object returned by
limma::lmFit(), optional.- lmFit4
object returned by
posthoc_test="DEqMS"inrun_limma_replicate().- define_hits
logicalindicating whether to define hits using the statistical thresholds.- adjp_cutoff, p_cutoff, fold_cutoff, mgm_cutoff, ave_cutoff
numericvalues representing the appropriate statistical threshold, orNULLwhen a threshold should not be applied.- int_adjp_cutoff, int_p_cutoff, int_fold_cutoff
numericthresholds to apply only to interaction contrasts.- confint
logicalpassed tolimma::topTable(), which defines whether to return confidence intervals for each log2 fold change.- use_cutoff_colnames
logicalwhether to include the statistical thresholds abbreviated in the"hit"colname, whendefine_hits=TRUE.- rename_headers
logicalindicating whether to rename statistical colnames returned bylimma::topTable()to the colnames include the contrast name.- return_fold
logicalwhether to return an additional column with the signed fold change, seelog2fold_to_fold().- merge_df
logicalindicating whether to merge the finaldata.framelist into onedata.frame.- include_ave_expr
logicalindicating whether to retain the column"AveExpr". This column can be misleading, especially if themgm(max group mean) threshold is used when determining statistical hits. This column is mainly useful in reviewing limma output, since it uses the"AveExpr"values to apply its moderated variance statistic.- include_group_means
logicalindicating whether to include each group mean along with the relevant contrast. These values are helpful, in that they should exactly represent the reportedlogFCvalue. Sometimes it is helpful and comforting to see the exact values used in that calculation.- rowData_df
data.framerepresenting optional rowData annotation to be retained in the resulting statdata.frame. This argument is usually defined usingrowData_colnamesinse_contrast_stats(), which uses corresponding columns fromrowData(se).- collapse_by_gene
logicalindicating whether to applycollapse_stats_by_genewhich chooses one "best" exemplar per gene when there are multiple rows that represent the same gene.- rename_contrasts
logical(inactive) which will in future allow for automated renaming of contrasts.- sep
characterstring used as a delimiter in certain output colnames.- int_grep
characterstring used to recognize contrasts which are considered "interaction contrasts". The default pattern recognizes any contrasts that contain multiple fold changes, recognized by the presence of more than one hypen"-"in the contrast name.- verbose
logicalindicating whether to print verbose output.
Value
list with one data.frame per contrast defined in
the input lmFit3 object. When define_hits=TRUE there
will be one column per statistical threshold, named "hit"
followed by an abbreviation of the statistical thresholds
which were applied.
When merge_df=TRUE the returned data will be one
data.frame object.
Details
This function is called by run_limma_replicate() as
an extension to limma::topTable(), that differs in that
it is performed for each contrast in the input lmFit3 object.
By default the columns include the contrast, so that each data.frame
is self-described.
When define_hits=TRUE, then statistical thresholds are applied
to define a set of statistical hits. The thresholds available include:
adjp_cutoff- applied to"adj.P.Val"for adjusted P-value.p_cutoff- applied to"P.Value"for raw, unadjusted P-value.fold_cutoff- normal space fold change, applied to"logFC"by usinglog2(fold_cutoff).mgm_cutoff- max group mean, applied to the highest group mean value involved in each specific contrast.ave_cutoff- applied to"AveExpr"which represents the mean value across all sample groups.
Note that mgm_cutoff requires input lmFit1 which stores the
group mean values used in the limma workflow.
Note also there are optional arguments specific to interaction
contrasts, which in this context is assumed to be a
"fold change of fold changes" style of contrast, for example:
(groupA-groupB)-(groupC-groupD). The purpose is distinct interaction
thresholds is to enable reasonable data mining, sometimes with
somewhat more lenient thresholds for interaction contrasts.
For example, one may use adjp_cutoff=0.01 and int_adjp_cutoff=0.05,
or fold_cutoff=2 and int_fold_cutoff=1.5.
By default, rename_headers=TRUE causes colnames to include the
contrast, for example renaming colname "logFC" to "logFC contrastA".
This change helps reinforce the source of the statistical results,
and allows the data.frame results to be merged together using
base::merge().
Indeed, merge_df=TRUE will cause all data.frame results to be
merged into one large data.frame, using jamba::mergeAllXY().
See also
Other jamses stats:
format_hits(),
handle_na_values(),
hit_array_to_list(),
process_sestats_to_hitim(),
run_limma_replicate(),
save_sestats(),
se_contrast_stats(),
sestats_to_dfs(),
sestats_to_df(),
voom_jam()