Convert limma eBayes fit to data.frame with annotated hits
Source:R/jam_secontrasts.R
ebayes2dfs.Rd
Convert limma eBayes fit to data.frame with annotated hits
Usage
ebayes2dfs(
lmFit3 = NULL,
lmFit1 = NULL,
lmFit4 = NULL,
define_hits = TRUE,
adjp_cutoff = 0.05,
p_cutoff = NULL,
fold_cutoff = 1.5,
int_adjp_cutoff = adjp_cutoff,
int_p_cutoff = p_cutoff,
int_fold_cutoff = fold_cutoff,
mgm_cutoff = NULL,
ave_cutoff = NULL,
confint = FALSE,
use_cutoff_colnames = TRUE,
rename_headers = TRUE,
return_fold = TRUE,
merge_df = FALSE,
include_ave_expr = FALSE,
include_group_means = TRUE,
transform_means = c("none", "exp2signed", "10^"),
rowData_df = NULL,
collapse_by_gene = FALSE,
rename_contrasts = FALSE,
sep = " ",
int_grep = "[(].+-.+-.+[)]|-.+-",
trim_colnames = c("t", "B", "F", "sca.t"),
posthoc_test = c("none", "DEqMS"),
verbose = FALSE,
...
)
Arguments
- lmFit3
object returned by
limma::eBayes()
.- lmFit1
object returned by
limma::lmFit()
, optional.- lmFit4
object returned by
posthoc_test="DEqMS"
inrun_limma_replicate()
.- define_hits
logical
indicating whether to define hits using the statistical thresholds.- adjp_cutoff, p_cutoff, fold_cutoff, mgm_cutoff, ave_cutoff
numeric
values representing the appropriate statistical threshold, orNULL
when a threshold should not be applied.- int_adjp_cutoff, int_p_cutoff, int_fold_cutoff
numeric
thresholds to apply only to interaction contrasts.- confint
logical
passed tolimma::topTable()
, which defines whether to return confidence intervals for each log2 fold change.- use_cutoff_colnames
logical
whether to include the statistical thresholds abbreviated in the"hit"
colname, whendefine_hits=TRUE
.- rename_headers
logical
indicating whether to rename statistical colnames returned bylimma::topTable()
to the colnames include the contrast name.- return_fold
logical
whether to return an additional column with the signed fold change, seelog2fold_to_fold()
.- merge_df
logical
indicating whether to merge the finaldata.frame
list into onedata.frame
.- include_ave_expr
logical
indicating whether to retain the column"AveExpr"
. This column can be misleading, especially if themgm
(max group mean) threshold is used when determining statistical hits. This column is mainly useful in reviewing limma output, since it uses the"AveExpr"
values to apply its moderated variance statistic.- include_group_means
logical
indicating whether to include each group mean along with the relevant contrast. These values are helpful, in that they should exactly represent the reportedlogFC
value. Sometimes it is helpful and comforting to see the exact values used in that calculation.- rowData_df
data.frame
representing optional rowData annotation to be retained in the resulting statdata.frame
. This argument is usually defined usingrowData_colnames
inse_contrast_stats()
, which uses corresponding columns fromrowData(se)
.- collapse_by_gene
logical
indicating whether to applycollapse_stats_by_gene
which chooses one "best" exemplar per gene when there are multiple rows that represent the same gene.- rename_contrasts
logical
(inactive) which will in future allow for automated renaming of contrasts.- sep
character
string used as a delimiter in certain output colnames.- int_grep
character
string used to recognize contrasts which are considered "interaction contrasts". The default pattern recognizes any contrasts that contain multiple fold changes, recognized by the presence of more than one hypen"-"
in the contrast name.- verbose
logical
indicating whether to print verbose output.
Value
list
with one data.frame
per contrast defined in
the input lmFit3
object. When define_hits=TRUE
there
will be one column per statistical threshold, named "hit"
followed by an abbreviation of the statistical thresholds
which were applied.
When merge_df=TRUE
the returned data will be one
data.frame
object.
Details
This function is called by run_limma_replicate()
as
an extension to limma::topTable()
, that differs in that
it is performed for each contrast in the input lmFit3
object.
By default the columns include the contrast, so that each data.frame
is self-described.
When define_hits=TRUE
, then statistical thresholds are applied
to define a set of statistical hits. The thresholds available include:
adjp_cutoff
- applied to"adj.P.Val"
for adjusted P-value.p_cutoff
- applied to"P.Value"
for raw, unadjusted P-value.fold_cutoff
- normal space fold change, applied to"logFC"
by usinglog2(fold_cutoff)
.mgm_cutoff
- max group mean, applied to the highest group mean value involved in each specific contrast.ave_cutoff
- applied to"AveExpr"
which represents the mean value across all sample groups.
Note that mgm_cutoff
requires input lmFit1
which stores the
group mean values used in the limma workflow.
Note also there are optional arguments specific to interaction
contrasts, which in this context is assumed to be a
"fold change of fold changes" style of contrast, for example:
(groupA-groupB)-(groupC-groupD)
. The purpose is distinct interaction
thresholds is to enable reasonable data mining, sometimes with
somewhat more lenient thresholds for interaction contrasts.
For example, one may use adjp_cutoff=0.01
and int_adjp_cutoff=0.05
,
or fold_cutoff=2
and int_fold_cutoff=1.5
.
By default, rename_headers=TRUE
causes colnames to include the
contrast, for example renaming colname "logFC"
to "logFC contrastA"
.
This change helps reinforce the source of the statistical results,
and allows the data.frame
results to be merged together using
base::merge()
.
Indeed, merge_df=TRUE
will cause all data.frame
results to be
merged into one large data.frame
, using jamba::mergeAllXY()
.
See also
Other jamses stats:
format_hits()
,
handle_na_values()
,
hit_array_to_list()
,
process_sestats_to_hitim()
,
run_limma_replicate()
,
save_sestats()
,
se_contrast_stats()
,
sestats_to_dfs()
,
sestats_to_df()
,
voom_jam()