Curate Ingenuity IPA colnames
curateIPAcolnames(
jDF,
ipaNameGrep = c("^Name$", "^ID$", "Canonical Pathways", "Upstream Regulator",
"Diseases or Functions Annotation", "Diseases . Functions", "My Lists",
"Ingenuity Toxicity Lists", "My Pathways"),
geneGrep = c("Molecules in Network", "Target molecules", "Molecules", "Symbol"),
geneCurateFrom = c(" [(](complex|includes others)[)]", "^[,]+|[,]+$"),
geneCurateTo = c("", ""),
convert_ipa_slash = TRUE,
ipa_slash_sep = ":",
verbose = TRUE,
...
)
data.frame from one Ingenuity IPA enrichment test.
vector of regular expression patterns used to recognize the name of the enriched entity, for example the biological pathway, or network, or disease category, etc.
regular expression pattern used to recognize the column containing genes, or the molecules tested for enrichment which were found in the enriched entity.
vector of patterns and replacements, respectively, used to curate values in the gene column. These replacement rules are used to ensure that genes are delimited consistently, with no leading or trailing delimiters.
logical indicating whether to print verbose output.
additional arguments are ignored.
This function is intended to help curate colnames observed in Ingenuity IPA enrichment data. The IPA enrichment data includes multiple types of enrichment tests, each with slightly different column headers. This function is intended to make the colnames more consistent.
This function will rename the first recognized gene colname
to "geneNames"
for consistency with downstream analyses.
The values in the recognized gene colname are curated
using geneCurateFrom,geneCurateTo
for multiple
pattern-replacement substitutions. This mechanism is
used to ensure consistent delimiters and values used
for each enrichment table.
Any colname matching "-log.*p.value"
is considered
-log10 P-value, and is converted to normal P-values
for consistency with downstream analyses.
Any recognized P-value column is renamed to "P-value"
for consistency with downstream analyses.
When the recognized P-value column contains a range,
for example "0.00017-0.0023"
, the lower P-value is
chosen. In that case, the higher P-value is stored in
a new column "max P-value"
. P-value ranges
are reported in the disease category analysis by
Ingenuity IPA, after collating individual pathways
by disease category and storing the range of enrichment
P-values.
Other jam import functions:
importIPAenrichment()