Skip to contents

Import Ingenuity IPA data

This document describes steps recommended for using Ingenuity Pathway Analysis (IPA) enrichment data.

Ingenuity IPA enrichment results should be exported from the IPA app:

  • Open an IPA pathway analysis result.

  • Click "Export All" at the top-right of the menu bar.

  • Choose either “Text .txt” or “Excel”.

    • The Excel file must be.xlsx format.
  • Save each enrichment result to a separate file.

This workflow demonstrates the import process using two IPA enrichment files used by Reese et al 2019 https://doi.org/10.1016/j.jaci.2018.11.043 to compare enrichment results in newborns to older children.

Import IPA data

To import an IPA text file, use importIPAenrichment(). It works the same when importing Excel .xlsx format.

newborn_txt <- system.file("extdata",
   "Newborns-IPA.txt",
   package="multienrichjam");
newborn_dfl <- importIPAenrichment(newborn_txt);

The result is a list, named by the IPA analysis. Each element contains one data.frame with analysis results. Shown below is a summary of results, with number of rows and columns, created with jamba::sdim().

sdim(newborn_dfl);
#>                            rows cols      class
#> Canonical Pathways          113    8 data.frame
#> Upstream Regulators         117    7 data.frame
#> Diseases and Bio Functions  444    8 data.frame
#> Tox Functions                15    8 data.frame
#> Networks                      8    8 data.frame
#> Tox Lists                    19    7 data.frame
#> Analysis Ready Molecules     41    3 data.frame

In multienrichjam, you may want to analyze multiple IPA analyses. The example below uses lapply() to import multiple IPA files.

newborn_txt <- system.file("extdata",
   "Newborns-IPA.txt",
   package="multienrichjam");
olderchildren_txt <- system.file("extdata",
   "OlderChildren-IPA.txt",
   package="multienrichjam");

ipa_files <- c(Newborns=newborn_txt,
   OlderChildren=olderchildren_txt)

ipa_l <- lapply(ipa_files, importIPAenrichment);

A summary of the list of lists is shown below, using jamba::ssdim():

ssdim(ipa_l);
#> $Newborns
#>                            rows cols      class
#> Canonical Pathways          113    8 data.frame
#> Upstream Regulators         117    7 data.frame
#> Diseases and Bio Functions  444    8 data.frame
#> Tox Functions                15    8 data.frame
#> Networks                      8    8 data.frame
#> Tox Lists                    19    7 data.frame
#> Analysis Ready Molecules     41    3 data.frame
#> 
#> $OlderChildren
#>                            rows cols      class
#> Canonical Pathways          237    8 data.frame
#> Upstream Regulators         338    8 data.frame
#> Diseases and Bio Functions  500    8 data.frame
#> Tox Functions               118    8 data.frame
#> Networks                     10    8 data.frame
#> Tox Lists                    36    7 data.frame
#> Analysis Ready Molecules    162    3 data.frame

Analyze IPA enrichments from one enrichment test

IPA performs multiple types of analyis, and we recommend using one type for multienrichjam, starting with “Canonical Pathways”.

Other data available for use:

  • “Canonical Pathways:: IPA curated pathways (most common*).
  • “Upstream Regulators”: IPA curated regulators that are predicted to have ‘upstream’ effects in cell signaling.
  • “Diseases and Bio Functions”: IPA curated disease-associated pathways, which include category and sub-category annotations.
  • “Tox Functions”: IPA curated toxicity-associated pathways, which also include category and sub-category annotations.

“Analysis Ready Molecules”: is a data.frame that contains the IPA gene cross-reference, which stores what you called a gene, and what IPA recognized for their analysis.

  • The default revert_ipa_xref=TRUE will convert IPA gene symbol to your gene symbol as provided to IPA.
  • If you provided microarray or platform identifiers, such as Affymetrix '1007_s_at' or Agilent 'ID A_14_P109686', you may try revert_ipa_xref=FALSE, which will retain the IPA gene symbol.

Extract ‘Canonical Pathways’ from each IPA result:

## Take only the Ingenuity Canonical Pathways
enrichList_canonical <- lapply(ipa_l, function(i){
   i[["Canonical Pathways"]];
});
sdim(enrichList_canonical);
#>               rows cols      class
#> Newborns       113    8 data.frame
#> OlderChildren  237    8 data.frame

Convert to enrichResult (optional)

Each data.frame can be converted to enrichResult. It is not strictly necessary, but may be useful to use with functions related to clusterProfiler, for example ggtangle::cnetplot().

This option may be useful to review the conversion.

## Convert data.frame to enrichResult
## multienrichjam::enrichDF2enrichResult
er_canonical <- lapply(enrichList_canonical, function(i){
   enrichDF2enrichResult(i,
      keyColname="Name",
      pvalueColname="P-value",
      geneColname="geneNames",
      geneRatioColname="Ratio",
      pvalueCutoff=1)
});
sdim(er_canonical);
#>               rows cols        class
#> Newborns       113   12 enrichResult
#> OlderChildren  237   12 enrichResult
kable_coloring(
   head(as.data.frame(er_canonical[[1]])),
   caption="Top 10 rows of enrichment data",
   row.names=FALSE) %>%
   kableExtra::column_spec(column=seq_len(ncol(er_canonical[[1]])),
      border_left="1px solid #DDDDDD",
      extra_css="white-space: nowrap;")
Top 10 rows of enrichment data
ID Ingenuity Canonical Pathways -log(p-value) zScore GeneRatio geneID pvalue geneNames.ipa Description p.adjust Count setSize
Role of Macrophages, Fibroblasts and Endothelial Cells in Rheumatoid Arthritis Role of Macrophages, Fibroblasts and Endothelial Cells in Rheumatoid Arthritis 0.405 NaN 0.00321 TNFSF13B 0.3935501 TNFSF13B Role of Macrophages, Fibroblasts and Endothelial Cells in Rheumatoid Arthritis 0.3935501 1 312
Neuroinflammation Signaling Pathway Neuroinflammation Signaling Pathway 0.406 NaN 0.00322 CASP8 0.3926449 CASP8 Neuroinflammation Signaling Pathway 0.3926449 1 311
Sirtuin Signaling Pathway Sirtuin Signaling Pathway 0.428 NaN 0.00344 HIST1H1D 0.3732502 HIST1H1D Sirtuin Signaling Pathway 0.3732502 1 291
G-Protein Coupled Receptor Signaling G-Protein Coupled Receptor Signaling 0.447 NaN 0.00362 PRKAR2B 0.3572728 PRKAR2B G-Protein Coupled Receptor Signaling 0.3572728 1 276
Protein Ubiquitination Pathway Protein Ubiquitination Pathway 0.461 NaN 0.00377 TAP2 0.3459394 TAP2 Protein Ubiquitination Pathway 0.3459394 1 265
Signaling by Rho Family GTPases Signaling by Rho Family GTPases 0.478 NaN 0.00397 RDX 0.3326596 RDX Signaling by Rho Family GTPases 0.3326596 1 252

Run multiEnrichMap()

Now given a list of data.frame results, we can run multiEnrichMap():

mem_canonical <- multiEnrichMap(er_canonical,
   enrichBaseline=1,
   p_cutoff=0.05,
   topEnrichN=10)

Output is a list containing summary results.

kable_coloring(
   sdim(mem_canonical),
   caption="sdim(mem_canonical)") %>%
   kableExtra::column_spec(column=seq_len(4),
      border_left="1px solid #DDDDDD",
      extra_css="white-space: nowrap;")
sdim(mem_canonical)
rows cols class class_v2
enrichList 2 list NA
enrichLabels 2 character NA
colorV 2 character NA
geneHitList 2 list NA
geneHitIM 68 2 matrix array
memIM 22 11 matrix array
geneIM 22 2 matrix array
enrichIM 11 2 matrix array
multiEnrichDF 11 11 data.frame NA
multiEnrichResult 11 13 enrichResult NA
thresholds 5 list NA
headers 9 list NA
enrichIMcolors 11 2 matrix array
enrichIMdirection 11 2 matrix array
enrichIMgeneCount 11 2 matrix array
geneIMcolors 22 2 matrix array
geneIMdirection 22 2 matrix array

Mem Plot Folio

The mem_plot_folio() represents a key step in the analysis workflow. Several downstream results are directly dependent upon the options chosen here:

Pathway clusters are defined by analyst parameters:

  • The number of pathways clusters
  • The relative weight of the gene-pathway incidence matrix.
  • The method used for clustering.

Mem Plot Folio then provides a series of visualizations:

  1. Enrichment P-value heatmap often as a dot plot

  2. Gene-pathway heatmap, clustered by column and by row

  3. Cnet cluster plots

    1. Pathway clusters are labeled by LETTERS (“A”, “B”, “C”, “D”, etc.)
    2. The second plot labels clusters by the top n pathway names
    3. The third plot is (b) and hides the gene labels.
  4. Cnet exemplar plots

    • Includes 1 exemplar pathway per cluster.
    • Includes 2 exemplars per cluster.
    • Includes 3 exemplars per cluster.
  5. Cnet per cluster

    • One plot for each pathway cluster

Only the first four plots are shown by using do_which=c(1:4).

mem_canonical_plots <- mem_plot_folio(mem_canonical,
   pathway_column_split=4,
   column_cex=0.7,
   node_factor=1,
   use_shadowText=TRUE,
   label_factor=1.2,
   do_which=c(1:4),
   verbose=TRUE,
   main="Canonical Pathways");
#> ##  (13:52:35) 04Nov2025:   mem_plot_folio(): Gene-pathway heatmap (pre-emptive) 
#> ##  (13:52:35) 04Nov2025:   mem_plot_folio(): plot_num 1: Enrichment P-value Heatmap

Mem plot folio showing the first four plots

#> ##  (13:52:36) 04Nov2025:   mem_plot_folio(): Gene-pathway heatmap 
#> ##  (13:52:36) 04Nov2025:   mem_plot_folio(): plot_num 2: Gene-Pathway Heatmap

Mem plot folio showing the first four plots

#> ##  (13:52:37) 04Nov2025:   mem_plot_folio(): Defined 4 pathway clusters. 
#> ##  (13:52:37) 04Nov2025:   mem_plot_folio(): Preparing Cnet collapsed 
#> ##  (13:52:38) 04Nov2025:   mem_plot_folio(): subsetCnetIgraph() 
#> ##  (13:52:38) 04Nov2025:   mem_plot_folio(): plot_num 3: Cnet collapsed with gene and cluster labels

Mem plot folio showing the first four plots

#> ##  (13:52:39) 04Nov2025:   mem_plot_folio(): plot_num 4: Cnet collapsed with gene and set labels

Mem plot folio showing the first four plots

The object returned mem_canonical_plots is a list of the graphical objects.

Customing Mem Plots

Cnet Cluster Plot

The Cnet Cluster Plot is often the focus of manuscript figures. The typical workflow is demonstrated below.

# generate the data
mpf4 <- mem_plot_folio(mem_canonical,
   do_which=c(4),
   do_plot=FALSE)

# extract the cnet
cnet <- mpf4$cnet_collapsed_set;

# jam_graph
jam_igraph(cnet,
   node_factor=2,
   use_shadowText=TRUE,
   label_factor_l=list(nodeType=c(Gene=2, Set=1)))

Cnet cluster network extracted from mem_plot_folio() to use for custom figures.

ShinyCat for Custom Cnet Layout

The R-shiny Cnet Adjustment Tool ShinyCat is intended to help polish the Cnet plot layout when making a final figure.

The R-shiny app uses several functions:

Make sure to assign the output to a variable, or to click “Save RData” from within the R-shiny app. For example:

output_env <- launch_shinycat(g=cnet)

The output is stored in an environment called output_env.

# obtain the output data
adj_cnet <- output_env$adj_cnet;

Then the new Cnet plot can be plotted, for example:

# jam_graph
jam_igraph(adj_cnet,
   node_factor=2,
   use_shadowText=TRUE,
   label_factor_l=list(nodeType=c(Gene=2, Set=1)))

Enrichment P-value Heatmap

mem_enrichment_heatmap() produces a heatmap with Enrichment versus pathway, with -log10(P-value) in the heatmap.

It is also provided by:
mem_plot_folio(mem, do_which=1)

Argument p_cutoff is used to set the Pvalue, by default it inherits the same threshold from the data provided. Cells are only shaded with the P-value is below the threshold, making it clear which entries are significant. below which cells are colorized – every P-value above this threshold is not colored, and displayed as white, even when the P-value is less than 1.

mem_enrichment_heatmap(mem_canonical,
   p_cutoff=0.05);

Enrichment heatmap shown as a dotplot to indicate the number of genes involved.

The same data can be plotted as a heatmap.

mem_enrichment_heatmap(mem_canonical,
   style="heatmap",
   p_cutoff=0.05);

Enrichment heatmap showing the heatmap style, without dot plot.

Argument color_by_column=TRUE applies the color gradient to each column, using colorV colors defined in from multiEnrichMap().

memhm <- mem_enrichment_heatmap(mem_canonical,
   style="heatmap",
   color_by_column=TRUE);

Enrichment heatmap, colorized by column, showing an alternative style.Enrichment heatmap, colorized by column, showing an alternative style.

Gene-Pathway Heatmap

We can view the pathway-gene matrix using the function

mem_gene_path_heatmap() produces a heatmap of the pathway-gene incidence matrix. This heatmap is the core of multienrichjam.

It is also provided by:
mem_plot_folio(mem, do_which=2)

The function will estimate the number of pathway clusters, but can be customized:

  • column_split=3 will produce 3 pathway clusters.
  • row_split=10 will produce 10 gene clusters.

Colors across the top of the heatmap indicate enrichment P-values.

Colors on the left of the heatmap indicate which genes were present in each enrichment test. When directional gene hits are provided, the left of the heatmap will also indicate directionality.

hm <- mem_gene_path_heatmap(mem_canonical,
   column_cex=0.5,
   row_cex=0.6);
ComplexHeatmap::draw(hm,
   merge_legends=TRUE)

Gene-pathway heatmap drawn specifically with mem_gene_path_heatmap().

As a follow-up analysis, you can pull out each pathway cluster from the heatmap itself, using heatmap_column_order():

hm_sets <- heatmap_column_order(hm);
hm_sets;
#> $A
#>           Glioma Signaling   Growth Hormone Signaling 
#>         "Glioma Signaling" "Growth Hormone Signaling" 
#> 
#> $B
#>       Hepatic Cholestasis   Fc Epsilon RI Signaling          p70S6K Signaling 
#>     "Hepatic Cholestasis" "Fc Epsilon RI Signaling"        "p70S6K Signaling" 
#> 
#> $C
#>    mTOR Signaling   HIPPO signaling 
#>  "mTOR Signaling" "HIPPO signaling"

Full Cnet plot

The Concept network (Cnet) plot shows every pathway-gene relationship.

The helper function memIM2cnet() creates a Cnet plot from the mem_canonical output. Here, we also pipe the result through other helper functions:

#cnet <- mem_canonical$multiCnetPlot1b;
cnet <- mem_canonical %>% 
   mem2cnet()
   # memIM2cnet() %>%
   # fixSetLabels() %>%
   # removeIgraphBlanks() %>%
   # relayout_with_qfr();

withr::with_par(list(mar=c(1, 1, 1, 1)+0.1), {
   jam_igraph(cnet,
      use_shadowText=TRUE,
      node_factor=0.5,
      vertex.label.cex=0.6);
   mem_legend(mem_canonical);
})

Full Cnet plot, for all pathways and genes.

Extract the largest connected subnetwork.

cnet_largest_sub <- subset_igraph_components(cnet, keep=1)

jam_igraph(cnet_largest_sub,
   use_shadowText=TRUE,
   label_factor=0.5,
   node_factor=0.5);

Cnet plot showing the largest connected sub-network.

Subset Cnet by Cluster

Subset the pathway nodes with subsetCnetIgraph(), using hm_sets defined above.

cnet_sub <- subsetCnetIgraph(cnet,
   repulse=3.5,
   includeSets=unlist(hm_sets[c("A")]));
jam_igraph(cnet_sub,
   node_factor=1,
   use_shadowText=TRUE,
   label_dist_factor=3,
   label_factor=1.3);
mem_legend(mem_canonical);

Cnet plot showing a specific Cnet cluster.

Subset Cnet Options

Subset the pathway nodes with subsetCnetIgraph(), using a custom subset of pathways.

Alternatively, subset by other network attributes:

  • minSetDegree=6: pathways with at least 6 genes
  • minGeneDegree=2: genes present in 2 or more pathways (not used here).

Other useful defaults:

  • remove_singlets=TRUE: remove singlet nodes with no connections.
  • force_relayout=TRUE: re-calculated the layout.
  • do_reorder=TRUE: re-order nodes by color.
  • spread_labels=TRUE: re-position labels away from incoming edges
  • remove_blanks=FALSE: optionally remove blank colors from pie nodes.
cnet3 <- multienrichjam::subsetCnetIgraph(cnet,
   repulse=5,
   minSetDegree=6,
   minGeneDegree=1);
jam_igraph(cnet3,
   node_factor=0.7,
   use_shadowText=TRUE);
mem_legend(mem_canonical);

Subset Cnet plot using a specific set of pathways.

Multi-Enrichment Map

The “Multi Enrichment Map” itself can be view using mem2emap().

This network connects pathways when they meet a Jaccard overlap coefficient threshold based upon the shared genes between the pathways.

The default 0.2 is stored in the ‘Mem’ object mem_canonical.

emap <- mem2emap(mem_canonical)

jam_igraph(emap,
   node_factor=2,
   use_shadowText=TRUE)
title(main="overlap=0.2")

Multi-enrichment network creating using mem2emap(), using the default overlap threshold 0.2.

You can provide the Jaccard overlap threshold directly, with argument overlap. Values should be between 0 and 1.

A reasonable threshold can be estimated with mem_find_overlap(), which determines an intermediate level of connectivity, and should be a solid starting point for future adjustments.

use_overlap <- mem_find_overlap(mem_canonical);

emap2 <- mem2emap(mem_canonical,
   overlap=use_overlap)

jam_igraph(emap2,
   node_factor=3,
   use_shadowText=TRUE)
title(main=paste0("overlap=", use_overlap))

Multi-enrichment network shown after using alternative overlap threshold.

Notice there are distinct subnetworks, called “components”, which are not connected to each other.

You can pull out a component with subset_igraph_components(). Components are ordered by size, largest to smallest, so you can keep the largest using argument keep=1, or the second largest with keep=2, and so on.

We also call two other helper functions:

  1. removeIgraphBlanks()

    • removes blank colors from multi-color nodes, such as pie nodes, or colored rectangle nodes.
    • It helps show only the remaining colors without the whitespace.
  2. relayout_with_qfr()

    • Fruchterman-Reingold layout, with argument repulse used to adjust the spacing between nodes.
    • Also updates other useful attributes, and spreads the node labels to reduce label overlaps.
## You can alternatively pull out any other component
g_sub <- subset_igraph_components(emap2, keep=1);

## Re-apply network layout, and remove blank colors
g_sub <- relayout_with_qfr(repulse=3.5,
   removeIgraphBlanks(g_sub))

## Plot
jam_igraph(g_sub,
   node_factor=3,
   label_factor=2,
   use_shadowText=TRUE)

Network plot showing the largest connected sub-network of the multi-enrichment network.

jam_igraph() to plot igraph

jam_igraph() is a customized igraph::plot(), with benefits:

  • edge_bundling="connections" (default) improves the rendering of edges by bundling edges from node clusters, so they are drawn with a bezier curve

  • use_shadowText=TRUE (optional) will draw labels with a contrasting border to improve legibility of text labels

  • rescale=FALSE (default) keeps the network layout aspect ratio instead of scaling the coordinates to fit the size. of the plot window. It also properly scales the node and edge sizes.

  • convenient resizing:

    • label_factor: adjusts label.cex by a multiplier
    • node_factor: adjusts node.size by a multiplier
    • edge_factor: adjusts edge.width by a multiplier
    • label_dist_factor re-scales the label.dist values by a multiplier

Simple resizing

Consider the following changes, demonstrated below:

  • node_factor=2: nodes 2x larger
  • edge_factor=2: edges 2x wider
  • label_factor=1.2: labels 20% larger
  • use_shadowText=TRUE: shadow text labels
  • label_dist_factor=5: label distance 5x farther from node center
jam_igraph(cnet3,
   node_factor=2,
   edge_factor=2,
   label_factor=1.2, 
   label_dist_factor=5,
   use_shadowText=TRUE)

Network plot created using jam_igraph() as an enhanced alternative to the default igraph plot function.

Colored edges

Edges can be colorized using the colors of the connecting nodes, a visual enhancement inspired by the Gephi network visualization tool. This process is performed using color_edges_by_nodes().

jam_igraph(color_edges_by_nodes(cnet3, alpha=0.7),
   edge_bundling="connections",
   # edge_factor=2,
   # node_factor=2,
   label_factor=1.2, 
   label_dist_factor=5,
   use_shadowText=TRUE)

Network plot is shown using edges colorized based upon the colors for the connected nodes.