MultiEnrichment Heatmap of Genes and Pathways
mem_gene_path_heatmap(
mem,
genes = NULL,
sets = NULL,
min_gene_ct = 1,
min_set_ct = 1,
min_set_ct_each = 4,
column_fontsize = NULL,
column_cex = 1,
row_fontsize = NULL,
row_cex = 1,
row_method = "binary",
column_method = "binary",
enrich_im_weight = 0.3,
gene_im_weight = 0.5,
gene_annotations = c("im", "direction"),
annotation_suffix = c(im = "hit", direction = "dir"),
simple_anno_size = grid::unit(6, "mm"),
cluster_columns = NULL,
cluster_rows = NULL,
cluster_row_slices = TRUE,
cluster_column_slices = TRUE,
name = NULL,
p_cutoff = mem$p_cutoff,
p_floor = 1e-10,
row_split = NULL,
column_split = NULL,
auto_split = TRUE,
column_title = LETTERS,
row_title = letters,
row_title_rot = 0,
colorize_by_gene = TRUE,
na_col = "white",
rotate_heatmap = FALSE,
colramp = "Reds",
column_names_max_height = grid::unit(18, "cm"),
column_names_rot = 90,
show_gene_legend = FALSE,
show_pathway_legend = TRUE,
show_heatmap_legend = 8,
use_raster = FALSE,
seed = 123,
verbose = FALSE,
...
)
list
object created by multiEnrichMap()
. Specifically
the object is expected to contain colorV
, enrichIM
,
memIM
, geneIM
.
character vector of genes to include in the heatmap, all other genes will be excluded.
character vector of sets (pathways) to include in the heatmap, all other sets will be excluded.
minimum number of occurrences of each gene across the pathways, all other genes are excluded.
minimum number of genes required for each set, all other sets are excluded.
minimum number of genes required for each set, required for at least one enrichment test.
numeric
passed as fontsize
to ComplexHeatmap::Heatmap()
to define a specific fontsize for column and row labels. When
NULL
the nrow/ncol of the heatmap are used to infer a reasonable
starting point fontsize, which can be adjusted with column_cex
and row_cex
.
character string of the distance method
to use for row and column clustering. The clustering is performed
by amap::hcluster()
.
numeric
value between 0 and 1 (default 0.3),
the relative weight of enrichment -log10 P-value
and overall
gene-pathway incidence matrix when clustering pathways.
When enrich_im_weight=0
then only the gene-pathway incidence
matrix is used for pathway clustering.
When enrich_im_weight=1
then only the pathway significance
(-log10 P-value
) is used for pathway clustering.
The default enrich_im_weight=0.3
balances the combination
of the enrichment P-value matrix, with the gene-pathway incidence
matrix.
numeric
value between 0 and 1 (default 0.5),
the relative weight of the mem$geneIM
gene incidence matrix,
and overall gene-pathway incidence matrix when clustering genes.
When gene_im_weight=0
then only the gene-pathway incidence
matrix is used for gene clustering.
When gene_im_weight=1
then only the gene incidence matrix
(mem$geneIM
) is used for gene clustering.
The default _im_weight=0.5
balances the gene incidence matrix
with the gene-pathway incidence matrix, giving each matrix equal weight
(since values are typically all (0, 1)
.
character
string indicating which annotation(s)
to display alongside the gene axis of the heatmap.
By default it uses "im", "direction"
, and "direction"
is removed
when mem$geneIMdirection
is not available.
"im"
displays the gene incidence matrix mem$geneIM
using
categorical colors defined by mem$colorV
.
"direction"
displays the gene directionality mem$geneIMdirection
using colors defined by colorjam::col_div_xf(1.2)
.
When no values are given, the gene annotation is not displayed.
When two values are given, the annotations are displayed in the order they are provided.
character
vector named by values permitted
by gene_annotations
, with optional suffix to add to the annotation
labels. For example it may be helpful to add "hit" or "dir" to
distinguish the enrichment labels.
character value passed to ComplexHeatmap::Heatmap()
,
used as a label above the heatmap color legend.
numeric value of the enrichment P-value cutoff,
above which P-values are not colored, and are therefore white.
The enrichment P-values are displayed as an annotated heatmap
at the top of the main heatmap. Any cell that has a color meets
at least the minimum P-value threshold. This value by default
is taken from input mem
, using mem$p_cutoff
, for
consistency with the input multienrichment analysis.
optional arguments passed to
ComplexHeatmap::Heatmap()
to split the heatmap by columns
or rows, respectively.
when row_split
is NULL
and auto_split=TRUE
, it will determine an appropriate number
of clusters based upon the number of rows. To turn off row split
,
use row_split=NULL
or row_split=0
or row_split=1
;
likewise for column_split
.
when row_split
or column_split
are supplied as a named
vector, the names are aligned with sets
to be displayed
in the heatmap, and will use the intersect()
of the two.
When data is clustered, cluster_row_slices=FALSE
and
cluster_column_slices=FALSE
such that the dendrogram will
be broken into separate pieces.
optional character string with title to display above the heatmap.
optional character string with title to display
beside the heatmap. Note when row_split
is defined, the
row_title
is applied to each heatmap section.
numeric
value indicating the rotation of
row_title
text, where 0
is not rotated, and 90
is rotated
90 degrees.
logical
indicating whether to color the
main heatmap body using the colors from geneIM
which represents
each enrichment in which a given gene is involved. Colors are
blended using colorjam::blend_colors()
, using colors from
mem$colorV
, applied to mem$geneIM
.
character
string indicating the color to use for
NA or missing values. Typically this argument is only used
when colorize_by_gene=TRUE
, where entries with no color are
recognized as NA
by ComplexHeatmap::Heatmap()
.
logical
indicating whether the entire heatmap
should be rotated so that pathway names are displayed as rows,
and genes as columns. Notes on how arguments are applied to rows
and columns:
Column arguments applied to rows:
column_split
, column_title
, cluster_columns
,
column_fontsize
, column_cex
are applied to rows since they refer to pathway data;
Row arguments applied to columns:
row_split
, row_title
, cluster_rows
, row_fontsize
, row_cex
are applied to columns since they refer to gene data;
Arguments applied directly to columns:
column_method
, column_title_rot
are applied directly to heatmap columns since they
refer to the output heatmap options.
Arguments applied directly to rows:
row_method
, row_title_rot
are applied directly to heatmap rows since they
refer to the output heatmap options.
character
name of color, color gradient, or a
vector of colors, anything compatible with input to
jamba::getColorRamp()
.
numeric
value passed to set.seed()
to allow
reproducible results, typically with clustering operations.
logical
indicating whether to print verbose output.
additional arguments are passed to ComplexHeatmap::Heatmap()
for customization. However, if ...
causes an error, the same
ComplexHeatmap::Heatmap()
function is called without ...
,
which is intended to allow overloading ...
for different
functions.
Heatmap
object defined in ComplexHeatmap::Heatmap()
, with
two additional attributes:
"caption"
- a character
string with important clustering settings.
"draw_caption"
- a function
that will draw the caption in the
bottom-left corner of the heatmap, calling
ComplexHeatmap::grid.textbox()
. This function should be called
with no parameters, for example:
attr(hm, "draw_caption")()
In addition, the returned object can be interrogated with two helper functions that help define the row and column clusters, and the exact order of labels as they appear in the heatmap.
jamba::heatmap_row_order()
- returns a list
of vectors of
rownames in the order they appear in the heatmap, with list names
defined by row split.
jamba::heatmap_column_order()
- returns a list
of vectors of
colnames in the order they appear in the heatmap, with list names
defined by row split.
This function takes the mem
list output from
multiEnrichMap()
and creates a gene-by-pathway incidence
matrix heatmap, using ComplexHeatmap::Heatmap()
.
It uses three basic sources of data to annotate the heatmap:
mem$memIM
the gene-set incidence matrix
mem$geneIM
the gene incidence matrix by dataset
mem$enrichIM
the pathway enrichment P-value matrix by dataset
It will try to estimate a reasonable number of column and row
splits in the dendrogram, based solely upon the number of
columns and rows. These guesses can be controlled with argument
column_split
and row_split
, respectively.
When pathways are filtered by min_gene_ct
, min_set_ct
,
and min_set_ct_each
, the order of operations is as follows:
min_set_ct_each
, min_set_ct
- these filters are applied
before filtering genes, in order to ensure all genes are present
from the start.
min_gene_ct
- genes are filtered after pathway filtering,
in order to remove pathways which were not deemed "significant"
based upon the required number of genes. Only after those pathways
are removed can the number of occurrences of each gene be judged
appropriately.
Other jam plot functions:
adjust_polygon_border()
,
grid_with_title()
,
jam_igraph()
,
mem_enrichment_heatmap()
,
mem_legend()
,
mem_multienrichplot()
,
mem_plot_folio()
,
plot_layout_scale()