MultiEnrichment Heatmap of Genes and Pathways
Usage
mem_gene_path_heatmap(
mem,
genes = NULL,
sets = NULL,
min_gene_ct = 1,
min_set_ct = 1,
min_set_ct_each = 4,
column_fontsize = NULL,
column_cex = 1,
row_fontsize = NULL,
row_cex = 1,
row_method = "binary",
column_method = "binary",
enrich_im_weight = 0.3,
gene_im_weight = 0.5,
gene_annotations = c("im", "direction"),
annotation_suffix = c(im = "hit", direction = "dir"),
simple_anno_size = grid::unit(6, "mm"),
cluster_columns = NULL,
cluster_rows = NULL,
cluster_row_slices = TRUE,
cluster_column_slices = TRUE,
name = NULL,
p_cutoff = mem$p_cutoff,
p_floor = 1e-10,
row_split = NULL,
column_split = NULL,
auto_split = TRUE,
column_title = LETTERS,
row_title = letters,
row_title_rot = 0,
colorize_by_gene = TRUE,
na_col = "white",
rotate_heatmap = FALSE,
colramp = "Reds",
column_names_max_height = grid::unit(180, "mm"),
column_names_rot = 90,
show_gene_legend = FALSE,
show_pathway_legend = TRUE,
show_heatmap_legend = 8,
use_raster = FALSE,
seed = 123,
verbose = FALSE,
...
)Arguments
- mem
listobject created bymultiEnrichMap(). Specifically the object is expected to containcolorV,enrichIM,memIM,geneIM.- genes
character vector of genes to include in the heatmap, all other genes will be excluded.
- sets
character vector of sets (pathways) to include in the heatmap, all other sets will be excluded.
- min_gene_ct
minimum number of occurrences of each gene across the pathways, all other genes are excluded.
- min_set_ct
minimum number of genes required for each set, all other sets are excluded.
- min_set_ct_each
minimum number of genes required for each set, required for at least one enrichment test.
- column_fontsize, row_fontsize
numericpassed asfontsizetoComplexHeatmap::Heatmap()to define a specific fontsize for column and row labels. WhenNULLthe nrow/ncol of the heatmap are used to infer a reasonable starting point fontsize, which can be adjusted withcolumn_cexandrow_cex.- row_method, column_method
character string of the distance method to use for row and column clustering. The clustering is performed by
amap::hcluster().- enrich_im_weight
numericvalue between 0 and 1 (default 0.3), the relative weight of enrichment-log10 P-valueand overall gene-pathway incidence matrix when clustering pathways.When
enrich_im_weight=0then only the gene-pathway incidence matrix is used for pathway clustering.When
enrich_im_weight=1then only the pathway significance (-log10 P-value) is used for pathway clustering.The default
enrich_im_weight=0.3balances the combination of the enrichment P-value matrix, with the gene-pathway incidence matrix.
- gene_im_weight
numericvalue between 0 and 1 (default 0.5), the relative weight of themem$geneIMgene incidence matrix, and overall gene-pathway incidence matrix when clustering genes.When
gene_im_weight=0then only the gene-pathway incidence matrix is used for gene clustering.When
gene_im_weight=1then only the gene incidence matrix (mem$geneIM) is used for gene clustering.The default
_im_weight=0.5balances the gene incidence matrix with the gene-pathway incidence matrix, giving each matrix equal weight (since values are typically all(0, 1).
- gene_annotations
characterstring indicating which annotation(s) to display alongside the gene axis of the heatmap. By default it uses"im", "direction", and"direction"is removed whenmem$geneIMdirectionis not available."im"displays the gene incidence matrixmem$geneIMusing categorical colors defined bymem$colorV."direction"displays the gene directionalitymem$geneIMdirectionusing colors defined bycolorjam::col_div_xf(1.2).When no values are given, the gene annotation is not displayed.
When two values are given, the annotations are displayed in the order they are provided.
- annotation_suffix
charactervector named by values permitted bygene_annotations, with optional suffix to add to the annotation labels. For example it may be helpful to add "hit" or "dir" to distinguish the enrichment labels.- name
character value passed to
ComplexHeatmap::Heatmap(), used as a label above the heatmap color legend.- p_cutoff
numeric value of the enrichment P-value cutoff, above which P-values are not colored, and are therefore white. The enrichment P-values are displayed as an annotated heatmap at the top of the main heatmap. Any cell that has a color meets at least the minimum P-value threshold. This value by default is taken from input
mem, usingmem$p_cutoff, for consistency with the input multienrichment analysis.- column_split, row_split
optional arguments passed to
ComplexHeatmap::Heatmap()to split the heatmap by columns or rows, respectively.when
row_splitisNULLandauto_split=TRUE, it will determine an appropriate number of clusters based upon the number of rows. To turn off rowsplit, userow_split=NULLorrow_split=0orrow_split=1; likewise forcolumn_split.when
row_splitorcolumn_splitare supplied as a named vector, the names are aligned withsetsto be displayed in the heatmap, and will use theintersect()of the two. When data is clustered,cluster_row_slices=FALSEandcluster_column_slices=FALSEsuch that the dendrogram will be broken into separate pieces.
- column_title
optional character string with title to display above the heatmap.
- row_title
optional character string with title to display beside the heatmap. Note when
row_splitis defined, therow_titleis applied to each heatmap section.- row_title_rot
numericvalue indicating the rotation ofrow_titletext, where0is not rotated, and90is rotated 90 degrees.- colorize_by_gene
logicalindicating whether to color the main heatmap body using the colors fromgeneIMwhich represents each enrichment in which a given gene is involved. Colors are blended usingcolorjam::blend_colors(), using colors frommem$colorV, applied tomem$geneIM.- na_col
characterstring indicating the color to use for NA or missing values. Typically this argument is only used whencolorize_by_gene=TRUE, where entries with no color are recognized asNAbyComplexHeatmap::Heatmap().- rotate_heatmap
logicalindicating whether the entire heatmap should be rotated so that pathway names are displayed as rows, and genes as columns. When enabled, arguments referring to columns and rows are flipped, so "column" arguments will continue to affect pathways/sets, and "row" arguments will continue to affect genes. This includescolumn_methodandrow_methodas of 0.0.90.900.Exceptions:
row_title_rotis only applied to rows, due to its purpose.column_names_rotis only applied to columns, also due to its purpose.
- colramp
charactername of color, color gradient, or a vector of colors, anything compatible with input tojamba::getColorRamp().- column_names_max_height
grid::unitpassed toComplexHeatmap::Heatmap(). When supplied asnumericit is converted to units in "mm".- column_names_rot
numericpassed toComplexHeatmap::Heatmap().- show_gene_legend, show_pathway_legend
logicalwhether to show the gene IM and pathway IM legends, respectively.The gene IM legend is
FALSEby default, since it only describes the color used for each column, and is somewhat redundant with the pathway IM legend.The pathway IM legend displays the color scale including the range of enrichment P-values colorized.
- show_heatmap_legend
numericorlogical, (default 8) with the maximum number of labels to use for the heatmap color legend. The heatmap color legend includes all the the possible blended colors based upon the gene IM data.When
logical,TRUEis converted to8by default.When there are more legend items than than
show_heatmap_legendthe color legend will only display singlet colors.
- use_raster
logicalpassed toComplexHeatmap::Heatmap(), (default TRUE), indicating whether to rasterize the heatmap body. If the heatmap appears too blurry, useFALSEwhich will render each heatmap cell individually. For very large heatmaps this can create a very large PDF file size, and may introduce visual artifacts if the output dimensions are smaller than 1 cell per pixel.- seed
numericvalue passed toset.seed()to define a reproducible random seed.- verbose
logicalindicating whether to print verbose output.- ...
additional arguments are passed to
ComplexHeatmap::Heatmap()for customization. However, if...causes an error, the sameComplexHeatmap::Heatmap()function is called without..., which is intended to allow overloading...for different functions.
Value
Heatmap object defined in ComplexHeatmap::Heatmap(), with
additional attributes:
"caption"- acharacterstring with caption that described the data dimensions and clustering parameters."caption_legendlist"- aComplexHeatmap::Legendsobject suitable to be included with Heatmap legends usingdraw(hm, annotation_legend_list=caption_legendlist), or drawn withgrid::grid.draw(caption_legendlist)."draw_caption"- afunctionthat will draw the caption in the bottom-left corner of the heatmap by default, to be called withattr(hm, "draw_caption")()ordraw_caption().
In addition, the returned object can be interrogated with two helper functions that help define the row and column clusters, and the exact order of labels as they appear in the heatmap.
jamba::heatmap_row_order()- returns alistof vectors of rownames in the order they appear in the heatmap, with list names defined by row split.jamba::heatmap_column_order()- returns alistof vectors of colnames in the order they appear in the heatmap, with list names defined by row split.
Details
This function takes the mem list output from
multiEnrichMap() and creates a gene-by-pathway incidence
matrix heatmap, using ComplexHeatmap::Heatmap().
It uses three basic sources of data to annotate the heatmap:
mem$memIMthe gene-set incidence matrixmem$geneIMthe gene incidence matrix by datasetmem$enrichIMthe pathway enrichment P-value matrix by dataset
It will try to estimate a reasonable number of column and row
splits in the dendrogram, based solely upon the number of
columns and rows. These guesses can be controlled with argument
column_split and row_split, respectively.
When pathways are filtered by min_gene_ct, min_set_ct,
and min_set_ct_each, the order of operations is as follows:
min_set_ct_each,min_set_ct- these filters are applied before filtering genes, in order to ensure all genes are present from the start.min_gene_ct- genes are filtered after pathway filtering, in order to remove pathways which were not deemed "significant" based upon the required number of genes. Only after those pathways are removed can the number of occurrences of each gene be judged appropriately.
See also
Other jam plot functions:
adjust_polygon_border(),
grid_with_title(),
jam_igraph(),
mem_enrichment_heatmap(),
mem_legend(),
mem_multienrichplot(),
mem_plot_folio(),
plot_layout_scale()