Heatmap for SummarizedExperiment data
Usage
heatmap_se(
se,
sestats = NULL,
hm_name = NULL,
hm_title = NULL,
rows = NULL,
row_type = "rows",
column_type = "samples",
data_type = "expression",
correlation = FALSE,
assay_name = NULL,
contrast_names = NULL,
contrast_suffix = "",
cutoff_name = NULL,
alt_sestats = NULL,
alt_assay_name = assay_name,
alt_contrast_names = NULL,
alt_contrast_suffix = "",
alt_cutoff_name = NULL,
isamples = colnames(se),
normgroup_colname = NULL,
centerby_colnames = NULL,
controlSamples = NULL,
control_label = "",
controlFloor = NA,
naControlAction = c("na", "row", "floor", "min"),
naControlFloor = 0,
top_colnames = NULL,
top_annotation = NULL,
top_annotation_name_gp = grid::gpar(),
rowData_colnames = NULL,
left_annotation = NULL,
left_annotation_name_gp = grid::gpar(),
left_annotation_name_rot = 90,
right_annotation = NULL,
simple_anno_size = grid::unit(8, "mm"),
legend_title_gp = grid::gpar(fontsize = 10),
legend_labels_gp = grid::gpar(fontsize = 10),
legend_grid_cex = 1,
row_names_gp = NULL,
row_split = NULL,
row_subcluster = NULL,
row_title_rot = 0,
sample_color_list = NULL,
legend_at = NULL,
legend_labels = NULL,
subset_legend_colors = TRUE,
row_cex = 0.8,
column_cex = 1,
row_anno_fontsize = 11,
useMedian = FALSE,
show_row_names = NULL,
show_row_dend = length(rows) < 2000,
mark_rows = NULL,
mark_labels_gp = grid::gpar(),
column_title = character(0),
apply_hm_column_title = FALSE,
hm_title_buffer = 0,
show_heatmap_legend = TRUE,
show_top_legend = TRUE,
show_left_legend = TRUE,
legend_border_color = "black",
show_top_annotation_name = TRUE,
show_left_annotation_name = TRUE,
row_label_colname = NULL,
cluster_columns = FALSE,
cluster_column_slices = FALSE,
cluster_rows = function(x, ...) {
amap::hcluster(jamba::rmNA(naValue = 0, x), ...,
method = "euclidean", link = "ward")
},
cluster_row_slices = FALSE,
column_names_gp = NULL,
column_split = NULL,
column_split_sep = ",",
color_max = 3,
color_floor = 0,
lens = 2,
rename_contrasts = TRUE,
rename_alt_contrasts = TRUE,
use_raster = TRUE,
verbose = FALSE,
debug = FALSE,
...
)
Arguments
- se
SummarizedExperiment
by default, or one of the following:SummarizedExperiment
with accessor functionsrowData()
,colData()
, andassays()
. It will usevalues(rowRanges())
if no slotrowData
exists.SingleCellExperiment
with accessor functionsrowData()
,colData()
, andassays()
. It will usevalues(rowRanges())
if no slotrowData
exists.Seurat
object, which is coerced toSingleCellExperiment
and handled accordinglyExpressionSet
or compatible object with accessor functionsfeatureData()
,phenoData()
, andassayData()
.
- sestats
one of the following types of data:
list
output fromse_contrast_stats()
, which specifically containshit_array
as a 3-dimensional array of hits with dimensions "Cutoffs", "Contrasts", "Signal".numeric
matrix intended to represent an incidence matrix, where a value0
indicates absence, and non-zero indicates presence. This format is useful for supplying any incidence matrix, such as gene-by-pathway (for example Github package "jmw86069/multienrichjam" providesmem$memIM
with gene-by-pathway matrix), or gene-by-class (see Github package "jmw86069/pajam" for examples using ProteinAtlas protein classification, including membrane-bound, secreted, transcription factors, etc.), or any incidence matrix defined by Github "jmw86069/venndir" functionlist2im_value()
orlist2im()
which converts input to a Venn diagram into an incidence matrix.When
sestats
is supplied, data is converted to incidence matrix, then columns are matched withcontrast_names
. All rows with non-zero entry in those columns are included in the heatmap. Whenrows
is also supplied, then the intersection of incidence matrix rows androws
is displayed in the heatmap.Note that
alt_sestats
does not subset rows displayed in the heatmap.
- hm_name
character
string, orNULL
(default) which uses thedata_type
value. Note that the legend title uses thedata_type
, and is also used forhm_name
whenhm_name=NULL
. Thehm_name
is most useful to customize because this string is used as the prefix for grid graphical components, for example seen withComplexHeatmap::list_components()
. When two heatmaps or aHeatmapList
is drawn, the names can be used to define specific grid regions of each heatmap. If the heatmaps share the samehm_name
then the regions will also have identical name and cannot be addressed distinctly.- hm_title
character
string, orNULL
(default) which generates a heatmap title using the dimensions,assay_name
,data_type
, and a string which describes the data centering. When provided as acharacter
string, it is used as-is. (In future this value may accept variable names.)- rows
character
vector ofrownames(se)
to define a specific set of rows to display. Whensestats
is supplied, then the intersection ofrows
with genes defined bysestats
is displayed. Note that rows are required to be inrownames(se)
, all other rows are dropped.- row_type
character
string used in the title of the heatmap which indicates how many rows are displayed. For example"1,234 genes detected above background"
or"1,234 DEGs by limma-voom"
. Whenrow_type=""
orrow_type=NULL
this information is not included in the heatmap title.- column_type
character
string used in the title of the heatmap which indicates how many column are displayed. For example"12 samples"
or"12 biological replicates"
. Whencolumn_type=""
orcolumn_type=NULL
this information is not included in the heatmap title.- data_type
character
string used as title of the heatmap color gradient legend, for example"expression"
indicates the data contains gene expression measurements. Notes:The prefix
"centered"
is automatically appended whenever the data is also centered for the heatmap. Setcenterby_colnames=FALSE
to display data that is not centered.The prefix
"correlation of"
is automatically appended whencorrelation=TRUE
which displays correlation of whatever data is included in the heatmap.
- correlation
logical
indicating whether to calculate sample correlation, and plot a sample-by-sample correlation heatmap. This option is included here since many of the same arguments are required for data centering, and sample annotations. Note thatcolor_max
is forced to a maximum value of1.0
, representing the maximum correlation value.- assay_name
character
string indicating the name inassays(se)
to use for data to be displayed in the heatmap.When multiple
assay_name
values are supplied, the first assay_name that matchesnames(assays(se))
will be used in the heatmap. In this way, multipleassay_names
can be supplied to define statistical hits insestats
, which callshit_array_to_list()
to combine hits acrossassay_name
entries; but only the firstassay_name
found inse
is used for the heatmap values.When there is only one value for
assayNames(se)
, thenassay_name
will default to this value, instead of acting like it couldn't possibly know what was intended. Haha.Lastly,
assay_name
can be anumeric
index, helpful in caseassays(se)
contains no names - not recommended but it can happen.
- contrast_names
character
vector of contrasts insestats$hit_array
to use for the heatmap. Whencontrast_names=NULL
then all contrasts are displayed, which is the default.- contrast_suffix
character
string with optional suffix to append to the end of each contrast name label forsestats
hit incidence matrix beside the heatmap. This suffix may be useful when comparing two methods for the same set of contrast names, withsestats
andalt_sestats
.- cutoff_name
character
orinteger
index used to define the specific statistical cutoffs to use fromsestats$hit_array
. This argument is passed tohit_array_to_list()
ascutoff_names
.- alt_sestats, alt_assay_name, alt_contrast_names, alt_contrast_suffix
arguments analogous to those described above for
sestats
which are used whenalt_sestats
is supplied.- isamples
character
vector ofcolnames(se)
used to visualize a subset of samples used for the data centering step. Note that data centering uses all columns supplied inse
, and after centering, the subset of columns defined inisamples
is displayed in the heatmap. This distinction makes it possible to center data by some control group, then optionally not display the control group data.- normgroup_colname
character
vector of colnames incolData(se)
used during data centering. When supplied, samples are centered independently within each normgroup grouping. These values are equivalent to usingcenterby_colnames
.- centerby_colnames
either:
character
vector of colnames incolData(se)
used during data centering. When supplied, samples are centered independently within each centerby grouping. It is typically used for things like cell lines, to center each cell line by a time point control, or untreated control.NULL
to perform centering across all columns inse
.FALSE
to disable centering.
- controlSamples
character
optional vector of samples to use as the reference during data centering. Note that samples are still centered within each normgroup and centerby grouping, and within that grouping samples are centered to thecontrolSamples
which are present in that grouping. Any center group for which no samples are defined incontrolSamples
will use all samples in that center group. Typically,controlSamples
is used to define a specific group as the reference for centering, so changes are displayed relative to that group. Make sure to definecontrol_name
to include an appropriate label in the heatmap title.- control_label
character
string used in heatmap title to describe the control used during data centering, relevant whencontrolSamples
is also supplied. Recommended format:"versus Wildtype"
or"vs. Wildtype"
. The heatmap title will include data centering andcontrol_label
in this format:"centered within {centerby_colnames}, {control_label}"
, for example"centered within Genotype/Time, versus Vehicle"
.- controlFloor, naControlAction, naControlFloor
passed to
jamma::centerGeneData()
to customize data centering.controlFloor
imposes an optional noise floor to control group mean/median values, so the summary value during centering is at leastcontrolFloor
. Useful for defining an effective noise floor for a platform technology.naControlAction
defines the action taken only when values for all control samples areNA
.naControlFloor
is anumeric
value used whennaControlAction="floor"
, which causes the group reference value to use the value provided innaControlFloor
.
- top_colnames
one of the following types:
character
vector of colnames to use fromcolData(se)
as annotations to display intop_annotation
above the heatmap.NULL
, will callchoose_annotation_colnames()
to detect reasonable colnames: columns with more than one unique value; columns with at least one duplicated value.FALSE
will hide thetop_colnames
, which also occurs whencolData(se)
is empty.
- top_annotation
specific heatmap annotation as defined by
ComplexHeatmap::HeatmapAnnotation()
. When supplied, thetop_colnames
described above is not used.- top_annotation_name_gp
grid::gpar
object to customize the annotation name displayed beside the top annotation.- rowData_colnames
character
vector of colnames inrowData(se)
to use for heatmap annotations displayed on the left side of the heatmap. Specific colors can be included insample_color_list
as a namedlist
of color vectors or color functions. The names of this list must match colnames to be displayed, otherwiseComplexHeatmap::Heatmap()
will define its own color function.- left_annotation
specific heatmap annotation as defined by
ComplexHeatmap::rowAnnotation()
. When supplied, therowData_colnames
andsestats
row annotations are not displayed. In order to supply custom row annotations and not loseleft_annotation
defined above, supply the row annotations asright_annotation
.- left_annotation_name_gp
grid::gpar
object to customize the annotation name displayed beside the left annotation.- left_annotation_name_rot
numeric
rotation of left annotation label, in degrees, where0
indicates normal text, and90
is rotated vertically.- right_annotation
specific heatmap annotation as defined by
ComplexHeatmap::HeatmapAnnotation()
. This element is created automatically whenmark_rows
is supplied.- simple_anno_size
grid::unit
size used to define heatmap annotation sizes (height or width of each line) for any simple annotations.- legend_title_gp
grid::gpar
to customize the legend title fonts, applied to each legend: top annotation, left annotation, main heatmap.- legend_labels_gp
grid::gpar
to customize the legend label fonts, applied to each legend: top annotation, left annotation, main heatmap.- legend_grid_cex
numeric
multiplied to adjust the relative size of each legend grid unit, applied to each relevant metric.- row_names_gp
gpar
to define custom column name settings. When"fontsize"
is not defined, the automatic font size calculation is added to therow_names_gp
supplied.- row_split
is used to define heatmap split by row, ultimately passed to
ComplexHeatmap::Heatmap()
argumentrow_split
. However, the input type can vary:integer
number of row splits based upon row clustering. Ifrow_split
is greater than the number of rows, it will be set to the number of rows.character
value or values in colnames ofrowData(se)
to split using row annotation inse
.data.frame
whoserownames()
must contain all rows to be displayed in the heatmap. This argument is passed directly toComplexHeatmap::Heatmap()
to apply the split appropriately.character
orfactor
vector named byrownames(se)
with another custom row split, passed directly toComplexHeatmap::Heatmap()
argumentrow_split
, with proper order for rows being displayed
- row_subcluster
integer
orcharacter
vector representing one or more elements returned byrow_split
to use as a drill-down sub-cluster heatmap. This argument is experimental, and is intended to make it easy to "drill down" into specific row clusters.The process internally creates a full heatmap using all arguments as defined, then extracts the
jamba::heatmap_row_order()
which contains row split data in alist
of rownames vectors. Thelist
elements that matchrow_subcluster
are extracted and used again for a subsequent heatmap, and are displayed in the same order in which they appear in the original full heatmap - which meanscluster_rows=FALSE
is defined at this point. Howeverrow_split
is retained for this subset of rows, to indicate the original row split annotation.Note that
row_subcluster
must match thenames()
returned byjamba::heatmap_row_order()
for the full heatmap, or should include anumeric
index for thelist
element or elements to use.In principle this process would be run in two stages: First, view a heatmap with
row_split=6
, then re-run the same heatmap withrow_subcluster=4
to see cluster number 4 from the full heatmap.
- row_title_rot
numeric
value indicating text rotation in degrees to use for row titles.- sample_color_list
named
list
of color vectors or color functions, where names correspond to colnames in eithercolData(se)
orrowData(se)
, and which are passed to corresponding left or top annotation functions. When colors are not defined,ComplexHeatmap::Heatmap()
will define colors using its own internal function.- legend_at, legend_labels
numeric
andcharacter
, respectively, to define custom values for the heatmap color gradient legend.When
legend_at
is supplied, it is used as provided.When
legend_labels
is supplied, it is used only when its length equalslength(legend_at)
, in which case it is used as provided.When
centerby_colnames=FALSE
and the matrix data does not contain negative values,legend_at
uses integers from0
tocolor_max
, to avoid presenting a color legend with unnecessary negative values. However, whencolor_max <= 1
it usespretty(c(0, color_max))
, removing extraneous values, then ensuring the maximum value iscolor_max
. For example whencolor_max=0.85
, thelegend_at
is likely to bec(0, 0.2, 0.4, 0.6, 0.8, 0.85)
.When
centerby_colnames
is notFALSE
, and/or data contains negative values, thelegend_at
is symmetric above and below zero. Whencolor_max <= 1
the label is created usingpretty(c(-color_max, color_max))
, as described above, socolor_max
is used as the minimum and maximum value. Whencolor_max > 1
thelegend_at
uses integer steps.When
color_max <= 1
thelegend_labels
are presented as-is with no transformation.When
color_max > 1
thelegend_labels
are transformed withexp2signed(x)
which is the inverse oflog2(1 + x)
. This inverse tranform displays normal space values, in the case of centered data, the values represent normal space fold changes. For example thelegend_at=c(-2, -1, 0, 1, 2)
would result inlegend_labels=c("-4", "-2", "1", "2", "4")
.When
correlation=TRUE
thelegend_labels
by default uselegend_at
, following rules forcolor_max <= 1
above. Otherwise,legend_labels
values inverse transformed fromlog2(1 + x)
in order to display normal space fold change values,To override any of this behavior, supply both
legend_at
and correspondinglegend_labels
.
- subset_legend_colors
logical
indicating whether to subset colors shown in the color key defined bysample_color_list
, which is useful when the heatmap only represents a subset of categorical color values.When
subset_legend_colors == TRUE
, the color key will only include colors shown in thetop_annotation
.When
subset_legend_colors == FALSE
all colors defined insample_color_list
will be included for each relevant column.
- row_cex, column_cex
numeric
values used to adjust the row and column name font size, relative to the automatic adjustment that is already done based upon the number of rows and columns being displayed.- row_anno_fontsize
numeric
base font size for row annotation labels. This value is only used whenleft_annotation_name_gp
is not supplied. Note these labels appears underneath row annotations, alongside column labels, and therefore they are also adjusted by multiplyingcolumn_cex
so these labels are adjusted together.- useMedian
logical
passed tojamma::centerGeneData()
during data centering.- show_row_names, show_row_dend
logical
indicating whether to display row names, and row dendrogram, respectively. With more than 2,000 rows this step can become somewhat slow.- mark_rows
character
vector of values inrownames(se)
that should be labeled usingComplexHeatmap::anno_mark()
in call-out style. Usually this argument is used whenshow_row_names=FALSE
, hiding the row labels, but is not required. Values inmark_rows
are intersected with rows displayed in the heatmap, therefore only matching entries will be labeled.- mark_labels_gp
grid::gpar
to customize the font used by labels whenmark_rows
is supplied.- column_title
character
optional title to include at the top of the heatmap. It can include a single value, or multiple values representing eachcolumn_split
in the order they appear.Note: This argument is ignored when
apply_hm_column_title=TRUE
.When
column_title=character(0)
(default) orcolumn_title=""
, theComplexHeatmap::Heatmap()
uses its usual default behavior, which is to assigncolumn_title
usingcolumn_split
values when they are being used.
- apply_hm_column_title
logical
(default FALSE) whether to apply the heatmap title tocolumn_title
. This option makes it convenient to display the title atop the heatmap without additional effort, however it hides any othercolumn_title
created by usingcolumn_split
. When using bothcolumn_split
andapply_hm_column_title=TRUE
it may be useful to callheatmap_column_group_labels()
.- hm_title_buffer
numeric
number of whitespace lines to add to the heatmap title(attr(hm, "hm_title")
between the title and the heatmap below it. This whitespace can be useful when also callingheatmap_column_group_labels()
, to provide enough space to draw the additional annotations.- show_heatmap_legend, show_left_legend, show_top_legend
logical
indicating whether each legend should be displayed. Sometimes there are too many annotations, and the color legends can overwhelm the figure. Note thatshow_left_legend
is applied in a specific order, with these rules:show_left_legend
is extended to at least length 2, then values are used in order for:sestats
,rowData_colnames
, in order, using whichever is defined.If
sestats
is defined, the first value inshow_left_legend
is used for this annotation, then the remaining values are used forrowData_colnames
. Setting the firstshow_left_legend
value toFALSE
will ensure the legend forsestats
is not displayed.If
rowData_colnames
is defined, then the remaining values inshow_left_legend
are recycled for all columns inrowData_colnames
, and applied in order. In this way, individual columns can have the legend displayed or hidden.If
alt_sestats
is defined, the legend is always hidden, in favor of showing only the legend forsestats
without duplicating this legend.
- legend_border_color
character
color used as border color tofor be used as a border color for the various legend colors. Note this argument recognizes only the first color provided, and does not recycle different colors across the various legend borders.- show_top_annotation_name, show_left_annotation_name
logical
indicating whether to display the annotation name beside the top and left annotations, respectively.- row_label_colname
character
string used as a row label, where this value is a colname inrowData(se)
. It is useful when rownames are some identifier that is not user-friendly, and where another column in the data may provide a more helpful label, for example"SYMBOL"
to display gene symbol instead of accession number.- cluster_columns, cluster_rows
logical
indicating whether to cluster columns by hierarchical clustering; orfunction
with a specific function that produceshclust
ordendrogram
output, given anumeric
matrix. Note thatcluster_rows
default will replaceNA
values with zero0
to avoid errors with missing data, and usesamap::hcluster()
by default which is a one-step compiled process to perform distance calculation and hierarchical clustering.- column_names_gp
gpar
to define custom column name settings. When"fontsize"
is not defined, the automatic font size calculation is added to thecolumn_names_gp
supplied.- column_split
character
orinteger
vector used to define heatmap column split.- column_split_sep
character
string used as delimited whencolumn_split
defines multiple split levels.- color_max
numeric
value passed tocolorjam::col_div_xf()
which defines the upper limit of color gradient used in the heatmap.- color_floor
numeric
value passed tocolorjam::col_div_xf()
argumentfloor
which defines the minimum non-zero numeric value for a color to be applied. This option is available to prevent coloring values below thecolor_floor
which can be useful in some circumstances.- lens
numeric
value passed tocolorjam::col_div_xf()
to control the intensity of color gradient applied to the numeric range.- rename_contrasts, rename_alt_contrasts
logical
(default TRUE) whether to rename long contrast names insestats
andalt_sestats
usingcontrast2comp()
.- use_raster
logical
passed toComplexHeatmap::Heatmap()
to determine whether heatmaps should be converted to raster images, which effectively turns each heatmap panel into a single graphical object. Recommenduse_raster=TRUE
and also installing R packagemagick
which greatly enhances speed and quality of rasterized heatmap output. Whenmagick
is not available, it may be best to useuse_raster=FALSE
. Whenuse_raster=FALSE
each pixel square of a heatmap is its own graphical object. For heatmaps with very large dimensions, having each pixel as an object can make the heatmap extremely large in memory, and sometimes pixels can overlap others because the minimum pixel size of the output graphics device does not reflect the actual size of each pixel.- verbose
logical
indicating whether to print verbose output.- debug
logical
indicating debug mode, data is returned in alist
:hm
objectComplexHeatmap::Heatmap
top_annotation
objectComplexHeatmap::HeatmapAnnotation
for columnsleft_annotation
objectComplexHeatmap::HeatmapAnnotation
for rowshm_title
objectcharacter
string with the heatmap title.
- ...
additional arguments are passed to supporting functions.
Details
Note: Still a work in progress. This function is the basis for the majority of heatmaps created for Omics data.
This function is a bold attempt to simplify the intricate task
of creating an expression heatmap, using ComplexHeatmap::Heatmap()
,
given a SummarizedExperiment
object.
It attempts to enable:
selection of
assays(se)
to use in the heatmapuse of
rowData(se)
orcolData(se)
to produce row and column annotations, respectively.re-use of defined colors for annotations, see
platjam::design2colors()
define and adjust heatmap color gradient and scale
data centering by row: versus all columns, or specific controls, optionally within independent centering groups
filtering rows to show only the statistical hits
display annotation of statistical hits beside the heatmap
split rows or columns using
rowData(se)
andcolData(se)
, respectivelyheatmap title to display key options used, for easy reference
Additional Features
data centering can be disabled with
centerby_colnames=FALSE
.alternative hits can be displayed using
alt_sestats
. It does not subset heatmap rows, it inherits rows fromsestats
.display a subset of columns after row centering, useful to hide the control group for certain figures.
option to display correlation heatmap, using the same data centering, then calculates Pearson correlation across sample columns.
labels and legend grids can be customized to exact sizes with
grid::gpar()
andgrid::unit()
definitions, for manuscript figures.mark annotations option to label a subset of rows
row subclusters can be visualized using
row_subcluster
to drill down into specific subclusters from hierarchical clustering, k-means clustering, or anyrow_split
.
Data Centering
The intent is to display expression values from assays(se)
,
centered across all columns, or with customization defined by
centerby_colnames
and normgroup_colnames
. The resulting centered
data can be subsetted by argument isamples
, which occurs after
centering in order to decouple the centering step from the display
of resulting data. To subset samples involved in centering itself,
either subset the input se
data, or supply controlSamples
to
define a subset of samples used as the baseline in centering.
See jamma::centerGeneData()
for more details.
Paired data, also called repeated measures data, can be visualized
by including the pairing as centerby_colnames
so that centering
is calculated within each pairing subgroup. In this case if also using
controlSamples
to define a "time zero" or "baseline", then all
baseline samples will have exactly zero, if there is only one replicate
per pairing group at the baseline. In this case, it may be useful
to create the full heatmap once to confirm the centering is performing
as intended, then create a second heatmap using isamples
to show only
the non-baseline samples - thus removing the large chunk of values with 0.
Note: data centering can be disabled with centerby_colnames=FALSE
.
Heatmap Title
A heatmap title is returned as an attribute attr(hm, "hm_title")
,
which describes:
total rows displayed, with
row_type
indicating the measured entity (gene, probe, DEGs, etc.)total columns displayed, with
column_type
indicating the sampled entity (samples, total replicates, etc.)the
assay_name
for the data being displayedrelevant options for data centering, for example
"global-centered"
(by default) or"Centered within Cell Line, versus Wildtype"
To include the heatmap title:
ComplexHeatmap::draw(hm, column_title=attr(hm, "hm_title))
Top and Left Annotations
The top heatmap annotations use colData(se)
with user-supplied
top_colnames
or by auto-detecting those colnames that apply
to multiple colnames(se)
.
Colors can be supplied using argument sample_color_list
, as
described below.
The an incidence matrix of statistical hits can be displayed
on the left of the heatmap, using arguments sestats
and alt_sestats
.
These arguments can accept either the output of se_contrast_stats()
,
or they can be a numeric
matrix with values c(-1, 0, 1)
, indicating
statistical hits down, no change, and up, respectively.
The contrasts can optionally be subset with contrast_names
,
which corresponds to columns in the matrix if supplied in that format.
When sestats
is supplied, it will subset all heatmap rows to include
only rows with at least one non-zero value in the incidence matrix.
If argument rows
is supplied, then all rownames(se)
matching
rows
are displayed, regardless of statistical hits.
For comparison across other sestats
results, argument alt_sestats
is treated similar to sestats
except that the heatmap is not subset
based upon these values. That means the heatmap will be subset to
match hits defined by sestats
but not alt_sestats
.
The alt_sestats
incidence matrix is displayed to the far left
of the sestats
incidence matrix. For clarity, it can be useful to
add alt_sestats_suffix
to add a suffix to each contrast label,
for example if sestats
represents limma hits, use
sestats_suffix=" limma"
, and if alt_sestats
represents limma-voom
hits, use alt_sestats_suffix=" limmavoom"
.
Argument rowData_colnames
can be supplied, which enables display of
rowData(se)
annotations in the left_annotation
of the heatmap.
Colors can be supplied using argument sample_color_list
.
Argument sample_color_list
is a list
named by each annotation column
to be displayed as top or left annotation. Each list element is either:
a
character
vector of R colors named bycharacter
value, ora
function
defined bycirclize::colorRamp2()
to be applied fornumeric
column values. In this case thebreaks
used to define the color function are used to define the color legend.
The function platjam::design2colors()
can be used to create
sample_color_list
starting with a data.frame
of annotations,
and will soon be moved into this package.
A custom left_annotation
can be supplied, but this method currently
prevents the other annotations described above from being displayed.
To display automated annotations with rowData_colnames
and custom
row annotations, supply custom annotations with right_annotation
.
Note that annotations must be supplied in exact row order, which
is usually easiest when supplying rows
with specific set of rows.
Compatible Input Formats
Data provided in se
is expected to be SummarizedExperiment
, however
other Bioconductor data types are accepted that provide
accessor functions: featureData()
, phenoData()
, and assayData()
,
including for example the "MethyLumiSet"
class.
Note that matrix
input is currently not supported, however it can
be converted to SummarizedExperiment
like this:
se <- SummarizedExperiment::SummarizedExperiment(
assays=list(data=matrix),
rowData=data.frame(Gene=rownames(matrix)),
colData=data.frame(Sample=colnames(matrix)))
See also
Other jamses heatmaps:
detect_heatmap_components()
,
heatmap_column_group_labels()
Examples
se <- make_se_test(nrow=1000, ngroups=4, nreps=8)
# optionally define factor levels to force the order of labels
SummarizedExperiment::rowData(se)$Class <- factor(
sample(head(LETTERS, 5), size=nrow(se), replace=TRUE))
# basic heatmap
hm <- heatmap_se(se, rowData_colnames="Class")
# draw by printing hm, or call draw() to add useful options
ComplexHeatmap::draw(hm,
column_title=attr(hm, "hm_title"),
merge_legends=TRUE)
# define specific colors
sample_color_list <- list(
group=colorjam::group2colors(
unique(SummarizedExperiment::colData(se)$group)),
Class=colorjam::group2colors(
unique(SummarizedExperiment::rowData(se)$Class)))
heatmap_se(se,
rowData_colnames="Class",
sample_color_list=sample_color_list)
# split rows by "Class"
heatmap_se(se,
rowData_colnames="Class",
row_split="Class",
sample_color_list=sample_color_list)
# let's have some fun now
hm2 <- heatmap_se(se,
column_split=c("group"),
column_title_rot=90,
row_split=c("Class"),
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
hm2drawn <- ComplexHeatmap::draw(hm2, merge_legends=TRUE)
# as an example, extract the row order
# technically you should use hm2drawn, but usually hm2 is enough
hro <- jamba::heatmap_row_order(hm2drawn);
jamba::sdim(hro)
#> rows class
#> A 196 character
#> B 210 character
#> C 199 character
#> D 215 character
#> E 180 character
lapply(hro, head, 7)
#> $A
#> row_0620 row_0676 row_0417 row_0582 row_0858 row_0730 row_0847
#> "row_0620" "row_0676" "row_0417" "row_0582" "row_0858" "row_0730" "row_0847"
#>
#> $B
#> row_0546 row_0935 row_0996 row_0091 row_0172 row_0636 row_0225
#> "row_0546" "row_0935" "row_0996" "row_0091" "row_0172" "row_0636" "row_0225"
#>
#> $C
#> row_0966 row_0152 row_0324 row_0863 row_0535 row_0342 row_0450
#> "row_0966" "row_0152" "row_0324" "row_0863" "row_0535" "row_0342" "row_0450"
#>
#> $D
#> row_0959 row_0934 row_0162 row_0336 row_0911 row_0274 row_0603
#> "row_0959" "row_0934" "row_0162" "row_0336" "row_0911" "row_0274" "row_0603"
#>
#> $E
#> row_0331 row_0657 row_0930 row_0200 row_0378 row_0497 row_0133
#> "row_0331" "row_0657" "row_0930" "row_0200" "row_0378" "row_0497" "row_0133"
#>
# (the names will differ from values when `row_labels` are customized)
# center by WildType samples
# - controlSamples
# - control_label
hm2 <- heatmap_se(se,
controlSamples=rownames(subset(
SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
column_split=c("group"),
column_title_rot=90,
row_split=c("Class"),
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
hm2drawn <- ComplexHeatmap::draw(hm2,
column_title=attr(hm2, "hm_title"),
merge_legends=TRUE)
# add "callout" labels for a subset of rows
mark_rows <- c(sample(jamba::heatmap_row_order(hm2drawn)[[1]], size=5),
sample(jamba::heatmap_row_order(hm2drawn)[[1]], size=3));
# turn off ComplexHeatmap warning when using RStudio
ComplexHeatmap::ht_opt(message=FALSE)
hm3 <- heatmap_se(se,
mark_rows=mark_rows,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
column_split=c("group"),
column_title_rot=90,
row_split=c("Class"),
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
ComplexHeatmap::draw(hm3,
column_title=attr(hm3, "hm_title"),
merge_legends=TRUE)
# sestats can accept list, incidence matrix, hit_array, or sestats
# this example defines random set of hits
sestats_list <- list(
contrast1=setNames(sample(c(1, -1), replace=TRUE, size=50),
sample(rownames(se), size=50)),
contrast2=setNames(sample(c(1, -1), replace=TRUE, size=50),
sample(rownames(se), size=50)))
hm4 <- heatmap_se(se,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
sestats=sestats_list,
column_split=c("group"),
row_split=c("Class"),
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
ComplexHeatmap::draw(hm4,
column_title=attr(hm4, "hm_title"),
merge_legends=TRUE)
# it doesn't take much effort to run stats really quick
sedesign <- groups_to_sedesign(SummarizedExperiment::colData(se)[, "group", drop=FALSE])
contrast_names(sedesign) <- jamba::vigrep("-groupA", contrast_names(sedesign))
sestats <- se_contrast_stats(se=se,
fold_cutoff=4,
sedesign=sedesign, assay_name="counts")
hm4s <- heatmap_se(se,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
sestats=sestats,
column_split=c("group"),
row_split=6,
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
ComplexHeatmap::draw(hm4s,
column_title=attr(hm4s, "hm_title"),
merge_legends=TRUE)
# for fun, "drill down" into cluster 5
hm4s_4 <- heatmap_se(se,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
sestats=sestats,
column_split=c("group"),
row_split=6,
row_subcluster=4,
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
#> Warning: The heatmap has not been initialized. You might have different results
#> if you repeatedly execute this function, e.g. when row_km/column_km was
#> set. It is more suggested to do as `ht = draw(ht); row_order(ht)`.
ComplexHeatmap::draw(hm4s_4,
column_title=attr(hm4s_4, "hm_title"),
merge_legends=TRUE)
# sestats can be provided as an incidence matrix
if (jamba::check_pkg_installed("venndir")) {
# convert sestats to list
sestats_hitlist <- hit_array_to_list(sestats)
# convert sestats hitlist to incidence matrix
# - for fun, use only the first two contrasts
sestats_hitim <- venndir::list2im_value(sestats_hitlist[1:2])
print(head(sestats_hitim));
# convert sestats_list to signed incidence matrix
sestats_im <- venndir::list2im_value(sestats_list)
print(head(sestats_im, 10));
# if the list has items (no direction) use venndir::list2im_opt()
hm5 <- heatmap_se(se,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
sestats=sestats_hitim,
column_split=c("group"),
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
ComplexHeatmap::draw(hm5,
column_title=attr(hm5, "hm_title"),
merge_legends=TRUE)
}
#> groupB-groupA groupC-groupA
#> row_0022 -1 -1
#> row_0030 -1 0
#> row_0066 1 0
#> row_0075 1 0
#> row_0080 -1 0
#> row_0087 1 0
#> contrast1 contrast2
#> row_0154 -1 0
#> row_0149 1 0
#> row_0267 -1 0
#> row_0839 -1 0
#> row_0753 1 0
#> row_0818 1 0
#> row_0032 1 0
#> row_0545 -1 0
#> row_0052 -1 0
#> row_0851 -1 0
# customize column label fonts using column_names_gp
column_bold <- ifelse(
SummarizedExperiment::colData(se)$group %in% "groupA",
2, 1);
hm6 <- heatmap_se(se,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs WildType",
column_names_gp=grid::gpar(col=sample_color_list$group[
as.character(SummarizedExperiment::colData(se)$group)],
font=column_bold),
column_split=c("group"),
row_split=c("Class"),
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
ComplexHeatmap::draw(hm6,
column_title=attr(hm6, "hm_title"),
merge_legends=TRUE)
# correlation=TRUE, any heatmap becomes a sample correlation heatmap
hm6corr <- heatmap_se(se,
correlation=TRUE,
apply_hm_column_title=TRUE,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
column_names_gp=grid::gpar(col=sample_color_list$group[
as.character(SummarizedExperiment::colData(se)$group)],
font=rep(c(1, 2, 1), c(3, 5, 24))),
column_split=c("Group"),
sample_color_list=sample_color_list)
ComplexHeatmap::draw(hm6corr,
merge_legends=TRUE)
## Final heatmap:
# 1. Applies heatmap title automatically.
# 2. Hides the top_colnames
# 3. Adds fancy grouped labels above the heatmap.
#
# apply_hm_column_title=TRUE
# convenient way to define a title,
# but it does not also display column_split labels
#
# hm_title_buffer=4
# convenient way to insert some whitespace lines
#
# heatmap_column_group_labels()
# adds to a drawn heatmap - it must already be drawn
#
SummarizedExperiment::colData(se)$Genotype <- rep(c("WT", "KO"), each=16);
SummarizedExperiment::colData(se)$Treatment <- rep(c("Control", "Dex"), each=8);
hm7 <- heatmap_se(se,
apply_hm_column_title=TRUE,
hm_title_buffer=3,
controlSamples=rownames(
subset(SummarizedExperiment::colData(se), group %in% "groupA")),
control_label="vs groupA",
sestats=sestats_list,
top_colnames=FALSE,
column_split=c("group"),
row_split=c("Class"),
rowData_colnames=c("Class"),
cluster_row_slices=FALSE,
sample_color_list=sample_color_list)
hm7_drawn <- ComplexHeatmap::draw(hm7,
merge_legends=TRUE)
# now add fancy labels
heatmap_column_group_labels(
hm_group_list=c("Treatment", "Genotype"),
se=se,
hm_drawn=hm7_drawn)
# Note: this step does not work consistently inside RStudio plot pane,
# in that case call dev.new() then run the step above to create hm7_drawn,
# then repeat the step below
#
# adjust the height of labels with argument y_offset_lines
# with positive values (upward), or negative values (downward).