Import genome coverage matrix files
coverage_matrix2nmat(
x = NULL,
filename = NULL,
signal_name = NULL,
target_name = "target",
background = 0,
smooth = FALSE,
target_is_single_point = FALSE,
signal_is_categorical = FALSE,
mat_grep = "[-0-9]+:[-0-9]+",
upstream_grep = "^[-]",
downstream_grep = "^[^-]",
target_grep = "^0$",
verbose = FALSE,
...
)
data.frame
or compatible object containing
genome coverage data, or a character file path. When
x
is not supplied, filename
is used to import
data. When x
is a filename, it is used to populate
filename
, then data is imported into x
.
character path to a genome coverage file.
When x
is supplied, this argument is ignored. When
filename
is used, only the first file is imported.
The name of signal regions. It is only used
for printing the object. When signal_name
is NULL
, the
signal_name
is derived from names(filename)
if
available, then basename(filename)
, or "signal"
then
only x
is supplied.
The name of the target names. It is only used for printing the object.
numeric value containing the background value in the matrix.
logical whether to apply smoothing on rows.
logical indicating whether the target region is a single point, and whether signal matrix is categorical, respectively.
character regular expression pattern used
to identify colnames which contain coverage data. The
default pattern expects the format "-200:-100"
.
character regular expression pattern
used to identify upstream colnames from values that
match mat_grep
. The default assumes any region
beginning "-"
is negative and upstream the central
target region.
character regular expression pattern
used to identify upstream colnames from values that
match mat_grep
. The default assumes all colnames which
are not upstream are therefore downstream.
character regular expression pattern
used to identify a colname referring to the target
,
which by default can only be "0"
. Otherwise, no target
region is defined.
logical indicating whether to print verbose output.
additional arguments are ignored.
normalizedMatrix
numeric matrix, where additiona
metadata is stored in the object attributes. See
EnrichedHeatmap::as.normalizedMatrix()
for more
details about the metadata. The rownames
are defined
by the first colname which does not match
mat_grep
, which by default is "Gene ID"
,
otherwise rownames are NULL
.
This function imports genome coverage data matrix
and returns an object of class
normalizedMatrix
compatible for use by the
package "EnrichedHeatmap"
.
There is a conversion function EnrichedHeatmap::as.normalizedMatrix()
,
however this function does not call that function, in
favor of defining the attributes directly. In future, this
function may change to call that function.
Other jam coverage heatmap functions:
get_nmat_ceiling()
,
nmathm_row_order()
,
nmatlist2heatmaps()
,
validate_heatmap_params()
,
zoom_nmatlist()
,
zoom_nmat()
Other jam import functions:
deepTools_matrix2nmat()
,
frequency_matrix2nmat()
,
import_lipotype_csv()
,
import_metabolomics_niehs()
,
import_nanostring_csv()
,
import_nanostring_rcc()
,
import_nanostring_rlf()
,
import_proteomics_PD()
,
import_proteomics_mascot()
,
import_salmon_quant()
,
process_metab_compounds_file()
## There is a small example file to use for testing
cov_file <- system.file("data", "tss_coverage.matrix", package="platjam");
cov_file <- system.file("data", "h3k4me1_coverage.matrix", package="platjam");
if (length(cov_file) > 0) {
nmat <- coverage_matrix2nmat(cov_file);
jamba::printDebug("signal_name: ",
attr(nmat, "signal_name"));
if (suppressPackageStartupMessages(require(EnrichedHeatmap))) {
color <- "red3";
signal_name <- attr(nmat, "signal_name");
k <- 6;
set.seed(123);
partition <- kmeans(log10(1+nmat), centers=k)$cluster;
EH <- EnrichedHeatmap(log10(1+nmat),
split=partition,
pos_line=FALSE,
use_raster=TRUE,
col=jamba::getColorRamp(color, n=10),
top_annotation=HeatmapAnnotation(
lines=anno_enriched(gp=grid::gpar(col=colorjam::rainbowJam(k)))
),
axis_name_gp=grid::gpar(fontsize=8),
name=signal_name,
column_title=signal_name
);
PHM <- Heatmap(partition,
use_raster=TRUE,
col=structure(colorjam::rainbowJam(k),
names=as.character(seq_len(k))),
name="partition",
show_row_names=FALSE,
width=grid::unit(3, "mm"));
draw(PHM + EH, main_heatmap=2);
}
}
#> ## (12:31:34) 21Sep2023: signal_name: h3k4me1_coverage.matrix