Add gaps between GRangesList regions

addGRLgaps(
  grl,
  strandSpecific = TRUE,
  gapname = "gap",
  suffix = "_v",
  newValues = list(feature_type = "gap"),
  default_feature_type = "exon",
  feature_type_colname = "feature_type",
  doSort = TRUE,
  verbose = FALSE,
  ...
)

Arguments

grl

GRangesList object

strandSpecific

logical indicating whether the gaps are calculated per strand, see getGRLgaps().

gapname, suffix

character vector supplying the name to assign to new gap GRanges elements, using jamba::makeNames() with suffix as described to define non-duplicated names. If gapname is NULL then no names are assigned to new gap GRanges entries, however when the input gr GRanges object has names, the concatenation of gaps causes names "" to be assigned to all gap GRanges elements, which are duplicated for multiple gaps.

newValues

list of values to add to the resulting gap GRanges, whose names become colnames(grl@unlistData), and whose values are used to populate each column. By default a colname "feature_type" is added, with value "gap" added to each row. When newValues is NULL then no values are added to the gaps GRanges. For every entry in names(newValues), if the colname does not exist, it is created and populated with default_feature_type as a default value, prior to adding gaps.

default_feature_type, feature_type_colname

character values, indicating the type and colname to populate in values(grl@unlistData) prior to adding gaps. When feature_type_colname is NULL, no action is taken. When feature_type_colname does not exist in colnames(values(grl@unlistData)), it is created with values in default_feature_type. Also any colname defined by newValues that does not already exist is also created and populated with default_feature_type.

doSort

logical indicating whether to sort the resulting GRanges objects. When doSort=FALSE the gaps are added to the end of each input grl GRanges object. Note that the GrangesList object is not sorted, only the GRanges objects within the GRangesList are sorted.

verbose

logical indicating whether to print verbose output.

...

additional arguments are passed to getGRLgaps().

Value

GRangesList object, sorted per GRangesList element when doSort=TRUE. When newValues is supplied, the values for gaps GRanges elements will be assigned, otherwise any column values present in gr will be NA for gaps elements. The names of gaps elements are assigned using gapname then are made unique using jamba::makeNames(), unless gapname is NULL.

Details

This function adds gaps between each GRanges region separately for each GRangesList element, where there is a gap between two GRanges for the same seqnames. When strandSpecific=TRUE the gaps are determined per strand.

This function is a wrapper around getGRLgaps(), which is then concatenated to the input gr GRanges object using S4Vectors::pc(). When the input grl GRanges has column S4Vectors::values() then the gaps GRanges object will have NA values used by default. To supply values, use the newValues argument, which assigns name-value pairs.

See also

Examples

gr <- GenomicRanges::GRanges(seqnames=rep(c("chr1","chr2"), c(3,2)), ranges=IRanges::IRanges(start=c(100, 300, 400, 300, 700), end=c(199, 450, 500, 600, 800)), strand=rep(c("+","-"), c(3,2)), feature_type=rep("exon", 5)); names(gr) <- jamba::makeNames(rep("exon", length(gr))); gr;
#> GRanges object with 5 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v1 chr1 100-199 + | exon #> exon_v2 chr1 300-450 + | exon #> exon_v3 chr1 400-500 + | exon #> exon_v4 chr2 300-600 - | exon #> exon_v5 chr2 700-800 - | exon #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths
addGRgaps(gr);
#> GRanges object with 7 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v1 chr1 100-199 + | exon #> gap_v1 chr1 200-299 + | gap #> exon_v2 chr1 300-450 + | exon #> exon_v3 chr1 400-500 + | exon #> exon_v4 chr2 300-600 - | exon #> gap_v2 chr2 601-699 - | gap #> exon_v5 chr2 700-800 - | exon #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths
grl <- GenomicRanges::split(gr, GenomicRanges::seqnames(gr)); grl;
#> GRangesList object of length 2: #> $chr1 #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v1 chr1 100-199 + | exon #> exon_v2 chr1 300-450 + | exon #> exon_v3 chr1 400-500 + | exon #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $chr2 #> GRanges object with 2 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v4 chr2 300-600 - | exon #> exon_v5 chr2 700-800 - | exon #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #>
addGRLgaps(grl);
#> GRangesList object of length 2: #> $chr1 #> GRanges object with 4 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v1 chr1 100-199 + | exon #> gap_v1 chr1 200-299 + | gap #> exon_v2 chr1 300-450 + | exon #> exon_v3 chr1 400-500 + | exon #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $chr2 #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v4 chr2 300-600 - | exon #> gap_v2 chr2 601-699 - | gap #> exon_v5 chr2 700-800 - | exon #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #>
addGRLgaps(grl, strandSpecific=FALSE);
#> GRangesList object of length 2: #> $chr1 #> GRanges object with 4 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v1 chr1 100-199 + | exon #> exon_v2 chr1 300-450 + | exon #> exon_v3 chr1 400-500 + | exon #> gap_v1 chr1 200-299 * | gap #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $chr2 #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | feature_type #> <Rle> <IRanges> <Rle> | <character> #> exon_v4 chr2 300-600 - | exon #> exon_v5 chr2 700-800 - | exon #> gap_v2 chr2 601-699 * | gap #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #>