Add gaps between GRanges regions

addGRgaps(
  gr,
  strandSpecific = TRUE,
  gapname = "gap",
  suffix = "_v",
  newValues = list(feature_type = "gap"),
  default_feature_type = "exon",
  feature_type_colname = "feature_type",
  doSort = TRUE,
  ...
)

Arguments

gr

GRanges object

strandSpecific

logical indicating whether the gaps are calculated per strand, see getGRgaps().

gapname, suffix

character vector supplying the name to assign to new gap GRanges elements, using jamba::makeNames() with suffix as described to define non-duplicated names. If gapname is NULL then no names are assigned to new gap GRanges entries, however when the input gr GRanges object has names, the concatenation of gaps causes names "" to be assigned to all gap GRanges elements, which are duplicated for multiple gaps.

newValues

list of values to add to the resulting gap GRanges, whose names become colnames(gr), and whose values are used to populate each column. By default a colname "feature_type" is added, with value "gap" added to each row. When newValues is NULL then no values are added to the gaps GRanges.

doSort

logical indicating whether to sort the resulting GRanges object. When doSort=FALSE the gaps are added to the end of the gr input GRanges object.

...

additional arguments are passed to getGRgaps().

Value

GRanges object, sorted when doSort=TRUE. When newValues is supplied, the values for gaps GRanges elements will be assigned, otherwise any column values present in gr will be NA for gaps elements. The names of gaps elements are assigned using gapname then are made unique using jamba::makeNames(), unless gapname is NULL.

Details

This function adds gaps between each GRanges region where there is a gap between two GRanges for the same seqnames. When strandSpecific=TRUE the gaps are determined per strand.

This function is a wrapper around getGRgaps(), which is then concatenated to the input gr GRanges object using base::c(). When the input gr has column S4Vectors::values() then the gaps GRanges object will have NA values used by default. To supply values, use the newValues argument, which assigns name-value pairs.

See also

Examples

gr <- GenomicRanges::GRanges(seqnames=rep(c("chr1","chr2"), c(3,2)), ranges=IRanges::IRanges(start=c(100, 300, 400, 300, 700), end=c(199, 450, 500, 600, 800)), strand=rep(c("+","-"), c(3,2))); gr;
#> GRanges object with 5 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 100-199 + #> [2] chr1 300-450 + #> [3] chr1 400-500 + #> [4] chr2 300-600 - #> [5] chr2 700-800 - #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths
getGRLgaps(GenomicRanges::split(gr, GenomicRanges::seqnames(gr)))
#> GRangesList object of length 2: #> $chr1 #> GRanges object with 1 range and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 200-299 + #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $chr2 #> GRanges object with 1 range and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr2 601-699 - #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #>
getGRgaps(gr);
#> GRanges object with 2 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 200-299 + #> [2] chr2 601-699 - #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths