Annotate GRangesList from GRangesList objects

annotateGRLfromGRL(
  GRL1,
  GRL2,
  annoName1 = "name",
  annoName2 = "name",
  grlOL = NULL,
  add_grl_name = FALSE,
  returnType = c("GRL", "GR"),
  splitColname = annoName1,
  verbose = FALSE,
  ...
)

Arguments

GRL1

GRangesList query

GRL2

GRangesList subject, used to add annotations to GRL1

annoName1

character value indicating either the colname of values(GRL1) to use as the name, or if "name" then it uses names(GRL1).

annoName2

character value indicating either the colname of values(GRL2) to use as the name, or if "name" then it uses names(GRL2).

grlOL

overlap result (optional) from GenomicRanges::findOverlaps() for these same GRangesList objects, used to save time by not re-running GenomicRanges::findOverlaps() again.

add_grl_name

logical indicating whether to add the names of each GRangesList object to the output object, useful for tracking the annotations to the source data.

returnType

character value indicating whether to return GRangesList "GRL" or GRange "GR" object.

splitColname

character value used internally to indicate how to split the resulting GRanges annotated data back to GRangesList. Almost always, this value should be identical to annoName1 which will split the resulting GRanges back into the identical input GRangesList.

verbose

logical indicating whether to print verbose output.

...

additional arguments are passed to annotateGRfromGR(). To customize the aggregation functions, supply numShrinkFunc or stringShrinkFunc as described in annotateGRfromGR().

Value

GRangesList object with the same length and lengths as the input GRL1, with annotation columns added from GRL2.

Details

This function extends annotateGRfromGR() for the special case of GRangesList objects. It requires both GRangesList objects have identical length, and assumes both are in equivalent order. It then restricts all overlapping annotations to those where the query and subject are the same original GRangesList index.

This function is particularly useful following an operation on a GRangesList object that otherwise removes all annotations in values(GRL1), for example GenomicRanges::reduce() or GenomicRanges::flank(). This function can be used to re-annotate the resulting features using the original GRangesList object.

Note that annotations are added at the level of individual GRanges entries, equivalent to values(GRL1@unlistData). This function does not currently apply annotations at the GRangesList level, thus it does not use values(GRL2) if they exist.

See also

Examples

gr12 <- GenomicRanges::GRanges( seqnames=rep(c("chr1", "chr2", "chr1"), c(3,3,3)), ranges=IRanges::IRanges( start=c(100, 200, 400, 500, 300, 100, 200, 400, 600), width=c(100,150,50, 50,50,100, 50,200,50) ), strand=rep(c("+", "-", "+"), c(3,3,3)), gene_name=rep(c("GeneA", "GeneB", "GeneC"), each=3) ) # Now split into GRangesList grl1 <- GenomicRanges::split(gr12[,0], GenomicRanges::values(gr12)$gene_name); grl2 <- GenomicRanges::split(gr12, GenomicRanges::values(gr12)$gene_name); # The first object is a GRangesList with no annotations grl1;
#> GRangesList object of length 3: #> $GeneA #> GRanges object with 3 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 100-199 + #> [2] chr1 200-349 + #> [3] chr1 400-449 + #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $GeneB #> GRanges object with 3 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr2 500-549 - #> [2] chr2 300-349 - #> [3] chr2 100-199 - #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $GeneC #> GRanges object with 3 ranges and 0 metadata columns: #> seqnames ranges strand #> <Rle> <IRanges> <Rle> #> [1] chr1 200-249 + #> [2] chr1 400-599 + #> [3] chr1 600-649 + #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #>
# The second object is a GRangesList with annotation, # assumed to be in the same order grl2;
#> GRangesList object of length 3: #> $GeneA #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | gene_name #> <Rle> <IRanges> <Rle> | <character> #> [1] chr1 100-199 + | GeneA #> [2] chr1 200-349 + | GeneA #> [3] chr1 400-449 + | GeneA #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $GeneB #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | gene_name #> <Rle> <IRanges> <Rle> | <character> #> [1] chr2 500-549 - | GeneB #> [2] chr2 300-349 - | GeneB #> [3] chr2 100-199 - | GeneB #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $GeneC #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | gene_name #> <Rle> <IRanges> <Rle> | <character> #> [1] chr1 200-249 + | GeneC #> [2] chr1 400-599 + | GeneC #> [3] chr1 600-649 + | GeneC #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #>
annotateGRLfromGRL(grl1, grl2);
#> GRangesList object of length 3: #> $GeneA #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | X #> <Rle> <IRanges> <Rle> | <character> #> grl1_v1 chr1 100-199 + | GeneA #> grl1_v2 chr1 200-349 + | GeneA #> grl1_v3 chr1 400-449 + | GeneA #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $GeneB #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | X #> <Rle> <IRanges> <Rle> | <character> #> grl1_v4 chr2 500-549 - | GeneB #> grl1_v5 chr2 300-349 - | GeneB #> grl1_v6 chr2 100-199 - | GeneB #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #> #> $GeneC #> GRanges object with 3 ranges and 1 metadata column: #> seqnames ranges strand | X #> <Rle> <IRanges> <Rle> | <character> #> grl1_v7 chr1 200-249 + | GeneC #> grl1_v8 chr1 400-599 + | GeneC #> grl1_v9 chr1 600-649 + | GeneC #> ------- #> seqinfo: 2 sequences from an unspecified genome; no seqlengths #>