Assign exon names to GRangesList
assignGRLexonNames( GRL, geneSymbolColname = "geneSymbol", exonNameColname = paste0(geneSymbolColname, "Exon"), suffix = "_exon", renameOnes = FALSE, filterTwoStrand = TRUE, checkDisjoin = c("warn", "none", "stop"), assignGRLnames = TRUE, verbose = FALSE, ... )
GRL |
|
---|---|
geneSymbolColname |
|
exonNameColname |
|
suffix |
|
renameOnes |
|
filterTwoStrand |
|
checkDisjoin |
|
assignGRLnames |
|
verbose | logical indicating whether to print verbose output. |
... | additional arguments are ignored. |
GRangesList object with additional columns indicating the exon name.
This function takes a GRangesList object with an annotated gene symbol column, and defines exon numbers for each distinct (non-adjacent) range per gene. When multiple ranges overlap, one exon number is applied to them all, and disjoint ranges are denoted using a letter suffix.
For example the exon labels below:
|======|......|======|=======|======|......|======|=======|
.exon1.........exon2a.exon2b..exon2c........exon3a..exon3b.
The full name for each feature will become:
Gene_exon1
Gene_exon2a
Gene_exon2b
Gene_exon2c
Gene_exon3a
Gene_exon3b
The reasoning, is to preserve the gene symbol for readability, but to number exons to indicate the numbered contiguous exon, with suffix to indicate the sub-section of each exon.
Other jam GRanges functions:
addGRLgaps()
,
addGRgaps()
,
annotateGRLfromGRL()
,
annotateGRfromGR()
,
closestExonToJunctions()
,
combineGRcoverage()
,
exoncov2polygon()
,
findOverlapsGRL()
,
flattenExonsBy()
,
getFirstStrandedFromGRL()
,
getGRLgaps()
,
getGRcoverageFromBw()
,
getGRgaps()
,
grl2df()
,
jam_isDisjoint()
,
make_ref2compressed()
,
sortGRL()
,
spliceGR2junctionDF()
,
stackJunctions()
Other jam RNA-seq functions:
closestExonToJunctions()
,
combineGRcoverage()
,
defineDetectedTx()
,
detectedTxInfo()
,
exoncov2polygon()
,
flattenExonsBy()
,
getGRcoverageFromBw()
,
groups2contrasts()
,
internal_junc_score()
,
makeTx2geneFromGtf()
,
make_ref2compressed()
,
prepareSashimi()
,
runDiffSplice()
,
sortSamples()
,
spliceGR2junctionDF()
gene1 <- GenomicRanges::GRanges(seqnames="chr1", ranges=IRanges::IRanges( start=c(100, 200, 400, 500), width=c(100, 50, 50, 150)), strand="+", geneSymbol="Gene1"); gene2 <- GenomicRanges::GRanges(seqnames="chr1", ranges=IRanges::IRanges( start=c(600, 900, 1150, 1200), width=c(200, 100, 100, 50)), strand="-", geneSymbol="Gene2"); gene3 <- GenomicRanges::GRanges(seqnames="chr1", ranges=IRanges::IRanges( start=c(1500), width=c(75)), strand="+", geneSymbol="Gene3"); GRL <- GenomicRanges::GRangesList(list(gene1, gene2, gene3)) assignGRLexonNames(GRL)#> GRangesList object of length 3: #> $Gene1 #> GRanges object with 4 ranges and 2 metadata columns: #> seqnames ranges strand | geneSymbol geneSymbolExon #> <Rle> <IRanges> <Rle> | <character> <character> #> Gene1_exon1a chr1 100-199 + | Gene1 Gene1_exon1a #> Gene1_exon1b chr1 200-249 + | Gene1 Gene1_exon1b #> Gene1_exon2 chr1 400-449 + | Gene1 Gene1_exon2 #> Gene1_exon3 chr1 500-649 + | Gene1 Gene1_exon3 #> ------- #> seqinfo: 1 sequence from an unspecified genome; no seqlengths #> #> $Gene2 #> GRanges object with 4 ranges and 2 metadata columns: #> seqnames ranges strand | geneSymbol geneSymbolExon #> <Rle> <IRanges> <Rle> | <character> <character> #> Gene2_exon3 chr1 600-799 - | Gene2 Gene2_exon3 #> Gene2_exon2 chr1 900-999 - | Gene2 Gene2_exon2 #> Gene2_exon1b chr1 1150-1249 - | Gene2 Gene2_exon1b #> Gene2_exon1a chr1 1200-1249 - | Gene2 Gene2_exon1a #> ------- #> seqinfo: 1 sequence from an unspecified genome; no seqlengths #> #> $Gene3 #> GRanges object with 1 range and 2 metadata columns: #> seqnames ranges strand | geneSymbol geneSymbolExon #> <Rle> <IRanges> <Rle> | <character> <character> #> Gene3_exon1 chr1 1500-1574 + | Gene3 Gene3_exon1 #> ------- #> seqinfo: 1 sequence from an unspecified genome; no seqlengths #>