Assign exon names to GRangesList
assignGRLexonNames( GRL, geneSymbolColname = "geneSymbol", exonNameColname = paste0(geneSymbolColname, "Exon"), suffix = "_exon", renameOnes = FALSE, filterTwoStrand = TRUE, checkDisjoin = c("warn", "none", "stop"), assignGRLnames = TRUE, verbose = FALSE, ... )
| GRL |
|
|---|---|
| geneSymbolColname |
|
| exonNameColname |
|
| suffix |
|
| renameOnes |
|
| filterTwoStrand |
|
| checkDisjoin |
|
| assignGRLnames |
|
| verbose | logical indicating whether to print verbose output. |
| ... | additional arguments are ignored. |
GRangesList object with additional columns indicating the exon name.
This function takes a GRangesList object with an annotated gene symbol column, and defines exon numbers for each distinct (non-adjacent) range per gene. When multiple ranges overlap, one exon number is applied to them all, and disjoint ranges are denoted using a letter suffix.
For example the exon labels below:
|======|......|======|=======|======|......|======|=======|
.exon1.........exon2a.exon2b..exon2c........exon3a..exon3b.
The full name for each feature will become:
Gene_exon1
Gene_exon2a
Gene_exon2b
Gene_exon2c
Gene_exon3a
Gene_exon3b
The reasoning, is to preserve the gene symbol for readability, but to number exons to indicate the numbered contiguous exon, with suffix to indicate the sub-section of each exon.
Other jam GRanges functions:
addGRLgaps(),
addGRgaps(),
annotateGRLfromGRL(),
annotateGRfromGR(),
closestExonToJunctions(),
combineGRcoverage(),
exoncov2polygon(),
findOverlapsGRL(),
flattenExonsBy(),
getFirstStrandedFromGRL(),
getGRLgaps(),
getGRcoverageFromBw(),
getGRgaps(),
grl2df(),
jam_isDisjoint(),
make_ref2compressed(),
sortGRL(),
spliceGR2junctionDF(),
stackJunctions()
Other jam RNA-seq functions:
closestExonToJunctions(),
combineGRcoverage(),
defineDetectedTx(),
detectedTxInfo(),
exoncov2polygon(),
flattenExonsBy(),
getGRcoverageFromBw(),
groups2contrasts(),
internal_junc_score(),
makeTx2geneFromGtf(),
make_ref2compressed(),
prepareSashimi(),
runDiffSplice(),
sortSamples(),
spliceGR2junctionDF()
gene1 <- GenomicRanges::GRanges(seqnames="chr1", ranges=IRanges::IRanges( start=c(100, 200, 400, 500), width=c(100, 50, 50, 150)), strand="+", geneSymbol="Gene1"); gene2 <- GenomicRanges::GRanges(seqnames="chr1", ranges=IRanges::IRanges( start=c(600, 900, 1150, 1200), width=c(200, 100, 100, 50)), strand="-", geneSymbol="Gene2"); gene3 <- GenomicRanges::GRanges(seqnames="chr1", ranges=IRanges::IRanges( start=c(1500), width=c(75)), strand="+", geneSymbol="Gene3"); GRL <- GenomicRanges::GRangesList(list(gene1, gene2, gene3)) assignGRLexonNames(GRL)#> GRangesList object of length 3: #> $Gene1 #> GRanges object with 4 ranges and 2 metadata columns: #> seqnames ranges strand | geneSymbol geneSymbolExon #> <Rle> <IRanges> <Rle> | <character> <character> #> Gene1_exon1a chr1 100-199 + | Gene1 Gene1_exon1a #> Gene1_exon1b chr1 200-249 + | Gene1 Gene1_exon1b #> Gene1_exon2 chr1 400-449 + | Gene1 Gene1_exon2 #> Gene1_exon3 chr1 500-649 + | Gene1 Gene1_exon3 #> ------- #> seqinfo: 1 sequence from an unspecified genome; no seqlengths #> #> $Gene2 #> GRanges object with 4 ranges and 2 metadata columns: #> seqnames ranges strand | geneSymbol geneSymbolExon #> <Rle> <IRanges> <Rle> | <character> <character> #> Gene2_exon3 chr1 600-799 - | Gene2 Gene2_exon3 #> Gene2_exon2 chr1 900-999 - | Gene2 Gene2_exon2 #> Gene2_exon1b chr1 1150-1249 - | Gene2 Gene2_exon1b #> Gene2_exon1a chr1 1200-1249 - | Gene2 Gene2_exon1a #> ------- #> seqinfo: 1 sequence from an unspecified genome; no seqlengths #> #> $Gene3 #> GRanges object with 1 range and 2 metadata columns: #> seqnames ranges strand | geneSymbol geneSymbolExon #> <Rle> <IRanges> <Rle> | <character> <character> #> Gene3_exon1 chr1 1500-1574 + | Gene3 Gene3_exon1 #> ------- #> seqinfo: 1 sequence from an unspecified genome; no seqlengths #>