Convert PSL alignment to data.frame

psl2df(psl, verbose = FALSE, ...)

Arguments

psl

file or other connection compatible with base::readLines() of data in PSL alignment format.

...

additional arguments are ignored.

Details

This function takes PSL alignment format, as produced by BLAT, and converts to data.frame. The driving reason for this function is that current methods to convert PSL to other formats lose some useful information, for example conversion to BED12 format loses the query alignment coordinates, in favor of storing only the reference coordinates.

To supply text as input use base::textConnection() to wrap a text connection around the input text.

See also

Other jam data import functions: import_juncs_from_bed()

Examples

psls <- c( "2252\t0\t0\t0\t0\t0\t5\t56285\t+\tQuery_Sequence\t2252\t0\t2252\tReference_Sequence\t99095\t1992\t60529\t6\t227,86,71,79,77,1712,\t0,227,313,384,463,540,\t1992,6931,11020,39871,45861,58817,", "664\t0\t0\t0\t1\t3\t1\t1\t+\tQuery_Sequence\t2252\t1291\t1958\tReference_Sequence\t99095\t58842\t59507\t2\t657,7,\t1291,1951,\t58842,59500," ); psl2df(textConnection(psls), verbose=TRUE)
#> ## (19:08:42) 09Mar2021: psl2df(): Detected no psLayout header line. #> ## (19:08:42) 09Mar2021: length(pslLines):2
#> match mis-match rep.match Ns Qgapcount Qgapbases Tgapcount Tgapbases strand #> 1 2252 0 0 0 0 0 5 56285 + #> 2 664 0 0 0 1 3 1 1 + #> Qname Qsize Qstart Qend Tname Tsize Tstart Tend #> 1 Query_Sequence 2252 0 2252 Reference_Sequence 99095 1992 60529 #> 2 Query_Sequence 2252 1291 1958 Reference_Sequence 99095 58842 59507 #> blockcount blockSizes qStarts #> 1 6 227,86,71,79,77,1712, 0,227,313,384,463,540, #> 2 2 657,7, 1291,1951, #> tStarts #> 1 1992,6931,11020,39871,45861,58817, #> 2 58842,59500,