For each base, an integer quality score = -10 log(probabilty base is wrong) is calculated, then added to 33 to make a number in the Ascii printable character range. Line 4 is a string of Ascii-encoded base quality scores, one character per base in the sequence.Line 3 is always '+' from GSAF (it can optionally include a sequence description). Line 2 is the sequence reported by the machine.Except for the barcode information, read identifiers will be identical for corresponding entries in the R1 and R2 fastq files. Line 1 is the read identifier, which describes the machine, flowcell, cluster, grid coordinate, end and barcode for the read.GCGTTGGTGGCATAGTGGTGAGCATAGCTGCCTTCCAAGCAGTTATGGGAG
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
December 2022
Categories |