2019NCSS

Based on the principles of RNA virus and nanopore sequencing technology, we used Minimap2 to map the reads fragments in the sequencing data to the reference sequences. Reads that can directly match the reference sequence are RNA positive strand fragments. Sequencing reads that are matched to the reference sequence using the principle of complementary base pairing are RNA negative strand fragments. The remaining fragments are sequencing noise and need to be filtered.

The red part represents the reference sequence, and the purple part represents the matched part of the reads.
According to the mapping results, we set Q20 as the minimum standard to filter the bases at each position. Finally, count the number of bases at each site to determine the type and probability of mutation at each site. Suppose there are 100 reads at site x, of which A has 30, C has 10, and U has 60. That is, U is considered to be the reference base of sequence position x. Then there is a U-A variation and a U-C variation at this site x. Then the U-A mutation probability is A/(A+U)=0.333333, and the U-C mutation C/(C+U)=0.142857.
Finally, all site probabilities were averaged and normalized.
In the histogram, the x-axis shows the type of base variation, and the y-axis represents the number of sites where that type of base variation occurs. Under the same mutation type, the left is the positive chain and the right is the negative chain.
In the same column, different mutation probability intervals are represented by different colors..