Page 1 of 1

Polymorphism frequency in ananassae?

Posted: Thu Feb 07, 2013 1:44 am
by cjones
Most of my students have one or more sites in their projects in which one or more reads has a high-quality discrepancy from the consensus. Given the size of the homologous flanking regions I'm inclined to call these polymorphisms, but there are a lot more of them than seems reasonable. Has this been a common observation in these ananassae finishing projects?

Re: Polymorphism frequency in ananassae?

Posted: Thu Feb 07, 2013 2:08 am
by wleung
Many of the genuine high quality discrepancies I have seen in the D. ananassae projects are placed in regions that matched known transposons in D. ananassae. The consensus sequences in the repetitive regions are marked with blue "repeat" tags. Consequently, some of the high quality discrepancies could be attributed to reads that have been misplaced into the wrong copy of the transposon. While there might only be a single copy of the transposon in your project, the read might have been derived from a different copy of the transposon in the genome. Note that if the read is part of a transposon, we would expect it to have a high degree of similarity with other copies of the transposons in the genome.

Hence I would suggest checking both the read and its mate pair to verify that you are confident in the placement of both reads in your assembly. In addition, you can search the read (and its mate pairs) against the whole genome assembly (e.g. using FlyBase BLAST or BLAT in the official UCSC Genome Browser) to see if there are other locations in the genome where the reads would match better.

In addition, for projects with major misassemblies, high quality discrepancies is a key tool for distinguishing different repeat copies from each other. In particular, regions where there are multiple reads with multiple high quality discrepancies would likely indicate a collapsed repeat. Please refer to Chris' talk on polymorphisms for more information on how you can use the restriction digests to distinguish polymorphisms from misassemblies.