What is close enough to the end to ignore?

Ask questions about sequence improvement / finishing D. mojavensis projects here.
Post Reply
dpaetkau
Posts: 29
Joined: Fri Jun 05, 2009 6:18 pm

What is close enough to the end to ignore?

Post by dpaetkau » Tue Feb 07, 2012 11:43 pm

In the answers to the Drosophila Homework Exercise, it says that we would not normally call for reads close to the ends because another overlapping contig will take care of these. What is "close to the end"? How many bases should one ignore? What is the standard tag, or comment that you should put on these problems to show that you have looked at them? Also, is there something in the naming that shows two contigs are adjacent so that you could check the neighboring contig and fix the problem with the reads from that contig (we did this when we were annotating, but I am not sure how to do it with finishing)?

wleung
Posts: 182
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: What is close enough to the end to ignore?

Post by wleung » Wed Feb 08, 2012 1:03 pm

Links to the genome browser with the estimated position of each clone is available at the Drosophila ananassae dot chromosome page on the GEP wiki. In general, you can ignore problem regions at the first and last 2kb of the fosmid clone. You can add a comment tag to the problematic regions at the fosmid ends to indicate that you have inspected the region and the problems were ignored because they are close to the fosmid clone ends.

> Also, is there something in the naming that shows two contigs are adjacent
The fosmid clone names are based on their well position so you cannot infer the clone position in the genomic assembly using the clone names.

Post Reply