Region of high sequence quality following poor quality

Ask questions about sequence improvement / finishing D. mojavensis projects here.
Post Reply
jstamm
Posts: 30
Joined: Mon Aug 06, 2007 8:28 pm

Region of high sequence quality following poor quality

Post by jstamm » Thu Feb 14, 2008 7:34 pm

This was an interesting one...

I have a student working on fosmid 240-J07. There is a gap between contigs 3c and 4 of the original Consed assembly. He ordered a new read to span the gap: 08XBAB-240J07_t2.b3. There is a short sequence at the beginning of the read that matches well with the end of contig 3c bordering the gap. There is then a region of poor quality in the read followed by a very long region of the read that is very high quality and runs all the way to the end of the read. However, phredphrap designates these bases as X's in Consed, probably because of the poor quality region in the middle. Is it okay to manually edit the base reads in the high quality region so he can close the gap? Or is there another way to treat this?

(I would post a picture of the read, but I don't know how to do that.)

cshaffer
Posts: 211
Joined: Sun Feb 04, 2007 10:29 pm
Location: Washington University in St Louis
Contact:

editing bases

Post by cshaffer » Fri Feb 15, 2008 9:31 pm

sorry this is late I thought I had posted the answer to this but it didn't post for some reason.

Yes feel free to edit the bases and do the join if you want.

You can actually edit the bases then re-run phredphrap if you want you don't even have to do a force join if the edits are good it should all fall together.

I am a little curious because XXXX's are usually restricted to those regions that get tagged as vector sequence by one of the subroutines in phredphrap (it does make mistakes though). You should have the student make sure they are not looking at a sequence reaction that is going from the fosmid insert off into the fosmid vector backbone. I would have the student do a blast 2 sequences from the NCBI BLAST web page and compare the read to the fosmid vector. If you need the vector sequence it is posted on the wiki (the file is called "singleVectorForRestrictionDigest.fasta" ).

jstamm
Posts: 30
Joined: Mon Aug 06, 2007 8:28 pm

You were right

Post by jstamm » Mon Feb 18, 2008 10:20 pm

The student got suspicious when the read wouldn't close the gap, and looked at the vector sequence. Because that particular contig was complemented, he actually sequenced in the wrong direction and got the vector sequence instead. On the bright side, he now knows with certainty which direction he should go.

Post Reply