fixed high quality discrepancies keep reappearing

Ask questions about sequence improvement / finishing D. mojavensis projects here.
Post Reply
dpaetkau
Posts: 31
Joined: Fri Jun 05, 2009 6:18 pm

fixed high quality discrepancies keep reappearing

Post by dpaetkau » Mon Feb 20, 2012 11:58 am

I have several students that find are having the same trouble. They fix their high quality discrepancies (usually a single base or a section of low quality) by converting the bases to low quality. The chromatographs show multiple peaks or a wide peak that looks like two peaks, or very low peaks. This works great and they save the file. When they open the new ace file and run the list of problems, the high quality discrepancies show up again. Shouldn't the problem area go away? This only happens with some of the students. Any hints?

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: fixed high quality discrepancies keep reappearing

Post by wleung » Mon Feb 20, 2012 1:15 pm

This problem usually occurs when you try to change the quality of a pad (*). Changing the discrepant pad to low quality will appear to resolve the high quality discrepancy. However, the edits you have made on the pad will disappear the next time you start consed. This problem occurs because the pad does not actually exist in the underlying phd file (which stores all the edits you have made to a trace). The pads are inserted by Consed when it tries to show all the aligned reads in a region. Hence while you can edit the pad within the aligned reads windows, the changes you have made to the pad are not stored by Consed.

If you have determined there are multiple peaks in a region that have been miscalled by phred (e.g. based on the distance between the peaks within a trace), I would edit the pad to the base directly. Alternatively, you can add a comment tag on the pad to indicate that the consensus is correct.

dpaetkau
Posts: 31
Joined: Fri Jun 05, 2009 6:18 pm

Re: fixed high quality discrepancies keep reappearing

Post by dpaetkau » Wed Feb 22, 2012 11:05 pm

First, let me say that students are adding a tag to the problem areas and taking care of them that way. Thank you for that suggestion.

However, because I would like to understand what is going on, I will elaborate. The problem areas are at the beginning of sequence. Often the student will make all low quality to the left of the cursor (if the read is a forward read) or all low quality to the right if it is a bottom strand read. This usually extends the low quality region a little farther into the read, and it gets rid of the high quality discrepancy. The sequence is poor with multiple peaks at each base, so we are not just randomly changing it to low quality. Every time they open consed, the high quality discrepancy is still there. If they go to the sight, they can see the changes that they made, but the problem still shows up in their list. any ideas?

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: fixed high quality discrepancies keep reappearing

Post by wleung » Thu Feb 23, 2012 2:51 pm

I am not sure I understand the problem. If possible, please provide screenshots of the problem areas.

Basically, all the bases that you have changed to low quality should no longer be considered to be a high quality discrepancy. The only exception are pads, where changing pads to low quality have no effect. Hence Consed will still consider the discrepant pads as high quality discrepancies.

If the discrepant regions are located at the beginning of a read, then the discrepant region most likely would correspond to the vector sequence.

For the D. ananassae subclones, the insert vector junction is a EcoRI cut site (GAATTC). The cut site is preceded by the sequence GTGTGGTG for reads in the forward strand. For reads in the reverse strand, the sequence following the EcoRI cut site is CACCACAC.
dana_vector_sequences.png
Unclipped D. ananassae vector
dana_vector_sequences.png (37.07 KiB) Viewed 6769 times
In that case, you can change all the discrepant bases to X's
dana_vector_clipped.png
Clipped D. ananassae vector
dana_vector_clipped.png (37.23 KiB) Viewed 6769 times
.

Post Reply