Premature Stop codons

Ask questions about annotation of D. erecta, D. mojavensis, and D. grimshawi projects here.
Post Reply
dpaetkau
Posts: 31
Joined: Fri Jun 05, 2009 6:18 pm

Premature Stop codons

Post by dpaetkau » Thu Dec 11, 2014 7:45 pm

Hello Wilson,

We are having multiple troubles with premature stop codons showing up in many of our gene model checkers. Especially the last exon of many genes. I thought I knew how to do this and when I check the phases between the two exons, they are right.

So here is another example of even weirder trouble. This is an internal one. It says there are premature stop codons and when my student clicks on the magnifying glass to see what the problem is, she is taken to the stop codon at the beginning of the whole contig (which is not part of her gene).

Details are:
Contig: dbiarmipes_3Lcontrol_Jan2014_contig65
GENE: Gyc76C
Exons:
38824-38991, 39319-39516, 39640-39705, 40322-40534, 40603-40757, 40813-41035, 41092-41263, 41330-41494, 41613-41903, 42645-42810, 42876-43182, 43255-43483, 43563-43727, 43811-45505

Pictures
Attachments
Screen Shot 2014-12-11 at 2.46.06 PM.png
Screen Shot 2014-12-11 at 2.46.06 PM.png (24.12 KiB) Viewed 5109 times
Screen Shot 2014-12-11 at 2.45.51 PM.png
Screen Shot 2014-12-11 at 2.45.51 PM.png (88.87 KiB) Viewed 5110 times

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: Premature Stop codons

Post by wleung » Fri Dec 12, 2014 3:33 am

> We are having multiple troubles with premature stop codons... Especially the last exon of many genes.

As a reminder, the "Coding Exon Coordinates" field should not include the stop codon. If the issue persists, please verify that the splice sites of adjacent exons have compatible phase (see description below).


> So here is another example of even weirder trouble. This is an internal one.

The problem is caused by the fact that the total length of the extracted coding region (4213 bp) is not a multiple of 3. The issue is typically caused by incompatible splice donor and acceptor sites. The incompatible splice sites would introduce a frame shift and lead to the in-frame stop codons.

From the Gene Model Checker output, we see that the first in-frame stop codon appears in the 9th CDS (CDS_9). In most cases, this means that the splice donor site of 8th CDS is incompatible with the splice acceptor site of the 9th CDS. Examination of the splice donor site for the 8th CDS (CDS 8_13790_2) shows the gene model is 4 nucleotides (41491-41494) longer than most of the gene predictions and the RNA-Seq coverage data.
Splice_Junction_CDS_8_13790_2.png
Splice_Junction_CDS_8_13790_2.png (77.63 KiB) Viewed 5107 times
Based on the results of the BLASTX alignment against contig65, CDS 8_13790_2 is in frame +1 and CDS 9_13790_0 is in frame +3. (Both alignments matched the entire length of their respective CDS.)
blastx_Gyc76C_CDS_8_9.png
blastx_Gyc76C_CDS_8_9.png (94.54 KiB) Viewed 5107 times
The splice acceptor site for CDS 9_13790_0 is in phase 0 relative to frame +3. This means that CDS 8_13790_2 must have a phase 0 splice donor site. The GT at 41491-41492 is in phase 0 relative to frame +1 so it is compatible with the phase 0 splice donor site. In contrast, the GT at 41495-41496 is a phase 1 donor site relative to frame +1, which is incompatible with the phase 0 splice acceptor site. The extra nucleotide will introduce a frame shift in CDS 9_13790_0, leading to the in-frame stop codons detected by the Gene Model Checker.

Consequently, while the splice donor site at 41495-41496 is supported by the SGP gene predictor, the TopHat splice junctions, and the splice site prediction, this splice site candidate is not part of CDS 8_13790_2 in the D. biarmipes ortholog.

One possible explanation for the splice junction is that there could be an alternate isoform of Cyc76C in D. biarmipes where CDS 9_13790_0 becomes a terminal CDS (translated in frame +2). However, because of the weak RNA-Seq evidence available (the TopHat splice junctions are supported by 2-13 spliced RNA-Seq reads), there is insufficient evidence to postulate a novel isoform in D. biarmipes.

Post Reply