D biarmipes contig 31 exon 14

Ask questions about annotation of D. erecta, D. mojavensis, and D. grimshawi projects here.
Post Reply
cmackinnon
Posts: 14
Joined: Fri Jun 18, 2010 2:14 pm

D biarmipes contig 31 exon 14

Post by cmackinnon » Wed Apr 03, 2013 2:47 am

The gene record finder shows there are 3 isoforms for the zfh2 gene contained in contig 31. Isoform PC has an exon 14, but the blastx report only shows a *, which I assume is because exon 14 is very small and only the stop codon is detected by the blastx? I have tried to look for overlaps at the end of exon 13 and exon 15 to see if I could find a region that might contain some info I could use for the small record finder. I've not been successful at all, so could you guide me on how I can figure out the CDS for exon 14?

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: D biarmipes contig 31 exon 14

Post by wleung » Wed Apr 03, 2013 5:45 am

The basic strategy is to use the placement of the adjacent exons to narrow the scope of the region where the small exon could be placed and then examine the other sources of evidence (e.g. RNA-seq, sequence similarity to untranslated regions) to identify the small exon. In this case, the CDS 14_1629_0 is placed between the CDS 13_1629_0 and the CDS 15_1629_0 in D. melanogaster. Note that there is no overlap among these CDS's in the D. melanogaster model.

blastx alignment of the individual CDS's against contig31 indicate that the CDS 13_1629_0 ends at 22,482 and the alignment to the CDS 15_1629_0 begins at 24,697.
blastx_zfh_cds_alignments.png
blastx_zfh_cds_alignments.png (227.43 KiB) Viewed 3259 times
Examination of the region at the end of the blastx alignment CDS 13_1629_0 shows a phase 0 donor site at 22,483-22,484. These results impose certain constraints on the placement of CDS 14_1629_0: the CDS must be in the region between the two adjacent CDS's (i.e 22,483-24,696) and it must have a phase 0 acceptor site.

Because the RNA-seq reads are derived from processed mRNAs, the reads would map to both translated and untranslated regions. Based on the D. melanogaster model, the exon zfh2:14 is actually larger (110 bp) and the 3' untranslated region is immediately downstream of the stop codon in 14_1629_0. Looking at the RNA-seq data and TopHat junction predictions in the GEP UCSC Genome Browser, we found a region at around 24,500 that has with high RNA-seq read coverage and multiple TopHat splice junctions.
locate_zfh2_cds_14_1629.png
locate_zfh2_cds_14_1629.png (244.42 KiB) Viewed 3259 times
Examination of this region shows that the bases after the phase 0 splice acceptor site is a stop codon (i.e. 24,481-24483).
splice_acceptor_CDS_14.png
splice_acceptor_CDS_14.png (167.3 KiB) Viewed 3259 times
A blastn search of the exon zfh2:14 against contig31 also shows sequence similarity beginning at 24,481.

Post Reply