Ask questions about annotation of D. erecta, D. mojavensis, and D. grimshawi projects here.
Wed Apr 09, 2014 6:51 pm

I have one group annotating a gene that, in D. melanogaster, has a first exon of two amino acids. When the students try to find the exon in D. biarmipes, there are a dozen or so matches within 1 or 2 kb using the small exon finder. None has the exact sequence but all start with M. Is there any way to find this possible exon? I will note that this is for one isoform-the other isoforms use an exon 2 kb away from the 2nd exon. All of the other exons have been found.

Re: Very short first exon

Thu Apr 10, 2014 12:23 am

In general, I would recommend using the RNA-Seq data (particularly the TopHat junctions and the the Alignment Summary tracks) to help you identify small coding exons. The Annotation Strategy Guide contains an example (starting on page 17) that illustrates how you can use the RNA-Seq data to identify a small coding exon of unc-13.

In most cases, the small coding exon is part of a larger transcribed exon. Consequently, you can also try to map the entire transcribed exon against the contig sequence with blastn to reduce the list of possible candidates. The sequence for each transcribed exon is available through the "Transcript Details" tab on the Gene Record Finder.

