D. biarmides question

Ask questions about annotation of D. erecta, D. mojavensis, and D. grimshawi projects here.
Post Reply
cmackinnon
Posts: 14
Joined: Fri Jun 18, 2010 2:14 pm

D. biarmides question

Post by cmackinnon » Wed Sep 19, 2012 3:48 pm

D. biarmides contig 31 is rated a level 4. Briefly, what should I anticipate in terms of annotation challenges for a level 4 project?

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: D. biarmides question

Post by wleung » Wed Sep 19, 2012 8:02 pm

Because D. biarmipes is evolutionarily further away from D. melanogaster than D. erecta, we anticipate that the D. biarmipes projects will be more difficult to annotate than D. erecta. The difficulty levels of the D. biarmipes projects range from 4 to 6, with 4 being the easiest. D. biarmipes projects with a difficulty rank of 4 generally contain only 1 or 2 genes, each with a small number of isoforms. In addition, the blastx alignments suggests most of the exons are highly conserved with the putative D. melanogaster ortholog.

As a reminder, the difficulty level is just a preliminary estimate. You can examine the project using the GEP UCSC Genome Browser Mirror and download the annotation package from the GEP Data Repository to obtain a more accurate assessment of the difficulty of the D. biarmipes projects.

cmackinnon
Posts: 14
Joined: Fri Jun 18, 2010 2:14 pm

Re: D. biarmides question

Post by cmackinnon » Thu Sep 20, 2012 1:36 pm

Thank you! Cosmid31 has one gene [zfh2] with three isoforms. There are about 16 exons total [haven't carefully counted them yet].

Another question, which blastx report should I use? The blastx information obtained by clicking on the exon in the Genome Browser predicts a better ending nucleotide coordinate for first exon than NCBI blastx. Will I miss important info if I only use the blastx from the genome browser or should I still do a NCBI blastx as a double check of the coordinates?

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: D. biarmipes question

Post by wleung » Thu Sep 20, 2012 2:32 pm

> Will I miss important info if I only use the blastx from the genome browser or should I still do a NCBI blastx as a double check of the coordinates?

Yes, because the blastx alignment track on the genome browser is created by searching the contig sequence against the set of full-length D. melanogaster proteins, the alignment blocks often do not correspond to the precise location of the individual exons. BLAST is only aware of match, mismatch, and indels and do not take the splice site signals (e.g. donor and acceptor site sequences) into account when it generates the alignment. The blastx track on the genome browser provides a reasonable estimate as to the most likely ortholog within a region of the contig.

Consequently, I would recommend using the BLAST 2 sequences with the blastx program to double check the coordinates and map the individual exons. Note that if the exons are small, you may need to increase the E-value threshold in order to detect small exons. In addition, you can use the RNA-seq data (coverage, TopHat, etc) to help identify exon with small coding regions but is immediately adjacent to an untranslated region.

cmackinnon
Posts: 14
Joined: Fri Jun 18, 2010 2:14 pm

Re: D. biarmides question

Post by cmackinnon » Mon Sep 24, 2012 5:58 pm

Thanks again. I needed an explanation for making them do both!

Post Reply