D. erecta mRNA

Ask questions about annotation of D. erecta, D. mojavensis, and D. grimshawi projects here.
Post Reply
drevie
Posts: 67
Joined: Sun Feb 04, 2007 10:23 pm
Location: California Lutheran University, Thousand Oaks, CA

D. erecta mRNA

Post by drevie » Wed Apr 11, 2012 7:15 pm

While doing blastn of our fosmids, there were often matches to D. erecta mRNA. There isn't a track on the genome browser (Wash U version) for mRNA. Are there plans to add such a track? Right now there are only D. yakuba tracks.

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: D. erecta mRNA

Post by wleung » Wed Apr 11, 2012 8:33 pm

Most of the D. erecta mRNA sequences are actually gene predictions that have not been experimentally confirmed. In particular, the RefSeq D. erecta mRNA records typically have the "XM" prefix and the corresponding GenBank records shows that these are predictions from GLEAN-R (see the features section of a sample GenBank record here). The GLEAN pipeline combine evidence from sequence alignments and multiple gene predictors to create the final gene models (see the manuscript by Elsik et. al for details on GLEAN and the paper Evolution of genes and genomes on the Drosophila phylogeny by the Drosophila 12 Genomes Consortium for details on the GLEAN-R gene models).

Because most of the D. erecta mRNA records have not been experimentally confirmed (as denoted by the XM/XP prefixes in the accession numbers), incorporating the D. erecta RNA data into the genome browser could potentially propagate errors in the GLEAN-R models into our final gene annotations. Consequently, if we decide to include the predicted D. erecta mRNA sequences as an evidence track, we would need to label them as gene predictions and not as real mRNA's.

In contrast, the D. yakuba tracks are actually based on RNA-Seq data from D. yakuba. The D. yakuba transcripts are assembled using Cufflinks and Oases and then mapped against the D. erecta fosmids. Consequently, even though the predictions are not as good as full-length cDNAs, these predicted transcripts from D. yakuba are supported by experimental evidence. Note that the modENCODE project did not generate RNA-Seq data for D. erecta which is why we have do the cross-species mapping.

drevie
Posts: 67
Joined: Sun Feb 04, 2007 10:23 pm
Location: California Lutheran University, Thousand Oaks, CA

Re: D. erecta mRNA

Post by drevie » Thu Apr 12, 2012 6:07 pm

Thanks for the info. I didn't look closely at the mRNA matches they got, and was surprised they were there. Since they are only predictions, I don't see why they are even in the databases.

Post Reply