BLASTp search for final submission item

Ask questions about annotation of D. erecta, D. mojavensis, and D. grimshawi projects here.
Post Reply
cjones
Posts: 99
Joined: Sun Feb 04, 2007 10:19 pm
Location: Moravian College, Bethlehem PA
Contact:

BLASTp search for final submission item

Post by cjones » Tue Apr 23, 2013 1:33 am

In the section "Have you annotated all the genes?" students are instructed to perform a BLASTP search against the non-redundant protein database, presumably just of Dmel proteins, using each "region" of their fosmid identified as a potential coding region by each of the gene prediction programs. Must this be done on an "exon"-by-"exon" basis (calling each of those predicted coding regions an "exon") or can the entire "protein" be tossed into the BLASTP search in a single go? I would think the single "protein" run would work just as well and save time and effort, but perhaps the search parameters would work differently depending on the size of the query string.

(And why a BLASTP search? If there were no stop codons in the "exon" then there would be 6 possible peptides and there's no way of knowing which one the gene predictor "meant" -- wouldn't a BLASTx search make more sense? Of course, if there were anything there the initial BLASTx search should have found it, so why do this at all? My head hurts now....)
Chris Jones
Assoc. Prof. of Biology
Moravian College
Bethlehem PA

wleung
Posts: 185
Joined: Sun Feb 04, 2007 7:41 pm
Location: Washington University in St. Louis

Re: BLASTp search for final submission item

Post by wleung » Tue Apr 23, 2013 2:02 am

The primary purpose of the blastp search of the Genscan predictions against the nr database is to identify genes that are found in other species but not in D. melanogaster. Consequently, the blastp search should be against all of the non-redundant protein database (i.e. not limited to D. melanogaster). You can use the entire Genscan predicted protein as the query when you search the nr database.

If there are significant matches and these matches have been experimentally confirmed (e.g. Refseq proteins with the NP_ prefix), then we will use these matches as the basis to construct a gene model.

We decided to add this search to the annotation report because we have previously identified a few genes in other Drosophila species that are conserved among many species (from honey bee to human) but the gene is not found in D. melanogaster.

Post Reply