checklist issue: BLAST check
-
- Posts: 99
- Joined: Sun Feb 04, 2007 10:19 pm
- Location: Moravian College, Bethlehem PA
- Contact:
checklist issue: BLAST check
The finishing checklist includes an item instructing us to "Run BLAST (check for contamination from vector, host)" -- are we actually supposed to use the entire 40kb fosmid consensus sequence? It takes forever and the proverbial day, and not surprisingly there are lots of hits to various Drosophilae. Are we missing something here in running this search?
Chris Jones
Assoc. Prof. of Biology
Moravian College
Bethlehem PA
Assoc. Prof. of Biology
Moravian College
Bethlehem PA
-
- Posts: 211
- Joined: Sun Feb 04, 2007 10:29 pm
- Location: Washington University in St Louis
- Contact:
BLAST
I am not sure why your BLAST searches are taking so long. Should only take about 3 or 4 minutes unless NCBI is really backed up.
In screening for contamination you are looking for exact hits to vector sequence not highly similar hits so that is pretty easy to screen for these. As for host conatmination there are times when an e coli transposon will jump into a fosmid during propagation. You really do need to make sure you are screening for these things but they are rare and the length and level of similarity needs to be quite high so again its pretty easy to screen by eye from the hit list to see if you have any contamination.
If NCBI is really going slow you can install a local copy of blast on your macs. The searches should take anywhere from 3 - 10 minutes depending on the database and how old your macs are. I do not have an easy to install package but it is an option if anyone is interested.
In screening for contamination you are looking for exact hits to vector sequence not highly similar hits so that is pretty easy to screen for these. As for host conatmination there are times when an e coli transposon will jump into a fosmid during propagation. You really do need to make sure you are screening for these things but they are rare and the length and level of similarity needs to be quite high so again its pretty easy to screen by eye from the hit list to see if you have any contamination.
If NCBI is really going slow you can install a local copy of blast on your macs. The searches should take anywhere from 3 - 10 minutes depending on the database and how old your macs are. I do not have an easy to install package but it is an option if anyone is interested.
-
- Posts: 211
- Joined: Sun Feb 04, 2007 10:29 pm
- Location: Washington University in St Louis
- Contact:
getting consensus sequences out of Consed
One way is to go to the aligned reads window and select "Export Consensus sequence" from the file menu. This will give you a save dialog box where you can save the consensus sequence of the contig you are viewing.
You can use the saved file to search the non-redundant database. Remember you are looking for possible contamination. Real contamination will be long stretches of EXACT matches.
The likelyhood of some kind of cantamination is extreamly low given the history of these sequences. The lenght of the match would have to be quite long (say 500-1000 bp) before I started to worry about contamination.
You can use the saved file to search the non-redundant database. Remember you are looking for possible contamination. Real contamination will be long stretches of EXACT matches.
The likelyhood of some kind of cantamination is extreamly low given the history of these sequences. The lenght of the match would have to be quite long (say 500-1000 bp) before I started to worry about contamination.