contigs that don't show up on assemblyview

Ask questions about sequence improvement / finishing D. mojavensis projects here.
Post Reply
ksaville
Posts: 45
Joined: Mon Aug 06, 2007 8:30 pm

contigs that don't show up on assemblyview

Post by ksaville » Thu Mar 06, 2008 8:29 pm

We have a couple of projects that have 1 or 2 contigs that don't show up in assembly view. Is there a reason for this?


also one of these has only one read and in this read there are lots of Ns. ut from the traces they look like Ts to us. Can we change these to Ts?

cshaffer
Posts: 211
Joined: Sun Feb 04, 2007 10:29 pm
Location: Washington University in St Louis
Contact:

small contigs

Post by cshaffer » Fri Mar 07, 2008 6:02 am

when you have lots and lots of reads you tend to accumulate a few junk reads with little or no quality data, most of these reads end up in their own contig or matched with a few other low quality reads to make a low quality contig.

Consed has some settings that tell assembly view to not display contigs these low quality contigs.
You can change the various cut-off values if you want. In assembly view click the "what to show" menu and select In/exclude contigs.

You can see the default settings are to not show contigs with less thatn 10 reads or contigs shorter than 1kb. I think the genome center uses values of 10 reads and 500 bases.

Most of these contigs end up being low quality data that just get put together because a few bases match. Since they are mostly low quality data they can be ignored.

However sometimes the data is pretty good the but basecaller made a lot of mistakes, usually calling all of one base "N" instead of the correct base. In these cases feel free to edit the "N"s to the proper basecall and then use search for string to see if you can put the read where it belongs. Sometimes this can help with read depth or with covering a region of low quality.

Post Reply