D grimshawi project

From GEP Wiki
Jump to: navigation, search

This page should be used to add any useful tidbits or helpful advice anyone finds that is specific to the D grimshawi finishing and annotation project.

Early work on the Agencourt assemblies of the D. grimshawi genome identified two scaffolds as being likely derived from the dot chromosome.  Both of these scaffolds contain a high density of putative orthologs of D. melanogaster dot chromosome genes.

These scaffolds were analyzed and the list of all fosmids mapping to these scaffolds was extracted. This list was cross-referenced to the list of clones available for order from the Drosophila Genome Resource Center. The available fosmids that mapped to the two scaffolds were analyzed and a set of overlapping fosmids were ordered. However, not all fosmids were viable and only 40 fosmids were returned. This set of fosmids covers about 85% of the two scaffolds. Unfortunately about 15% of the scaffolds were not covered by any available clones.


Viewing the Two Scaffolds in Genome Browsers

You can view the two grimshawi contigs in their entirety on either the UCSC browser or the flybase browser.

However you should be aware that the two browsers are using slightly different assemblies.

The UCSC browser is displaying the original Agencourt Arachne assembly while the flybase browser displays the CAF-1 assembly in which minor updates have been made to reconcile the results of various assembly programs.  The consequences of the CAF-1 update is that the same regions of the dot chromosome from grimshawi do not have the same scaffold numbers and the coordinates no longer match precisely. You can use the the links below to view these regions (the UCSC version has the locations of the fosmids mapped onto the scaffolds, the flybase version does not):


UCSC FLYBASE
Small contig scaffold_24861 scaffold_14592
Large contig scaffold_25011 scaffold_14822

Warning! the locations of the fosmids on the UCSC browser are only approximate; to find the exact location of any fosmid on this browser a BLAT search with end sequences should be carried out.


The Fosmids

The 41 fosmids have been duplicated with slightly different names to facility data tracking. Now each project name will only be sent out once but each fosmid will have two names. The table below cross-references the name of the fosmid in the CAF1 assembly with the two names it has within the GEP system.


original name GEP name 1 GEP name 2
39967600K09 DGA01K09 DGB09K01
41358300D10 DGA02D10 DGB10D02
41361600P06 DGA03P06 DGB06P03
41358000D11 DGA04D11 DGB11D04
41378100E01 DGA05E01 DGB01E05
39956400H06 DGA06H06 DGB06H06
41359300O19 DGA07O19 DGB19O07
41362400A10 DGA08A10 DGB10A08
39956300F16 DGA10F16 DGB16F10
41358300F05 DGA11F05 DGB05F11
41354400M09 DGA12M09 DGB09M12
41354400G20 DGA13G20 DGB20G13
41363600B11 DGA14B11 DGB11B14
41358400M06 DGA15M06 DGB06M15
39967000F05 DGA16F05 DGB05F16
41354300O01 DGA17O01 DGB01O17
41353800N23 DGA18N23 DGB23N18
39955800A15 DGA19A15 DGB15A19
41358400P16 DGA20P16 DGB16P20
41354400E24 DGA21E24 DGB24E21
41377600C17 DGA22C17 DGB17C22
41363700F17 DGA23F17 DGB17F23
39994800E20 DGA25E20 DGB20E25
41359800H21 DGA26H21 DGB21H26
41359700P17 DGA27P17 DGB17P27
41361200N19 DGA28N19 DGB19N28
41361600M16 DGA29M16 DGB16M29
41378400P08 DGA30P08 DGB08P30
41364300I18 DGA31I18 DGB18I31
41354500C21 DGA32C21 DGB21C32
41361600I22 DGA33I22 DGB22I33
41361200C04 DGA34C04 DGB04C34
41364400P10 DGA35P10 DGB10P35
41353000M13 DGA36M13 DGB13M36
41378000K14 DGA37K14 DGB14K37
41354100L22 DGA38L22 DGB22L38
39967700J11 DGA41J11 DGB11J41
41361700O21 DGA42O21 DGB21O42
41378200A19 DGA43A19 DGB19A43
41359200G24 DGA45G24 DGB24G45