Page 1 of 1

Finding Elusive Exons in D. biarmipes

Posted: Thu Nov 01, 2012 2:47 am
by akleinschmit
My students and I are having a hard time finding the following D. melanogaster ortholog exons in D. biarmipes as outlined in the excel screen shots below (D.bia_DOT_Contig10). We have attempted to look for predicted splice sites, RNA-seq data, conservation with other Drosophila species, as well as using small exon finder.

We are wondering if it is possible that the D.bia ortholog of CG2219 may of lost isoform C or if we are overlooking or have missed a critical cue when searching for the exon? Also, we are not sure what to think about the missing exons in the D.bia ortholog of CG33978.

Any suggestions on where to pick up on the project would be greatly helpful. Thank you:)

Re: Finding Elusive Exons in D. biarmipes

Posted: Thu Nov 01, 2012 3:43 pm
by wleung
In both cases, the exons are likely to be too small or too weakly conserved to be detected using BLAST searches. Given the lack of expression data, conservation and computational predictions, we will use conservation of exon length and biological constraints to help us pick the best candidate.

Missing exon for CG2219

I agree with your assessment that the C isoform of CG2219 probably does not exist in D. biarmipes. However, we can construct a gene model that represents the candidate for the putative ortholog of the C isoform of CG2219

In D. melanogaster, the exon 2_1569_1 in the C isoform overlaps with the exon 3_1569_1 in the A and B isoforms. Consequently, we expect to find the exon 2_1569_1 between the donor site of the previous exon (4_1569_0) and donor site of the next exon (3_1569_1). Ideally, the exon 2_1569_1 should be placed within exon 3_1569_1. The other constraint is that we need a phase 1 acceptor because the donor site of 4_1569_1 is in phase 2.

We find that there are only three stop codons in this region of interests:

The only phase 1 acceptor relative to frame -2 is found at:
41637-41636 (single base followed by stop codon)

The two candidates found relative to frame -3 are at:
acceptor candidates for CG2219
Dbia_contig10_acceptor_candidates.png (58.87 KiB) Viewed 6120 times
None of these candidates are not supported by expression data, sequence conservation, or computational predictions. Consequently, we will err by picking the exon with the longest open reading frame: the open reading frame in frame -3 with the splice acceptor site at 41651-41650.

Re: Finding Elusive Exons in D. biarmipes

Posted: Thu Nov 01, 2012 4:05 pm
by wleung
Missing exons in CG33978

Given that exons 3_1556_1 and 2_1566_1 overlap with each other in D. melanogaster, I would try to map the larger exon (3_1556_1) first. From your previous analysis of the adjacent exons, we know that we need this exon to have a phase 1 acceptor and a phase 0 donor and that the exon should be approximately 20 amino acids in length.

Using the Small Exon Finder, we find that there are two CDS's that satisfy these criteria within the region that are bounded by the adjacent exons (4_1556_0 and 1_1556_0):
8027-8087 and 8837-8897
small exon finder D. biarmipes CG33978
small_exon_finder_Dbia_CG33978.png (90.33 KiB) Viewed 6119 times
Because we know that 2_1566_1 has the same donor site as 3_1556_1 but it has an earlier phase 1 acceptor site and the exon has an expected size of 6 amino acids, we will use the genome browser to scan for potential candidates within both of these regions.

We found multiple candidates that satisfy both criteria in both of these regions. However, the best splice acceptor candidate appears to be at 8871-8872 (in frame +3) based on the conservation of exon size with D. melanogaster.
best small exon candidate CG33978 D. biarmipes
Dbia_contig10_CG33978_best_candidate.png (38.05 KiB) Viewed 6119 times

Re: Finding Elusive Exons in D. biarmipes

Posted: Thu Nov 01, 2012 5:32 pm
by akleinschmit
Thank you Wilson :) That explanation was wonderful (I hadn't considered using the information held within the Dmel ortholog coordinates to help us locate the exons)!