GENSCAN 1.0 Date run: 16-Jun-106 Time: 19:15:11 Sequence Pan_troglodytes_annotation_chunk_6 : 92926 bp : 37.47% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.01 Init + 3927 3988 62 2 2 92 36 134 0.941 7.37 1.02 Intr + 4295 4782 488 2 2 -4 58 251 0.141 4.63 1.03 Term + 32450 32922 473 0 2 98 33 360 0.767 25.61 1.04 PlyA + 33658 33663 6 1.05 2.04 PlyA - 33984 33979 6 1.05 2.03 Term - 39050 39042 9 2 0 131 43 0 0.557 -2.98 2.02 Intr - 40145 39946 200 0 2 56 93 174 0.799 12.85 2.01 Init - 42084 42015 70 0 1 62 89 27 0.801 1.46 2.00 Prom - 52533 52494 40 -3.75 3.00 Prom + 52584 52623 40 -6.75 3.01 Init + 58969 58982 14 0 2 104 39 13 0.757 -2.67 3.02 Intr + 59502 59755 254 0 2 122 98 132 0.940 13.75 3.03 Intr + 63568 63807 240 2 0 93 80 193 0.996 15.60 3.04 Intr + 65961 66168 208 1 1 96 64 166 0.960 12.21 3.05 Intr + 68207 69743 1537 2 1 48 16 1534 0.649 130.73 3.06 Term + 79333 79395 63 0 0 94 38 40 0.023 -3.49 3.07 PlyA + 79573 79578 6 1.05 4.02 PlyA - 81420 81415 6 1.05 4.01 Sngl - 85953 85666 288 0 0 55 32 280 0.901 14.44 4.00 Prom - 89332 89293 40 -10.05 5.03 PlyA - 89902 89897 6 1.05 5.02 Term - 90253 90017 237 1 0 44 40 279 0.702 13.88 5.01 Intr - 90956 90594 363 2 0 60 27 530 0.635 38.56Click here to view a PDF image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s): >Pan_troglodytes_annotation_chunk_6|GENSCAN_predicted_peptide_1|340_aa MAWRGRGGLGSEGAGSPGALDLEGQLFAPPHPLERRAPSREHDENLEIAFRASDLAGPAA ISSTVLGASASLCGVGWHTLASSFPSVPQPFGGRGGGGGLCADPGAGNPTLRLRKPRKSS PSLSQTLDSSAPGLRRHSEKLLVNLIPEAVPGRGGGGGLTHRLGSSESDATWAQDLSGLE SELGSFEDGLAALEIWRSDATMRTHTRGAPSVFFIYLLCFVSAYITDENPEVMIPFTNAN YDSHPMLYFSRAEVAELQRRAASSHEHIAARLTEAVHTMLSSPLEYLPPWDPKDYSARWN EIFGNNLGALAMFCVLYPENIEARDMAKDYMERMAAQPSW >Pan_troglodytes_annotation_chunk_6|GENSCAN_predicted_peptide_2|92_aa MNTKIQGIREERRPGAGGILQRPGLCGISNDLAPDYGISSCAVQNSHVDTGKEKERWMRE TVVSVAEMCSLNGHGRRRRRSKKAILKIPKGT >Pan_troglodytes_annotation_chunk_6|GENSCAN_predicted_peptide_3|771_aa MVMTKLVKDAPWDEVPLAHSLVGFATAYDFLYNYLSKTQQEKFLEVIANASGYMYETSYR RGWGFQYLHNHQPTNCMALLTGSLVLMNQGYLQEAYLWTKQVLTIMEKSLVLLREVTDGS LYEGVAYGSYTTRSLFQYMFLVQRHFNINHFGHPWLKQHFAFMYRTILPGFQRTVAIADS NYNWFYGPESQLVFLDKFVMRNGSANWLADQIRRNRVVEGPGTPSKGQRWCTLHTEFLWY DASLKSVPPPDFGTPTLHYFEDWGVVTYGSALPAEINRSFLSFKSGKLGGRAIYDIVHRN KYKDWIKGWRNFNAGHEHPDQNSFTFAPNGVPFITEALYGPKYTFFNNVLMFSPAVSKSC FSPWVGQVTEDCSSKWSKYKHDLAASCQGRVVAAEEKNGVVFIRGEGVGAYNPQLNLKNV QRNLILLHPQLLLLVDQIHLGEESPLETAASFFHNVDVPFEETVVDGVHGAFIRQRDGLY KMYWMDDTGYSKKATFASVTYPRGYPYNGTNYVNVTMHLRSPITRAAYLFIGPSIDVQSF TVHGDSQQLDVFIATSKHAYATYLWTGEATGQSAFAQVIADHHKILFDRNSVIKSSIVPE VKDYAAIVEQNLQHFKPVFQLLEKQILSRVRNTASFRKTAERLLRFSDKRQTEEAIDRIF AISQQQQQQSKSKKNRRAGKRYKFVDAVPDIFAQIEVNEKKIRQKAQILAQKELPIDEDE EMKDLLDFADVTYEKHKNGGLIKGRFGQARMVVEVNGVIVKRFQSADQCFI >Pan_troglodytes_annotation_chunk_6|GENSCAN_predicted_peptide_4|95_aa MVKKKISDSESDDSKSKKKTDAADKPRGFARGLDPERIIGATDSSGELMFLMKWKDSDEA DLVLAKEASMKCPQIVIAFYEERLTWHSCPEDEAQ >Pan_troglodytes_annotation_chunk_6|GENSCAN_predicted_peptide_5|199_aa SEEPGDRELEAGEQNPGAPGEEGTPGQRLEPLLHDHQDLRAQIFTNTVDNARIVLQINAC FAAVDFSVKYETELAMCQSVESDIHGVHKVIDDTNVTRLQLETEIKALKKQLLFMKNHEE EMEQLNGILLHLESELAQTQAEGQHQAQEYEALLNIKVKLEAEIATYNNLLEDGEDFNLG DALDNSNSMQTIQKTPPAQ Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Comments The SCORE of a predicted feature (e.g., exon or splice site) is a log-odds measure of the quality of the feature based on local sequence properties. For example, a predicted 5' splice site with score > 100 is strong; 50-100 is moderate; 0-50 is weak; and below 0 is poor (more than likely not a real donor site). The PROBABILITY of a predicted exon is the estimated probability under GENSCAN's model of genomic sequence structure that the exon is correct. This probability depends in general on global as well as local sequence properties, e.g., it depends on how well the exon fits with neighboring exons. It has been shown that predicted exons with higher probabilities are more likely to be correct than those with lower probabilities.