GENSCAN 1.0 Date run: 13-Mar-108 Time: 11:34:24 Sequence panTro2_dna : 113620 bp : 37.94% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.03 PlyA - 1018 1013 6 1.05 1.02 Term - 4627 3701 927 1 0 44 41 691 0.647 51.16 1.01 Init - 5434 5069 366 1 0 104 87 335 0.981 32.07 1.00 Prom - 15261 15222 40 -8.15 2.00 Prom + 23920 23959 40 -3.95 2.01 Init + 27519 27607 89 2 2 38 101 47 0.096 1.16 2.02 Intr + 43684 43814 131 1 2 26 49 112 0.050 0.52 2.03 Intr + 70135 70318 184 2 1 137 17 168 0.089 12.52 2.04 Intr + 70753 71044 292 1 1 -7 -15 476 0.097 24.01 2.05 Intr + 83244 83370 127 2 1 69 106 154 0.958 14.63 2.06 Intr + 87034 87192 159 2 0 41 93 157 0.986 10.54 2.07 Term + 88179 89698 1520 1 2 130 39 710 0.889 59.97 2.08 PlyA + 90779 90784 6 1.05Click here to view a PDF image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s): Predicted coding sequence(s): >panTro2_dna|GENSCAN_predicted_peptide_1|430_aa MEELVTEKEAEESHRPDSVSLLTFILLLTLTILTIWLFKYCRVHFLHETGLAMICGLIVG VILRYGTPGTRGRDKLLNCTQEDQAFSTLVVDVSGKFFEYTLKREISPGKINSVKQNDML GKSVGIFLGIFSGCFTMGAVTGVVTALVTKFTKLDCFPLLETALFFLMSWSTFLLAEACG FTGVVAVLFCGITQAHYTFNNLLVESRSRSKQLFEAENFIFSCMVLALFTFQKHVFSPVF IIGAFVAVFLGRAAHIYPLSFFLSLGRRHKIGWNFQHTMMFSGLRGAVAFALAICDTASY ARQMTFTTTPFIVFFTIWIIGGGTTPMLSWLNIRVSIKEPSKEDRNEHHWQYFRVGVDPD QDPPPNNDSFQVLQGDSPDSARGNWTKQESTWIFRLWYSFDHNYLKPILTHSGSPLTTTL PPGGDTAAPH >panTro2_dna|GENSCAN_predicted_CDS_1|1293_bp atggaggagctcgtcactgagaaggaggcggaagagagccaccggccagacagtgtgagc ctgctcaccttcatcctgctgctcacgctcaccatcctcaccatatggctcttcaagtac tgccgggtgcactttctgcatgagaccgggctggccatgatctgtgggctcatcgttggg gtgatcctgaggtatggtacccctggcaccaggggccgtgacaaattactcaactgcact caagaagatcaggccttcagcactttagtagtggatgtcagcggtaaattcttcgaatac actctgaaaagagaaatcagccctggcaagatcaacagcgtaaagcagaatgacatgcta gggaagtcagttggcatttttctaggtatatttagtggctgttttaccatgggagctgtg actggtgttgtgactgctttagtgaccaagtttaccaaactggactgctttcccctgctg gagacggcgctcttcttcctcatgtcctggagcacgtttctcttggcagaagcctgcgga tttacaggcgttgtagctgtccttttctgtggaatcacacaagctcattacaccttcaac aatctgttggtggaatcaagaagtcgaagcaagcagctctttgaggcagagaacttcatc ttctcctgcatggtcctggcgctatttaccttccagaagcacgttttcagccctgttttc atcattggagcttttgttgctgtcttcctgggcagagccgcccatatctacccgctctcc ttcttcctcagcttgggcagaaggcataagattggctggaattttcaacacacgatgatg ttttcaggcctcaggggagcagtggcatttgcgttggccatctgtgacacggcatcctat gctcgccagatgacgttcaccaccacgcctttcatcgtgttcttcaccatctggatcatt ggaggaggcacgacacccatgttgtcatggcttaatatcagagttagcatcaaggagccc tccaaagaggaccgcaacgaacaccactggcagtacttcagagttggtgttgaccccgat caagatccaccacccaacaatgacagctttcaagtcttacaaggggacagcccagattct gccagaggaaactggacaaaacaggagagcacatggatattcaggctgtggtacagcttt gatcacaattacctgaagcccatcctcacacacagcggctccccgctaaccaccactctc ccgcctggtggagacacagcggctccccactaa >panTro2_dna|GENSCAN_predicted_peptide_2|833_aa MQPRTYVDLKLAFCDCNGKQIKQEEDVGRCLRTECQRARGCGSPGPFVPHTGASVCLKAR QISKAINRDRSAGGLAFGEGKAAATLGASVSREEERPSQGSSHWRGIRRVRRRSTPETKA ASQAGGGRGNGAQKQSKRAGVSGGGKGCGEGASQIPEMPEFLEDPSVLTKDKLKSELVAN NVTLPAGEQRKDVYVQLYLQHLTARNRPPLPAGTNSKGPPDFSSDEEREPTPKATKKTDK PRQEDKDDLDVTELTNEDLLDQLVKYGVNPGPIVGTTRKLYEKKLLKLREQGTESRSSTP LPTISSSAENTRQNGSNDSDRYSDNEEGKKKEHKKVKSTRDTVPFSELGTTPSGGGFFQG ISFPEISTRPPLGSTELQAAKKVHTSKGDLPREPLVATNLPGRGQLQKLASERNLFISCK SSHDRCLEKSSSSSSQPEHSAMLVSTAASPSLIKETTTGYYKDIVENICGREKSGIQPLC PERSHISDQSPLSSKRKALEESESSQLISPPLAQAIRDYVNSLLVQGGVGSLPGTSNSMP PLDVENIQKRIDQSKFQETEFLSPPRKVPRLSEKSVEERDSGSFLAFQNIPGSELMSSFA KTVVSHSLTTLGLEVAKQSQHDKIHASELSFPFHESILKVIEEEWQQVDRQLPSLACKYP VSSREATQILSVPKVDDEILGFISEATPLGGIQAASTESCNQQLDLALCRAYEAAASALQ IATHTAFVAKAMQADISQAAQILSSDPSRTHQALGILSKTYDAASYICEAAFDEVKMAAH TMGNSTVGRRYLWLKDCKINLASKNKLASTPFKGGTLFGGEVCKVIKKRGNKH >panTro2_dna|GENSCAN_predicted_CDS_2|2502_bp atgcagccaagaacatatgttgacctaaaactggcattctgtgattgtaatggaaaacag ataaagcaggaggaagatgtaggaagatgcctgaggactgaatgccaaagggccaggggg tgtgggtcaccaggaccctttgttcctcacactggggctagtgtttgcctcaaagctaga cagatcagtaaagctattaaccgggataggtcagcaggtgggctagcgtttggggaaggc aaggctgcggctactcttggagcttcagtgtcccgggaggaagaaaggcccagccaaggg tcctcacactggcgtggaattcggcgcgttcgtaggcgatcgaccccagagacgaaagct gcttctcaagctgggggagggagaggaaacggcgcacaaaagcagagcaagcgcgccggc gtgagcggcggcggcaaaggctgtggggagggggcttcgcagatccccgagatgccggag ttcctggaagacccctcggtcctgacaaaagacaagttgaagagtgagttggtcgccaac aatgtgacgctgccggccggggagcagcgcaaagacgtgtacgtccagctctacctgcag cacctcacggctcgcaaccggccgccgctccccgccggcaccaacagcaaggggcccccg gacttctccagtgacgaagagcgcgagcccaccccgaaagccacaaaaaaaactgataaa cccagacaagaagataaagatgatctagatgtaacagagctcactaatgaagatcttttg gatcagcttgtgaaatacggagtgaatcctggtcctattgtgggaacaaccaggaagcta tatgagaaaaagcttttgaaactgagggaacaaggaacagaatcaagatcttctactcct ctgccaacaatttcttcttcagcagaaaatacaaggcagaatggaagtaatgattctgac agatacagtgacaatgaagaaggaaagaagaaagaacacaagaaagtgaagtccactagg gatactgttcctttttctgaacttggaactactccctctggtggtggattttttcagggt atttcttttcctgaaatctccacccgtcctcctttgggcagtaccgaactacaggcagct aagaaagtacatacttctaagggagacctacctagggagcctcttgttgccacaaacttg cctggcaggggacagttgcagaagttagcctctgaaaggaatttgtttatttcatgcaag tctagccatgataggtgtttagagaaaagttcttcgtcatcttctcagcctgaacacagt gccatgttggtctctactgcagcttctccttcactgattaaagaaaccaccactggttac tataaagacatagtagaaaatatttgcggtagagagaaaagtggaattcaaccattatgt cctgagaggtcccatatttcagatcaatcgcctctctccagtaaaaggaaagcactagaa gagtctgagagctcacaactaatttctccgccacttgcccaggcaatcagagactatgtc aattctctgttggtccagggtggggtaggtagtttgcctggaacttctaactctatgccc ccactggatgtagaaaacatacagaagagaattgatcagtctaagtttcaagaaactgaa ttcctgtctcctccacgaaaagtccctagactgagtgagaagtcagtggaggaaagggat tcaggttcctttttggcatttcagaacatacctggatccgaactgatgtcttcttttgcc aaaactgttgtctctcattctctcactaccttaggtctagaagtggctaagcaatcacag catgataaaatacatgcctcagaactatcttttcccttccatgaatctattttaaaagta attgaagaagaatggcagcaagttgacaggcagctgccttcactggcatgcaaatatcca gtttcttccagggaggcaacacagatattatcagttccaaaagtagatgatgaaatccta gggtttatttctgaagccactccactaggaggtattcaagcagcctccactgagtcttgc aatcagcagttggacttagcactctgtagagcatatgaagctgcagcatcagcattgcag attgcaacccacactgcctttgtagctaaggctatgcaggcagacattagtcaagctgca cagattcttagctcagatcctagtcgtacccaccaagcgcttgggattctgagcaaaaca tatgatgcagcctcatatatttgtgaagctgcatttgatgaagtgaagatggctgcccat accatgggaaattccactgtaggtcgtcgatacctctggctgaaggattgcaaaattaat ttagcttctaagaataagctggcttccactccctttaaaggtggaacattatttggagga gaagtatgcaaagtaattaaaaagcgtggaaataaacactag Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Comments The SCORE of a predicted feature (e.g., exon or splice site) is a log-odds measure of the quality of the feature based on local sequence properties. For example, a predicted 5' splice site with score > 100 is strong; 50-100 is moderate; 0-50 is weak; and below 0 is poor (more than likely not a real donor site). The PROBABILITY of a predicted exon is the estimated probability under GENSCAN's model of genomic sequence structure that the exon is correct. This probability depends in general on global as well as local sequence properties, e.g., it depends on how well the exon fits with neighboring exons. It has been shown that predicted exons with higher probabilities are more likely to be correct than those with lower probabilities.