GENSCAN 1.0 Date run: 11-Nov-105 Time: 10:30:27 Sequence Unknown : 170934 bp : 43.30% C+G : Isochore 2 (43 - 51 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.02 PlyA - 1029 1024 6 1.05 1.01 Sngl - 9160 8738 423 1 0 79 49 394 0.877 30.90 1.00 Prom - 14680 14641 40 -3.46 2.00 Prom + 35754 35793 40 -5.26 2.01 Sngl + 50467 50901 435 0 0 55 42 448 0.399 33.17 2.02 PlyA + 53128 53133 6 1.05 3.00 Prom + 55104 55143 40 -9.26 3.01 Init + 56584 56644 61 0 1 67 92 -35 0.145 -3.60 3.02 Term + 72008 72360 353 0 2 49 46 478 0.852 34.35 3.03 PlyA + 73098 73103 6 1.05 4.00 Prom + 78208 78247 40 -2.96 4.01 Init + 112457 112613 157 1 1 53 34 165 0.901 5.67 4.02 Term + 112685 112950 266 0 2 77 40 149 0.843 4.67 4.03 PlyA + 114211 114216 6 1.05 5.04 PlyA - 114334 114329 6 1.05 5.03 Term - 119129 118774 356 0 2 70 54 525 0.982 41.96 5.02 Intr - 120003 119595 409 0 1 11 82 198 0.380 5.34 5.01 Init - 121497 121246 252 0 0 46 81 116 0.468 3.83 5.00 Prom - 123905 123866 40 -7.06 6.04 PlyA - 124122 124117 6 1.05 6.03 Term - 125834 125095 740 0 2 101 42 650 0.996 55.33 6.02 Intr - 128460 128314 147 1 0 119 54 104 0.578 10.31 6.01 Init - 131271 131187 85 0 1 91 57 120 0.600 8.33 6.00 Prom - 131563 131524 40 -7.76 7.10 PlyA - 131684 131679 6 -1.95 7.09 Term - 133089 132284 806 1 2 148 44 936 0.999 88.59 7.08 Intr - 134624 134486 139 2 1 82 103 142 0.930 15.14 7.07 Intr - 135131 134944 188 0 2 103 9 127 0.087 5.71 7.06 Intr - 144109 144060 50 0 2 109 78 -8 0.051 -1.38 7.05 Intr - 147074 146991 84 1 0 108 87 67 0.083 7.54 7.04 Intr - 148901 148861 41 2 2 82 92 48 0.088 1.62 7.03 Intr - 153997 153840 158 2 2 19 53 263 0.976 15.63 7.02 Intr - 154613 154544 70 2 1 -11 104 155 0.689 5.95 7.01 Init - 157828 157820 9 1 0 84 57 34 0.558 -1.28 7.00 Prom - 160198 160159 40 -5.36 8.05 PlyA - 160224 160219 6 1.05 8.04 Term - 166490 166240 251 0 2 66 48 265 0.977 16.17 8.03 Intr - 168226 167450 777 2 0 43 58 679 0.473 51.85 8.02 Intr - 169524 168911 614 2 2 26 76 140 0.076 -1.37 8.01 Intr - 170307 170052 256 1 1 100 -18 571 0.295 44.20Click here to view a PDF image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s): Predicted coding sequence(s): >Unknown|GENSCAN_predicted_peptide_1|140_aa MSAYAFYVQTCREEHRKKNPEVPVNFAEFSKKYSERWKTMSGKDKSKFDEITKADKMRYD QEMKDYGPAKGAKKKKDPNASKRPLSGFFLFCSELGPKIKSTNPTISIRDMAKKLGEMWN NLNGSEKQPYITKAAKLKEK >Unknown|GENSCAN_predicted_CDS_1|423_bp atgtctgcttatgccttctatgtgcagacgtgcagagaagaacataggaagaaaaaccca gaggtccctgtcaattttgcagaattttccaagaagtactctgagaggtggaagacaatg tctgggaaagataaatctaaatttgatgaaataacaaaggcagataaaatgcgctatgat caggaaatgaaggattatggaccagctaagggagccaagaagaagaaggatcctaatgcc tccaaaaggccactgtctggattcttcctgttctgttcagaattaggccccaagatcaaa tctacaaaccccaccatctctattagagacatggcaaaaaagctgggtgagatgtggaat aacttaaatggcagtgaaaagcagccctacatcactaaggcggcaaagctgaaggagaag tag >Unknown|GENSCAN_predicted_peptide_2|144_aa MELQEIQLKEAKHIAEEADRKYEEVPRKLVIIEGDLECTEERAELAESRCQETDEQIRLM DQNLKCLSDAEEKYSQKEDIYEEEIKILTDKLKEAETRAEFTERLVAKLEKTTDDLEYKL KCNKEENLCTQRMLYQTLLDLNEM >Unknown|GENSCAN_predicted_CDS_2|435_bp atggaactccaggaaatccaactcaaagaagctaagcatattgcagaagaggcagatagg aaatatgaagaggtgcctcgtaagttggtgatcattgaaggagacttggaatgcacagag gaacgagctgagctggcagagtcccgctgccaagagacagatgagcagatcagactgatg gaccagaacctgaagtgtctaagtgatgctgaagaaaaatactctcaaaaagaagacata tatgaggaagagatcaagattctcactgacaaactcaaggaggcagagacccgtgctgag tttactgagagattggtagccaagctggaaaagacaactgatgacttggaatataaactg aaatgcaacaaagaggagaacctctgtacacaaaggatgctgtaccagactctgcttgac ctgaatgagatgtag >Unknown|GENSCAN_predicted_peptide_3|137_aa MGQPHSTKCILSHLLQLVVTGFRGCCDDQNKGRFDGPEAQEEACSGKRTYQELLVNQNPI VQPLASRRLTRNLYKCIKKAMKQKQLRRGVKEVQKFVNRGEKGIMVLAEDTLPIEVYCHL PVMCEDRNLAYVSIPLR >Unknown|GENSCAN_predicted_CDS_3|414_bp atgggtcagccccattctaccaaatgtattctttcccatttgctccagctggtggtaaca ggtttccgcggctgctgcgatgaccaaaataaaggtagattcgatgggcccgaggctcag gaggaggcgtgctccgggaagcgcacctaccaagagctgctggttaaccagaaccccatc gtgcagcccctggcttctcgccgcctcacgcggaacctctacaaatgcatcaagaaagcc atgaagcagaagcagcttcggcgtggggtgaaagaggttcagaaatttgtcaacagagga gaaaaagggatcatggttttggcagaagacacactgcccattgaggtatactgccatctt ccagtcatgtgtgaggaccgaaatctggcctatgtctctatccctctaagatga >Unknown|GENSCAN_predicted_peptide_4|140_aa MLLLSRPGPAQVHPEARSLQAREEPGCCSLRALPGAQEAGETERRRWVKNSEGSRRSCGI SSRSCRSLAGRGQALVPPARSPQPQQSRSSRGRRCPSDSIARADLFPSLGQLTPAHPKGK TPSVPQNGKIPVHTLPEPCC >Unknown|GENSCAN_predicted_CDS_4|423_bp atgctgctcctttcccgccccggaccggcgcaggtccatccagaggcgcgaagcctgcag gccagggaagagcccggatgctgctcccttcgtgctctgcctggggcccaggaagcagga gaaacggaacggcggcgttgggtcaagaactcagaggggtcgcgaaggtcctgcggcatc tcctctcgcagttgccgcagcctagccggccgggggcaggcgctggtgccccccgcccgc tccccgcagccccagcagagccggagttcccgcggccgccgctgcccgagcgactcgatc gcccgagccgacctcttcccaagccttggacagctgacccctgcgcatcctaaaggaaag accccatctgttcctcagaatgggaaaattcccgtgcatactttgccagaaccgtgttgc tga >Unknown|GENSCAN_predicted_peptide_5|338_aa MLELWTGPVRPTRERGGSWVSGRRQMACSARPGPHAGHVRQRHLSLPRLLPLKIRRSSSS ASRRAPGAKLSGKEKGAESDERGKTSGNLGVSYSHSSCGPSYGSQNFSAPYSPYALNQEA DVSGGYPQCAPAVYSGNLSSSMVQHHHHHQGYAGGAVGSPQYIHHSYGQEHQSLALATYN NSLSPLHASHQEACRSPASETSSPAQTFDWMKVKRNPPKTGKVGEYGYLGQPNAVRTNFT TKQLTELEKEFHFNKYLTRARRVEIAASLQLNETQVKIWFQNRRMKQKKREKEGLLPISP ATPPGNDEKAEESSEKSSSSPCVPSPGSSTSDTLTTSH >Unknown|GENSCAN_predicted_CDS_5|1017_bp atgttggagctgtggaccggccctgtgaggcccacgcgggagcgcggaggctcctgggtc tcggggcgccggcagatggcctgcagcgcacggcccggcccccacgctggccacgtccgc cagcgtcatctctctctcccccgcctcctacccctaaaaatccggcggtcgagtagctcc gcatcccggagggctcccggtgcaaaactgagtgggaaagagaaaggggcggagagcgac gagagagggaagacttccgggaacctgggggtgtcctactcccactcgagttgtggtcca agctatggctcacagaacttcagtgcgccttacagcccctacgcgttaaatcaggaagca gacgtaagtggtgggtacccccagtgcgctcccgctgtttactctggaaatctctcatct tccatggtccagcatcaccaccaccaccagggttatgctgggggcgcggtgggctcgcct caatacattcaccactcatatggacaggagcaccagagcctggccctggctacgtataat aactccttgtcccctctccacgccagccaccaagaagcctgtcgctcccctgcatcggag acatcttctccagcgcagacttttgactggatgaaagtcaaaagaaaccctcccaaaaca gggaaagttggagagtacggctacctgggtcaacccaacgcggtgcgcaccaacttcact accaagcagctcacggaactggagaaggagttccacttcaacaagtacctgacgcgcgcc cgcagggtggagatcgctgcatccctgcagctcaacgagacccaagtgaagatctggttc cagaaccgccgaatgaagcaaaagaaacgtgagaaggagggtctcttgcccatctctccg gccaccccgccaggaaacgacgagaaggccgaggaatcctcagagaagtccagctcttcg ccctgcgttccttccccggggtcttctacctcagacactctgactacctcccactga >Unknown|GENSCAN_predicted_peptide_6|323_aa MGPGQAIRLEARRILRMAAGAPGGCGSSVLFPTTEPGSKHPRCPRIRKVWEAVQAAREHS LYLNQPAPNQSDKDKKKESLEIADGSGGGSRRLRTAYTNTQLLELEKEFHFNKYLCRPRR VEIAALLDLTERQVKVWFQNRRMKHKRQTQCKENQNSEGKCKSLEDSEKVEEDEEEKTLF EQALSVSGALLEREGYTFQQNALSQQQAPNGHNGDSQSFPVSPLTSNEKNLKHFQHQSPT VPNCLSTMGQNCGAGLNNDSPEALEVPSLQDFSVFSTDSCLQLSDAVSPSLPGSLDSPVD ISADSLDFFTDTLTTIDLQHLNY >Unknown|GENSCAN_predicted_CDS_6|972_bp atggggcctggccaggctattcgcctggaagctcggcgaattctcaggatggcggctggg gctccaggcggctgcggcagctctgtgctgtttcccacaacagaacccggaagcaaacat ccccggtgcccaaggatcaggaaggtgtgggaggcagttcaggctgccagggagcactcg ctgtatctaaaccaaccagcccccaaccagagcgacaaggacaagaagaaggaatccctg gaaatcgccgatggcagcggcgggggatcgcggcgcctgagaactgcttacaccaacaca cagcttctagagctggaaaaagaatttcatttcaacaagtacctttgcagaccccgaagg gtggagattgcagcgctgctggatttgactgagagacaagtgaaagtgtggtttcagaac cggaggatgaagcacaagaggcagacccagtgcaaggaaaaccaaaacagcgaagggaaa tgtaaaagccttgaggactccgagaaagtagaggaggacgaggaagagaagacgctcttt gagcaagcccttagcgtctctggggcccttctggagagagaaggctacacttttcagcaa aatgccctctctcagcagcaggctcccaatggacacaatggcgactcccaaagtttccca gtctcgcctttaaccagcaatgagaaaaatctgaaacattttcagcaccagtcacccact gttcccaactgcttgtcaacaatgggccagaactgtggagctggcctaaacaatgacagt cctgaggcccttgaggtcccctctttgcaggactttagcgttttctccacagattcctgc ctgcagctttcagatgcagtttcacccagtttgccaggttccctcgacagtcccgtagat atttcagctgacagcttagacttttttacagacacactcaccacaatcgacttgcagcat ctgaattactaa >Unknown|GENSCAN_predicted_peptide_7|514_aa MLLSPLGLKGKEPVVYPWMKKIHVSAVNPSYNGGEPKRSRTAYTRQQVLELEKEFHFNRY LTRRRRIEIAHTLCLSERQGFCESRERAASGETAPGHRKRRFPERLELETIPGSEFKHLS RGGVKGRNHAQPPAAAISLRGAAIGGGVSRDRGGVPMCALTGVKPLSECAIKIVKQRDAK SDLLRQLGDLRWLPLPGSQRNASNNPTPANAAKSPLLNSPTVAKQIFPWMKESRQNTKQK TSSSSSGESCAGDKSPPGQASSKRARTAYTSAQLVELEKEFHFNRYLCRPRRVEMANLLN LTERQIKIWFQNRRMKYKKDQKGKGMLTSSGGQSPSRSPVPPGAGGYLNSMHSLVNSVPY EPQSPPPFSKPPQGTYGLPPASYPASLPSCAPPPPPQKRYTAAGAGAGGTPDYDPHAHAL QGNGSYGTPHIQGSPVFVGGSYVEPMSNSGPALFGLTHLPHAASGAMDYGGAGPLGSGHH HGPGPGEPHPTYTDLTGHHPSQGRIQEAPKLTHL >Unknown|GENSCAN_predicted_CDS_7|1545_bp atgctgctgagcccgctgggcctgaagggcaaggagcccgtggtgtacccctggatgaag aagatccatgtcagcgccgttaaccccagttataacggaggggagcctaagcgctctcga accgcctacacccggcagcaggtcttggagctggagaaggagttccacttcaatcgctac ctgacccggcggcgccgcatcgagatcgcccacacgctctgtttgtctgagcgccagggt ttctgcgagtccagggagcgcgccgcgtccggggaaacagcgcctggacaccggaaaagg cgattccctgagcgcctggagttggagacaattcctggttcagaatttaaacatctttct aggggaggagtgaaggggaggaatcatgcacagcctcctgcagctgctataagtctgcgc ggggcggccattggcggcggagtgtcacgtgaccgcgggggcgtgccaatgtgcgccctc acgggtgtcaaacccctgtcagagtgtgcgatcaagatcgtgaaacaacgcgatgcaaaa agcgacctactacgacagctcggcgatctacggtggctacccctaccaggcagccaacgg aatgccagcaacaaccctacccctgccaacgcggccaagagccccctgctcaactcaccc acagtggccaaacaaatcttcccctggatgaaagagtctcgacaaaacacaaagcagaaa accagcagctccagctcaggcgaaagctgcgctggcgacaagagcccgccggggcaggct tcgtccaagcgcgcgcgcacggcctacacgagcgcgcagctggtggagctggagaaagag ttccacttcaaccgctacctgtgccggccgcgccgggtggagatggccaatctgctgaac ctcactgagcgccagatcaagatctggttccagaatcgccgcatgaagtacaaaaaggat cagaagggcaagggcatgctaacgtcatcggggggccagtctccaagtcgcagccccgtg ccccccggagccggtggctatctgaactctatgcattcgctggtcaacagcgtcccgtat gagccccagtcgcccccgcccttctccaagcccccccagggtacctacgggctgcccccc gcctcctaccctgcgtccctgcccagctgcgcacccccgccacccccacagaagcgctac acggcggcaggggcgggcgcagggggcacccccgactatgacccgcacgctcatgccctg cagggcaacggcagctatgggaccccacacatacagggaagccccgtcttcgtggggggc agctatgtggagcccatgagcaactccgggccagccctctttggtctaactcacctcccc cacgctgcctcgggcgccatggactatgggggtgccgggccgctgggcagcggccaccac cacgggccggggcctggggagccgcaccccacctacacggaccttaccggccaccatcct tctcagggaagaattcaggaagcacccaagctcacccacctgtga >Unknown|GENSCAN_predicted_peptide_8|632_aa XAVYGSHGRRGRQTYTRYQTLELEKEFHFNRYLTRRRRIEIANALCLTERQIKIWFQNRR MKWKKENKLINSTQPSGEDSEAKAGEKEARGTKGAPTEPSLSVPRVQPSVEEKQPQPRRV NLGAGGTLQPGEPLGSLRLPAPDSRSQPPIAPASPPQQRSGPNEGPRAMGAARVLASRRA RGACNAGPPAGRLHLPLSYVCDLLESFGAVENNYRDTRHCQGMLPLSPSFYSWRSKQDPS NPPHGSPDPKQPLWLQGAWAHSLGSRRTLRLGARLARGTGPCDEDFPGWILHEFTSRGHQ AGFTTGQQKHVIRSRTPYLGAYVGGNQVHVPVISIIHHKLCKGAIDAQTTASHKSSTHIK KQMSSYFVNSFCGRYPNGPDYQLHNYGDHSSVSEQFRDSASMHSGRYGYGYNGMDLSVGR SGSGHFGSGERARSYAASASAAPAEPRYSQPATSTHSPPPDPLPCSAVAPSPGSDSHHGG KNSLSNSSGASANAGSTHISSREGVGTASGAEEDAPASSEQASAQSEPSPAPPAQPQIYP WMRKLHISHDNIGGPEGKRARTAYTRYQTLELEKEFHFNRYLTRRRRIEIAHALCLSERQ IKIWFQNRRMKWKKDNKLKSMSMAAAGGAFRP >Unknown|GENSCAN_predicted_CDS_8|1899_bp ngtgctgtgtatgggagccatgggcgccgaggccgccagacctacacgcgctaccagaca ctggagctggagaaggagttccacttcaaccgctacctgacacggcgccgccgcatcgag atcgccaacgcgctctgcctcaccgagcgccagatcaagatctggttccagaaccgccgc atgaagtggaaaaaggaaaacaagctcatcaattccacgcagcccagcggggaggactca gaggcaaaagcgggcgagaaggaagctcgaggaacaaagggggccccaacagagcccagt ctctcggtcccgcgtgtgcaaccgtcagtggaagagaagcagcctcagccgaggcgagtt aacctgggcgcgggtggaacattacagcccggggagcccctgggctccctcaggctcccg gctccggattcccgctcccagcctccgatagcgcccgcgtcgccgccacagcagcgttca ggacccaacgaggggcccagggccatgggagccgccagagtcctggcttccagacgtgcc aggggcgcctgcaacgccgggcctccagcggggagactccacttgcctctcagctatgtt tgtgatttacttgagtctttcggagccgtggaaaataactacagagatactaggcattgc caaggaatgcttccattatcgccctcattttactcctggaggtctaagcaagaccccagt aacccgccccacggatccccagaccccaagcagcctctgtggctgcagggagcctgggcc cactcgctgggttcaaggagaaccctccgacttggggcccggctggctcgaggaaccgga ccctgtgatgaagattttccaggctggatactgcacgagtttacctctagaggtcatcag gcaggatttacgactggacaacaaaagcacgtgattcgaagtcgtaccccatatttgggt gcctacgtaggagggaaccaagtacatgtcccagtcatttccataattcatcataaattg tgcaagggtgctatagacgcacaaacgaccgcgagccacaaatcaagcacacatatcaaa aaacaaatgagctcttattttgtaaactcattttgcggtcgctatccaaatggcccggac taccagttgcataattatggagatcatagttccgtgagcgagcaattcagggactcggcg agcatgcactccggcaggtacggctacggctacaatggcatggatctcagcgtcggccgc tcgggctccggccactttggctccggagagcgcgcccgcagctacgctgccagcgccagc gcggcgcccgccgagcccaggtacagccagccggccacgtccacgcactctcctccgccc gatccgctgccctgctccgccgtggccccctcgcccggcagcgacagccaccacggcggg aaaaactccctgagcaactccagcggcgcctcggccaacgccggcagcacccacatcagc agcagagagggggttggcacggcgtccggagccgaggaggacgcccctgccagcagcgag caggcgagtgcgcagagcgagccgagcccggcgccgcccgcccaaccccagatctacccc tggatgcgcaagctgcacataagtcatgacaacataggcggcccggaaggcaaaagggcc cggacggcctacacgcgctaccagaccctggagctggagaaggagttccacttcaaccgt tacctgacccgcagaaggaggattgaaatagcacatgctctttgcctctccgagagacaa attaaaatctggttccaaaaccggagaatgaagtggaaaaaagataataagctgaaaagc atgagcatggccgcggcaggaggggccttccgtccctga Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Comments The SCORE of a predicted feature (e.g., exon or splice site) is a log-odds measure of the quality of the feature based on local sequence properties. For example, a predicted 5' splice site with score > 100 is strong; 50-100 is moderate; 0-50 is weak; and below 0 is poor (more than likely not a real donor site). The PROBABILITY of a predicted exon is the estimated probability under GENSCAN's model of genomic sequence structure that the exon is correct. This probability depends in general on global as well as local sequence properties, e.g., it depends on how well the exon fits with neighboring exons. It has been shown that predicted exons with higher probabilities are more likely to be correct than those with lower probabilities.