GENSCAN 1.0 Date run: 2-Apr-107 Time: 22:00:30 Sequence chunk2-7 : 80323 bp : 40.96% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.01 Sngl + 6288 7220 933 2 0 84 54 957 0.999 88.10 1.02 PlyA + 8927 8932 6 1.05 2.03 PlyA - 10182 10177 6 1.05 2.02 Term - 18802 17553 1250 2 2 9 32 960 0.152 73.25 2.01 Init - 36038 35879 160 2 1 91 72 262 0.757 24.93 2.00 Prom - 36186 36147 40 -17.34 3.00 Prom + 36292 36331 40 -17.61 3.01 Sngl + 36374 37033 660 1 0 77 48 782 0.516 69.02 3.02 PlyA + 37598 37603 6 1.05 4.06 PlyA - 40034 40029 6 1.05 4.05 Term - 41098 40661 438 1 0 -17 53 401 0.861 20.39 4.04 Intr - 41527 41213 315 1 0 -28 61 530 0.932 34.04 4.03 Intr - 41899 41631 269 2 2 111 42 287 0.697 22.73 4.02 Intr - 77852 77657 196 2 1 31 57 136 0.140 2.87 4.01 Init - 78129 77992 138 0 0 49 33 283 0.999 17.09Click here to view a PDF image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s): >chunk2-7|GENSCAN_predicted_peptide_1|310_aa MNALTKVKLINKLNEGEVQLGVADKVSWHSEYKDSAWIFLGGLPYELTEGDIICVFSQYG EIVNINLVRDKKTGKSKGFCFLCYEDQRSTILAVDNFNGIKIKGRTIQVDHVSNYQAPKN SEEMDDATKELQKGSYGPHTYSSSSSESSEDEQPIKKHKKEKKEKKKRKKEKEKTDRLVQ AEQPSSSSPTSKTVKEKDDPGPKKHSSKNSERTQKSQPRERRKLPEARTAYYGGAEDLER ELKKEKRKHEHKSSSRKEAREETTGDRDRGQSSDTLSSPYNWRSEGRSHRTRSRSQDKSH RHKKARRSWE >chunk2-7|GENSCAN_predicted_peptide_2|469_aa MPWCHDEGEDSPGDTIACKASLAYARAIVEYESSDVVMHGELAAAVDGRQRGEVFCFPSV LMASSAKSAEMPTISKTLNPTPDPHQEYLDPRITIALFEIGSHSPSSWGSLPSLKNSSHQ VTEQQTAQKFNNLLKEIKDILKNMAGFEEKITEAKELFEETNIPEDVSAHKENIRGLDKI NEMLSTNLPVSLAPEKEDNEKKQEMILETNITEDVSAHKENIRGLDKINEMLSTNLSLSL APEKEDNEKKQEMIMENQNSENTVQVFARDLVNRLEEKKVLNETQQSQEKAKNRLNVQEE TMKIRNNMEQLLQEAEHWSKQHTELSKLIKSYQKSQKDISETLGNNGVDFQTQPNNEVSA KHELEEQVKKLSHDTYSLQLMAALLENECQILQQRVEILKELHHQKQGTLQEKPIQINYK QDKKNQKPSEAKKVEMYKQNKQEMKGTFQKKDRSCRSLDACLNKKACNT >chunk2-7|GENSCAN_predicted_peptide_3|219_aa MGSPTLCPSMKGTPLPHTILCLDLAGRNLTDYLMKILTQCGYSFTATVMQEIVCDIKKKL CCIPLDFEQETAMVGSSSSLEKSYKLPNGQVITISNKWFCCPEALFQTSFVGMESCGIHE TTFNSIMKSDVDIYKDLYANAVLSGSTTMYPSITNRMQKEITALAPSAMKIKITAPPECK YSVWIRGSILASLSTFQQMWISKQEYNKSGPSIVHGKCF >chunk2-7|GENSCAN_predicted_peptide_4|451_aa MPKARSLAPRAARALLAASRLERWAQPRLTLERRSASAERLGAEAWAGGLLRSLPGWGAE LRSRPPAFGPHAPGLVRGLAALGPSPNESAAPLRLRIRRKSLLLPLKGWSLNSMSFTTRS TTSSTNYWSLGSVQLPSYVAQLVSSVVSVYAGAGGSGFRISVSHSTSFWGGLGDLVGIGD IQNEKETMQGLNDCLASYLDRTIEDLRVQIFASTVDSACIILQIDKAHITADDFRVKCET ELAMCQSVESDIHGLRKSTDDTNVTQLQLEAEIEALKEELLFMKKTHEEEVKGLQAQIAS SGLTMETEESTKQSAEIGASEIMLMELRHTLQSLEINLNSMRNLKARLENSLREVETRYA MQMEQLNRVQLHLKLKLAQTWAEGQHQVQEYEALLNIKIKLEAEITTYHHLLEDEEGFNP GDALDSSNSIQSIQKTTTHRIVDSIGGPPGV Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Comments The SCORE of a predicted feature (e.g., exon or splice site) is a log-odds measure of the quality of the feature based on local sequence properties. For example, a predicted 5' splice site with score > 100 is strong; 50-100 is moderate; 0-50 is weak; and below 0 is poor (more than likely not a real donor site). The PROBABILITY of a predicted exon is the estimated probability under GENSCAN's model of genomic sequence structure that the exon is correct. This probability depends in general on global as well as local sequence properties, e.g., it depends on how well the exon fits with neighboring exons. It has been shown that predicted exons with higher probabilities are more likely to be correct than those with lower probabilities.