GENSCANW output for sequence chunk2_3




GENSCAN 1.0	Date run:  2-Apr-107	Time: 21:54:16

Sequence Pan : 101300 bp : 38.84% C+G : Isochore 1 ( 0 - 43 C+G%)

Parameter matrix: HumanIso.smat

Predicted genes/exons:


Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr..
----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------

 1.01 Init +   1661   1986  326  1  2   62   36   272 0.830  14.25
 1.02 Intr +   4302   4423  122  0  2  106   55     0 0.406  -2.28
 1.03 Intr +   5669   5852  184  0  1   82   21   138 0.577   4.42
 1.04 Intr +   7263   7393  131  0  2   49   79    62 0.340   0.82
 1.05 Intr +   8546   8741  196  0  1   18   31   152 0.583   0.05
 1.06 Intr +   8774   9053  280  2  1   84   49   119 0.786   4.16
 1.07 Intr +   9112   9258  147  0  0   45   44   119 0.632   2.71
 1.08 Term +  10390  10623  234  0  0   84   45   117 0.688   2.34
 1.09 PlyA +  15177  15182    6                               1.05

 2.06 PlyA -  16115  16110    6                               1.05
 2.05 Term -  22698  22416  283  2  1   65   29   208 0.965   6.51
 2.04 Intr -  25508  25208  301  0  1   75   94   329 0.997  27.07
 2.03 Intr -  27023  26860  164  1  2   90   95    97 0.992   9.30
 2.02 Intr -  28311  28227   85  1  1   87   69   114 0.987   7.36
 2.01 Init -  32564  32546   19  2  1   89   94    20 0.927   3.05
 2.00 Prom -  34543  34504   40                              -4.95

 3.02 PlyA -  36191  36186    6                               1.05
 3.01 Sngl -  50135  49083 1053  2  0   61   39   815 0.742  70.79
 3.00 Prom -  52207  52168   40                              -7.45

 4.00 Prom +  52240  52279   40                              -8.45
 4.01 Init +  73337  73419   83  1  2   73   81    63 0.564   4.67
 4.02 Intr +  84140  84382  243  2  0   37   51   227 0.029   9.59
 4.03 Term +  84521  84896  376  2  1  -41   43   279 0.472   3.73
 4.04 PlyA +  86384  86389    6                               1.05


Click here to view a PDF image of the predicted gene(s)

Click here for a PostScript image of the predicted gene(s)


Predicted peptide sequence(s):


>Pan|GENSCAN_predicted_peptide_1|539_aa
MLRRSRAWALTGGAGWPGPSVVGGCLLLLPSHLSGSEIGAWLRVSMEEAPSLKPFLASPS
APLVPPPPVCSQDLLALWHWGLEDTFSGQALKMAFEIEDRVVTCGWPPRGMSPPKFHCPV
PATRRPCVHSTLSTHSHRFLKSHCPELVAASSQSLGVGSVIPSYRRGRCGHADGEPAAHG
SSVGEEKGAPGPLGPPAQPPAANACSTWGFSGELTVSWRGWQTPKVPQAFPEHLTKTVHL
CTHGAGLTSVTLESGKRVSYHTFLSLWIHMTWNLPWATQMLFILLKMREPPLFWTLAILG
TAPWTTSGKHAPNLSVSEDRWPLGMPCGENQGRGPHSEVLELGRCMLCGWQLVPLETEAC
WAEPSCLVVSLLLALLIPVPHSLPTLSSGRGTGKCADAHERWGKPMTVPGRVQRPVRGWL
PRQAPRGARRFVCVEGALVVMCPTGGGTGPYCPIPSGSGRRHFPLKETNNRKGRLPKRLR
PHTWCCYSLEYSSYSLRCTEDTWTEDQVRLSYRVSVSPNVTNTHPFLSSPTPIPFWVLT

>Pan|GENSCAN_predicted_peptide_2|283_aa
MAKDVQGSVDEGVSEGLPTLQSTSSTNAPPDDDDRLENVQYPYQLYIAPSTSSTERPSPN
GPDRPFQCPTCGVRFTRIQNLKQHMLIHSGIKPFQCDRCGKKFTRAYSLKMHRLKHEGKR
CFRCQICSATFTSFGEYKHHMRVSRHIIRKPRIYECKTCGAMFTNSGNLIVHLRSLNHEA
SELANYFQSSDFLVPDYLNQEQEETLVQYDLGEHGFESNSSVQMPVISQVSSTQNCESTF
PLGSLGGLAEKEEEVPEQPKSSACAEATRDDPPKSELSSITIE

>Pan|GENSCAN_predicted_peptide_3|350_aa
MGVKTFTHSSSSHSQEMLGKLNMLRNDGHFCDITIRVQDKIFRAHKVVLAACSDFFRTKL
VGQAEDENKNVLDLHHVTVTGFIPLLEYAYTATLSINTENIIDVLAAASYMQMFSVASTC
SEFMKSSILWNTPNSQPEKGLDAGQENNSNCNFTSRDGSISPVSSECSVVERTIPVCRES
RRKRKSYIVMSPESPVKCGTQTSSPQVLNSSASYSENRNQPVDSSLAFPWTFPFGIDRRI
QPEKVKQAENTRTLELPGPSETGRRMADYVTCESTKTTLPLGTEEDVRVKVERLSDEEVH
EEVSQPVSASQSSLSDQQTVPGSEQVQEDLLISPQSSSIGIMPSVFFLVL

>Pan|GENSCAN_predicted_peptide_4|233_aa
MQSLTVAPSILRHLGFWLKGFTSAGGNQELAMQIFGVLKELMTQHVHTYGLIMGGSNRSA
EAQKLANGINITVATPGRLLYHMQNIPGFMYKNLQCLVIDEADRILDVGVDDDKANTTVD
GLEQGYVVCPSEKRFLLLFTFLKKNRKKLVVFFSSCMSVKYHYELLNYIDLPVLAIHGKQ
KQNKRTTTFFQFCNTDLGTHCVWMWWQEDWTFLKSTGLFTMTIRMTLRNIFIV


Explanation

Gn.Ex : gene number, exon number (for reference)
Type  : Init = Initial exon (ATG to 5' splice site)
        Intr = Internal exon (3' splice site to 5' splice site)
        Term = Terminal exon (3' splice site to stop codon)
        Sngl = Single-exon gene (ATG to stop)
        Prom = Promoter (TATA box / initation site)
        PlyA = poly-A signal (consensus: AATAAA)
S     : DNA strand (+ = input strand; - = opposite strand)
Begin : beginning of exon or signal (numbered on input strand)
End   : end point of exon or signal (numbered on input strand)
Len   : length of exon or signal (bp)
Fr    : reading frame (a forward strand codon ending at x has frame x mod 3)
Ph    : net phase of exon (exon length modulo 3)
I/Ac  : initiation signal or 3' splice site score (tenth bit units)
Do/T  : 5' splice site or termination signal score (tenth bit units)
CodRg : coding region score (tenth bit units)
P     : probability of exon (sum over all parses containing exon)
Tscr  : exon score (depends on length, I/Ac, Do/T and CodRg scores)

Comments

The SCORE of a predicted feature (e.g., exon or splice site) is a
log-odds measure of the quality of the feature based on local sequence
properties. For example, a predicted 5' splice site with
score > 100 is strong; 50-100 is moderate; 0-50 is weak; and
below 0 is poor (more than likely not a real donor site).

The PROBABILITY of a predicted exon is the estimated probability under
GENSCAN's model of genomic sequence structure that the exon is correct.
This probability depends in general on global as well as local sequence
properties, e.g., it depends on how well the exon fits with neighboring
exons.  It has been shown that predicted exons with higher probabilities
are more likely to be correct than those with lower probabilities.