GAZE: A genetic framework for the integration of gene-prediction data by dynamic programming

Kevin L. Howe; Tom Chothia; Richard Durbin

doi:10.1101/gr.149502

GAZE: A genetic framework for the integration of gene-prediction data by dynamic programming

Kevin L. Howe, Tom Chothia, Richard Durbin^*

^*Corresponding author for this work

Computer Science

Research output: Contribution to journal › Article › peer-review

68 Citations (Scopus)

Abstract

We describe a method (implemented in a program, GAZE) for assembling arbitrary evidence for individual gene components (features) into predictions of complete gene structures. Our system is generic in that both the features themselves, and the model of gene structure against which potential assemblies are validated and scored, are external to the system and supplied by the user. GAZE uses a dynamic programming algorithm to obtain the highest scoring gene structure according to the model and posterior probabilities that each input feature is part of a gene. A novel pruning strategy ensures that the algorithm has a run-time effectively linear in sequence length. To demonstrate the flexibility of our system in the incorporation of additional evidence into the gene prediction process, we show how it can be used to both represent nonstandard gene structures (in the form of trans-spliced genes in Caenorhabditis elegans), and make use of similarity information (in the form of Expressed Sequence Tag alignments), while requiring no change to the underlying software. GAZE is available at http://www.sanger.ac.uk/Software/analysis/GAZE.

Original language	English
Pages (from-to)	1418-1427
Number of pages	10
Journal	Genome Research
Volume	12
Issue number	9
DOIs	https://doi.org/10.1101/gr.149502
Publication status	Published - Sept 2002

ASJC Scopus subject areas

Genetics
Genetics(clinical)

Access to Document

10.1101/gr.149502

Cite this

@article{d31a8b717e734772a3a89acd37b9af09,

title = "GAZE: A genetic framework for the integration of gene-prediction data by dynamic programming",

abstract = "We describe a method (implemented in a program, GAZE) for assembling arbitrary evidence for individual gene components (features) into predictions of complete gene structures. Our system is generic in that both the features themselves, and the model of gene structure against which potential assemblies are validated and scored, are external to the system and supplied by the user. GAZE uses a dynamic programming algorithm to obtain the highest scoring gene structure according to the model and posterior probabilities that each input feature is part of a gene. A novel pruning strategy ensures that the algorithm has a run-time effectively linear in sequence length. To demonstrate the flexibility of our system in the incorporation of additional evidence into the gene prediction process, we show how it can be used to both represent nonstandard gene structures (in the form of trans-spliced genes in Caenorhabditis elegans), and make use of similarity information (in the form of Expressed Sequence Tag alignments), while requiring no change to the underlying software. GAZE is available at http://www.sanger.ac.uk/Software/analysis/GAZE.",

author = "Howe, {Kevin L.} and Tom Chothia and Richard Durbin",

year = "2002",

month = sep,

doi = "10.1101/gr.149502",

language = "English",

volume = "12",

pages = "1418--1427",

journal = "Genome Research",

issn = "1088-9051",

publisher = "Cold Spring Harbor Laboratory Press",

number = "9",

}

TY - JOUR

T1 - GAZE

T2 - A genetic framework for the integration of gene-prediction data by dynamic programming

AU - Howe, Kevin L.

AU - Chothia, Tom

AU - Durbin, Richard

PY - 2002/9

Y1 - 2002/9

N2 - We describe a method (implemented in a program, GAZE) for assembling arbitrary evidence for individual gene components (features) into predictions of complete gene structures. Our system is generic in that both the features themselves, and the model of gene structure against which potential assemblies are validated and scored, are external to the system and supplied by the user. GAZE uses a dynamic programming algorithm to obtain the highest scoring gene structure according to the model and posterior probabilities that each input feature is part of a gene. A novel pruning strategy ensures that the algorithm has a run-time effectively linear in sequence length. To demonstrate the flexibility of our system in the incorporation of additional evidence into the gene prediction process, we show how it can be used to both represent nonstandard gene structures (in the form of trans-spliced genes in Caenorhabditis elegans), and make use of similarity information (in the form of Expressed Sequence Tag alignments), while requiring no change to the underlying software. GAZE is available at http://www.sanger.ac.uk/Software/analysis/GAZE.

AB - We describe a method (implemented in a program, GAZE) for assembling arbitrary evidence for individual gene components (features) into predictions of complete gene structures. Our system is generic in that both the features themselves, and the model of gene structure against which potential assemblies are validated and scored, are external to the system and supplied by the user. GAZE uses a dynamic programming algorithm to obtain the highest scoring gene structure according to the model and posterior probabilities that each input feature is part of a gene. A novel pruning strategy ensures that the algorithm has a run-time effectively linear in sequence length. To demonstrate the flexibility of our system in the incorporation of additional evidence into the gene prediction process, we show how it can be used to both represent nonstandard gene structures (in the form of trans-spliced genes in Caenorhabditis elegans), and make use of similarity information (in the form of Expressed Sequence Tag alignments), while requiring no change to the underlying software. GAZE is available at http://www.sanger.ac.uk/Software/analysis/GAZE.

UR - http://www.scopus.com/inward/record.url?scp=0036708637&partnerID=8YFLogxK

U2 - 10.1101/gr.149502

DO - 10.1101/gr.149502

M3 - Article

C2 - 12213779

AN - SCOPUS:0036708637

SN - 1088-9051

VL - 12

SP - 1418

EP - 1427

JO - Genome Research

JF - Genome Research

IS - 9

ER -

GAZE: A genetic framework for the integration of gene-prediction data by dynamic programming

Abstract

ASJC Scopus subject areas

Access to Document

Fingerprint

Cite this