Abstract
Computational approaches to de novo protein tertiary structure prediction, including those based on the preeminent "fragment-assembly" technique, have failed to scale up fully to larger proteins (of the order of 100 residues and above). A number of limiting factors are thought to contribute to the scaling problem over and above the simple combinatorial explosion, but the key ones relate to the lack of exploration of properly diverse protein folds, and an acute form of "deception" in the energy function whereby low-energy conformations do not reliably equate with native structures. In this paper, solutions to both of these problems are investigated through a multi-stage memetic algorithm incorporating the successful Rosetta method as a local search routine. It is found that specialised genetic operators significantly add to structural diversity and this translates well to reaching low energies. The use of a generalised stochastic ranking procedure for selection enables the memetic algorithm to handle and traverse deep energy wells that can be considered deceptive, which further adds to the ability of the algorithm to obtain a much-improved diversity of folds. The results should translate to a tangible improvement in the performance of protein structure prediction algorithms in blind experiments such as CASP, and potentially to a further step towards the more challenging problem of predicting the three-dimensional shape of large proteins.
Original language | English |
---|---|
Pages (from-to) | 577-607 |
Number of pages | 30 |
Journal | Evolutionary Computation |
Volume | 24 |
Issue number | 4 |
Early online date | 23 Feb 2016 |
DOIs | |
Publication status | Published - 1 Dec 2016 |
Bibliographical note
The pdf file is the post-print version of the article, accepted for publication by Evolutionary Computation journal.Keywords
- Protein structure prediction
- fragment-assembly
- memetic algorithms