TY - JOUR
T1 - Analyzing energy landscapes for folding model proteins
AU - [No Value], [No Value]
AU - Johnston, Roy
PY - 2006/1/1
Y1 - 2006/1/1
N2 - A new benchmark 20-bead HP model protein sequence (on a square lattice), which has 17 distinct but degenerate global minimum (GM) energy structures, has been studied using a genetic algorithm (GA). The relative probabilities of finding particular GM conformations are determined and related to the theoretical probability of generating these structures using a recoil growth constructor operator. It is found that for longer successful GA runs, the GM probability distribution is generally very different from the constructor probability, as other GA operators have had time to overcome any initial bias in the originally generated population of structures. Structural and metric relationships (e.g., Hamming distances) between the 17 distinct GM are investigated and used, in conjunction with data on the connectivities of the GM and the pathways that link them, to explain the GM probability distributions obtained by the GA. A comparison is made of searches where the sequence is defined in the normal (forward) and reverse directions. The ease of finding mirror image solutions are also compared. Finally, this approach is applied to rationalize the ease or difficulty of finding the GM for a number of standard benchmark HP sequences on the square lattice. It is shown that the relative probabilities of finding particular members of a set of degenerate global minima depend critically on the topography of the energy landscape in the vicinity of the GM, the connections and distances between the GM, and the nature of the operators used in the chosen search method. (c) 2006 American Institute of Physics.
AB - A new benchmark 20-bead HP model protein sequence (on a square lattice), which has 17 distinct but degenerate global minimum (GM) energy structures, has been studied using a genetic algorithm (GA). The relative probabilities of finding particular GM conformations are determined and related to the theoretical probability of generating these structures using a recoil growth constructor operator. It is found that for longer successful GA runs, the GM probability distribution is generally very different from the constructor probability, as other GA operators have had time to overcome any initial bias in the originally generated population of structures. Structural and metric relationships (e.g., Hamming distances) between the 17 distinct GM are investigated and used, in conjunction with data on the connectivities of the GM and the pathways that link them, to explain the GM probability distributions obtained by the GA. A comparison is made of searches where the sequence is defined in the normal (forward) and reverse directions. The ease of finding mirror image solutions are also compared. Finally, this approach is applied to rationalize the ease or difficulty of finding the GM for a number of standard benchmark HP sequences on the square lattice. It is shown that the relative probabilities of finding particular members of a set of degenerate global minima depend critically on the topography of the energy landscape in the vicinity of the GM, the connections and distances between the GM, and the nature of the operators used in the chosen search method. (c) 2006 American Institute of Physics.
UR - http://www.scopus.com/inward/record.url?scp=34547555116&partnerID=8YFLogxK
U2 - 10.1063/1.2198537
DO - 10.1063/1.2198537
M3 - Article
SN - 1089-7690
VL - 124
SP - 204714
JO - Journal of Chemical Physics
JF - Journal of Chemical Physics
IS - 20
ER -