Learning Biophysically-Motivated Parameters for Alpha Helix Prediction
- Blaise Gassend ,
- Charles W O'Donnell ,
- Bill Thies ,
- Andrew Lee ,
- Marten van Dijk ,
- Srinivas Devadas
Background: Our goal is to develop a state-of-the-art protein secondary structure predictor, with an intuitive and biophysically-motivated energy model. We treat structure prediction as an optimization problem, using parameterizable cost functions representing biological “pseudoenergies”. Machine learning methods are applied to estimate the values of the parameters to correctly predict known protein structures.
Results: Focusing on the prediction of alpha helices in proteins, we show that a model with 302 parameters can achieve a Qα value of 77.6% and an SOVα value of 73.4%. Such performance numbers are among the best for techniques that do not rely on external databases (such as multiple sequence alignments). Further, it is easier to extract biological significance from a model with so few parameters.
Conclusion: The method presented shows promise for the prediction of protein secondary structure. Biophysically-motivated elementary free-energies can be learned using SVM techniques to construct an energy cost function whose predictive performance rivals state-of-the-art. This method is general and can be extended beyond the all-alpha case described here.