Using Computational Intelligence Models for Additional Insight into Protein Structure
Abstract
Proteins are large, complex molecules with crucial roles in the functioning of living organisms. Understanding the underlying mechanisms by which proteins achieve their structures and substructures, as well as those involved in the conformational transitions may contribute to a deeper comprehension of the involved biological processes. This paper investigates a new machine learning perspective upon analyzing protein conformational transitions and introduces a new formalization for the problem, with the more general goal of uncovering interesting patterns in protein conformational transitions. This study represents the starting point of a research which is being conducted in order to obtain a better comprehension of proteins’ structures and, implicitly, functions, by investigating computational intelligence methods for analyzing and deducing proteins conformational transitions.
References
[2] H.M Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, and P.E. Bourne. The Protein Data Bank. Nucleic Acids Research, 28:235-242.
[3] CATH: Protein Structure Classification Database at UCL. CATH - Gene3D. http://www.cathdb.info.
[4] J. Cortes, T. Sim´ eon, V. Ruiz De Angulo, D. Guieysse, M. Remaud-Sim´eon, and V. Tran. A path planning approach for computing large-amplitude motions of flexible molecules. Bioinformatics, 21(1):116–125, January 2005.
[5] Natalie L. Dawson, Tony E. Lewis, Sayoni Das, Jonathan G. Lees, David Lee, Paul Ashford, Christine A. Orengo, and Ian Sillitoe. Cath: an expanded resource to predict protein function through structure and sequence. Nucleic Acids Research, 45(D1):D289, 2017.
[6] N. Elfelly, J.-Y. Dieulot, and P. Borne. A neural approach of multimodel representation of complex processes. International Journal of Computers, Communications & Control, III(2):149–160, 2008.
[7] Aviezri S. Fraenkel. Complexity of protein folding. Bulletin of Mathematical Biology, 55(6):1199 – 1210, 1993.
[8] Christophe Guyeux, Nathalie M.-L. Cote, Jacques M. Bahi, and Wojciech Bienia. Is protein folding problem really a NP-complete one? First investigations. Journal of Bioinformatics and Computational Biology, 12(01):1350017–1350041, 2014.
[9] N. Haspel, M. Moll, M. L. Baker, W. Chiu, and L. E. Kavraki. Tracing conformational changes in proteins. In 2009 IEEE International Conference on Bioinformatics and Biomedicine Workshop, pages 120–127, Nov 2009.
[10] S. Kaski and T. Kohonen. Exploratory data analysis by the self-organizing map: Structures of welfare and poverty in the world. In Neural Networks in Financial Engineering. Proceedings of the Third International Conference on Neural Networks in the Capital Markets, pages 498–507. World Scientific, 1996.
[11] Michael Knudsen and Carsten Wiuf. The CATH database. Human Genomics, 4:207–212, 2010.
[12] I. Lotan, F. Schwarzer, and J.C. Latombe. Efficient energy computation for monte carlo simulation of proteins. Lecture Notes in Computer Science, 2812:354–373,, 2003.
[13] Tim Meyer, Marco D’Abramo, Adam Hospital, Manuel Rueda, Carles Ferrer-Costa, Alberto Prez, Oliver Carrillo, Jordi Camps, Carles Fenollosa, Dmitry Repchevsky, Josep Lluis Gelp, and Modesto Orozco. MoDEL (molecular dynamics extended library): A database of atomistic molecular dynamics trajectories. Structure, 18(11):1399 – 1409, 2010.
[14] Osamu Miyashita, Peter G. Wolynes, and Jos N. Onuchic. Simple energy landscape model for the kinetics of functional transitions in proteins. The Journal of Physical Chemistry B, 109(5):1959–1969, 2005.
[15] G. Morra, M. Meli, and G. Colombo. Molecular dynamics simulations of proteins and peptides: from folding to drug design. Current Protein and Peptide Science, 9:2181–2196, 2008.
[16] B. Offmann, M. Tyagi, and A.G. de Brevern. Local protein structures. Current Bioinformatics, 2(3):165–202, 2007.
[17] Kei-ichi Okazaki, Nobuyasu Koga, Shoji Takada, Jose N. Onuchic, and Peter G. Wolynes. Multiple-basin energy landscapes for large-amplitude conformational motions of proteins: Structure-based molecular dynamics simulations. Proceedings of the National Academy of Sciences, 103(32):11844–11849, 2006.
[18] A. Pandini, A. Fornili, and J. Kleinjung. Structural alphabets derived from attractors in conformational space. BMC Bioinformatics, 11(97):1–18, 2010.
[19] Alessandro Pandini. Structural alphabet tools for molecular simulations. http://people.brunel.ac.uk/~csstaap2/software.html. [Online; accessed 12-May-2017].
[20] B. Raveh, A. Enosh, O. Schueler-Furman, and D. Halperin. Rapid sampling of molecular motion with prior information constraints. PLoS Computational Biology, 5(2), February 2009.
[21] Adam D. Schuyler, Robert L. Jernigan, Pradman K. Qasba, Boopathy Ramakrishnan, and Gregory S. Chirikjian. Iterative cluster-nma: A tool for generating conformational transitions in proteins. Proteins: Structure, Function, and Bioinformatics, 74(3):760–776, 2009.
[22] Lars Skjaerven, Siv M. Hollup, and Nathalie Reuter. Normal mode analysis for proteins. Journal of Molecular Structure: {THEOCHEM}, 898(13):42 – 48, 2009.
[23] Panu Somervuo and Teuvo Kohonen. Self-organizing maps and learning vector quantization for feature sequences. Neural Processing Letters, 10:151–159, 1999.
[24] K.N. Srinivasan, V. Sivaraja, I. Huys, T. Sasaki, B. Cheng, T.K. Kumar, K. Sato, J. Tytgat, C. Yu, B.C. San, S. Ranganathan, H.J. Bowie, R.M. Kini, and P. Gopalakrishnakone. kappa-hefutoxin, a novel toxin from the scorpion heterometrus fulvipes with unique structure and function. importance of the functional diad in potassium channel selectivity. J.Biol.Chem, 277:30040–30047, 2002. PDB ID: 1HP9.
[25] Nobuhiko Tokuriki and Dan S. Tawfik. Protein dynamism and evolvability. Science, 324(9524):203–207, 2009.
[26] D. Voet and J. Voet. Biochemistry. Wiley, 4 edition, 2011.
[27] Paul C. Whitford, Osamu Miyashita, Yaakov Levy, and Jos N. Onuchic. Conformational transitions of adenylate kinase: Switching by cracking. Journal of Molecular Biology, 366(5):1661 – 1671, 2007.
[28] Yuzhen Ye and Adam Godzik. FATCAT: a web server for flexible structure comparison and structure similarity searching. Nucleic Acids Research, 32:582–585, 2004.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
When the article is accepted for publication, I, as the author and representative of the coauthors, hereby agree to transfer to Studia Universitatis Babes-Bolyai, Series Informatica, all rights, including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author specifically retain: the right to make further copies of all or part of the published article for my use in classroom teaching; the right to reuse all or part of this material in a review or in a textbook of which I am the author; the right to make copies of the published work for internal distribution within the institution that employs me.