Constructing Unrooted Phylogenetic Trees with Reinforcement Learning
Abstract
With the development of sequencing technologies, more and more amounts of sequence data are available. This poses additional challenges, such as processing them is usually a complex and time-consuming computational task. During the construction of phylogenetic trees, the relationship between the sequences is examined, and an attempt is made to represent the evolutionary relationship. There are several algorithms for this problem, but with the development of computer science, the question arises as to whether new technologies can be exploited in these areas of computational biology.
In the following publication, we investigate whether the reinforced learning model of machine learning can generate accurate phylogenetic trees based on the distance matrix.
References
[2] Chor, B., and Tuller, T. Maximum likelihood of evolutionary trees: hardness and approximation. Bioinformatics 21, suppl 1 (2005), i97–i106.
[3] Elias, I., and Lagergren, J. Fast neighbor joining. In International Colloquium on Automata, Languages, and Programming (2005), Springer, pp. 1263–1274.
[4] Evans, J., Sheneman, L., and Foster, J. Relaxed neighbor joining: a fast distance-based phylogenetic tree construction method. Journal of molecular evolution 62, 6 (2006), 785–792.
[5] Felsenstein, J., and Felenstein, J. Inferring phylogenies, vol. 2. Sinauer associates Sunderland, MA, 2004.
[6] Guadarrama, S., Korattikara, A., Ramirez, O., Castro, P., Holly, E., Fishman, S., Wang, K., Gonina, E., Wu, N., Kokiopoulou, E., Sbaiz, L., Smith, J., Bartok, G., Berent, J., Harris, C., Vanhoucke, V., ´and Brevdo, E. TF-Agents: A library for reinforcement learning in tensorflow. https://github.com/tensorflow/agents, 2018. [Online; accessed 06-April-2021].
[7] Hochreiter, S., and Schmidhuber, J. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
[8] Jafari, R., Javidi, M. M., and Rafsanjani, M. K. Using deep reinforcement learning approach for solving the multiple sequence alignment problem. SN Applied Sciences 1, 6 (2019), 1–12.
[9] Kalantari, J., Nelson, H., and Chia, N. The unreasonable effectiveness of inverse reinforcement learning in advancing cancer research. In Proceedings of the AAAI Conference on Artificial Intelligence (2020), vol. 34, pp. 437–445.
[10] Mailund, T., and Pedersen, C. N. Quickjoin—fast neighbour-joining tree reconstruction. Bioinformatics 20, 17 (2004), 3261–3262.
[11] Mircea, I.-G., Bocicor, I., and Czibula, G. A reinforcement learning based approach to multiple sequence alignment. In International Workshop Soft Computing Applications (2016), Springer, pp. 54–70.
[12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
[13] Olsen, G. The ”newick’s 8: 45” tree format standard. World-Wide-Web Reference. http://evolution.genetics.washington.edu/phylip/newick doc.html (1990).
[14] Rice, P., Longden, I., and Bleasby, A. Emboss: the european molecular biology open software suite. Trends in genetics 16, 6 (2000), 276–277.
[15] Robinson, D. F., and Foulds, L. R. Comparison of phylogenetic trees. Mathematical biosciences 53, 1-2 (1981), 131–147.
[16] Saitou, N., and Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular biology and evolution 4, 4 (1987), 406–425.
[17] Sarich, V. M. Pinniped phylogeny. Systematic Zoology 18, 4 (1969), 416–422.
[18] Simonsen, M., Mailund, T., and Pedersen, C. N. Rapid neighbour-joining. In International Workshop on Algorithms in Bioinformatics (2008), Springer, pp. 113–122.
[19] Stoye, J., Evers, D., and Meyer, F. Rose: generating sequence families. Bioinformatics (Oxford, England) 14, 2 (1998), 157–163.
[20] Studier, J. A., and Keppler, K. J. A note on the neighbor-joining algorithm of saitou and nei. Molecular biology and evolution 5, 6 (1988), 729–731.
[21] Sukumaran, J., and Holder, M. T. Dendropy: a python library for phylogenetic computing. Bioinformatics 26, 12 (2010), 1569–1571.
[22] Sutton, R. S., and Barto, A. G. Reinforcement learning: An introduction. MIT press, 2018.
[23] Suvorov, A., Hochuli, J., and Schrider, D. R. Accurate inference of tree topologies from multiple sequence alignments using deep learning. Systematic biology 69, 2 (2020), 221–233.
[24] Wheeler, T. J. Large-scale neighbor-joining with ninja. In International Workshop on Algorithms in Bioinformatics (2009), Springer, pp. 375–389.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
When the article is accepted for publication, I, as the author and representative of the coauthors, hereby agree to transfer to Studia Universitatis Babes-Bolyai, Series Informatica, all rights, including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author specifically retain: the right to make further copies of all or part of the published article for my use in classroom teaching; the right to reuse all or part of this material in a review or in a textbook of which I am the author; the right to make copies of the published work for internal distribution within the institution that employs me.