Constructing Unrooted Phylogenetic Trees with Reinforcement Learning

P. Liptak; A. Kiss

doi:10.24193/subbi.2021.1.03

P. Liptak ELTE Eotvos Lorand University, Faculty of Informatics, Budapest, Hungary
A. Kiss ELTE Eotvos Lorand University, Faculty of Informatics, Budapest, Hungary

DOI: https://doi.org/10.24193/subbi.2021.1.03

Abstract

With the development of sequencing technologies, more and more amounts of sequence data are available. This poses additional challenges, such as processing them is usually a complex and time-consuming computational task. During the construction of phylogenetic trees, the relationship between the sequences is examined, and an attempt is made to represent the evolutionary relationship. There are several algorithms for this problem, but with the development of computer science, the question arises as to whether new technologies can be exploited in these areas of computational biology.

In the following publication, we investigate whether the reinforced learning model of machine learning can generate accurate phylogenetic trees based on the distance matrix.

References

[1] Bhattacharjee, A., and Bayzid, M. S. Machine learning based imputation techniques for estimating phylogenetic trees from incomplete distance matrices. BMC genomics 21, 1 (2020), 1–14.
[2] Chor, B., and Tuller, T. Maximum likelihood of evolutionary trees: hardness and approximation. Bioinformatics 21, suppl 1 (2005), i97–i106.
[3] Elias, I., and Lagergren, J. Fast neighbor joining. In International Colloquium on Automata, Languages, and Programming (2005), Springer, pp. 1263–1274.
[4] Evans, J., Sheneman, L., and Foster, J. Relaxed neighbor joining: a fast distance-based phylogenetic tree construction method. Journal of molecular evolution 62, 6 (2006), 785–792.
[5] Felsenstein, J., and Felenstein, J. Inferring phylogenies, vol. 2. Sinauer associates Sunderland, MA, 2004.
[6] Guadarrama, S., Korattikara, A., Ramirez, O., Castro, P., Holly, E., Fishman, S., Wang, K., Gonina, E., Wu, N., Kokiopoulou, E., Sbaiz, L., Smith, J., Bartok, G., Berent, J., Harris, C., Vanhoucke, V., ´and Brevdo, E. TF-Agents: A library for reinforcement learning in tensorflow. https://github.com/tensorflow/agents, 2018. [Online; accessed 06-April-2021].
[7] Hochreiter, S., and Schmidhuber, J. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
[8] Jafari, R., Javidi, M. M., and Rafsanjani, M. K. Using deep reinforcement learning approach for solving the multiple sequence alignment problem. SN Applied Sciences 1, 6 (2019), 1–12.
[9] Kalantari, J., Nelson, H., and Chia, N. The unreasonable effectiveness of inverse reinforcement learning in advancing cancer research. In Proceedings of the AAAI Conference on Artificial Intelligence (2020), vol. 34, pp. 437–445.
[10] Mailund, T., and Pedersen, C. N. Quickjoin—fast neighbour-joining tree reconstruction. Bioinformatics 20, 17 (2004), 3261–3262.
[11] Mircea, I.-G., Bocicor, I., and Czibula, G. A reinforcement learning based approach to multiple sequence alignment. In International Workshop Soft Computing Applications (2016), Springer, pp. 54–70.
[12] Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., et al. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
[13] Olsen, G. The ”newick’s 8: 45” tree format standard. World-Wide-Web Reference. http://evolution.genetics.washington.edu/phylip/newick doc.html (1990).
[14] Rice, P., Longden, I., and Bleasby, A. Emboss: the european molecular biology open software suite. Trends in genetics 16, 6 (2000), 276–277.
[15] Robinson, D. F., and Foulds, L. R. Comparison of phylogenetic trees. Mathematical biosciences 53, 1-2 (1981), 131–147.
[16] Saitou, N., and Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular biology and evolution 4, 4 (1987), 406–425.
[17] Sarich, V. M. Pinniped phylogeny. Systematic Zoology 18, 4 (1969), 416–422.
[18] Simonsen, M., Mailund, T., and Pedersen, C. N. Rapid neighbour-joining. In International Workshop on Algorithms in Bioinformatics (2008), Springer, pp. 113–122.
[19] Stoye, J., Evers, D., and Meyer, F. Rose: generating sequence families. Bioinformatics (Oxford, England) 14, 2 (1998), 157–163.
[20] Studier, J. A., and Keppler, K. J. A note on the neighbor-joining algorithm of saitou and nei. Molecular biology and evolution 5, 6 (1988), 729–731.
[21] Sukumaran, J., and Holder, M. T. Dendropy: a python library for phylogenetic computing. Bioinformatics 26, 12 (2010), 1569–1571.
[22] Sutton, R. S., and Barto, A. G. Reinforcement learning: An introduction. MIT press, 2018.
[23] Suvorov, A., Hochuli, J., and Schrider, D. R. Accurate inference of tree topologies from multiple sequence alignments using deep learning. Systematic biology 69, 2 (2020), 221–233.
[24] Wheeler, T. J. Large-scale neighbor-joining with ninja. In International Workshop on Algorithms in Bioinformatics (2009), Springer, pp. 375–389.