Can You Guess the Title? Generating Emoji Sequences for Movies
In the culture of the present emojis play an important role in written/typed communication, having a primary role of supplementing the words with emotional cues. While in different cultures emojis can be interpreted and thus used differently, a small set of emojis have clear meaning and strong sentiment polarity. In this work we study how to map natural language texts to emoji sequences, more precisely, we automatically assign emojis to movie subtitles/scripts. The pipeline of the proposed method is as follows: first the most relevant words are extracted from the movie subtitle, and then these are mapped to emojis. In order to perform the mapping, three methods are proposed: a lexical matching-based, a word embedding-based and a combined approach. To demonstrate the viability of the approach, we list some of the generated emojis for a randomly selected movie subset, showing also the deficiencies of the method in generating guessable sequences. Evaluation is performed via quizzes completed by human participants.
 Bai, Q., Dan, Q., Mu, Z., and Yang, M. A systematic review of emoji: Current research and future perspectives. Frontiers in Psychology 10 (2019), 2221.
 Cappallo, S., Mensink, T., and Snoek, C. G. Image2emoji: Zero-shot emoji prediction for visual media. In Proceedings of the 23rd ACM International Conference on Multimedia (2015), pp. 1311–1314.
 Cappallo, S., Mensink, T., and Snoek, C. G. Query-by-emoji video search. In Proceedings of the 23rd ACM International Conference on Multimedia (2015), pp. 735–736.
 Dice, L. R. Measures of the amount of ecologic association between species. Ecology 26, 3 (1945), 297–302.
 Dresner, E., and Herring, S. C. Functions of the nonverbal in CMC: Emoticons and illocutionary force. Communication Theory 20, 3 (2010), 249–268.
 Eisner, B., Rocktaschel, T., Augenstein, I., Bo ¨ ˇsnjak, M., and Riedel, S. Emoji2vec: Learning emoji representations from their description, 2016.
 Go, A., Bhayani, R., and Huang, L. Twitter sentiment classification using distant supervision. CS224N project report, Stanford 1, 12 (2009), 2009.
 Hovy, E. Text summarization. In The Oxford Handbook of Computational Linguistics, R. Mitkov, Ed. Oxford University Press, Oxford, 2004, ch. 32.
 Karthik, V., Nair, D., and Anuradha, J. Opinion mining on emojis using deep learning techniques. Procedia Computer Science 132 (2018), 167–173.
 Kralj Novak, P., Smailovic, J., Sluban, B., and Mozetic, I. Sentiment of emojis. PloS One 10, 12 (2015), e0144296.
 Kumari, R., and Gangwar, R. Use of expression based digital pictograms in interpersonal communication: a study on social media and social apps. International Journal of Innovative Knowledge Concepts 6 (2018), 11.
 Lin, C.-Y. ROUGE: A package for automatic evaluation of summaries. In Text Summarization Branches Out (2004), ACL, pp. 74–81.
 Lison, P., and Tiedemann, J. OpenSubtitles2016: Extracting large parallel corpora from movie and TV subtitles. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16) (Portoroz, Slovenia, May 2016), European Language Resources Association (ELRA), pp. 923–929.
 Mei, Q. Decoding the new world language: Analyzing the popularity, roles, and utility of emojis. In Companion Proceedings of The 2019 World Wide Web Conference (New York, NY, USA, 2019), WWW ’19, Association for Computing Machinery, p. 417–418.
 Mihalcea, R., and Csomai, A. Wikify! Linking documents to encyclopedic knowledge. In Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management (2007), pp. 233–242.
 Mikolov, T., Chen, K., Corrado, G., and Dean, J. Efficient estimation of word representations in vector space, 2013.
 Mikolov, T., Grave, E., Bojanowski, P., Puhrsch, C., and Joulin, A. Advances in pre-training distributed word representations. In Proceedings of the International Conference on Language Resources and Evaluation (LREC) (2018).
 Nenkova, A., and McKeown, K. A survey of text summarization techniques. In Mining Text Data. Springer, 2012, pp. 43–76.
 Pennington, J., Socher, R., and Manning, C. D. Glove: Global vectors for word representation. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014), pp. 1532–1543.
 Radford, W., Chisholm, A., Hachey, B., and Han, B. :telephone::person::sailboat::whale::okhand:; or “Call me Ishmael” – How do you translate emoji? In Proceedings of Australasian Language Technology Association Workshop (2016), pp. 150–154.
 Schutze, H., Manning, C. D., and Raghavan, P. Introduction to information retrieval. Cambridge University Press, 2008.
 Sebastiani, F. Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34, 1 (2002), 1–47.
 Stark, L., and Crawford, K. The conservatism of emoji: Work, affect, and communication. Social Media + Society 1, 2 (2015), 2056305115604853.
 Taggart, C. New words for old: Recycling our language for the modern world. Michael O’Mara Books, 2015.
 Wang, H., and Castanon, J. A. Sentiment expression via emoticons on social media. In International Conference on Big Data (2015), IEEE, pp. 2404–2408.
 Wartena, C., Brussee, R., and Slakhorst, W. Keyword extraction using word cooccurrence. In International Workshops on Database and Expert Systems Applications (2010), IEEE, pp. 54–58.
 Wijeratne, S., Balasuriya, L., Sheth, A., and Doran, D. EmojiNet: An open service and API for emoji sense discovery. In Proceedings of the International AAAI Conference on Web and Social Media (2017), vol. 11.
 Yadav, P., and Pandya, D. Sentireview: Sentiment analysis based on text and emoticons. In 2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA) (2017), pp. 467–472.
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
When the article is accepted for publication, I, as the author and representative of the coauthors, hereby agree to transfer to Studia Universitatis Babes-Bolyai, Series Informatica, all rights, including those pertaining to electronic forms and transmissions, under existing copyright laws, except for the following, which the author specifically retain: the right to make further copies of all or part of the published article for my use in classroom teaching; the right to reuse all or part of this material in a review or in a textbook of which I am the author; the right to make copies of the published work for internal distribution within the institution that employs me.