{"id":533,"date":"2014-12-02T00:42:37","date_gmt":"2014-12-01T22:42:37","guid":{"rendered":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/?p=533"},"modified":"2022-01-10T00:33:30","modified_gmt":"2022-01-09T22:33:30","slug":"a-study-regarding-inter-domain-linked-documents-similarity-and-their-consequent-bounce-rate","status":"publish","type":"post","link":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/a-study-regarding-inter-domain-linked-documents-similarity-and-their-consequent-bounce-rate\/","title":{"rendered":"A Study Regarding Inter Domain Linked Documents Similarity and their Consequent Bounce Rate"},"content":{"rendered":"<p>published in Studia Universitatis Babe\u015f-Bolyai, Seria Informatica, Vol. LIX, No. 1, pp. 83-91, 2014.<\/p>\n<p><strong>Cite as<\/strong><\/p>\n<pre class=\"nums:false wrap:on highlight:false\">D. Halita, D. Bufnea, \"A Study Regarding Inter Domain Linked Documents Similarity and their Consequent Bounce Rate\", in Studia Universitatis Babe\u015f-Bolyai, Seria Informatica, Vol. LIX, No. 1, pp. 83-91, 2014<\/pre>\n<p><strong>Full paper<\/strong><\/p>\n<p><img decoding=\"async\" style=\"border: none; vertical-align: text-bottom;\" src=\"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-content\/uploads\/pdf.png\" alt=\"\" \/> <a href=\"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-content\/uploads\/2014\/12\/Studia_2014_Halita_Bufnea.pdf\" target=\"_blank\" rel=\"noopener\">A Study Regarding Inter Domain Linked Documents Similarity and their Consequent Bounce Rate<\/a><\/p>\n<p><strong>Authors<\/strong><\/p>\n<p>Diana Halita, Darius Bufnea<br \/>\nDepartment of Computer Science, Faculty of Mathematics and Computer Science,<br \/>\nBabe\u015f-Bolyai University of Cluj-Napoca<\/p>\n<p><strong>Abstract<\/strong><\/p>\n<p>Then main objective of linking inter domain documents is to offer to the reader access to supplementary, semantic related information. However, linking web domains is sometimes artificially used, especially when the goal is to abusively increase the page rank of the destination domain. This paper presents a study regarding inter domain linked documents similarity and their consequent bounce rate. For that, we have advanced a series of experiments which outlines how similarity functions&#8217; behavior correlates with a website bounce rate. The method presented here could be used to identify within a web site improper placed outgoing links such as ads or spam links. Based on that, a search engine could fine the results in SERP by downgrading any website that fall in the above presented category.<\/p>\n<p><strong>Key words<\/strong><\/p>\n<p>bounce rate, document similarity, page ranking, identify improper placed links<\/p>\n<p><strong>BibTeX bib file<\/strong><\/p>\n<p><img decoding=\"async\" style=\"border: none; vertical-align: text-bottom;\" src=\"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-content\/uploads\/bib.png\" alt=\"\" \/> <a href=\"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-content\/uploads\/halita-bufnea-2014.bib\" target=\"_blank\" rel=\"noopener\">halita-bufnea-2014.bib<\/a><\/p>\n<pre class=\"lang:tex url:wp-content\/uploads\/halita-bufnea-2014.bib nums:false\"><\/pre>\n<p><strong>EndNote enw file<\/strong><\/p>\n<p><img decoding=\"async\" style=\"border: none; vertical-align: text-bottom;\" src=\"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-content\/uploads\/enw.png\" alt=\"\" \/> <a href=\"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-content\/uploads\/halita-bufnea-2014.enw\" target=\"_blank\" rel=\"noopener\">halita-bufnea-2014.enw<\/a><\/p>\n<p><strong>References<\/strong><\/p>\n<ol>\n<li>L. Becchetti, C. Castillo, D. Donato, <em>Link-Based Characterization and Detection of Web Spam<\/em>, 2<sup>nd<\/sup> International Workshop on Adversarial Information Retrieval on the Web, AIRWeb, Seattle, USA, August 2006, pp. 1-8.<\/li>\n<li>L. Becchetti, C. Castillo, D. Donato, S. Leonardi,R. Baeza-Yates, <em>Link Analysis for Web Spam Detection: Link-based and Content Based Techniques<\/em>, ACM Transactions on the Web (TWEB), Volume 2, Issue 1, New York, USA, February 2008, pp. 1-41.<\/li>\n<li>A. Farahat, M. Bailey, <em>How Effective is Targeted Advertising?<\/em>, Proceedings of the 21<sup>st<\/sup> World Wide Web Conference 2012, Lyon, France, April 16-20, 2012, pp. 111-120.<\/li>\n<li>Z. Gyongy, H. Garcia-Molina, P. Berkhin, J. Pedersen, <em>Link Spam Detection Based on Mass Estimation<\/em>, 32<sup>nd<\/sup> International Conference in Very Large Data Bases (VLDB), Seoul, Korea, 2006, pp. 439-450.<\/li>\n<li>A. Huang, <em>Similarity Measures for Text Document Clustering<\/em>, Proceedings of the New Zealand Computer Science Research Student Conference, Hamilton, New Zealand, 2008, pp. 49-56.<\/li>\n<li>J. Leskovec, A. Rajaraman, J. Ullman, <em>Mining of Massive Datasets<\/em>, Cambridge University Press, 2010.<\/li>\n<li>M. Najork, <em>Detecting Spam Web Pages through Content Analysis<\/em>, International World Wide Web Conference Committee, Edinburgh, Scotland, 2006, pp. 83-92.<\/li>\n<li>N. Spirin, J. Han, <em>Survey on Web Spam Detection: Principles and Algorithms<\/em>, ACM SIGKDD Explorations Newsletter, Volume 13, Issue 2, December 2011, pp. 50-64.<\/li>\n<li>D. Zhou, C. Burges, T. Tao, <em>Transductive Link Spam Detection<\/em>, Proceedings of the 3<sup>rd<\/sup> International Workshop on Adversarial Information Retrieval on the Web, AIRWeb, New York, USA, ACM Press, 2007, pp. 21-28.<\/li>\n<\/ol>\n","protected":false},"excerpt":{"rendered":"<p>published in Studia Universitatis Babe\u015f-Bolyai, Seria Informatica, Vol. LIX, No. 1, pp. 83-91, 2014. Cite as D. Halita, D. Bufnea, &#8220;A Study Regarding Inter Domain Linked Documents Similarity and their Consequent Bounce Rate&#8221;, in Studia Universitatis Babe\u015f-Bolyai, Seria Informatica, Vol.&hellip; <a href=\"https:\/\/www.cs.ubbcluj.ro\/~bufny\/a-study-regarding-inter-domain-linked-documents-similarity-and-their-consequent-bounce-rate\/\" class=\"more-link\">Continue Reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[110],"tags":[132,133,131],"_links":{"self":[{"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/posts\/533"}],"collection":[{"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/comments?post=533"}],"version-history":[{"count":13,"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/posts\/533\/revisions"}],"predecessor-version":[{"id":2230,"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/posts\/533\/revisions\/2230"}],"wp:attachment":[{"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/media?parent=533"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/categories?post=533"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.cs.ubbcluj.ro\/~bufny\/wp-json\/wp\/v2\/tags?post=533"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}