%A Bodo, Zalan %A Csato, Lehel %D 2017 %T A Hybrid Approach for Scholarly Information Extraction %K %X Metadata extraction from documents forms an essential part of web or desktop search systems. Similarly, digital libraries that index scholarly literature require to find and extract the title,  the list of authors and other publication-related information from an article. We present a hybrid approach for metadata extraction, combining classification and clustering to extract the desired information without the need of a conventional labeled dataset for training. An important asset of the proposed method is that the resulting clustering parameters can be used in other problems, e.g. document layout analysis. %U https://www.cs.ubbcluj.ro/~studia-i/journal/journal/article/view/10 %J Studia Universitatis BabeČ™-Bolyai Informatica %0 Journal Article %R 10.24193/subbi.2017.2.01 %P 5-16%V 62 %N 2 %@ 2065-9601 %8 2017-12-15