Evaluating Large Language Models for Diacritic Restoration in Romanian Texts: A Comparative Study (2025)

Automatic diacritic restoration is crucial for text processing in languages with rich diacritical marks, such as Romanian. This study evaluates the performance of several large language models (LLMs) in restoring diacritics in Romanian texts. Using a comprehensive corpus, we tested models including OpenAI’s GPT-3.5, GPT-4, GPT-4o, Google’s Gemini 1.0 Pro, Meta’s Llama 2 and…

LLM Output Compliance with Handcrafted Linguistic Features: An Experiment (2025)

Can we control the writing style of large language models (LLMs) by specifying desired linguistic features? We address this question by investigating the impact of handcrafted linguistic feature (HLF) instructions on LLM-generated text. Our experiment evaluates various state-of-the-art LLMs using prompts incorporating HLF statistics derived from corpora of CNN articles and Yelp reviews. We…