TF1-EN-3M: Three Million Synthetic Moral Fables for Training Small, Open Language Models (2025)

arXiv.org Authors Mihai Nadǎş, Laura Dioşan, Andrei Piscoran, Andreea Tomescu Abstract Moral stories are a time-tested vehicle for transmitting values, yet modern NLP lacks a large, structured corpus that couples coherent narratives with explicit ethical lessons. We close this gap with TF1-EN-3M, the first open dataset of three million English-language fables generated exclusively by…

PyResolveMetrics: A Standards-Compliant and Efficient Approach to Entity Resolution Metrics (2024)

International Conference on Computer Supported Education Authors Andrei Olar, L. Dioşan Abstract Entity resolution, the process of discerning whether multiple data refer to the same real-world entity, is crucial across various domains, including education. Its quality assessment is vital due to the extensive practical applications in fields such as analytics, personalized learning or academic…