The race to imbue artificial intelligence with the nuances of the Estonian language is intensifying, marked by efforts to create specialized benchmarks and train open-source models on authentic Estonian data. Despite advancements, a persistent artificiality in AI-generated Estonian persists, prompting researchers to aim for language that mirrors natural human speech. This pursuit involves leveraging powerful computing resources and grappling with the inherent limitations of a "small language" with constrained data.
The Push for Genuine Estonian in AI
Researchers are actively working to enhance how large language models (LLMs) comprehend and generate Estonian. A significant undertaking involves training open-source models to speak Estonian more fluently and to grasp cultural subtleties. This initiative aims to move beyond the current artificial and "wooden" output, striving for AI-generated Estonian that sounds like actual human conversation.
"Estonian often sounds artificial and clumsy" in conversations with AI, notes Kairit Sirts, Associate Professor in Natural Language Processing at the University of Tartu.
This endeavor is supported by substantial computational power, with models being trained on the LUMI supercomputer, described as the fastest in Northern Europe, located in Finland.
Read More: Microsoft Invests $10 Billion in Japan for AI and Data Security by 2029
Benchmarking Estonian Language Models
The availability of benchmarks for evaluating LLMs specifically for the Estonian language has been limited. To address this gap, a new benchmark has been developed, utilizing seven diverse datasets sourced directly from native Estonian materials. This benchmark aims to provide a comprehensive evaluation of various LLMs on Estonian tasks.

The evaluation included:
Six base models
Twenty-six instruction-tuned models
A comparison between open-source and commercial models.
This effort distinguishes itself by using native Estonian sources rather than machine translation, aiming for more reliable test material. Human evaluators and AI judges were employed, with the Claude 3.7 Sonnet model showing strong agreement with human ratings and outperforming others in the evaluation. The benchmark tests a range of competencies, including general knowledge, domain-specific expertise, grammar, vocabulary, summarization, and contextual understanding.
Addressing the "Small Language" Challenge
A key challenge highlighted is working with a "small language" and the associated limited data resources. Despite these constraints, the ambition is to maintain and expand competence in large language models within the Estonian research community. This involves intricate data collection strategies, quality assessment, and the creation of benchmarking frameworks to boost Estonian's capabilities within the context of open LLMs.
Read More: Steven Cohen's Firm Buys Amazon Stock, Sees AI Future
The broader applications of these efforts extend to:
Chatbots
Text summarizers
Content aggregators
Question answering systems
Background: The LLM Landscape
Large language models (LLMs) represent a significant leap in language technology, capable of understanding and generating human-like text. Their development has spurred a global interest in adapting them to specific languages and cultures. Efforts in Estonia reflect a broader trend of national initiatives aiming to ensure linguistic diversity is not lost in the rapid advancement of artificial intelligence. The Institute of Computer Science at the University of Tartu, Tallinn University of Technology, and the Institute of the Estonian Language are key players in this field.