![]()
Artificial intelligence continues to reshape how organizations process data, interact with customers, and analyze vast amounts of textual information. However, as the integration of language models into critical systems deepens, a significant technical barrier remains: the tendency of these models to generate plausible but entirely false information. Addressing this challenge requires rigorous academic research and specialized technical focus. At the forefront of this effort in Europe is David Dukić, a researcher whose work at the University of Zagreb Faculty of Electrical Engineering and Computing provides critical insights into making artificial intelligence more reliable.
Explore our related articles for further reading.
Understanding the Hallucination Problem in Artificial Intelligence
In the context of language models, a “hallucination” occurs when the system outputs information that is factually incorrect, fabricated, or unsupported by its training data or the provided context. Unlike obvious errors, hallucinations are particularly dangerous because the output is typically grammatically flawless, contextually coherent, and presented with high confidence. For enterprises relying on artificial intelligence for legal document analysis, medical data processing, or financial reporting, these hallucinations represent a critical liability.
The root cause of this issue lies in the fundamental architecture of large language models. These systems operate as probabilistic text generators, predicting the next most likely word based on complex mathematical representations. They do not possess a factual database or an inherent understanding of truth. When faced with a knowledge gap, the model will still attempt to complete the pattern, resulting in the generation of synthetic facts. Solving this requires shifting the focus from simply scaling up model size to improving the mechanisms of information extraction and verification.
The Research Focus of David Dukić
David Dukić has dedicated his doctoral and postdoctoral research to natural language processing (NLP), specifically targeting how artificial intelligence interacts with and extracts meaning from textual data. Conducting his doctoral studies under the supervision of Professor Jan Šnajder at the University of Zagreb Faculty of Electrical Engineering and Computing, Dukić concentrated on improving how systems recognize semantic categories within text.
This involves training models to accurately identify and categorize entities such as people, geographical locations, organizations, specific events, and underlying sentiments. While general-purpose language models can perform these tasks at a basic level, Dukić’s research targets high-precision extraction—the exact localization of specific information within complex documents. This granular approach is essential for building systems that cannot simply summarize a text, but can reliably pull structured data from unstructured language without introducing fabricated elements.
Analyzing the Croatian Media Space
A significant portion of this research relies on robust, real-world datasets. Dukić’s work is closely tied to the TakeLab Retriever platform, a sophisticated tool developed by the TakeLab research group at the University of Zagreb Faculty of Electrical Engineering and Computing. This platform provides an in-depth analytical capability for the Croatian media space, drawing from an impressive and continuously updated database of approximately 12 million news articles.
Working with a dataset of this scale in a specific language like Croatian presents unique challenges and opportunities. Unlike English, which dominates the training data of most commercial language models, Croatian requires specialized modeling to handle its complex morphology and syntax. By training and testing systems on the TakeLab Retriever dataset, researchers can evaluate how well artificial intelligence performs in low-resource linguistic environments. This ensures that the advancements in NLP are not exclusively limited to English-speaking markets but are applicable to regional languages in Croatia and the broader Balkan region.
Why Smaller, Specialized Models Outperform Massive Systems
One of the most compelling findings from Dukić’s research challenges the prevailing industry assumption that bigger is always better. While state-of-the-art commercial language models demonstrate impressive general capabilities, they often fall short in highly specialized, precision-dependent tasks. The research indicates that for specific applications—such as the precise extraction and localization of semantic information—smaller, fine-tuned models consistently outperform massive, generalized language models.
This finding has significant implications for developers and organizations deploying artificial intelligence. Large language models require immense computational resources for inference, leading to high latency and operational costs. Smaller models, fine-tuned on task-specific data like the 12-million-article Croatian dataset, achieve higher accuracy and precision while operating at a fraction of the cost. They are less prone to hallucinations in their specific domain because their parameters are tightly aligned with the desired output structure, leaving less room for the model to generate creative but incorrect text.
Schedule a free consultation to learn more.
The Mechanics of Fine-Tuning for Precision
Fine-tuning involves taking a pre-trained model and further training it on a narrow, high-quality dataset labeled for a specific task. In the context of semantic extraction, this means providing the model with thousands of examples where entities are explicitly highlighted and categorized. The model learns the exact boundaries of the information it needs to extract. Because the model’s objective function is narrowly constrained, the statistical probability of it generating outside those constraints—and thereby hallucinating—is drastically reduced.
Future Directions in Mitigating Model Hallucinations
Recognizing the global importance of this work, David Dukić plans to pursue postdoctoral research abroad this autumn. This transition marks a critical step in addressing the hallucination problem on an international scale. Postdoctoral research allows for the exchange of methodologies between the robust academic environment of the University of Zagreb Faculty of Electrical Engineering and Computing and leading global AI laboratories.
The fight against hallucinations is currently focused on several promising architectural changes. These include retrieval-augmented generation (RAG), where the model is forced to anchor its outputs to external, verified documents in real-time, and the development of better uncertainty quantification metrics. Uncertainty quantification allows a model to explicitly state when it does not know an answer, rather than guessing. The foundational work in precise entity extraction provides the underlying technology necessary to make these RAG systems function effectively, as the system must first accurately identify what information is missing before it can retrieve it.
Practical Takeaways for AI Practitioners
The insights derived from the research conducted at the University of Zagreb Faculty of Electrical Engineering and Computing offer actionable lessons for technology leaders and developers working with language models today.
- Evaluate your use case critically: Determine whether you need a generalist conversational agent or a specialized information extraction tool. If your goal is precise data parsing, a large language model may be the wrong tool for the job.
- Invest in domain-specific fine-tuning: Allocate resources to curate high-quality, domain-specific training data. A smaller model trained on your specific terminology and document structure will yield more reliable results than a massive model relying on its pre-trained, generalized knowledge.
- Implement strict validation pipelines: Even with fine-tuned models, automated outputs must be subject to programmatic validation. If a model is tasked with extracting dates or monetary values, downstream scripts should verify that these outputs conform to expected formats and logical constraints.
- Consider the linguistic context: Ensure that the models you deploy are capable of handling the specific linguistic nuances of your operational region. Generic multilingual models often underperform compared to models specifically trained on local language datasets, such as the Croatian media corpus utilized by TakeLab.
Share your experiences in the comments below.
Conclusion
As artificial intelligence becomes deeply embedded in enterprise data pipelines, the tolerance for hallucinations will rapidly decrease. The research led by David Dukić at the University of Zagreb Faculty of Electrical Engineering and Computing highlights a clear path forward: prioritizing precision, leveraging specialized datasets, and recognizing the superior performance of fine-tuned, smaller models for specific tasks. By moving beyond the allure of massive, generalized language models and focusing on targeted, reliable information extraction, the AI industry can build systems that businesses can genuinely trust.