Serbia launches LORYA to turn cultural heritage into AI-ready language data
The system aims to improve access to underrepresented languages in AI while supporting research and education through digitised cultural archives.
Serbia has launched LORYA, a new platform that uses AI-supported document processing to convert books, newspapers, manuscripts, and other written heritage materials into clean, structured, machine-readable data for research, education, and language technologies.
Developed by the UN Development Programme, the Mathematical Institute of the Serbian Academy of Sciences and Arts, and the National Library of Serbia, with support from France and Japan, the project is aimed not only at preserving written cultural heritage, but also at addressing a broader AI problem: the weak representation of underrepresented languages, scripts, and historical texts in digital training data.
The distinction matters. While many digitisation initiatives focus mainly on preservation and access, LORYA is also designed to prepare historical material for computational use. In practice, that means converting complex printed and handwritten documents into reusable data that can better support language technologies and future AI systems.
The platform focuses on books, newspapers, manuscripts, and other archival sources, including materials that traditional OCR systems often struggle to process. Its ability to work with handwritten, multi-script, and visually complex documents makes it especially relevant for collections that have remained difficult to digitise in a meaningful way.
That gives the project a wider significance beyond Serbia. As AI systems continue to depend on large volumes of digital text, many smaller or historically under-digitised languages remain poorly represented in training datasets. By transforming cultural heritage into structured digital resources, LORYA frames preservation not only as an archival task but also as part of a broader effort to make AI development more linguistically inclusive.
The project has also been released as open-source software and recognised as a Digital Public Good, suggesting that it is meant to serve as more than a national pilot. Interest from UNDP teams in Iraq and Nepal indicates that the model could be adapted in other contexts where cultural heritage, language diversity, and digital capacity intersect.
Seen in that light, LORYA is not simply a heritage digitisation tool. It is also an attempt to connect cultural preservation with public-interest AI development, while arguing that historical texts, minority languages, and local knowledge systems should not remain on the margins of the AI era.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
