Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach
Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear which architectural design choices drive generation quality and why current approaches fail. We present a controlled experimental study using domain-specific insurance contracts to investigate these questions. We first establish a single-agent LLM baseline, identifying key failure modes such as poor Ontology Design Pattern compliance, structural redundancy, and ineffective iterative repair. We then introduce a multi-agent architecture that decomposes ontology construction into four artifact-driven roles: Domain Expert, Manager, Coder, and Quality Assurer. We evaluate performance across architectural quality (via a panel of heterogeneous LLM judges) and functional usability (via competency question driven SPARQL evaluation with complementary retrieval augmented generation based assessment). Results show that the multi-agent approach significantly improves structural quality and modestly enhances queryability, with gains driven primarily by front-loaded planning. These findings highlight planning-first, artifact-driven generation as a promising and more auditable path toward scalable automated ontology engineering.
Full article excerpt tap to expand
Computer Science > Artificial Intelligence arXiv:2604.23090 (cs) [Submitted on 25 Apr 2026] Title:Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach Authors:Abid Talukder, Maruf Ahmed Mridul, Oshani Seneviratne View a PDF of the paper titled Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach, by Abid Talukder and Maruf Ahmed Mridul and Oshani Seneviratne View PDF HTML (experimental) Abstract:Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear which architectural design choices drive generation quality and why current approaches fail. We present a controlled experimental study using domain-specific insurance contracts to investigate these questions. We first establish a single-agent LLM baseline, identifying key failure modes such as poor Ontology Design Pattern compliance, structural redundancy, and ineffective iterative repair. We then introduce a multi-agent architecture that decomposes ontology construction into four artifact-driven roles: Domain Expert, Manager, Coder, and Quality Assurer. We evaluate performance across architectural quality (via a panel of heterogeneous LLM judges) and functional usability (via competency question driven SPARQL evaluation with complementary retrieval augmented generation based assessment). Results show that the multi-agent approach significantly improves structural quality and modestly enhances queryability, with gains driven primarily by front-loaded planning. These findings highlight planning-first, artifact-driven generation as a promising and more auditable path toward scalable automated ontology engineering. Subjects: Artificial Intelligence (cs.AI) Cite as: arXiv:2604.23090 [cs.AI] (or arXiv:2604.23090v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2604.23090 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Oshani Seneviratne [view email] [v1] Sat, 25 Apr 2026 00:58:17 UTC (1,382 KB) Full-text links: Access Paper: View a PDF of the paper titled Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach, by Abid Talukder and Maruf Ahmed Mridul and Oshani SeneviratneView PDFHTML (experimental)TeX Source view license Current browse context: cs.AI < prev | next > new | recent | 2026-04 Change to browse by: cs References & Citations NASA ADSGoogle Scholar Semantic Scholar export BibTeX citation Loading... BibTeX formatted citation × loading... Data provided by: Bookmark Bibliographic Tools Bibliographic and Citation Tools Bibliographic Explorer Toggle Bibliographic Explorer (What is the Explorer?) Connected Papers Toggle Connected Papers (What is Connected Papers?) Litmaps Toggle Litmaps (What is Litmaps?) scite.ai Toggle scite Smart Citations (What are Smart Citations?) Code, Data, Media Code, Data and Media Associated with this Article alphaXiv Toggle alphaXiv (What is alphaXiv?) Links to Code Toggle CatalyzeX Code Finder for Papers (What is CatalyzeX?) DagsHub Toggle DagsHub (What is DagsHub?) GotitPub Toggle Gotit.pub (What is GotitPub?) Huggingface Toggle Hugging Face (What is Huggingface?) ScienceCast Toggle ScienceCast (What is ScienceCast?) Demos Demos Replicate Toggle Replicate (What is Replicate?) Spaces Toggle Hugging Face Spaces (What is Spaces?) Spaces Toggle TXYZ.AI (What…
This excerpt is published under fair use for community discussion. Read the full article at arXiv.org.