Abstract
Public procurement generates over $13 trillion annually, yet data about public buyers and suppliers remains fragmented, inconsistent, and difficult to link across jurisdictions. This paper presents a practical industrial solution developed by Spend Network within the European project enRichMyData to semantically enrich and reconcile procurement data at scale. The proposed pipeline combines large language models (LLMs) with knowledge graphs (KGs) to create and maintain a canonical register of public sector entities. It supports multilingual, cross-border integration and is designed to serve both public transparency and commercial applications. The pipeline has been evaluated on a manually curated benchmark of 1,000 procurement-related entities and demonstrates high precision and scalability in real-world settings.