Entity linking over tabular data using the Human-In-The-Loop technique
Keywords: Entity linking, Data enrichment, Tabular data, Human-In-The-Loop
Entity linking is a crucial task in Natural Language Processing (NLP) and Information Retrieval (IR) [Shen et al 2021]. It involves associating specific strings of text (also known as mentions or entities) with the corresponding entries in a knowledge graph or database. This process aids systems in discerning the exact identity that a named entity refers to within a given context, especially when possibilities of ambiguity are present. For instance, a text mention of "Paris" may refer to either "Paris, France" or "Paris, Texas," and entity linking helps clarify such ambiguity [Yin et al 2019].
In the context of structured data such as tables, entity linking involves associating specific cells (entities) with the appropriate entries in a knowledge graph. It requires the identification and linking of entities within tabular data to specific entries in an external structured database. This process is pivotal in making structured data more meaningful and comprehensible by enriching it with external information and context. For instance, in a table of movies, a cell containing "The Matrix" would be linked to the corresponding entry in a knowledge graph, providing additional details about the film, like its director, release date, etc.
Building upon the existing solution available at https://github.com/roby-avo/alligator, the aim is to extend its functionality to support human feedback, thereby enhancing its effectiveness.
Work to be done:
- Human Feedback Integration: Define a process by which human feedback can be incorporated into the entity linking process. The integration should be designed in a manner that allows for usability and efficient processing of human input.
- Solution Evaluation: Conduct an evaluation of the solution using a domain-specific dataset. This assessment should gauge the effectiveness and improvements in the entity linking process brought about by human feedback.
- Future Directions: Based on the evaluation results, outline potential future improvements and directions for further research and development.
[Shen et al 2021] Shen, W., Li, Y., Liu, Y., Han, J., Wang, J., & Yuan, X. (2021). Entity linking meets deep learning: Techniques and solutions. IEEE Transactions on Knowledge and Data Engineering.
[Yin et al 2019] Yin, X., Huang, Y., Zhou, B., Li, A., Lan, L., & Jia, Y. (2019). Deep entity linking via eliminating semantic ambiguity with BERT. IEEE Access, 7, 169434-169445.