Advances in Entity Recognition and Data Integration

The field of entity recognition and data integration is experiencing significant advancements, driven by the development of large language models and innovative methodologies. Researchers are exploring new approaches to improve the accuracy and efficiency of entity recognition, such as prompt-based learning, in-context clustering, and few-shot prompting. These methods have shown promising results in various domains, including medical texts, historical documents, and cultural news texts. Furthermore, the integration of common data elements across heterogeneous datasets is being facilitated by dynamic frameworks that leverage embeddings and clustering techniques. Noteworthy papers in this area include: Segmenting France Across Four Centuries, which introduces a new dataset for analyzing long-term land use and land cover evolution. Research on Medical Named Entity Identification Based On Prompt-Biomrc Model and Its Application in Intelligent Consultation System, which presents a novel approach to medical entity recognition using prompt learning. In-context Clustering-based Entity Resolution with Large Language Models, which proposes a scalable and effective method for entity resolution using in-context clustering.

Sources

Segmenting France Across Four Centuries

Research on Medical Named Entity Identification Based On Prompt-Biomrc Model and Its Application in Intelligent Consultation System

A Dynamic Framework for Semantic Grouping of Common Data Elements (CDE) Using Embeddings and Clustering

In-context Clustering-based Entity Resolution with Large Language Models: A Design Space Exploration

Evaluating Named Entity Recognition Models for Russian Cultural News Texts: From BERT to LLM

On Entity Identification in Language Models

Token and Span Classification for Entity Recognition in French Historical Encyclopedias

TransClean: Finding False Positives in Multi-Source Entity Matching under Real-World Conditions via Transitive Consistency

Built with on top of