Surgical Intelligence and Data Curation

The field of surgical research is moving towards more intelligent and context-aware systems, with a focus on improving patient safety and reducing preventable medical errors. Recent developments have highlighted the importance of comprehensive datasets and robust models for surgical analysis and risk detection. Innovative approaches, such as the use of multimodal large language models and vision-language models, are being explored to address challenges in surgical training, real-time decision support, and workflow analysis. Noteworthy papers in this area include: CAT-SG, which introduces a large dynamic scene graph dataset for fine-grained understanding of cataract surgery, enabling more accurate recognition of surgical phases and techniques. Visual-Semantic Knowledge Conflicts, which presents a dataset of synthetic images for studying visual-semantic knowledge conflicts in operating rooms, aiming to improve surgical risk perception in multimodal large language models. Sanitizing Manufacturing Dataset Labels, which proposes a vision-language-based framework for label sanitization and refinement in manufacturing image datasets, enhancing dataset quality for training robust machine learning models. SurgVisAgent, which develops a multimodal agentic model for versatile surgical visual enhancement, dynamically identifying distortion categories and severity levels in endoscopic images.

Sources

CAT-SG: A Large Dynamic Scene Graph Dataset for Fine-Grained Understanding of Cataract Surgery

Visual-Semantic Knowledge Conflicts in Operating Rooms: Synthetic Data Curation for Surgical Risk Perception in Multimodal Large Language Models

Sanitizing Manufacturing Dataset Labels Using Vision-Language Models

SurgVisAgent: Multimodal Agentic Model for Versatile Surgical Visual Enhancement

Built with on top of