The field of remote sensing image segmentation is moving towards more sophisticated and diverse datasets, as well as innovative frameworks that integrate computer vision and natural language processing. Recent developments have focused on creating large-scale datasets that support single-object, multi-object, and non-object segmentation scenarios, which is expected to improve the generalization and real-world applicability of refer segmentation models. Additionally, there is a growing interest in open-vocabulary semantic segmentation, which involves assigning semantic labels to each pixel in an image using textual descriptions. This has led to the development of new benchmarks and evaluation metrics, such as those that assess performance across different viewing angles and sensor modalities. Notable papers include:
- A paper introducing a large-scale referring remote sensing image segmentation dataset and a novel framework that achieves state-of-the-art performance.
- A paper proposing a cost aggregation approach with optimal transport for open-vocabulary semantic segmentation, which notably improves the performance of existing models.
- A paper presenting a benchmark for multi-angle segmentation across aerial and ground perspectives, which facilitates an extensive evaluation of performance across different viewing angles and sensor modalities.