Advances in Semantic Segmentation and Self-Supervised Learning

The field of computer vision is moving towards developing more robust and efficient methods for semantic segmentation and self-supervised learning. Researchers are exploring new approaches to improve the accuracy of semantic segmentation models, particularly when fine annotations are not available. One promising direction is the use of coarse annotations and regularization methods to optimize the alignment of boundaries between classes. Another area of focus is self-supervised learning, where researchers are working to mitigate the effects of dense degradation and improve the performance of dense prediction tasks. Notable papers in this area include:

  • A paper that proposes a regularization method for models with an encoder-decoder architecture, which improves boundary recall when trained on coarse annotations.
  • A paper that introduces a Dense representation Structure Estimator (DSE) to tackle the issue of evaluating dense performance without annotations, and demonstrates its effectiveness in improving model selection and regularization.
  • A paper that presents a novel approach to salient object detection, which achieves near-supervised accuracy without pixel-level labels by leveraging reliable pseudo-masks and optimal transport alignment.
  • A paper that introduces FINDER, a framework for analyzing generic classification problems on noisy datasets, which produces state-of-the-art results in several challenging domains.
  • A paper that diagnoses and prevents partial prototype collapse in prototypical self-supervised learning, by introducing a fully decoupled training strategy that learns prototypes and encoders under separate objectives.

Sources

Semantic segmentation with coarse annotations

Exploring Structural Degradation in Dense Representations for Self-supervised Learning

Split-Fuse-Transport: Annotation-Free Saliency via Dual Clustering and Optimal Transport Alignment

FINDER: Feature Inference on Noisy Datasets using Eigenspace Residuals

Why Prototypes Collapse: Diagnosing and Preventing Partial Collapse in Prototypical Self-Supervised Learning

H-SPLID: HSIC-based Saliency Preserving Latent Information Decomposition

Built with on top of