The field of computer vision is moving towards leveraging weakly-supervised learning and innovative data annotation techniques to improve model performance and reduce manual labeling efforts. Recent research has focused on developing methods that can learn from pseudo-labels, noisy labels, and limited annotations, enabling the training of accurate models with minimal human supervision. Notable papers have proposed frameworks for generating high-quality pseudo-labels, refining label assignments, and transferring knowledge across datasets. These advancements have significant implications for applications such as object detection, visual grounding, and animal pose estimation. Noteworthy papers include Weakly-Supervised Affordance Grounding Guided by Part-Level Semantic Priors, which achieved a breakthrough improvement in affordance learning, and D2AF, a robust annotation framework for visual grounding that overcomes dataset size limitations and enriches the quantity and diversity of referring expressions. Additionally, the paper Auto-Labeling Data for Object Detection presents a viable alternative to standard labeling by configuring previously-trained vision-language foundation models to generate application-specific pseudo ground truth labels.