Advancements in Image Captioning and Outdoor Monitoring

The field of image captioning is moving towards more accurate and descriptive caption generation, with a focus on self-correction and attention-guided approaches. Researchers are also exploring the use of retrieval-based objects and relations prompts to improve captioning performance. In the area of outdoor monitoring, there is a growing interest in developing uncertainty-aware multimodal fusion frameworks to detect early abnormal health status and improve visual geo-localization for drones in various weather conditions. Notable papers in this area include SC-Captioner, which proposes a reinforcement learning framework for self-correcting image caption models, and WeatherPrompt, which introduces a multi-modality learning paradigm for weather-invariant representations. Additionally, RORPCap and AGIC demonstrate promising results in image captioning, while DeepLight and DUAL-Health show potential in lightning prediction and outdoor health monitoring, respectively.

Sources

SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning

AGIC: Attention-Guided Image Captioning to Improve Caption Relevance

RORPCap: Retrieval-based Objects and Relations Prompt for Image Captioning

Lightning Prediction under Uncertainty: DeepLight with Hazy Loss

Dynamic Uncertainty-aware Multimodal Fusion for Outdoor Health Monitoring

WeatherPrompt: Multi-modality Representation Learning for All-Weather Drone Visual Geo-Localization

Built with on top of