Advances in Human-Centric AI

The field of human-centric AI is rapidly advancing, with a focus on developing models that can understand and predict human behavior, intentions, and emotions. Recent research has led to the development of innovative frameworks and datasets that enable more accurate forecasting of human navigation, hand movements, and gaze. Notably, the incorporation of multimodal sensing and fusion of visual, auditory, and sensory cues has improved the performance of various AI systems. Furthermore, the development of power-efficient autonomous mobile robots and socially-aware embodied navigation models has significant implications for real-world applications.

Some noteworthy papers in this area include: EgoCogNav, which proposes a multimodal egocentric navigation framework that predicts perceived path uncertainty and forecasts trajectories and head motion. SFHand, a streaming framework for language-guided 3D hand forecasting that achieves state-of-the-art results and outperforms prior work by a significant margin. SocialNav, a foundational model for socially-aware navigation that achieves strong gains in both navigation performance and social compliance. GazeProphetV2, a multimodal approach to VR gaze prediction that combines temporal gaze patterns, head movement data, and visual scene information to predict gaze behavior in virtual reality environments.

Sources

EgoCogNav: Cognition-aware Human Egocentric Navigation

SFHand: A Streaming Framework for Language-guided 3D Hand Forecasting and Embodied Manipulation

Gaze Beyond the Frame: Forecasting Egocentric 3D Visual Span

SkillSight: Efficient First-Person Skill Assessment with Gaze

MHB: Multimodal Handshape-aware Boundary Detection for Continuous Sign Language Recognition

GazeProphetV2: Head-Movement-Based Gaze Prediction Enabling Efficient Foveated Rendering on Mobile VR

Power-Efficient Autonomous Mobile Robots

Modular Deep Learning Framework for Assistive Perception: Gaze, Affect, and Speaker Identification

SocialNav: Training Human-Inspired Foundation Model for Socially-Aware Embodied Navigation

Built with on top of