Advancements in Human Motion Understanding and Action Segmentation

The field of human motion understanding and action segmentation is rapidly evolving, with a focus on developing innovative methods for retrieving and analyzing complex human behavior in various scenarios. Recent developments have seen a shift towards open-vocabulary approaches, which enable the detection of interactions between humans and objects beyond predefined classes. Additionally, there is a growing interest in unsupervised and semi-supervised methods for temporal action segmentation, which can learn from unlabeled or partially labeled data. These advancements have the potential to improve the robustness and generalization of autonomous driving systems, human-computer interaction, and other applications. Noteworthy papers include:

  • A novel context-aware motion retrieval framework that enables the scalable retrieval of human behavior and their context through text queries, outperforming state-of-the-art models by up to 27.5% accuracy.
  • An end-to-end open-vocabulary HOI detector that integrates interaction-aware prompts and concept calibration, significantly outperforming state-of-the-art models on several datasets.
  • A novel approach for unsupervised skeleton-based temporal action segmentation, which utilizes a sequence-to-sequence temporal autoencoder and latent skeleton sequences to discover semantically meaningful action clusters.
  • A two-step approach for action discovery, which leverages known annotations to guide the temporal and semantic granularity of unknown action segments, demonstrating considerable improvements upon existing baselines.

Sources

Context-based Motion Retrieval using Open Vocabulary Methods for Autonomous Driving

Open-Vocabulary HOI Detection with Interaction-aware Prompt and Concept Calibration

Skeleton Motion Words for Unsupervised Skeleton-Based Temporal Action Segmentation

Looking into the Unknown: Exploring Action Discovery for Segmentation of Known and Unknown Actions

Built with on top of