The field of human-object interaction (HOI) detection is rapidly advancing with a focus on improving recognition accuracy and understanding complex interactions. Recent research has emphasized the importance of incorporating scene information, multi-task learning, and graph neural networks to enhance HOI detection. The development of new datasets and frameworks, such as those for online HOI generation and perception, is also driving progress in this area. Noteworthy papers include:
- Improvement of Human-Object Interaction Action Recognition Using Scene Information and Multi-Task Learning Approach, which proposes a methodology to utilize human action recognition performance by considering fixed object information in the environment and following a multi-task learning approach.
- Explicit Multimodal Graph Modeling for Human-Object Interaction Detection, which proposes a multimodal graph network framework that explicitly models the HOI task in a four-stage graph structure.