Physics-Aware Perception in Vision and Graphics

The field of computer vision is moving towards incorporating physical reasoning and geometry into its models, enabling more accurate and robust perception. This is evident in the development of novel physically-grounded visual backbones and the integration of geometric priors into photometric stereo networks. Another significant trend is the improvement of efficiency and accuracy in depth estimation and visual odometry, with a focus on real-time deployment and robustness under adverse conditions. Notable papers in this area include: DPVO-QAT++ which achieves significant reductions in memory footprint and processing time for deep patch visual odometry, GeoUniPS which leverages geometric priors for universal photometric stereo, RTS-Mono which proposes a real-time self-supervised monocular depth estimation method, WeSTAR which enhances the generalization of depth estimation foundation models via weakly-supervised adaptation, SEC-Depth which introduces a self-evolution contrastive learning framework for robust depth estimation, RoMa v2 which presents a novel matching architecture for dense feature matching, MOMNet which proposes an alignment-free framework for depth super-resolution, Lite Any Stereo which achieves efficient zero-shot stereo matching. These papers demonstrate the progress being made in physics-aware perception and its applications in vision and graphics.

Sources

{\Phi}eat: Physically-Grounded Feature Representation

DPVO-QAT++: Heterogeneous QAT and CUDA Kernel Fusion for High-Performance Deep Patch Visual Odometry

Geometry Meets Light: Leveraging Geometric Priors for Universal Photometric Stereo under Limited Multi-Illumination Cues

SOMA: Feature Gradient Enhanced Affine-Flow Matching for SAR-Optical Registration

RTS-Mono: A Real-Time Self-Supervised Monocular Depth Estimation Method for Real-World Deployment

Enhancing Generalization of Depth Estimation Foundation Model via Weakly-Supervised Adaptation with Regularization

Learning Depth from Past Selves: Self-Evolution Contrast for Robust Depth Estimation

RoMa v2: Harder Better Faster Denser Feature Matching

Multi-Order Matching Network for Alignment-Free Depth Super-Resolution

Lite Any Stereo: Efficient Zero-Shot Stereo Matching

Built with on top of