The field of spatial reasoning and human mobility prediction is rapidly evolving, with a focus on developing innovative models that can accurately predict human movement and understand spatial relationships. Recent research has explored the use of Vision-Language Models (VLMs) to improve spatial reasoning and mobility prediction, with promising results. Notably, the integration of reinforcement learning and visual map feedback has shown significant improvements in next-location prediction. Additionally, the development of frameworks that can adapt to disaster scenarios has demonstrated potential for predicting human mobility in emergency situations. The use of synthetic datasets and curriculum learning has also been effective in improving the robustness and generalization of spatial language models. Overall, the field is moving towards more sophisticated and human-like models that can reason about spatial relationships and predict human movement with high accuracy. Noteworthy papers include: Eyes Will Shut, which proposes a vision-based next GPS location prediction model using reinforcement learning from visual map feedback, achieving state-of-the-art performance. Predicting Human Mobility in Disasters via LLM-Enhanced Cross-City Learning introduces DisasterMobLLM, a framework that leverages LLMs to model mobility intention and achieves a 32.8% improvement in terms of Acc@1. 3D-R1 proposes a foundation model that enhances the reasoning capabilities of 3D VLMs, delivering an average improvement of 10% across various 3D scene benchmarks.