The field of e-commerce search and recommendation is witnessing significant advancements with the integration of multimodal learning. Researchers are exploring innovative approaches to leverage textual and visual data to improve search and recommendation systems. A key direction is the development of end-to-end multimodal recommendation systems that can jointly optimize multimodal and recommendation components, enabling real-time parameter updates and tighter alignment with downstream objectives. Another area of focus is the improvement of dense retrieval systems, which are critical for e-commerce search engines. Novel frameworks and techniques, such as multi-objective reinforcement learning and dynamic modality-balanced multimodal representation learning, are being proposed to address challenges like modality imbalance and limited handling of noise in multimodal data. Noteworthy papers in this area include LEMUR, which proposes a large-scale end-to-end multimodal recommender system, and MOON2.0, which introduces a dynamic modality-balanced multimodal representation learning framework. Additionally, papers like TaoSearchEmb and CroPS are making significant contributions to the development of more effective and efficient search systems.