The field of music transcription and retrieval is moving towards more efficient and effective models, with a focus on sparse attention mechanisms and lightweight architectures. Recent developments have shown that these approaches can achieve state-of-the-art performance while reducing computational cost and memory usage. Additionally, there is a growing interest in timbre-aware separation and retrieval, with novel associative memory mechanisms and contrastive learning frameworks being proposed. These advancements have the potential to improve various applications, including music generation, instrument retrieval, and singing voice conversion. Noteworthy papers include: Efficient Transformer-Based Piano Transcription With Sparse Attention Mechanisms, which proposes a sparse attention mechanism for efficient piano transcription, and Contrastive timbre representations for musical instrument and synthesizer retrieval, which introduces a contrastive learning framework for instrument retrieval.