Advances in Music Information Retrieval and Generation

The field of music information retrieval and generation is rapidly evolving, with a focus on developing more sophisticated and nuanced models for music analysis, generation, and interpretation. Recent research has explored the use of multimodal approaches, incorporating audio, symbolic, and textual modalities to capture the complexities of music. The development of new datasets and benchmarks, such as those for music autotagging and sheet music reasoning, has also enabled more comprehensive evaluations of model performance. Notably, the integration of music theory and cognitive insights into model design has led to improved performance and interpretability in tasks such as music genre classification and chord estimation. Furthermore, the application of graph neural networks and other advanced machine learning techniques has shown promise in tackling complex music analysis tasks. Some particularly noteworthy papers in this area include: AnalysisGNN, which introduces a unified framework for music analysis using graph neural networks, and PianoBind, which proposes a multimodal joint embedding model for pop-piano music. Additionally, the WildScore benchmark has been introduced for evaluating multimodal symbolic music reasoning, and the PianoVAM dataset has been released for multimodal piano performance analysis.

Sources

CoComposer: LLM Multi-agent Collaborative Music Composition

Algorithms for Collaborative Harmonization

From Discord to Harmony: Decomposed Consonance-based Training for Improved Audio Chord Estimation

Music Genre Classification Using Machine Learning Techniques

Synthesizing Sheet Music Problems for Evaluation and Reinforcement Learning

PianoBind: A Multimodal Joint Embedding Model for Pop-piano Music

WildScore: Benchmarking MLLMs in-the-Wild Symbolic Music Reasoning

Exploring Situated Stabilities of a Rhythm Generation System through Variational Cross-Examination

Optical Music Recognition of Jazz Lead Sheets

AnalysisGNN: Unified Music Analysis with Graph Neural Networks

Benchmarking Music Autotagging with MGPHot Expert Annotations vs. Generic Tag Datasets

PianoVAM: A Multimodal Piano Performance Dataset

Built with on top of