Advances in Music Information Retrieval and Generation

The field of music information retrieval and generation is rapidly evolving, with a focus on developing innovative methods for music modeling, transcription, and generation. Recent research has explored the use of deep learning techniques, such as transformer architectures and graph neural networks, to improve music analysis and generation tasks. Notably, there is a growing interest in developing models that can learn from large-scale datasets and generate high-quality music samples. Additionally, researchers are working on improving music transcription accuracy, particularly in the context of child-centered audio recordings and automatic lyrics transcription.

Some particularly noteworthy papers in this area include the correlation-permutation approach for speech-music encoders model merging, which enables the creation of unified audio models from independently trained encoders. The LiLAC model is also notable, as it offers a lightweight and modular architecture for musical audio generation with fine-grained controls. Other papers, such as the ones introducing the Fretting-Transformer and SonicVerse models, demonstrate significant advancements in music transcription and captioning tasks.

Sources

A correlation-permutation approach for speech-music encoders model merging

LiLAC: A Lightweight Latent ControlNet for Musical Audio Generation

Enabling automatic transcription of child-centered audio recordings from real-world environments

Reimagining Dance: Real-time Music Co-creation between Dancers and AI

Set theoretic solution for the tuning problem

Fretting-Transformer: Encoder-Decoder Model for MIDI to Tablature Transcription

SLEEPING-DISCO 9M: A large-scale pre-training dataset for generative music modeling

An Open Research Dataset of the 1932 Cairo Congress of Arab Music

Evolving music theory for emerging musical languages

Refining music sample identification with a self-supervised graph neural network

Adaptive Accompaniment with ReaLchords

SonicVerse: Multi-Task Learning for Music Feature-Informed Captioning

Exploiting Music Source Separation for Automatic Lyrics Transcription with Whisper

Diff-TONE: Timestep Optimization for iNstrument Editing in Text-to-Music Diffusion Models

Versatile Symbolic Music-for-Music Modeling via Function Alignment

Built with on top of