Diffusion Models for Generative Tasks

The field of generative models is moving towards leveraging diffusion models for various tasks, including image and audio generation, music synthesis, and material simulation. These models have shown promise in achieving high-quality results while providing interpretability and control over the generation process. Notably, researchers are exploring the use of diffusion transformers, hierarchical architectures, and self-supervised pre-training to improve the efficiency and quality of generative models. Some noteworthy papers in this area include:

  • ProGress, which introduces a novel generative music framework that incorporates concepts of Schenkerian analysis and diffusion modeling for structured music generation.
  • Audio Palette, which presents a diffusion transformer-based model for controllable Foley synthesis with fine-grained acoustic control.
  • Hierarchical Koopman Diffusion, which achieves both one-step sampling and interpretable generative trajectories for image generation.

Sources

Generative Models for Helmholtz Equation Solutions: A Dataset of Acoustic Materials

ProGress: Structured Music Generation via Graph Diffusion and Hierarchical Music Analysis

Massive Activations are the Key to Local Detail Synthesis in Diffusion Transformers

Audio Palette: A Diffusion Transformer with Multi-Signal Conditioning for Controllable Foley Synthesis

Hierarchical Koopman Diffusion: Fast Generation with Interpretable Diffusion Trajectory

Advancing End-to-End Pixel Space Generative Modeling via Self-supervised Pre-training

Machine Learning-Based Ultrasonic Weld Characterization Using Hierarchical Wave Modeling and Diffusion-Driven Distribution Alignment

Built with on top of