Advances in Molecular Representation and Generation

The field of molecular representation and generation is rapidly advancing, with a focus on developing innovative methods for predicting polymer properties, generating novel molecules, and improving the accuracy of molecular models. Recent research has explored the use of multi-view representations, graph neural networks, and diffusion-based models to improve the prediction of materials properties and the generation of valid molecules. Additionally, there is a growing interest in developing frameworks for knowledge editing and molecule-text alignment, which can enhance the performance of molecular language models and improve the discovery of new materials and drugs. Notable papers in this area include: Benchmarking GNNs for OOD Materials Property Prediction with Uncertainty Quantification, which presents a benchmark framework for evaluating graph neural networks on out-of-distribution materials property prediction with uncertainty quantification. Chain-of-Generation: Progressive Latent Diffusion for Text-Guided Molecular Design, which proposes a training-free multi-stage latent diffusion framework for text-guided molecular design. ChemFixer: Correcting Invalid Molecules to Unlock Previously Unseen Chemical Space, which introduces a framework for correcting invalid molecules into valid ones, expanding the diversity of potential drug candidates. MiAD: Mirage Atom Diffusion for De Novo Crystal Generation, which demonstrates a simple yet powerful technique for enabling diffusion models to change the state of atoms in a crystal during the generation process.

Sources

Multi-View Polymer Representations for the Open Polymer Prediction

Benchmarking GNNs for OOD Materials Property Prediction with Uncertainty Quantification

Chain-of-Generation: Progressive Latent Diffusion for Text-Guided Molecular Design

RTMol: Rethinking Molecule-text Alignment in a Round-trip View

MolEdit: Knowledge Editing for Multimodal Molecule Language Models

ChemFixer: Correcting Invalid Molecules to Unlock Previously Unseen Chemical Space

The Tokenization Bottleneck: How Vocabulary Extension Improves Chemistry Representation Learning in Pretrained Language Models

MiAD: Mirage Atom Diffusion for De Novo Crystal Generation

Tokenisation over Bounded Alphabets is Hard

Built with on top of