The field of power systems research is moving towards the development of innovative methods for generating synthetic data, driven by the need to address privacy concerns and improve the accuracy of energy systems. Recent studies have focused on evaluating the trade-offs between privacy and utility in synthetic data generation, with a particular emphasis on the use of generative models such as diffusion models and GANs. These models have shown promising results in generating high-fidelity synthetic data that can be used for a variety of applications, including power grid optimization and carbon emissions reduction. Notably, the integration of domain-specific knowledge into these models has been identified as a key factor in improving their performance.
Some notable papers in this area include: Evaluating Privacy-Utility Tradeoffs in Synthetic Smart Grid Data, which highlights the potential of structured generative models for developing privacy-preserving data-driven energy systems. Domain-Constrained Diffusion Models to Synthesize Tabular Data, which proposes a guided diffusion model that integrates domain constraints directly into the generative process. CausalDiffTab, which introduces a diffusion model-based generative model specifically designed to handle mixed-type tabular data. PGLib-CO2, which provides an open-source extension to the widely adopted PGLib-OPF test case library, enriching standard network cases with CO2 and CO2-equivalent emission intensity factors.