Abstract
Recent advances in generative AI offer promising solutions for synthetic data generation but often rely on large datasets for effective training. To address this limitation, we propose a novel generative model that learns from limited data by incorporating physical constraints into a Variational Autoencoder (VAE) framework. Specifically, we extend VAE with a physics-based generator to capture underlying dynamics, while unmodeled dynamics are learned via a latent Gaussian Process VAE (GPVAE) component. We further introduce a regularization term that balances the physical model and data-driven discrepancy, promoting both interpretability and fidelity to real-world observations. We evaluate the proposed method on both real and simulated data, demonstrating that the Physics-Informed GPVAE (PIGPVAE) outperforms state-of-the-art methods in terms of diversity and accuracy of the generated samples, even under small-data conditions. Additionally, we demonstrate that PIGPVAE can produce realistic samples beyond the observed distribution, highlighting its robustness and usefulness under distribution shifts.