Diffusion-Augmented LSTM for Improved Generalization in Cross-Site Solar Power Forecasting

DEMİR, AHMET; NECATİ, ATABERK

doi:10.1155/etep/6970304

Diffusion-Augmented LSTM for Improved Generalization in Cross-Site Solar Power Forecasting

DEMİR A., NECATİ A.

INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS, cilt.2026, sa.1, 2026 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 2026 Sayı: 1
Basım Tarihi: 2026
Doi Numarası: 10.1155/etep/6970304
Dergi Adı: INTERNATIONAL TRANSACTIONS ON ELECTRICAL ENERGY SYSTEMS
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus
Eskişehir Osmangazi Üniversitesi Adresli: Evet

Özet

Accurate predictions of photovoltaic (PV) power output are essential for the effective functioning of renewable energy systems amidst fluctuating environmental conditions. This study investigates whether diffusion-based data augmentation can improve the generalization of a deliberately simple LSTM forecasting backbone for solar power prediction. Two public datasets were used: a short-term cross-site dataset from two Indian PV plants and a long-term multiyear dataset from the Desert Knowledge Australia (DKA) Solar Center. In both cases, the data were preprocessed and converted into supervised learning sequences. The main methodological contribution of this study is the use of a denoising diffusion probabilistic model (DDPM) as a training-data augmentation mechanism, rather than the proposal of a new forecasting architecture. This model generates synthetic training sequences to enhance model generalization for use across various sites. For each dataset, we compare a baseline LSTM trained only on real data with an augmented LSTM trained on a mixture of real and DDPM-generated sequences. Specifically, in the Indian dataset, the model forecasts the plant-level daily energy yield at 15-min resolution, whereas in the DKA dataset, it predicts active power. Quantitative assessments on both datasets confirm that the diffusion-augmented model achieves statistically significant improvements. For the cross-site task, root-mean-squared error (RMSE) was reduced from 13.01 +/- 0.85 to 12.02 +/- 0.18 kWh (p = 0.003). For the long-term seasonal task, RMSE decreased from 44.71 +/- 11.71 to 36.78 +/- 0.17 kW, indicating a strong practical improvement and much greater stability. Statistical significance was confirmed for MAE (p = 0.003) and MAPE (p < 0.001). These findings indicate that diffusion-based augmentation can improve the generalization and stability of the selected LSTM predictor under cross-site and seasonal shifts. The results should be interpreted as evidence for the augmentation strategy under a fixed backbone, not as a claim of the state-of-the-art forecasting architecture.