During the past half-century, numerous optimization models have been developed to help hydropower producers to determine the optimal schedules on different scheduling levels. In such a hydro-dominated power system as Norway, hydro scheduling optimization tools are of critical importance to the efficient use of water resources, for which even a tiny increase in the energy conversion efficiency matters. This project does not intend to improve or replace any existing optimization tools. Instead, attention has been paid to the way how these tools are used for advanced decision support.
At present, no matter how sophisticated the optimization tools are, the hydropower operators must manually set up the executive commands before running the optimization models. Usually, the choice of commands is based on the operators’ personal experience or the model developers’ general suggestion, i.e., the default setting. The manual setup of commands delimits the power of the optimization tools. The full value of the optimization tools (e.g., increased profit or reduced computational time) could be exploited if the commands are dynamically determined according to the operating and market conditions of the hydro systems. However, the optimal selection of commands is impossible to be done by hand due to thousands of complex constraints and coupling variables and millions of input data.
The primary objective of this project is to facilitate the decision-making process for hydropower producers by realizing the automatic setup of executive commands before running the optimization tools. This automation is achieved by integrating machine learning (ML) techniques with a comprehensive understanding of the hydro systems and optimization models.
To achieve the primary objective, we follow the Gantt chart and planned main activities. In 2020, we developed the algorithms and tested ML models based on hydro test systems. Since 2021, the methodologies have been applied to real-world hydro systems . After back and forth generating datasets, testing ML models, and validating the results for more than 70 rounds, we achieved the following goals:
- All the participating industrial partners identify the real-world hydro scheduling problems and provide 11-month to 8-year historical data of operating and market conditions, i.e., inflow, market price, initial water level, and end water value for each reservoir.
- SINTEF determines the corresponding commands to solve the specific problems, calculates the best-weighted result, and generates the datasets for each partner. A dataset includes not only the historical input data for a given hydro system but also the command setting that gives the best solution, i.e., the highest weighted value of normalized objective value, calculation time, or non-physical spill.
- NTNU investigates the pre-processing techniques to reduce the number of the features (i.e., the number of input columns) and prepares the datasets for the ML model. Each dataset is then split into a training set (75% of the entire dataset) and a testing set (25%). To handle the abnormal distribution of the classes, we use the “imbalance-learn” library to ensure that the distribution of the classes in the two sub-datasets is roughly the same, either by generating new, synthetic cases or by reducing the presence of the majority class. Different supervised learning models, such as Random Forest Classifier, k-Nearest Neighbours, Multi-layer Perceptron, Support Vector Machines, Naive Bayes, AdaBoost Classifier, and Histogram based gradient boosting, are tested for the training dataset. The balanced accuracy is used to compare different ML models for each dataset. TPOT-library is used to search for the best combination of pre-processing, ML model, and their parameters. We use the selected ML model to predict the command setting for the testing dataset.
- SINTEF uses the testing datasets to evaluate the performance. We run the scheduling tool with the default setting and the command setting predicted by ML, respectively. Then the percentage of result difference between the default setting and ML predicted setting is calculated. It is demonstrated that the performance of the highest weighted purpose can be improved by using the command predicted by ML. For example, the purpose of avoiding non-physical spills is required by 4 of 6 industrial partners. 25%-100% non-physical spill can be prevented by setting the command predicted by ML compared to the result obtained by the default setting.
- SINTEF develops an interactive Jupyter notebook where all the industrial partners can test their own hydro systems, understand the ML models, and visualize the results. It has been merged with the SINTEF AI lab. This Jupyter notebook greatly enhances the interaction between industrial partners and R&D researchers. It can be a solid basis for further developing other intelligent decision-making processes. It can also be used for educational purpose by professors and students at NTNU.