Using Deep Machine learning techniques, this project is exploring whether if we had more accurate predictions for solar electricity generation then we could reduce the amount of "spinning reserve" required. This would reduce carbon emissions and reduce costs to end-users, as well as increase the amount of solar generation the grid can handle
Benefits
Not required.
Learnings
Outcomes
Fully operational PV Nowcasting service ran on two ML models:
1. PVNet for 0-6 hours
2. Blend of PVNet and National_xg for 6-8 hours, and National_xq from 8 hours and beyond.
Accuracy improvement over the previous OCF model by approximately 30% for the GSP and National forecasts (4-8 hours), resulted in forecasts approximately 40% more accurate than the Balancing Mechanism Reporting Service (BMRS) model and over 40% against the PEF forecast (for 0-8 hours).
Probabilistic forecasts for all horizons:
1. Backtest runs for DRS project
2. UI including a new Delta view, dashboard view and probabilistic display
3. UI speedup, with query times from 20s down to <1 second
The most significant of these was the achievement of the target set by NESO of a 20% reduction in MAE error. This was extremely large in renewable forecasting and was the result of numerous machine learning improvements.
Literature Review of NWP ensemble to identify common trends and the methodologies for Solar PV forecasts creation using ensemble weather data. Research showed that incorporating NWP ensemble improved deterministic forecast accuracy by 5% and reduced large forecasting errors by up to 12%. Integration of NWP ensemble into existing PV Nowcasting service running was not done yet as the extension was focused only on research and development of methodology on how to combine ensemble forecasts to create a probabilistic forecast.
Lastly, the forecast from Open Climate Fix was delivered completely open and documented. The resilience was significantly increased over the project duration, resulting in over 99.5% availability. This resilience was implemented by NESO, with all the infrastructure constructed in code to allow replicability.
Lessons Learnt
In the course of W1 and WP2, the project identified the following lessons:
1. Never underestimate the importance of cleaning up and checking data in advance: Several approaches to loading data were tried, from on-the-fly to pre-preparing, and instituted automatic and visual tests of the data to ensure the project was always lining up the various data sources correctly. The more effective approach has been to pre-prepare thousands of batches and save each batch as a separate file on disk. This allowed us for rapid loading of these files during training, resulting in a 12x increase in training speed compared to loading data on-the-fly.
2. Having infrastructure as code allows the main production service to run uninterrupted: Having code to easily instantiate infrastructure is very useful to the efficient management of environments to ensure the project could bring the algorithm into productive use. The Terraform software tool was used which made spinning up (and down) environments very easy and repeatable. Being able to spin up new environments allowed the project to test new features in development environments while allowing the main production to keep running uninterrupted.
3. Using Microservices to “start simple and iterate” accelerates development: Using a microservice architecture allowed the project to upgrade individual components as we saw benefits in improving them, independently of changing other components’ behaviour. This was very useful when building out the prototype service, as it allowed the project team to start with a simple architecture - even a trivial forecast model - and iteratively improve the functionality in the components. For example, first starting out with one PV provider of data allowed the project to get a prototype working, and in WP3 we will expand onboard an additional PV provider.
4. Data processing may take longer than expected: While it was initially planned to extend our dataset back to 2016 for all data sources during WP2, it turned out that data processing took much longer than expected. This did not have a direct impact on project deliverables but is something to consider in further ML research.
5. Data validation is important: For both ML training and inference, using clear and simple data validation builds trust in the data. This helps build a reliable production system and keeps software bugs at a minimum.
6. Engaging specialist UX/UI skills is important: By acknowledging that UX and UI design is a specialised area and incorporating those skills, a UI was developed which will be easier to use and convey information effectively. This was validated over WP3 through working with the end users (control room engineers).
7. Building our own hardware demonstrates value for money but may pose other challenges for a small team: Two computers have been built during the project with a total of six GPUs and it was estimated that using on-premises hardware instead of the cloud for data-heavy & GPU heavy machine learning R&D can significantly reduce the direct costs. However, the time it would require for a small team to put together all the components was significant (approx. 25 days for one person in total). While the total costs would still be lower, appropriate resource planning should be considered if planning hardware upgrades in the future.
In the course of WP3 and WP1 (extension), the project identified the following lessons:
1. Merging the code right away when performing frontend testing is of upmost importance: Merging the code after frontend testing proved to be time-consuming, and it was important to consider when performing tests.
2. Large Machine Learning models are harder to productionise: Large Machine Learning models proved to be difficult to productionise due to their substantial requirements for compute, memory, latency and operational overhead which increase deployment cost and system complexity. The size of these models introduces inference latency, hardware dependency, and scalable contraints, making them difficult to integrate into real-time and cost-sensitive production environments.. Ongoing advances in model efficiency, compression techniques, specialised hardware, and Machine Learning Operations (MLOps) tooling are expected to significantly reduce these barriers, enabling more practical and scalable deployment of these large models in the future.
3. Machine Learning training always takes longer than expected: Even with an already made model, making data pipelines work correctly took time. It was important to always have enough time allocated when planning ML training activities.
4. Security and authentication is hard: Ensuring robust authentication/security measures are in place was harder than we envisaged. It may be easier to implement packages already built or contract third party providers to support the process.
5. Separate National Forecast Model: PVLive estimate of National Solar generation does not equal the sum of PV Live’s GSP generation estimate. This motivated us to build a separate National forecast, compared to adding our GSP forecast.
6. Investment is needed to take open-source contributions to the next level: Time and resources are needed to engage with open-source contributors and develop an active community. We may want to consider hiring an additional resource to support this activity.
Lessons from WP2 (extension) were as follows:
1. Expensive cloud machines storage disk left idle: We used some GCP/AWS machines for R&D, and we often paused the machine when they are not in immediate use. This was because some GPU machines could cost significant amounts per hour. It was discovered that costs still accrue for the disk (storage) of the paused machines. Balancing the pausing of the machines with the ability to start them up quickly versus starting a cloud machine from scratch each time has no golden rule, but it was useful to be aware of.
2. Challenging to maintain active communication with NESO: Particularly high turnover at NESO forecasting team affected communication on the project. This evolved over the duration of the project, and more active and easier communication was observed in the latter phase.
3. Reproducibility on cloud vs local servers: When results differ between the cloud and local services, it can be tricky to determine the cause. Verbose logging, saving intermediate data, and maintaining consistent package versions and setup on both machines helped. One particular bug involved results differing when multiple CPUs were used locally, but only one CPU was used in the cloud.
4. Protection of production data: Two environments in the cloud, “development” and “production,” are maintained to protect the “production” data. This setup allowed developers to access the “development” environment, where changes do not affect the live service. Although maintaining two environments increases costs, it was considered worthwhile.
5. Probabilistic forecasts: Some unreleased open-source packages were used to implement one of the probabilistic forecasts. The advantage of using this code before its release was noted, but it also means more thorough checks for bugs are required, which can take more time.
6. Leap year - Clock changes all good due to using UTC. But tiny bits of code broke on leap year. Needed more testing and checking.
Further lessons from WP (extension) of NWP ensemble model development were as follows:
1. Clear project goals are important from the outset: Clarifying project goals early, even in research projects with high uncertainty, prevents scope creep and wasted effort. This ensured everyone is aligned with what success looks like from the outset.
2. Schedule enough time for obtaining data: Obtaining ECMWF data proved challenging due to its substantial size (over 100GB per day) and the considerable time it takes to download, which can range from over 60 hours for a month's archive to as much as 120 hours depending on the archival system's workload. It is therefore important to schedule enough time to obtain data in the early stages of project planning.
3. Optimising Data for Efficiency: Streamlining the number of variables used in the model can significantly cut down on data download times and storage requirements. A study on feature importance pinpointed 6 out of 12 variables that collectively hold most of the information PVNet needed for forecasting, enabling a halving of both download time and storage by providing regular deterministic data for the remaining variables during inference.
4. Importance of Visualisation for Ensemble Data: The multifaceted nature of ensemble data makes it difficult to quickly grasp. Various visualisation techniques are essential for comprehending general trends, spread, and areas of uncertainty.
5. National ensemble-based forecasts are achievable within reasonable timeframes: Running inference for 50 forecast versions for a single Grid Supply Point (GSP) took approximately 5 seconds, following an initial setup period of under 6 minutes. This suggested that a national ensemble-based forecast could be produced in around 25 minutes.
6. Investing in foundational tools streamlined the entire research and development process: Ensuring enough time is allocated to develop the right tools, such as ocf-data-sampler or new training models, accelerates future development and improved overall efficiency.