Currently there are a significant number of assets which are not able to participate in Frequency Response services due to challenges in clearly demonstrating frequency response delivery separate from the delivery of other services. This means there is less competition in the markets and as a result the cost paid for response services is higher than it could be if these assets were able to participate.
This project will investigate analysis techniques and develop an algorithm to validate Response delivery from a large number of these assets which are unable to use conventional metering solutions.
This should enable service providers to participate in Dynamic Response markets with assets using forms of data processing to separate out dynamic response service delivery from other energy recorded by the meter.
Benefits
- Significant increase in assets able to participate in services (initially estimated at 443MW, rising to 672-1126MW in 2025), greatly increasing competition and liquidity ultimately resulting in lower Response procurement costs (currently approximately £20m/month)
- Reduced barriers to entry supporting overall frequency market ambitions
- Provide global industry leading solution to use data processing innovation which could later be adapted to unlock capacity in other markets
- Reduced requirement and dependency on constrained new connections with benefits to ESO and wider industry
Learnings
Outcomes
The final outcome of the project is the implementation of (at a proof-of concept level) the process, algorithmic model and scoring system enabling ESO to identify gaming in the submission of post event baselines.
The resulting tool, written in Python, flags suspicious unit behavior to ESO, highlighting points for further investigation. Specific features include:
- A data collation module for fetching and processing necessary data
- Algorithms for running gaming checks
- A metric that aggregates scores across all checks into a single value, indicating the likelihood of gaming
- Demos with visualizations of checks and scores
The finalised checks to be implemented included:
- Correlation between baseline and ideal response
An ideal response is how the unit should respond to frequency deviations per its contract. The baseline should be independent of the ideal response, and a correlation should indicate gaming.
- Difference between reported and expected baseline
While this project is about allowing participation from providers for whom submitting fixed baselines an hour ahead is difficult, we can still check whether the difference between reported and expected baselines are characteristic or unusual as an indicator of potential gaming
- Difference between reported and expected active power
Anomaly detection methods to identify subtle behaviour changes
- Difference between reported active power and metered active power
Reported active power (PM) and metered active power (OM) should be equal most of the time (minor differences may be due to measurement error). Significant, consistent or regularly-occurring differences could indicate gaming behaviour.
- Unavailability checks
These checks use anomaly detection methods to identify cases where declarations of unavailability indicate systemic, deliberate gaming.
- Correlation with market prices
Prices in the wholesale electricity market should not affect delivery. We can however recognise units may be incentivised to alter their response delivery to take advantage of wholesale prices, and thus should be considered alongside other gaming checks.
The method uses a combination of the above checks. The output Aggregated Gaming Scores are values between 0 and 1 representing how suspicious the unit's behavior is based on their data, for each date and time. Depending on data collation time (e.g., API connection strength), checks take 1-2 minutes to run. The Aggregated Gaming Score computation takes less than 5 minutes.
Lessons Learnt
Future projects could test the solutions in practice (and train the models) in our Dynamic Response markets (ie with historical, rather than synthetic data) to inform any revisions such as weighting some checks more than others, identifying thresholds at which we consider gaming is more likely, ie reflecting where events are confirmed or discarded. This could provide ESO more confidence in the robustness of the checks and ultimately reduce barriers for participation from units with variable baselines in our dynamic response services.
- Other projects could explore similar techniques being applied to other ancillary services. Projects would need to consider different data flows and IT integration, but the fundamental principles of gaming detection and solutions reviewed in this project could be applicable to other services.
Future projects could consider the applicability of the solution developed in this project for different use cases. The principal use-case considered in this project was that of a battery which is providing an ancillary service while at the same time serving a variable load. We could compare the effectiveness of the solution for different use-cases to consider whether refinements are required including whether different checks should be done, or weighted differently, for different use cases.