Following a major gas outage incident in the Stannington area of Sheffield, Northern Powergrid gained permission from DESNZ to collect disaggregated consumption data from smart meters located in this area. This proposal is to undertake a data modelling exercise to compare the disaggregated and aggregated consumption data, ElectraLink data, registers of embedded generation and known locations of electric vehicles to train machine learning models and prediction algorithms.
The analysis will cover 14 months of consumption data collected for 1500 smart meters within the Stannington suburb of Sheffield with a time period from November 2022 to the end December 2023. The proposal will look at identifying LCTs on disaggregated smart meter data and use that to investigate the possibility of identifying LCTs on aggregated data.
Benefits
This is a Low TRL Research Project and is not expected to have any immediate benefits.
Learnings
Outcomes
The limited number of known LCT installations within the Stannington area precluded the use of data driven methods. These being methods where the model (or more usually the parameters of a pre-selected model such as logistic regression) used to classify if a given LCT type is present within a given HH series is learnt from the data.
Instead, a heuristic approach was taken, where a set of rules are developed, based on knowledge of the system a priori.
EV identification at premises level was found to be relatively straightforward. However, a lack of ground truth, such as registered charging points, precluded formal validation. Using the EV flagged premises as a guide, many charging points were visible via Google Street View.
The Solar PV Heuristic is based on the intuition that properties with PV will consume less electricity on sunny days than when it is cloudy. This is found to be the case and clear seasonal envelope is visible in the consumption data. This provides hope that image-based classification techniques may prove fruitful. However, the task will be complicated by missing and irregular data.
The detection of ASHP is frustrated by the low levels of adoption (<1% of premises) and differences in operation (low-slow vs high-fast).
Both the ASHP and PV heuristics would benefit from learning from, and testing against, additional HH data from known installations (ideally also premises with battery storage).
Overall, the heuristics presented here demonstrate the feasibility of detecting LCTs from consumption data. It is hoped, as more data become available, that they will serve as useful benchmarks against which the performance of more advanced ML techniques can be measured.
Aggregation does mask some signals, although EV usage is still clearly identifiable at feeder and substation level. Disaggregated consumption data at HH level therefore allows NPG to build better models for LCT identification.
Potential Future Projects:
- Further investigation of “willing volunteer” consumption profiles to refine understanding in interpretation of profiles
- Clustering profile analysis of non-domestic customers (not subject to same data confidentiality constraint)
- Use Stannington data to generate artificial consumption profiles for higher penetrations of LCT and ToU tariffs with impacts on substation load
- Potential for use of Agent Based Models to investigate consumer herding impacts
- Potential collaboration on Demand Forecasting Project
Lessons Learnt
A general lesson from the study is that gathering large volumes of HH consumption data from smart meters retrospectively requires significant effort is required on the party requesting the data. The inherent limitations of the DCC service and smart meter behaviour need to be accommodated.
Acquisition of HH metered export data would increase the effectiveness of Solar PV LCT identification at premises and feeder/substation level. Longer period records (at least three years) will be required to account for the effects of year-to-year variability in cloud cover on generation.