There is a large amount of useful data which is published about the energy sector through mandatory reports, innovation trials and consumer tools. However, datasets are often published on standalone webpages with limited descriptions. This makes it very difficult for both incumbents and innovators to discover, search and understand datasets.
This project will review all data held by WPD to ascertain the extent that it can be shared with third-parties. Use cases will then be developed using the data that can be shared. The dataset with the highest value will be processed, standardised, and published so the identified use cases can be fully realised.
Benefits
Maximising the visibility and value of data could:
- Faster decarbonise the energy system through easier identification of capacity for the connection of LCT assets.
- Provide a better ‘whole of system’ view promoting better system security.
- Enhance third-party interactions promoting better flexibility response behaviour and greater opportunities for flexibility revenue.
- Facilitate the realisation new commercial opportunities from the creation of new markets with new players.
- Reduce customer bills through a more strategic deployment of community-owned LCTs.
- Optimise procurement regarding asset location, size, or function.
- Better identify whether flexibility or build is a better solution where constraint exists.
Learnings
Outcomes
The outcomes from this project include:
· Learning Reports generated at the end of each of the six work packages.
· Our Data Process Team is now using the Data Playbook developed during Work Package 6 to formalise the policy and process required to fully adopt the Data Sharing Assessment Tool developed during Work Package 3. This will enable third parties to request previously unshared WPD datasets
· The Energy Networks Association (ENA), with the help of WPD, used the Data Playbook developed by the project in Work Package 6 to create the ENA Data Triage Playbook, this in-turn was used to develop the Energy Data Request Tool which enables a single point for data requests and standardising networks’ approach to servicing these requests.
· Additionally, due to the success of the Work Package 5 Data Science Challenge, we are looking to setup additional challenges to engage the data science community with energy problems.
Lessons Learnt
Work Package 1 - Data Discovery & Classification
The primary takeaway from this work package is that WPD has a wealth of valuable information across the organisation, the value of which will be easy to maximise if the barriers to entry are lowered. Some of the decisions that have been taken to make DNOs more secure have a negative impact on their ability to collaborate with external teams and adopt remote working.
Work Package 2 - Use Case Development
Within Work Package 2 the POD team has engaged with a large number of stakeholders across the business and industry. We found that both WPD and external stakeholders were keen to participate in our workshops and were able to offer some very valuable insight. In fact, given the distributed nature of WPD staff and external stakeholders we believe that providing virtual events may have increased the diversity and number of participants.
The workshops have shown that WPD data has significant value to stakeholders outside of the normal groups. From sharing data more effectively within WPD to making more information available to external parties there is significant value to be extracted.
Work Package 3 - Data Openness Assessment
Within Work Package 3, the POD team has engaged with stakeholders across the business to review and test the developed tool with a number of different datasets.
Within this work package the project team have tested the Data Sharing Assessment Tool with real datasets to understand if it meets the need of the end users. This has been critical to identify additional needs, find where the tool is not working as planned or as the user expected and check that the tool delivers valid recommendations.
The concept of Presumed Open is relatively new for the energy sector and does not always come naturally for an organisation that operates critical national infrastructure and is naturally risk averse. In our experience, individuals who have assisted with the triage of datasets have all of the knowledge and skills needed to identify and sensitively mitigate issues however the risk averse nature of the energy sector creates doubt and worry that they have ‘missed something’.
For Data Sharing Assessment Tool and the Presumed Open Data project to gain traction and truly become Business As Usual it will be essential to ramp up engagement with stakeholders across the organisations. Providing feedback on assessments completed and additional training where required.
Work Package 5 – Data Science Challenge
The learnings from the data science challenge have suggested some key points for development of future challenges:
· Allowing participants to use external data would make a much more diverse solution and perhaps identify useful data sets for the challenge. This must be carefully considered, since it allows performance to be driven by data accessibility and could lead to uneven playing field for teams with lower resources or accessibility. One option is to split the challenge into two tracks, one with and one without restrictions.
· Time for participants to work on a task should be considered. Participants often take part in the challenge in their own time and hence longer gaps between submissions can be advantageous but also require longer commitments. One compromise could be to reduce the number of tasks but make the test set much larger to allow more robust assessment.
· It is beneficial to choose a realistic problem which can be easily framed as a well-defined competition problem. However, this is not always trivial. More realistic problems are easier to understand but are not always easy to score, and well-defined competitions are not always realistic which can cause confusion. A balance must be sought when developing future challenges.
· Make sure the data is as clean as possible. Although this challenge used high quality data even the small number of erroneous values provided some difficulty and added pre-processing time for some participants.
· Consider skill scores rather than absolute metrics to reduce biases and variations in the tasks.
· A practice challenge is essential. It helps to work out bugs in the submission process and helps clarify the challenge requirements to the participants.
· Regular engagement with the teams is well-received. The LinkedIn Forum and emailing list helped facilitate information sharing and for teams to discuss with each other and clarify minor points from the challenge. It also ensured that information was fairly shared with all participants.