Diagnosing the NHS: SynÆ
NHS England recognise that data is useful. They collect data about a variety of things with the aim of using it to improve things for everyone - staff, patients, etc. Much of this data can be published openly, which expands the possibility of innovation beyond any internal analytics or development team. Open Prescribing is a good example of this. A key aspect of open data is that it must be anonymous - it cannot contain any personally identifiable data. So what happens to those datasets that might be extremely valuable but cannot be anonymised enough to be published as open data?
ODI Leeds and NHS England will be working together to explore the potential of 'synthetic data.' This is data that has been created following the patterns identified in a real dataset but it contains no personal data, making it suitable to release as open data. Synthetic data is also great for building and prototyping ideas - because the synthetic data works exactly like the real data, any tools or services can be efficiently switched over to the real data with little to no problems. Specifically for this project, we'd like to release synthetic data about A&E admissions.
This is where you can help.
We want your feedback and opinions about releasing synthetic data. Maybe you think that even if A&E data can be successfully anonymised it shouldn't be released. Maybe you'd like to look at the details of how we've created the synthetic dataset to check the methodology and ensure that the richness of the data remains. We ask that you check out the resources below - read Tom's full blog post for the scope of the project, then add your thoughts to the open collab document - and register your interest for a face-to-face collaboration meeting (date TBC for Q1 in 2019).
Project launch blog post
Tom Forth writes about the potential of synthetic data for anonymised data analysis.
Exploring methods for creating synethetic data
Guest post from Jonny Pearson at NHS England, a technical post exploring methods for creating synthetic data.
A collaborative Google Doc that anyone can contribute to. We're seeking your feedback about synthetic data.
A collaborative Word Online document that anyone can contribute to. (For people who cannot access Google Docs)
Tuesday 5 March 2019