Defining a national core reference data set will turn 'puppy like enthusiasm' for open data into bankable economic growth
The world is changing, yet again – this time it's all about data: how we produce it, crunch it and use it to make our lives better. Britain has huge advantages in this new world: we have the largest coherent public sector datasets, which will be the backbone of the data-driven society, as well as pioneering a scientific edge in using them. We also have political consensus about open data: the last government was committed to it and the coalition government increased the scope of its vision and commitment. Everyone appreciates how important this is.
But enthusiasm and vision are only the start. As Dermot Joyce wrote here, releasing lots of stuff is not enough – not even when it's 9,500 datasets. It has to be the right stuff, and published the right way. My basic recommendation – that we should have a clear, auditable national data strategy – may not sound very radical, but we currently don't have such a strategy, and I believe it's the essential next step.
This is a big challenge. Britain traditionally has the boffins, the bright ideas. We generate the excitement. But we don't mint the money. In fact, we tend to act, as Saul Klein has put it, like world philanthropists of talent. So our challenge is to turn our open-data visions, and all our wonderful puppy-like enthusiasm, into bankable economic growth, right here – and quickly too.
Not surprisingly, there are some people who resist the idea of a defined national strategy. With understandable anarchic spirit, they just want to throw open the window and toss out all the data that's near to hand. In many ways it's a noble and attractive ambition. But it won't do. We can't just say to government departments: Please publish whatever you can. We can't just say to our trading funds: Please give it all away. We can't just sit around our committee tables and say: Hey presto! We've decided to be open!
So this is the foundation of my recommendations: that we define a national core reference dataset. It should be designed strategically, combining the basic data by which we define and understand the nation and the data that is most useful for driving social and economic gain. This national core reference data will then be the backbone of public sector information and the backbone of a data revolution led by Britain.
In my full recommendations, I talk about a twin-track approach, where we can count on the core data published to a high standard and, simultaneously, the rest published quickly as well. And with that twin track, we should have the highest ambition to turn all our data from imperfect to best quality – but never let imperfection slow us down.
My other main recommendations are:
So that citizens need have no fears about their privacy being compromised. This is of the highest importance. We can gain the value of data and retain confidentiality if, first, we ensure all case-level open data is anonymised, and even then only made available in technological "safe haven" environments when the data is especially sensitive; and, second, we spread responsibility to the end-user of data: there must be heavier consequences available, and they must be rigorously applied when there are transgressions of the rules.
Data is of no value unless it can be put to good use. This means we need to produce more data scientists, and there should be plenty of them employed throughout government. And we need to invest in basic data science, as well as partnerships between academia and business, to make sure that the science can be applied to real-world opportunities.
A 'mixed economy' of public data
We should expect companies to be open, for example, in publishing all clinical trials of medicines, as indeed the pharmas are starting to do. And we should expect companies in public-private partnerships to be more open in their data policies.
Within the government data machine, we need to keep our in-house innovation up to date and work with experts to gather and analyse data to track what is the real value of public sector information and how it can be increased; and to make sure this happens the Data Strategy Board that I chair is fully committed to our partnership with the newly founded Open Data Institute.
My last recommendation is that government must eat its own lunch: it must formally embed structured data in how it develops, monitors and adapts public policy.
Finally, I want to re-emphasise my main challenge to government: to design a clear National Data Strategy founded on a national core reference dataset with a visible, predictable, auditable implementation plan.
I realise this will be unpalatable to some: it places a high expectation on government to improve its delivery systems; it asks government to move from enthusiasm, to predictable practical application; it means streamlining and clarifying of channels for driving change; and it asks government to think more broadly about the whole landscape of public sector information.
But the prize is massive. The prize is better government and significant economic growth and huge social benefit. I trust the government, which has cross-party, cross-sector support for this – as, by the way, we have demonstrated in our extensive and transparent consultation process with both experts and the wider public – to take on with enthusiasm this vital challenge of defining and delivering a visible, truly world-leading strategy for public sector information.