Northernlands 2 - Responsible data collaboration
This transcript comes from the captions associated with the video above. It is "as spoken".
Hi, my name is Stian Westlake and I'd like to talk today about
the idea of a data moon shot for better and more prolific data
collaboration. I've got three particular interests in this.
Not long ago I wrote a book about the intangible economy and what
needs to make it work. I recently worked as a government
adviser on science innovation policy, but most importantly I
recently joined the Royal Statistical Society as their
chief executive. I'm going to first of all kick off by talking
a little bit about the economics of data and how that affects
what we might do. If you really don't like economics, this is a
great time - this being a video conference to pop out and make
yourself a cup of tea. I will probably only take about 2 mins
so it will probably be a weak cup of tea, but you're forewarned
The story that I tell in 'Capitalism without capital'
is a story of how the world economy once-upon-a-time
the capital in that economy, the things we used to invest in
used to be physical things you could see and touch like machines
or buildings or plants. That used to represent about 15% of world
GDP per year.
It was the sinues of the economy 40 years ago. Over the last 40
years that's changed and what now makes the world economy work is
intangible assets. Things you can't see or touch like R&D,
software, brands, supply chains and of course, data.
Data and other intangible assets are kind of funky from an
economic point of view. They behave differently from other
types of investments for two reasons that are going to be
important to us. One is that they have spillovers, so if
you're a business and you invest in some data, and that data is
widely published. Other people may benefit from it as well as
you. You can't... your competitors may benefit from it
There is a kind of challenge there if you're relying on self
interested businesses to do your investment, that won't get you
all that way. The other thing about intangible assets
generally in data in particular is they have what economists
like to call synergies. They're really good when you combine
them together, which is exactly why data collaboration is so
important. If you have a little bit of information - information
about a small number of customers or a
small amount of epidemiological
information it's not hugely useful. If you can combine that
with lots of similar data. Or lots of different data.
Suddenly those things can become much more valuable, and that's a
general property of these intangible assets. So we're
moving from an economy where you could just build your factory
and get on with things to an economy where the capital that
we all depend on has these characteristics that it will be
under produced if you just rely on businesses to do their thing,
and it really matters how you combine it, those synergies.
Into this kind of problem, typically steps government and
when there are investments that have a lot of
these so called spillovers, you typically expect
government to step up to the plate and invest in some of them
And that's what's traditionally happened with, for example R&D
The British government funds about £10 billion worth of R&D
a year, business backs 20 billion.
If government didn't do that, funding the UK economy would be
doing even less well than it
currently is. If we think about something like education and
training, another really important intangible, in Britain
the government spends £90 billion a year on that. So in some
areas our politicians, our political culture, have got the
idea that investing in these intangibles is something really
important to do. Something that is worth spending taxpayers money
on if we want a thriving economy
and society. But that kind of message hasn't quite got through
that much on data. Now, obviously there is a tradition
of governments publishing and providing data. Things like the
Ordinance Survey and the tradition of National Statistics
are really old. In some cases centuries old, but the scale at
which that goes on the scale at which government sees it being
worth publishing data is still pretty small compared to the
funding of General Research or let alone funding education. And
it's something that isn't quite
in the mindset how we talk about politics. So when people talk
when we talk about the purpose of the money that we spend on
say the Office for National Statistics or providing health
data, we see it from a functional point of view. But
often we lack the bigger picture about how real investment in
data and really making that data sing can have a transformative
effect on society. So I guess my pitch when I talk to people in
politics or to anyone who is a citizen and who cares about
these things is to say that if you kind of want to follow the
current UK's government advice that we need to build, build,
build, then data is a really important place to start. And if
we're thinking of moonshots, if we're thinking of big, ambitious
technical projects that can improve the world, data is a
neglected area. But it's an area that deserves a lot more love
and a lot more credit, and probably a lot more public
funding and government
attention. One of the places where the RSS has
been vocal about this before is through our data manifesto.
Our kind of manifesto for what should happen in the data world
We published the most recent version last year, and this was
kind of a product of the input of our many members who are
statisticians and who work with
data. And it made, in my opinion, some really important
arguments that I think should form part of what you
might call a a data moonshot.
The first and kind of most obvious thing is this idea that
we should be investing much more in our data infrastructure as a
country in the same. We're very happy at the moment. Or
governments are very happy to spend billions of pounds
producing academic research papers, some of which are really
important, some of which perhaps maybe only get read a few dozen
times and are sealed away in an academic journal somewhere.
We should be thinking much more about how we invest
in data in the same way.
There's a really political message here. If you look at something
like Anna Powell-Smith's missing numbers project or her thoughts
about the government data graveyard, there are loads of
things that Governments - the UK government - should be
informing citizens about should be holding itself to
accountable to account for which simply aren't
published. So I think the first thing we
would call for is.
Whenever government does something, whenever the law is
changed. Whenever something important is going on that we
should be pushing for data to be made available. Data to be made
open about that.
But I think this is not just a kind of government
proberty in Government accountability issue. This is
also about producing bigger datasets that other people can
use and make valuable research contributions to.
If I think about things which you know I would call
moonshots in a sense. Really important infrastructural data
projects that create lots of new opportunities. I think of
something like if I were in Leeds for this event today down
the road in Bradford the Act Early North Project which brings
together an absolutely unique set of data on early year's
performance in Bradford, the youngest city in the UK.
Which is going to be an absolute treasure trove for public health
researchers that anyone who cares about tackling deprivation
and child poverty, and more generally about improving the
world. That was, I'm told by the people who run it
incredibly hard to get funding for because it didn't look like
a real academic project, but it wasn't a public health project.
Thanks to some farsighted work by the council and local NHS.
It did get funding, it cobbled it together and it created
something really impressive. But we have a kind of system where
funding academic papers, as Ben Goldacre one of the earlier
speakers today said funding academic papers is easier,
funding data tools is harder, so I think pushing
on those kind of big projects is really important. Another
example from my time in government. I was involved in the
design of an R&D project called the Industrial Strategy
Challenge Fund and the idea of that was to put together pretty
big collaborations between business and researchers,
typically in the kind of hundreds of millions of pounds
range. The idea was, they would improve society. They would
generate economic growth and would be a good thing for the
government to do. Lots of the bids for those projects were
from kind of industrial companies, people who
manufacture things which are all really valuable. There were few
to none that were about building really valuable data resources
whether that's digital twins, public health data sets.
And I think this is an area where we really want to see
governments using more, being more inspired and thinking
about how they can invest and work with businesses, Local
government to generate really valuable data resources
and to be more ambitious.
Those points all kind of come back to this economic idea
of spillovers. The reason why we want government spending money on
this is because the private sector will invest to a certain extent,
but there are public benefits
these data assets are public goods.
But I also mentioned that intangibles and data
specifically has this kind of idea of synergies. They're
especially valuable when you combine it, and I guess there
the call to action for governments is to say, well, how
do we make those those ideal combinations happen?
How do we create a world where collaboration works well,
happens a lot and is positive? I guess there's kind
of a few really interesting things that I look at that I
think we ought to be doing more
of. One is the affective and privacy respectful merging of
datasets. So the work that the Oxford evidence based medicine
data lab have been doing on the open safely platform, which has
seemed to be providing some really valuable insights into
the progress of the covid epidemic would be one example of
that. How do you bring together
extremely confidential medical information about people in a
way that you can analyze it with respect for confidentiality.
The idea that this is a really valuable investment that our
research funders, that our government, should be backing
seems really important if we want to make those synergies
work, that's absolutely essential. I guess
the second thing, and you know ODI Leeds is a great place to
be saying this is more hubs and programs for building networks
to share data. I am a longstanding fan of the work
that ODI Leeds has been pioneering in this area and I
was thrilled that they did some work with me when I was in
government. I think ODI Leeds have really blazed a trail here.
I think other projects like data kind UK that brings together
data scientists and charities who often have under utilized
data sources or don't know how to collect the right data is
another great example.
So the idea of backing hubs of creating more of them
seems like something that maybe it looks very touchy feely, but
it's hugely important if we want more collaboration in this area
and then I guess the issue which which we've heard a bit about
already today is how data ethics and standards interact with all
of this and I think if I'm
pitching this message at government. The most valuable
thing that they should be looking at is how do we create
an further and codify the results of a national
conversation on data ethics. Thinking with my kind of
General Science policy hat on?
This is something that there are success stories in the past in
other fields, so if we think about embryology for example,
which has been and in many places the world still is an
incredibly controversial field, the UK government in the 1990s
was actually pretty forward thinking there and really
fostered a national conversation. Got ethicists involved in some
cases funded research, but also brought together discussions
with the result that it created a relatively stable consensual
ethical framework for how that ran, which was
kind of to the advantage of the UK, not just in preventing bad
things, but actually in giving stability so that good things
could be done.
Clearly there's been a lot of talk and a lot of investment in
organizations that look at greater ethics. But this is, I
think, something that we just need to keep on pushing on and
keep on supporting.
The other side of this is perhaps the international side,
and you know this conference, the collaboration between the
Netherlands and the UK is a really great example.
Great to see the Netherlands reaching out to us in this way.
I would hope that the UK government could start reaching
out as well and playing a more active role in international
efforts to set standards. I think as a British person that
something where a relatively small investment could yeild big
benefits both for the world as a whole and for Britain specifically.
And then I guess.
A further dimension to this I was in the briefing for this
session I was asked how we could make data work for the
many, not for the few, but when I think about I think perhaps we
should turn that on its head and say how can we make data work
for minorities wherever they may be in society rather than just
data that serves the interest of majority, and particularly at a
time when the Black Lives Matter protest have made
systematic discrimination so salient in society the need for
granular specific data that cast light on discrimination and
highlights injustices seems more urgent than ever.
I think this is an area where
there've been historical, huge oversights in the part of public
data that we've been collecting, and obviously there is a
separate issue about the potential for discrimination by algorithms
that we use. This strikes me as a huge opportunity where the
right kind of intervention could make a huge difference.
So I guess where I would... it seems to me really
clear that there is a moon shot opportunity on data, an
opportunity for us as a society to invest more, to build more of
a collaborative around this, and to build norms
where we use data well and come together effectively around it.
I've talked about this to some extent for a public policy point
of view, but the public policy is not going to write itself.
The politicians are not going to do this themselves for all the
fine words that occasionally we hear about how they really like
Bayesian statistics and Monte Carlo simulations. To quote one
Minister from last weekend, this needs a movement and it will
essentially have to be a movement comprised of
people like you who work with data who know data and who know
the contribution that it can make to a good society. From that
point of view, it is really fortunate that data analysis and
data analysts are cool at the moment, are in demand, whether
that's in business or in government or in wider society.
Because it gives you and gives us a platform to make those
demands to point out how investment is really important,
how this needs to be done in the right collaborative and the
right ethical framework.
And the time to do this is very much now, so this is a
chance to make your voices heard.
To really drive
these moonshot projects that can change the world and to push
for better understanding and
more collaboration. So I think this is exciting to be
here and I look forward to seeing what we can do next.
Chief Executive, Royal Statistical Society
Stian is the new Chief Executive of the Royal Statistical Society. Before this, he served as policy adviser to three UK science and innovation ministers, and spent eight years at Nesta, the UK's national foundation for innovation, where he ran the organisation's think-tank. He co-wrote Capitalism Without Capital (2017), a book about the knowledge economy, which was selected as a Book of the Year by the Financial Times, The Economist and Marginal Revolution, and Tomorrow, Interrupted, a book about productivity growth, which will come out in 2021.
Nothernlands 2 is a collaboration between ODI Leeds and The Kingdom of the Netherlands, the start of activity to create, support, and amplify the cultural links between The Netherlands and the North of England. It is with their generous and vigourous support, and the support of other energetic organisations, that Northernlands can be delivered.