Northernlands 2 - Responsible data collaboration

Description

Transcript

This transcript comes from the captions associated with the video above. It is "as spoken".

Hi, my name is Stian Westlake and I'd like to talk today about

the idea of a data moon shot for better and more prolific data

collaboration. I've got three particular interests in this.

Not long ago I wrote a book about the intangible economy and what

needs to make it work. I recently worked as a government

adviser on science innovation policy, but most importantly I

recently joined the Royal Statistical Society as their

chief executive. I'm going to first of all kick off by talking

a little bit about the economics of data and how that affects

what we might do. If you really don't like economics, this is a

great time - this being a video conference to pop out and make

yourself a cup of tea. I will probably only take about 2 mins

so it will probably be a weak cup of tea, but you're forewarned

The story that I tell in 'Capitalism without capital'

is a story of how the world economy once-upon-a-time

the capital in that economy, the things we used to invest in

used to be physical things you could see and touch like machines

or buildings or plants. That used to represent about 15% of world

GDP per year.

It was the sinues of the economy 40 years ago. Over the last 40

years that's changed and what now makes the world economy work is

intangible assets. Things you can't see or touch like R&D,

software, brands, supply chains and of course, data.

Data and other intangible assets are kind of funky from an

economic point of view. They behave differently from other

types of investments for two reasons that are going to be

important to us. One is that they have spillovers, so if

you're a business and you invest in some data, and that data is

widely published. Other people may benefit from it as well as

you. You can't... your competitors may benefit from it

There is a kind of challenge there if you're relying on self

interested businesses to do your investment, that won't get you

all that way. The other thing about intangible assets

generally in data in particular is they have what economists

like to call synergies. They're really good when you combine

them together, which is exactly why data collaboration is so

important. If you have a little bit of information - information

about a small number of customers or a

small amount of epidemiological

information it's not hugely useful. If you can combine that

with lots of similar data. Or lots of different data.

Suddenly those things can become much more valuable, and that's a

general property of these intangible assets. So we're

moving from an economy where you could just build your factory

and get on with things to an economy where the capital that

we all depend on has these characteristics that it will be

under produced if you just rely on businesses to do their thing,

and it really matters how you combine it, those synergies.

Into this kind of problem, typically steps government and

when there are investments that have a lot of

these so called spillovers, you typically expect

government to step up to the plate and invest in some of them

And that's what's traditionally happened with, for example R&D

The British government funds about £10 billion worth of R&D

a year, business backs 20 billion.

If government didn't do that, funding the UK economy would be

doing even less well than it

currently is. If we think about something like education and

training, another really important intangible, in Britain

the government spends £90 billion a year on that. So in some

areas our politicians, our political culture, have got the

idea that investing in these intangibles is something really

important to do. Something that is worth spending taxpayers money

on if we want a thriving economy

and society. But that kind of message hasn't quite got through

that much on data. Now, obviously there is a tradition

of governments publishing and providing data. Things like the

Ordinance Survey and the tradition of National Statistics

are really old. In some cases centuries old, but the scale at

which that goes on the scale at which government sees it being

worth publishing data is still pretty small compared to the

funding of General Research or let alone funding education. And

it's something that isn't quite

in the mindset how we talk about politics. So when people talk

when we talk about the purpose of the money that we spend on

say the Office for National Statistics or providing health

data, we see it from a functional point of view. But

often we lack the bigger picture about how real investment in

data and really making that data sing can have a transformative

effect on society. So I guess my pitch when I talk to people in

politics or to anyone who is a citizen and who cares about

these things is to say that if you kind of want to follow the

current UK's government advice that we need to build, build,

build, then data is a really important place to start. And if

we're thinking of moonshots, if we're thinking of big, ambitious

technical projects that can improve the world, data is a

neglected area. But it's an area that deserves a lot more love

and a lot more credit, and probably a lot more public

funding and government

attention. One of the places where the RSS has

been vocal about this before is through our data manifesto.

Our kind of manifesto for what should happen in the data world

We published the most recent version last year, and this was

kind of a product of the input of our many members who are

statisticians and who work with

data. And it made, in my opinion, some really important

arguments that I think should form part of what you

might call a a data moonshot.

The first and kind of most obvious thing is this idea that

we should be investing much more in our data infrastructure as a

country in the same. We're very happy at the moment. Or

governments are very happy to spend billions of pounds

producing academic research papers, some of which are really

important, some of which perhaps maybe only get read a few dozen

times and are sealed away in an academic journal somewhere.

We should be thinking much more about how we invest

in data in the same way.

There's a really political message here. If you look at something

like Anna Powell-Smith's missing numbers project or her thoughts

about the government data graveyard, there are loads of

things that Governments - the UK government - should be

informing citizens about should be holding itself to

accountable to account for which simply aren't

published. So I think the first thing we

would call for is.

Whenever government does something, whenever the law is

changed. Whenever something important is going on that we

should be pushing for data to be made available. Data to be made

open about that.

But I think this is not just a kind of government

proberty in Government accountability issue. This is

also about producing bigger datasets that other people can

use and make valuable research contributions to.

If I think about things which you know I would call

moonshots in a sense. Really important infrastructural data

projects that create lots of new opportunities. I think of

something like if I were in Leeds for this event today down

the road in Bradford the Act Early North Project which brings

together an absolutely unique set of data on early year's

performance in Bradford, the youngest city in the UK.

Which is going to be an absolute treasure trove for public health

researchers that anyone who cares about tackling deprivation

and child poverty, and more generally about improving the

world. That was, I'm told by the people who run it

incredibly hard to get funding for because it didn't look like

a real academic project, but it wasn't a public health project.

Thanks to some farsighted work by the council and local NHS.

It did get funding, it cobbled it together and it created

something really impressive. But we have a kind of system where

funding academic papers, as Ben Goldacre one of the earlier

speakers today said funding academic papers is easier,

funding data tools is harder, so I think pushing

on those kind of big projects is really important. Another

example from my time in government. I was involved in the

design of an R&D project called the Industrial Strategy

Challenge Fund and the idea of that was to put together pretty

big collaborations between business and researchers,

typically in the kind of hundreds of millions of pounds

range. The idea was, they would improve society. They would

generate economic growth and would be a good thing for the

government to do. Lots of the bids for those projects were

from kind of industrial companies, people who

manufacture things which are all really valuable. There were few

to none that were about building really valuable data resources

whether that's digital twins, public health data sets.

And I think this is an area where we really want to see

governments using more, being more inspired and thinking

about how they can invest and work with businesses, Local

government to generate really valuable data resources

and to be more ambitious.

Those points all kind of come back to this economic idea

of spillovers. The reason why we want government spending money on

this is because the private sector will invest to a certain extent,

but there are public benefits

these data assets are public goods.

But I also mentioned that intangibles and data

specifically has this kind of idea of synergies. They're

especially valuable when you combine it, and I guess there

the call to action for governments is to say, well, how

do we make those those ideal combinations happen?

How do we create a world where collaboration works well,

happens a lot and is positive? I guess there's kind

of a few really interesting things that I look at that I

think we ought to be doing more

of. One is the affective and privacy respectful merging of

datasets. So the work that the Oxford evidence based medicine

data lab have been doing on the open safely platform, which has

seemed to be providing some really valuable insights into

the progress of the covid epidemic would be one example of

that. How do you bring together

extremely confidential medical information about people in a

way that you can analyze it with respect for confidentiality.

The idea that this is a really valuable investment that our

research funders, that our government, should be backing

seems really important if we want to make those synergies

work, that's absolutely essential. I guess

the second thing, and you know ODI Leeds is a great place to

be saying this is more hubs and programs for building networks

to share data. I am a longstanding fan of the work

that ODI Leeds has been pioneering in this area and I

was thrilled that they did some work with me when I was in

government. I think ODI Leeds have really blazed a trail here.

I think other projects like data kind UK that brings together

data scientists and charities who often have under utilized

data sources or don't know how to collect the right data is

another great example.

So the idea of backing hubs of creating more of them

seems like something that maybe it looks very touchy feely, but

it's hugely important if we want more collaboration in this area

and then I guess the issue which which we've heard a bit about

already today is how data ethics and standards interact with all

of this and I think if I'm

pitching this message at government. The most valuable

thing that they should be looking at is how do we create

an further and codify the results of a national

conversation on data ethics. Thinking with my kind of

General Science policy hat on?

This is something that there are success stories in the past in

other fields, so if we think about embryology for example,

which has been and in many places the world still is an

incredibly controversial field, the UK government in the 1990s

was actually pretty forward thinking there and really

fostered a national conversation. Got ethicists involved in some

cases funded research, but also brought together discussions

with the result that it created a relatively stable consensual

ethical framework for how that ran, which was

kind of to the advantage of the UK, not just in preventing bad

things, but actually in giving stability so that good things

could be done.

Clearly there's been a lot of talk and a lot of investment in

organizations that look at greater ethics. But this is, I

think, something that we just need to keep on pushing on and

keep on supporting.

The other side of this is perhaps the international side,

and you know this conference, the collaboration between the

Netherlands and the UK is a really great example.

Great to see the Netherlands reaching out to us in this way.

I would hope that the UK government could start reaching

out as well and playing a more active role in international

efforts to set standards. I think as a British person that

something where a relatively small investment could yeild big

benefits both for the world as a whole and for Britain specifically.

And then I guess.

A further dimension to this I was in the briefing for this

session I was asked how we could make data work for the

many, not for the few, but when I think about I think perhaps we

should turn that on its head and say how can we make data work

for minorities wherever they may be in society rather than just

data that serves the interest of majority, and particularly at a

time when the Black Lives Matter protest have made

systematic discrimination so salient in society the need for

granular specific data that cast light on discrimination and

highlights injustices seems more urgent than ever.

I think this is an area where

there've been historical, huge oversights in the part of public

data that we've been collecting, and obviously there is a

separate issue about the potential for discrimination by algorithms

that we use. This strikes me as a huge opportunity where the

right kind of intervention could make a huge difference.

So I guess where I would... it seems to me really

clear that there is a moon shot opportunity on data, an

opportunity for us as a society to invest more, to build more of

a collaborative around this, and to build norms

where we use data well and come together effectively around it.

I've talked about this to some extent for a public policy point

of view, but the public policy is not going to write itself.

The politicians are not going to do this themselves for all the

fine words that occasionally we hear about how they really like

Bayesian statistics and Monte Carlo simulations. To quote one

Minister from last weekend, this needs a movement and it will

essentially have to be a movement comprised of

people like you who work with data who know data and who know

the contribution that it can make to a good society. From that

point of view, it is really fortunate that data analysis and

data analysts are cool at the moment, are in demand, whether

that's in business or in government or in wider society.

Because it gives you and gives us a platform to make those

demands to point out how investment is really important,

how this needs to be done in the right collaborative and the

right ethical framework.

And the time to do this is very much now, so this is a

chance to make your voices heard.

To really drive

these moonshot projects that can change the world and to push

for better understanding and

more collaboration. So I think this is exciting to be

here and I look forward to seeing what we can do next.

  • Stian Westlake

    Chief Executive, Royal Statistical Society

    Stian Westlake
    © Stian Westlake 2019

    Stian is the new Chief Executive of the Royal Statistical Society. Before this, he served as policy adviser to three UK science and innovation ministers, and spent eight years at Nesta, the UK's national foundation for innovation, where he ran the organisation's think-tank. He co-wrote Capitalism Without Capital (2017), a book about the knowledge economy, which was selected as a Book of the Year by the Financial Times, The Economist and Marginal Revolution, and Tomorrow, Interrupted, a book about productivity growth, which will come out in 2021.

Sponsors

Nothernlands 2 is a collaboration between ODI Leeds and The Kingdom of the Netherlands, the start of activity to create, support, and amplify the cultural links between The Netherlands and the North of England. It is with their generous and vigourous support, and the support of other energetic organisations, that Northernlands can be delivered.

  • Kingdom of the Netherlands