Open Data with Hadley Beeman (and Glyn Wintle)
Not many people have heard of LinkedGov. It began as a conversation over coffee. Hadley is talking today about open government data. There wasn’t a lot of data out there, it needed to be cleaned etc. Hadley’s background is in collaborative technologies.
She started LinkedGov to spread the load of work to make the data clean and useful. With Glyn Wintle, the Technology Strategy board and other she found out there are a lot of companies trying to use Government data. The Technology Strategy Board offered to help kick things off.
They’re talking about all Government data, from transport stats to the stats relating to Hospitals, to Council Tax, Benefits etc. It’s all the data collected by and for Government and all non-personably identifiable.
Why we’re not there yet?
It comes down to two basic problems:
- A lot of the data has codes that make no sense to outsiders. The first challenge was to build a way to ask civil servants what data means only once, then use it over and over. Glyn explains how difficult this problem was before they worked this out, a technical person
- Typos and other problems with the data. For example, marking something as closed that hasn’t been since 2009. Formatting issues. They have data is RDF, 20% PDF, a lot of CSV, Excel (definitely their favourite option)
Linking and Linked Data
They use Google Refine to go through data. There’s a lot that technology can to do wade through badly formed data. There’s also a lot that a human can do. She show’s an example of two Councils that show data in very different formats, they often get a human to make the connection between data sets that a computer would dismiss as wildly different. Another example is high streets. They combine Royal Mail data with other data to make some that’s very useful.
Underneath everything is a great big web of linked data. Each council has this. We end up with a highly complex graph. How do you wade through it? When cleaning up data you usually want to get a non-technical person to do it. They used Google Refine, but it’s designed for technical users who are used to how to tool works. LinkedGov devised a way to allow non-technical people to clean up data. This greatly increases the number of people who can work on the data.
Most interestingly LinkedGov uses games to get information out of data. Their aim is to use games as tools to get as many people as possible involved in the project. LinkedGov is a volunteer driven website, in order for it to survive, everybody putting something in has to get much more out of it.
For the civil servants putting letting us know what the codes mean, they’re getting data worked on by others. They get access to their own data.
Hadley shows an example of what the queries the site puts together. This site was the motivation for civil servants to input into the site. Otherwise they’d have no reason, exposing this data meant more work, potentially less money and perhaps show where you’re not doing your job properly. But being able to query the data (once worked on by volunteers) meant they’d comply.
This was an overview of what they’re up to. Any questions?
Q. What volunteers are you looking for?
A. They need a lot of help designing games. Interfaces, the framework. They want them to be mobile friendly. The alpha of LinkedGov is designed to provoke developers to help make things better. Glyn says “It’s a crap product, please help us improve it”.
Q. Have you looked into Google’s visuals API?
A. There’a lot of potential for this, up until now they’ve been looking at the backend.
Q. Do you have any plans to help Government enter new data that they’re not at the moment?
A. LinkedGov is the “project that shouldn’t need to exist”. There are bits of Government where there are very very smart people and they’re will to help. However, it’s an effort in convincing senior managers that this is a good idea.
Q. Is there any evidence that data geeks respond to scoring?
A. They’ve done a lot of work establishing motivtations of the different people involved. the data geeks, they’re hoping that points will make it more fun, but also putting in a motivation pathway that it’s a better alternative to doing it yourself. There’s usually someone out there who’s interested
Q. How are you linking with Data.gov?
A. Hadley went to speak to they, they were really happy to help but they have no money. They’re focussed on transparency, LinkedGov is helping people to make money from it. Different aims, but it’s the same data so they help each other. One point, most of this data is Crown Copyright, however 99.99% of data coming out of Government is Open Government License (OGL).