Published in Government, Linked Data, Local Government, Open Data
Tagged: #linkeddata, LGID, Lichfield, Linked Data, localgov, RDF, transparent government, URI
I started the previous post in this mini-series with an assumption – ..working on the assumption that publishing this [local government spending] data is a good thing. That post attracted several comments, fortunately none challenging the assumption. So learning from that experience I am going to start with another assumption in this post. Publishing Local Authority data, such as local spending data, as ‘Linked Data’ is also a good thing. Those new to this mini-series, check back to the previous post for my reasoning behind the assertion.
In this post I am going to be concentrating more on the How than the Why Bother.
To help with this I am going to use, some of the excellent work that Stuart Harrison at Lichfield District Council has done in this area, as examples. Take a look at the spending data part of their site: spending.lichfielddc.gov.uk/. On the surface navigating your way around the site looking at council spend by type, subject, month, and supplier is the kind of experience a user would expect. Great for a website displaying information about a single council.
However, it is more than a web site. Inspection of the Download data tab shows that you can get your hands on the source data in csv format. Here is one line, representing a line of expenditure, from that data:
“http://statistics.data.gov.uk/id/local-authority/41UD”,”Lichfield District Council”,”2010-04-06″,”7747″,”http://spending.lichfielddc.gov.uk/spend/8605670″,”120.00″,”BRISTOW & SUTOR”,”401″,”Revenue Collection”,”Supplies & Services”,”Bailiff Fees”,”"
… which represents the data displayed on this human readable page:
Looking through the csv, you can pick out the strings of characters for information such as the date, supplier name, department name etc. In addition you can pick out a couple of URIs:
- http://statistics.data.gov.uk/id/local-authority/41UD– The UK Government identifier for Lichfield DC
- http://spending.lichfielddc.gov.uk/spend/8605670 – Lichfield’s identifier for this payment
In the context of csv, that’s all these URIs are, identifiers. However because they are http URIs you can click through to the address to get more information. If you do that with your web browser you get a human readable representation of the data. These sites also provide access to the same data, formatted in RDF, for use by developers.
You can see that data by adding ‘.rdf’ to the end of the address, thus: http://spending.lichfielddc.gov.uk/spend/8605670.rdf and then selecting the ‘view source’ option of your browser for the page of gobbledegook that you get back.
Inspecting the RDF, you will see that most things, except descriptive labels and financial values, are are now identified as URIs such as http://spending.lichfielddc.gov.uk/subjective/bailiff-fees and http://spending.lichfielddc.gov.uk/invoice/7747. Again if you follow those links, you will get a human readable representation of that resource, and the RDF behind it by adding a ‘.rdf’ suffix.
The eagle-eyed, inspecting the RDF-XML for Lichfield payment number 8605670, will have noticed a couple of things. Firstly, a liberal sprinkling of elements with names like payment:expenditureCategory or payment:payment. These come from the Payments Ontology as published on data.gov.uk as the recommended way of encoding spending, and other payment associated data, in RDF.
Secondly, you may have spotted that there is no date, or supplier name or identifier. That is because those pieces of information are attributes associated with a payment – invoice number 7747 in this case.
Zooming out from the data for a moment, and looking at the human readable form, you will see that most things, like spend type, invoice number, supplier name, are clickable links, which take you through to relevant information about those things – address details & payments for a supplier, all payments for a category etc. This intuitive natural navigation style often comes as a positive consequence of thinking about data as a set of linked resources instead of the traditional rows & columns that we are used to. Another great example of this effect can be found on a site such as the BBC Wildlife Finder. That is not to say that you could not have created such a site without even considering Linked Data, of course you could. However, data modelled as a set of linked resources almost self-describes the ideal navigation paths for a user interface to display it to a human.
The Linked Data practice of modelling data, such as spending data, as a set of linked resources and identifying those resources with URIs [which if looked up will provide information about that resource] is equally applicable to those outside of an individual authority. By being able to consume that data, whilst understanding the relationships within it and having confidence in the authority and persistence of the identifiers within it, a developer can approach the task of aggregating, comparing, and using that data in their applications more easily.
So, how do I (as a local authority) get my data from its raw flat csv format, in to RDF with suitable URIs and produce a site like Lichfield’s? The simple answer is that you may not have to – others may help you do some, if not all, of it. With help from organisations such as esd-toolkit, OpenlyLocal, SpotlightOnSpend, and with projects such as the xSpend project we are working on with LGID, many of the conversion [from csv], data formatting processes, and aggregation are being addressed – maybe not as quickly or completely as we would like, but they are. As to a human readable web view of your data, you may be able to copy Stuart by taking up the offer of a free Talis Platform Store and then running your own web server with his code that he hopes to share as open source. Alternatively it might be worth waiting for others to aggregate your data and provide a way for your citizens to view your data.
As easy as that then! – Well not quite, there are some issues about URI naming and creation, and how you bring the data together that still do need addressing by those engaged in this. But that is for Part 3….This post was also published on the Nodalities Blog