Friday night – nothing on the TV – I know! I’ll browse through the Protection of Freedoms Bill, currently passing through the UK Parliament. Sad I know, but interesting.
Lets scroll back in time a bit to November 19th 2010 and a government press conference introduced by a video from Prime Minister David Cameron. The headline story was about the publishing of government spending and contract data, but towards the end of this 109 second short he said the following:
… the most exciting is a new right to data. Which will let people request streams of government information and use it for social or commercial purposes. Take all this together and we really can make this one of the most open, accountable and transparent governments there is. Let me end by saying this. You are going to have so much information about what we do, how much of your money we spend doing it, and what the outcome is. So use it, exploit it, hold us to account. Together we can set a great example of what a modern democracy aught to look like. (my emphasis)
Obviously to realise this Right to Data there needs to be some legislation, which brings me to the Protection of Freedoms Bill. This is one of those bills which covers all sorts of issues, from rules for destruction of fingerprints and DNA profiles, CCTV camera regulations, detention of terrorist suspects, to freedom of information and data protection. Zooming in on the bits on the topic of the release and publication of datasets held by public authorities, we find a set of clauses that amend the Freedom of Information Act 2000.
After some amendments which allow for datasets and provision in electronic form we get this: “the public authority must, so far as reasonably practicable, provide the information to the applicant in an electronic form which is capable of re-use.” Unfortunately there is no definition of the term re-use. It could be argued that a pdf of some tables in a MS Word document could be re-used, where as I believe the spirit of the legislation should be made more explicit to by identifying non-proprietary data formats. I know this would be a tricky job for the parliamentary draftsmen, as we would not want to restrict it to things, such as XML and csv, that could age and be replaced by something better which then could not be used as it had not been mentioned in the legislation, but I believe that just using the term ‘re-use’ is far too woolly and open to [mis]interpretation.
What is [not] a dataset
This is one of the areas that raises most concern for me. Checkout this wording from the Bill: I am OK with (a) – data collected as part of an authority doing it’s job – and (c) – don’t change the data you have collected – publishing that raw data is important. However (b) specifically excludes data that is the product of analysis. Presumably analysis of collected data is one significant way that an authority measures the outcomes of its efforts. Understanding that analysis will help understand the subsequent decisions and actions they make and take. I assume that there may be some specific reasons that underpin this blanket exclusion of analysis data. If there are, they should be identified, instead of generally throttling the output of useful data that will go a long way to helping with Mr Cameron’s stated ambition for us to be able to see “what the outcome is” of the spending of public money.
Release of datasets for re-use
This is a whole new section (11A) to be added to the 2000 act to cover the release of datasets. It covers ownership, copyright, and/or database right of the information to be published and states that it should be published under “the licence specified by the Secretary of State in a code of practice issued under section 45”. Section 45 basically puts in to the hands of the Secretary of State the definition of the license(s) data should be published under. As of today the Open Government Licence for public sector information is what is wanted to keep the publishing of information open. However, what is there to stop a future Secretary of State, who has a less open outlook in replacing it with far more restrictive licences? Do we not need some form of presumption of openness being attached to the Secretary of States powers as part of this change in legislation?
On the topic of presumptions of openness, the wording of this bill contains phrases such as “unless the authority is satisfied that it is not appropriate for the dataset to be published” and “where reasonably practicable”. It is clear that many in the public sector are not as enthusiastic about publishing data as the current government position and such vague phrases as these may well be unreasonably used by some in justifying a throttling of the stream of information. They could easily be used to build in a bureaucratic decision hurdle for each dataset to have to jump, proving its appropriateness and practicality, before publication. I am sure that it would not be beyond a parliamentary draftsman’s skill to produce wording that means that all will be published, unless a specific objection is raised for an individual dataset, for reasons of excessive effort or data protection reasons.
Data published by an authority should be published under a scheme, the following applies here:How should we interpret “any up-dated version held by the authority of such a dataset”? My interpretation is that once a dataset has been published is shall continue to be published as it changes. The precedent for this is spending data – having published authority spending for January 2011, authorities should be automatically publishing it for February and following months. But what if, in response to a request, an authority publishes the contents of a spreadsheet used to track the amount of salt applied to roads in its area during winter 2010-11 and then uses a different spreadsheet for the following winter. Does the output of that new spreadsheet constitute a new dataset, or an up-date to it’s predecessor? From the wording in the Bill it is not clear.
Who does it cover?
I probably need a bit of help here from those that understand the public sector better than I do, but I am suspicious that references to the organisations listed in Schedule 1 and “the wider public sector”, do not take the net wide enough to cover some of the data that is relevant to our daily lives but is delivered on behalf of some authorities by third parties. For example I am aware that recently a large city was not able to inform citizens of their rubbish collection schedules because that data was considered as commercially restricted by their service provider.
So in summary, I welcome the commitment to a right to data being realised by streams of government information about what we do, how much of our money is spend doing it, and what the outcomes are. However, I am sceptical as to how effective the measures in the current Protection of Freedoms Bill will be in delivering them. Especially in the light of very recent comments made by the Prime Minister highlighting the “enemies of enterprise” in Whitehall and town halls across the country, attacking what he called the “mad” bureaucracy that holds back entrepreneurs. Those enemies are just the people who might take the wording of this bill as ammunition in their cause.
Whilst being concerned about this topic, I have been wondering why few are commenting on it. Are the majority just taking the press conference statements by David Cameron, and his fellow Ministers, as indications of a battle won, or am I missing something? I promote Sir Tim Berners-Lee’s 5 Star Data as the steps towards a Web of Linked Data – if we don’t get the publishing of public sector data to at least 3 star standard (Available as machine-readable structured data – in non-proprietary format), many of the current ambitions may remain just that, ambitions. That would be a massive missed opportunity.
So are we getting a right to data? – or just some provisions to extend the Freedom of Information Act a bit further in the dataset direction? I’m not sure.
Personal note: As you may tell from the above, I am no expert on the interpretation of parliamentary legislation, and I have left several unanswered questions hanging in this post. Any help in clarifying my thinking, confirming or disproving my assumptions, or answering some of those questions, will be gratefully received in comments to this post or your own posted thoughts. This post was also published on the Nodalities Blog