A recent blog post by Peter Krantz has sparked some interesting dialog on whether governments publishing open data for citizens and application developers need to deploy an Application Programming Interface (API) to their data.
The full post can be viewed here. It provides a nice set of considerations for governments looking at standing up an API to serve data to those that want to build civic apps.
Standing up an API to serve open data is a non-trivial undertaking, and governments would be wise to consider the risks and costs associated with doing it.
I’d also suggest that those in government evaluating the best ways to serve open data to end consumers review the details provided in the Open Data Handbook.
The Handbook (published by the Open Knowledge Foundation) provides a nice summary of the various ways that governments can serve open data, and provides a thorough listing of pros and cons for each:
The data should be available as a complete set. If you have a register which is collected under statute, the entire register should be available for download. A web API or similar service may also be very useful, but they are not a substitutes for bulk access.
In the discussion of whether to deploy an open data API or not, it’s important to keep sight of the fact that API’s do not obviate the need for governments to provide data in bulk format for download and use by consumers. Even if an agency uses an API (or APIs) there is a legitimate argument for ensuring the availability of data in bulk – and a number of benefits from providing data in this format as well.
Providing data in bulk is often an effective way to have outside consumers validate data quality. I’m a lurker on a number of mailing lists for transit agency data, and its a common occurrence on many of these lists for users to remark on data inaccuracies or areas where holes exist. To the credit of these transit agencies, the response to such input is almost always a quick adjustment to the data set.
Having said that, there are some rather compelling reasons why governments might want to consider an API for high-value data, to compliment bulk downloads that might be available.
Chief among them – it’s something that developers have come to expect.
It’s a mashup world and the web is now programmable – over the past several years, with the proliferation of so many web-based services behind REST APIs, many developers have come to expect (and even take for granted) that services and data are available through APIs.
In his post on the pros and cons of APIs, Krantz notes:
“[With bulk downloads…] This means entrepreneurs can get all your data, load it into their own system and design their API according to their use case. Also, high loads will hit their own infrastructure without affecting other apps.”
There are legions of entrepreneurs and app developers that would rather see high loads hitting someone else’s infrastructure, rather than their own. In fact, that’s often the appeal of API-based services – you can often leave the tricky and complicated business of scaling to someone else, and focus on how an application should work.
If one of the objectives of a government open data program is to attract outside developers to use government data, then government officials must be aware of this strong sentiment in the developer community.
There are a number of good options available for governments that want to provide access to data through an API but are wary of some of the risks associated with doing so.
One of the reasons that CouchDB – the open source document-oriented database, stewarded by the Apache foundation – is so popular in the world of open data is that it has a powerful REST API baked into it.
The rise of cloud-based hosting services for CouchDB like IrisCouch and Cloudant, and innovative services built on CouchDB (like Max Ogden’s DataCouch) make this an appealing option for governments that want to provide API access to their data.
In addition, services like CKAN, Socrata, Junar and others provide an easy way for governments to provide both bulk access to their data and an open API for developers.
This also strikes me as an area where governments can share knowledge – larger governments with more robust IT infrastructures and larger budgets may already have wrestled with many of the challenges of standing up and managing an API. This knowledge could be invaluable to other governments just getting started on their open data programs.
There also might be opportunities for governments to collaborate and even share infrastructure for open data APIs.
Bulk access to open government data is critical, but the appeal of APIs to application developers is a reality that every government developing an open data program must consider.
APIs aren’t the only way (or even the most important way) of serving open data to end users and consumers – but governments should carefully evaluate ways that open data can be served through easy to use, well documented interfaces for application developers.
To API or not to API – that is the question. Hopefully we are moving toward a place where government officials will have good answers for this important question.
Leave a Reply