Realtime Open Data

December 15, 2014

Realtime Open Data

I’ve been thinking a lot lately about data being collected about cities through remote sensor networks.

It’s never been easier to build DIY sensors, and some cities are starting to look seriously at how sensor data can inform better policy decisions and better investment of public resources.

It strikes me that this is a very relevant issue for those in the open data movement, as the data generated by urban sensor networks is likely to be mashed up with publicly available data from cities on crime, land use, service requests and a host of other things to drive better decision making. There’s a natural connection between the kinds of data we find in open data portals and the kind of data that is generated by emerging sensor networks.

It also strikes me that most municipal open data portals are not well suited to provide access to realtime data – the kinds of data that sensor networks are really good at generating.

Current State of Open Data Portals

Pretty much every modern open data portal provides a way to programmatically access data that is housed in it – data is accessed via an API by making an HTTP request (with the required information – e.g., authentication – in the request) and getting a response back (typically in either JSON or XML format). This data access paradigm fits well with the way that most of the data in municipal open data portals is updated – usually not more frequently than daily.

If data updates happen frequently – or if a data consumer wants to check and see if data has changed since the last time it was accessed – a consumer application can poll the API for changes at set intervals. And though this approach works acceptably well for data that doesn’t change all that often, it is far from acceptable from data that does (or could) change more frequently. In fact, the closer updates to data get to realtime changes, the less optimal this approach is because it places a heavier burden on consumers (who must poll the API for data chances more frequently) and for the data portal itself (which must handle and respond to more frequent requests from API consumers).

Other – more efficient – approaches to accessing data can be used when data updates occur more frequently. These approaches – like server-sent events and Websockets (which are both part of the HTML5 specification), or registering a callback URL (or Webhook) – benefit both the data consumer and the data producer.

Getting to Realtime

clear-streets

The closest thing I can identify to a realtime open data API is one that we built in the City of Philadelphia for flight information from the Philadelphia International Airport. This API uses data from the airport flight information system and is updated every three minutes (about the same frequency as data is updated on the Airport’s website and on flight information displays in the airport terminals). It provides a simple REST API for making standard HTTP calls for data on specific flights, and was also designed with a Websocket endpoint to allow realtime connections.

Another interesting realtime data project from Chicago is ClearStreets (a project of Open City, which has built a number of powerful civic apps for the City of Chicago) that shows the realtime position of plows as they clear the streets after heavy snow.

Even more exciting is the OpenSensors project which is a platform that supports data aggregation from remote sensor networks – the project hosts open data projects at no cost and allows anyone to subscribe to data feeds from these open sensor network projects.

I think these examples show how municipal open data portals can more in the direction of supporting realtime data, and – perhaps more importantly – how governments can begin to understand the coming importance of providing ways for data consumers to use realtime methods for accessing data.

Practical First Steps

It can be tempting to think of the need for realtime data as being closely coupled with the use of sensors. But even in places where sensor networks are not yet built out (or even planned), there are lots of opportunities for open data to become closer to realtime.

Crime incidents, parking citations, 311 service requests, road closures, permit and license issuance – these are all activities that occur every hour of every day as a part of municipal operations. And yet the data that is generated by these activities is still largely consumed through open data portals in a fashion that best fits data which is updated only periodically.

Wouldn’t it be useful if data consumers could subscribe to a specific topic or channel (like Service Requests or Building Permits) for a specific neighborhood, register a callback URL and then receive a push of JSON representing the specific event when it occurred? No more wasteful polling for changes that consume resources on both the client and data portal side – just send me information on an event I care about when it occurs.

In some instances, the barriers to moving toward making more realtime data available from governments is related to technology – some legacy systems may not make it practical to expose data in this way. But as cities start producing more and more data – particularly as remote sensor networks become more common – the demand for ways to consume data in more appropriate ways will increase.

Will municipal open data portals be able to keep up with this demand? We’ll see.

Cities, civic hacking, Open Data

4 responses to “Realtime Open Data”

bencodeforamerica

December 15, 2014 at 5:14 pm

I’m currently working on a realtime feed for Open311 requests across multiple cities (finally updating open311status.org. It’s somewhat bittersweet though because very little 311 data actually updates in realtime, instead it seems to be batch-published at certain intervals. It’s nice to get that data pushed when it is published, but it’s somewhat less exciting than a true realtime stream.

…which is to wonder at a more granular level what level of integration is required for realtime data. Open311 is often bolted on to another system, which means it itself is doing batch-polling. So perhaps a requirement of realtime is a deeper integration of the public API with the backend data and processing system.
mheadd

December 15, 2014 at 5:24 pm

Woah – really looking forward to seeing the updates to open311status.org. Very cool!

“…perhaps a requirement of realtime is a deeper integration of the public API with the backend data and processing system.”

Completely agree. That’s pretty common, not just with 311 systems but with other stuff as well. But I wonder how much an increased demand for data that is realtime will help to foster this change. If people start using your realtime feed for Open311, will it help to highlight the fact that the underlying data push is done in batches?

Will be interesting to see.
Andrew Turner (@ajturner)

December 15, 2014 at 8:27 pm

Looking forward to more realtime data coming online. It’s definitely a chicken / egg problem. Twitter realtime visualizations are prevalent because the data are so easily accessible. There is tremendous possible potential to re-apply that same energy to municipal data, if we can make it as easy and attractive to work with.

This week we launched “Stream Services” – this should enable any GIS department to start publishing sensor/vehicle feeds via WebSockets. We’re hoping this can be yet another enticement for organizations to start streaming data, not just publishing it.

http://resources.arcgis.com/en/help/arcgis-rest-api/index.html#/Stream_Services/02r300000288000000/
bigfleet

December 30, 2014 at 3:45 pm

The City of Charlotte’s Code for America fellowship in 2014 generated http://www.citygram.org/ which, in its final incarnation, would find this output to a webhook very natural. The maintenance and sustainability of that program has fallen to me (and the Charlotte brigade) so it’s a volunteer effort, but we are excited about Citygram and its future!

About Me

I am the former Chief Data Officer for the City of Philadelphia. I also served as Director of Government Relations at Code for America, and as Director of the State of Delaware’s Government Information Center. For about six years, I served in the General Services Administration’s Technology Transformation Services (TTS), and helped pioneer their work with state and local governments. I also led platform evangelism efforts for TTS’ cloud platform, which supports over 30 critical federal agency systems.

Civic Innovations