On Data Standards for Cities

Creating open data standards for cities is really, really hard. It’s also really, really important.

Data standardization across cities is a critical milestones that must be realized to advance the open data movement, to fully realize all of the potential benefits of openly publishing government data. More and more people are starting to realize the importance of this milestone and more and more energy will be devoted to creating new standards for city data in the months and years ahead.

flu-shots

The best example of what is possible when governments publish open data that conforms to a specific standard is the General Transit Feed Specification (GTFS). Developed by Google in partnership with the Tri-County Metropolitan Transportation District of Oregon (TriMet), GTFS is a data specification that is used by dozens of transit and transportation authorities across the country, and it has all of the qualities that open data advocates hope to replicate in other data standards for cities.

Transit authorities that publish GTFS data see an immediate tangible benefit because their transit information is available in Google Transit. Making this information more widely available benefits both transit agencies and transit riders, but the immediacy with which transit agencies can see this benefit make GTFS particularly valuable. Data standardization is an easier sell to government officials when tangible benefits are quickly realized.

The GTFS standard is relatively easy to use – it’s a collection of zipped, comma-delimited text files. This is a pretty low bar for transit agencies being asked to produce GTFS data, and it’s an eminently usable format for consumers of GTFS data. In fact, the ease of use of GTFS has spawned a cottage industry of transit applications in cities across the country and continues to be used as the bedrock set of information for transit app developers.

And perhaps most importantly, GTFS has given open data advocates a benchmark to use to advance other data standardization efforts. In many ways, GTFS made standards like Open311 possible.

So if data standardization is the future, and we’ve got at least one really good example to demonstrate the benefits to stakeholders and advance the concept, then what’s next? What’s the next data standard that will be adopted by multiple governments?

For the past year or so, there has been widespread interest in developing a shared data standard for food safety inspection data. On it’s face, this seems like a good candidate data source to standardize across cities. Most cities (certainly all large cities) conduct regular inspections of establishments that serve food to the public. This information can be (but is not always) fairly succinct – usually a letter grade or numerical ranking – that can easily be delivered to an end user on a number of different platforms and channels. For many reasons, focusing on food safety inspections data as the next best data set to standardize across cities makes a lot of sense.

Just recently, the joint efforts of several different groups culminated in an announcement by the City of San Francisco and Yelp to deliver standardized food safety inspection data through the Yelp platform.

I was involved in the discussions about a data standard for food safety inspections, though the City I work for will not be adopting the newly developed standard (at least not yet). The process of developing the new food safety inspections data standard was illuminating. There are some important lessons we can take away from this work – lessons we can put to use as we work to identify additional municipal data sets for standardization.

For me, the biggest lesson learned from the work that went into standardizing food safety inspection data is understanding when applying a data standard might obscure important differences in how data is collected, or in what data means. By way of example, a data standard like GTFS does not obscure differences in the underlying data across different jurisdictions. A transit schedule broken down to its essence is about location and time – when will my bus be at a specific stop on a specific route. There is nothing inherently different about this information from jurisdiction to jurisdiction. Time and place mean the same thing everywhere.

But this is not always the case with food safety inspection data – particularly when this data is distilled into digestible (pun intended) scores or rankings. The methods for conducting food safety inspections from city to city can vary widely, and these differences can result in very different results depending on where it comes from.

Daniel E. Ho, a professor at Stanford University, conducted an in depth study of the restaurant inspection systems in New York City and San Diego and found that the way in which inspection regimes are implemented can result in data that is often very different when compared across cities.

“While San Diego, for example, has a single violation for vermin, New York records separate violations for evidence of rats or live rats; evidence of mice or live mice; live roaches; and flies — each scored at 5, 6, 7, 8 or 28 points, depending on the evidence. Thirty ‘fresh mice droppings in one area’ result in 6 points, but 31 droppings result in 7 points.”

There also appears to be some debate in the medical community about the effectiveness of simplified grading for food establishments – i.e., using a letter grade or a numerical score. As noted in Professor Ho’s report – “…a single indicator has not been developed that summarizes all the relevant factors into one measure of [food] safety.”

All that said, if we’re going to advance the work of creating data standards across cities we need to identify the right data sets to standardize. These candidate data sets should have the same qualities as GTFS – demonstrating immediate benefits to data producers and data users, ease of use – but not have some of the less desirable qualities of food safety inspection data – obscuring differences in data collection and data quality across jurisdictions.

Lately, I’ve been trying to advance the idea that data about the locations where flu shots are administered (or any other form of inoculation) could be standardized across cities. I’ve gotten some great input from data advocates and from other cities, like the cities of Chicago and Baltimore.

I’m hoping to continue pushing this idea in the months ahead, leading up to the next flu season. If this most recent flu season has shown us anything, it’s that data matters – I think there could be enormous benefit in having cities use a standard data format for this information before the onset of the next really bad flu season.

But whether it’s flu shot locations or some other data set, the future of open data lies in building standards that multiple cities and government can adhere to. This is the next great milestone in the open data movement.

Advancing the movement toward this goal will be the most important work of the open data community in the months and years ahead.

[Note – photo courtesy of the San Diego International Airport]

On Data Standards for Cities

Building an Open311 Application with Node.js and CouchDB

Lots of work is being done to finalize the next version of the Open311 API spec (officially referred to as GeoReport V2).

Almost a year ago I launched TweetMy311 – a service that lets people report non-emergency service requests using a smart phone and Twitter. Since then, a lot has changed – not only with the Open311 specification but with the tools available to build powerful Twitter-based applications.
Node.js
In the last several months, I’ve spent a lot of time learning about and working with Node.js. Some of the things I did in the initial version of TweetMy311 (written in PHP) are so much easier to do in Node.js that I’ve decided to completely rewrite the application to use Node. In addition, since I initially launched TweetMy311 CouchDB (the NoSQL database on which the app is built) has also seen a lot of enhancements.

I’ve expecting the overhaul I’m currently working on to make the application code a lot more efficient and easy to understand. Once this overhaul is complete, I intend to release a big chunk of it as open source software, so that anyone that wants to build a powerful Node.js/CouchDB-based civic app can do so.

It’s also exciting to see new cities get on board the Open311 bandwagon. The City of Boston is now supporting Open311 and has started to issue API keys to developers.

As part of my work to overhaul TweetMy311, I’ve developed a neat little Node.js library for interacting with the Open311 API. Since I just started to work with the Boston implementation, I thought it would be helpful to others interested in doing so to walk through a quick example.

If you want to run this example for yourself, you’ll need to have Node.js installed, specifically the latest version – v0.4.2. If you have the Node Package Manager installed, you can simply do:

npm install open311

Once you’ve done this, you should be able to run the following script:

Which will output:

This is just a quick example of how to make the most basic of API calls with the Node.js Open311 module. You can use this module to build fully feature Open311 applications.

I’ll be doing some more blogging in the weeks ahead as the rewrite of TweetMy311 continues, and work on this phase of the GeoReport V2 spec is concluded.

Stay tuned!

Building an Open311 Application with Node.js and CouchDB

The Key to Open Gov Success: Common Standards

There is a really good post on the state of open government in Canada and the use of specific data licenses by Canadian cities over on David Eaves’ blog.

His post raises an important issue for the open government movement, one that I believe will ultimately determine it’s success or failure – the adoption of common standards by multiple governments in support of open government. This is something I’ve touched on before.

Eaves’ recent post discusses the importance of common licensing standards for open data. Equally important, in my mind, are other standards like those being developed for Open311, and standards for data formats (like GTFS).

One of the intended outcomes of the open government movement is the development of applications built on top of open data and open APIs. One of the primary advantages for governments from this type of “civic development” stems from the fact that (with rare exception) governments are not in direct competition with each other, and face common challenges.

This means that solutions built to address issues in one jurisdiction or municipality can potentially provide a benefit in other municipalities. That is the theoretical underpinning for efforts like Civic Commons.

But for this to work, there must be mutually agreed upon standards for things like data formats, APIs and data licensure to name just a few. Crafting and adopting these standards is work. Hard work. And making this even more difficult is the fact that there are those who would benefit from the absence of such standards – software vendors and other service providers.

Without painting all such vendors with the same brush (there are some notable exceptions), the absence of standards allows vendors to lock customers into their particular solutions, and provides an opportunity for them to sell the same solution over and over to different governments.

I’m not against capitalism (far from it), but governments need to get wise to the fact that common standards for data and APIs are what will ultimately help deliver on the promise of open government.

And also that there are those that do not wish such standards to be adopted or for open government to succeed.

The Key to Open Gov Success: Common Standards

Open311 Goes Big Time

This was a big week for the Open311 initiative. Federal CIO Vivek Kundra joined Gov 2.0 rock star Mayor Gavin Newsom from the City of San Francisco to announce a national initiative to adopt a common standard for a 311 API.

The number of supporters for the initiative is growing, and I think it’s high time that developers started getting in on the act.

There isn’t a publicly available sandbox (that I am aware of) for developers to use to develop Open311 applications. The Open311 website, however, has some detailed information on the API standard as well as some sample XML responses that the API will provide.

Based on this information, I’ve started working on a set of PHP classes for interacting with the Open311 API. It’s still rough, and it will obviously undergo many changes as more information on the API is developed, and public test infrastructure is set up.

Still, its a start (and it was fun to write!) – I’m hopeful that others will help develop this set of classes. Hit me up at mheadd [at] voiceingov.org if interested.

Open311 Goes Big Time

Coming Full Circle on 311

Tomorrow in New York City, developers, project managers, public policy specialists and others will come together to discuss an open standard specification for 311 services. One of the primary motivators for this discussion is the work that has been done by the District of Columbia which has deployed an open API for reporting 311 requests.

The idea behind the Open311 project is that both citizens and government are better served by having a uniform standard for 311 API’s like the District of Columbia. An open standard will better facilitate the development of applications and services that make submitting 311 requests easier and more convenient for citizens, and more cost effective for governments. From the Open311 website:

Open311 is not meant to refer to a specific app or any one incarnation of 311 services. Instead Open311 intends to be a specification of an open platform for 311 services…Once this core standard is defined, new user interfaces and custom workflows can be created by anyone and shared between cities to provide distributed innovation.

To help understand where 311 service may go because of efforts like Open311, or even more independent efforts to deploy 311 APIs, it helps to understand how 311 services operate today.

311 Today

Most municipal 311 services are centered around a call center operation. The call center is staffed with personnel that a citizen can speak with to report an issue. 311 Call center personnel are trained and typically have a predefined script that they use when interacting with a caller. This script ensures that they collect all required information from the caller and enter it into the 311 system.

Like many call centers, 311 centers may utilize some limited routing logic or an ACD to send a citizen to a specific agent or group of agents based on the issue they want to report. Also, like most call enters, the largest cost component (or certainly one of the big ones) is likely to be labor costs. There are other costs worth noting as well, some of them borne by citizens – like the cost of waiting in queue when all agents are occupied with other callers.

Interestingly, if you look at some of the details of the responses from the DC 311 API, you can see the call center roots of this service pretty clearly.

This is the response from the DC 311 API for the abandoned bicycle service type. You can see that this response shows many of the elements you would expect to see in a call center script, including prompts and the characteristics of the data entry fields the agent would use.

311 Tomorrow?

By deploying a 311 API, municipalities can encourage the development of new applications and different interfaces for submitting 311 service requests. This benefits citizens because there will potentially be more options to use when they need to contact 311. It can benefit governments because it can facilitate the submission of 311 requests without the need for one of the most cont-intensive components – 311 operators. Governments might also benefit from better information when requests are submitted – 311 applications that run on location-aware devices can easily submit geographic coordinates that may be more precise than an address spoken by a person.

A good example of the kind of innovation that can be fostered by deploying 311 APIs is the winner of the most recent round of the Apps for Democracy Contest — Social DC 311 — a combination iPhone/Facebook application.

But with all of the potential for new applications and slick new interfaces for 311 services from the advent of 311 APIs and the potential development of an Open311 standard, there is another (less obvious) platform on which innovative, cutting edge applications can be developed.

The ordinary telephone.

Why Phones Matter

Simply stated, phones matter in providing government services because almost all citizens have them (landline telephone penetration rates are somewhere close to 95 percent nationally, and cell phone penetration rates are at about 85 percent). Moreover, almost all citizens that have them understand how to use them, and have some experience navigating IVR or touch tone menu systems. There is no learning curve to ascend before a service request can be submitted.

Telephones are the most ubiquitous communications device on the planet, and they do not suffer from the uneven distribution rates of other consumer communications products. 311 service was built around the telephone, so its a natural interface for new applications.

As stated above, there are plenty of examples of 311 systems that use some automation through IVR and other technologies to route calls to agents. But, at the end of the routing process, a citizen talks to an agent and gives information about a service request to another human.

There are enormous efficiencies that could be gained by being more aggressive with IVR-based automation. Admittedly, not every call (or every caller) is right for an IVR system – that’s OK. One of the things IVR systems do is help entities more appropriately allocate scarce resources. Citizens that can serve themselves will do so using an IVR. Those that can’t (or won’t) will drop out to an agent and submit their request the old fashioned way. The finite number of agents on hand to take calls get focused on “high need” callers that require human assistance.

Back To The Future

As I stated in a previous post, there has simply never been a more varied and powerful array of tools available for developers to build phone applications than exists today. Open standards and open APIs (like those listed below) have removed the barriers between phone application development and traditional web development. What’s even more exciting is the potential introduced by the increasing ubiquity of VoIP, which is blurring the lines between traditionally telephony and other communications channels.

Governments don’t need expensive proprietary software or hardware, or a team of highly trained developers to build and deploy a high-volume IVR phone application. With choice has come ease of use and downward pressure on costs. Some of the newest and most innovative platforms around for building phone applications are listed below:

Want to build a portable phone app that conforms to open standards from the W3C? VoiceXML and CCXML are your ticket – these standards are supported on a large (and growing) number of platforms.

Want to build an open source solution that leverages your in house skills in PHP or Ruby? Stand up an Asterisk server and get cracking with PHPAGI or Adhearsion.

Need to connect to the PSTN but don’t want to invest in expensive specialized hardware? Call up a VoIP provider and deploy some SIP trunks.

The sea change in telephone application development over the last several years means more developers have the skills and tools to build sophisticated phone applications. It also means that government need not be locked into any one platform or vendor that has unique expertise in phone systems.

As the Open311 dialog moves forward, and as more and more municipalities begin deploying 311 APIs, it will be interesting to see what develops. Whatever awaits those of us that are interested in what is to come, I hope that there will be some phones involved.

Coming Full Circle on 311

Open Gov: A Means to and End

With all of the activity and excitement taking place around the country focused on new Government 2.0 and open government initiatives, its easy for those involved to get lost in the technology. Those of us that love technology and work with it for a living can get lost pretty quickly in the minutia of implementing an new solution.

A perfect example of this in my mind is the recently released iPhone App developed by the City of Boston for submitting municipal complaints. When asked why the city chose to develop an iPhone application, a senior advisor to the Mayor said:

“We chose the iPhone mostly because of its sex appeal – because it’s new and it’s hot.”

Don’t get me wrong, I love my iPhone and I think its exciting that state and local governments are developing applications for it, to make it easier for citizens to interact with their governments. I salute the City of Boston’s initiative in developing an application that makes it easier to submit municipal service requests. But most of the people that live in Boston don’t own iPhones. Most of the cell phone owners in Boston don’t have an iPhone either – so why choose the iPhone as a platform for a publicly funded application?

The city might have been better off developing an application that worked on more mobile devices. This could have been a web-based application that worked in the micro browsers that come with older cell phones as well as the more powerful browser software that ships with iPhones, G1 phones and other advanced mobile devices. They might have even developed a voice/DTMF interface for people (like my Mom) that use their cell phones the old fashion way. If they had, a lot more people might have been able to use the new service.

The point is that the goal of Gov 2.0 initiatives should not be the deployment of the “hottest” applications on the platforms with the most “sex appeal.” Gov 2.0 initiatives, and all of the exciting new technologies they bring to the table, are good for one thing – helping governments do their jobs more efficiently. That’s it.

As more governments jump on the Gov 2.0 bandwagon, it will be important for public officials to remain focused on the goals of their governments, their agencies and their offices – this will require an intimate understanding of the mission of government and a well developed set of metrics to help determine if Gov 2.0 technologies are helping governments more efficiently achieve their goals.

With this in mind, it was extremely gratifying to see Beth Noveck (of Wiki Government fame, who leads President Barack Obama’s open-government initiative) say the following:

Q: How will you measure the impact of these [open government] innovations?

A: Developing recommendations on transparency and open government has to include a process for developing metrics. We can talk about the number of data feeds we’ve released, or the number of people who’ve participated in rule making [but] we really have to look at transparency and participation to a specific end. So if our goal is improving the quality of American education or increasing accessibility and affordability of health care, we really have to look at those as the metrics and ask ourselves, “How does driving innovation into the way the public sector works help us to ultimately do the job better of making those hard policy decisions?”

Here’s hoping that those involved in Gov 2.0 and open government initiatives around the country take the time needed at the inception of their projects to as the questions: “What exactly are we trying to achieve here?” and “How will we measure our performance so that we know we’re making progress toward our goal?”

Or, when in doubt, ask – what would Beth Noveck do?

Open Gov: A Means to and End

Measuring Gov 2.0 Performance

If you can not measure it, you can not improve it.”

- Lord Kelvin

There is a lot of exciting news lately coming from state and local governments about innovative new uses for social networking and Gov 2.0 tools. Even the smallest burgs and hamlets in our fair nation are on Twitter, and even the lowliest first-term legislator has a Facebook page – sometimes before they have an office assignment.

But before governments go too far down the road of building Gov 2.0 tools into their business processes, it may be worth exploring if conventional performance measures are adequate to measure if (and by how much) Gov 2.0 tools are improving the job being done by governments. As Lord Kelvin said – “To measure is to know.”

In the late 90’s and early 2000’s many governments implemented new e-Government services for their citizens, and reorganized service delivery around Internet-based functionality. Government performance measures were infused with terms like unique hits, click-troughs and the like to more adequately track performance through this new channel.

Does the advent of Gov 2.0 and the increased use of social networking tools warrant a re-examination of the ways that governments evaluate how good a job they are doing? How do you measure customer satisfaction when a government interacts with a citizen via Twitter, or leaves a comment on a blog or Facebook page?

More importantly, how do government measure (and capture) the cost savings that may be brought about by using social networking tools, and approaches like “Wiki-Government“?

Some things to think about as Gov 2.0 gets more mature, and more widely used.

Measuring Gov 2.0 Performance