5 Minute REST API

Inspired by one of Waldo Jaquith’s recent projects to create an easy solution for converting data into a REST API, I wanted to try doing the same thing with a bit of a different approach.

Below is a quick screencast demonstrating how to take a CSV file and convert it into a powerful, flexible REST API.

I use the following tools in this demo, some of which I have mentioned before:

I haven’t used CouchDB for a project in a while, but I still wonder at its power and utility – for my money, there is no easier way to create a REST API than with CouchDB.

What I think this example demonstrates is that there are an abundance of tools that can be used to create powerful APIs, making it easier and more practical to build solutions with open data.

Find a tool or approach that best fits your technical skills and get hacking!

Don’t Hang Any Pictures

A few lessons learned and insights gained serving as a municipal Chief Data Officer – I hope this will be of value to anyone interested in starting an open data program.


Helping to create an open data culture in government is a process, not an outcome. It takes time. Though it may sometimes seem easier to shame people into releasing data, or to find ways around seemingly unreasonable objections, getting public officials to “own” the process of opening up their data is a key part of building this culture. But…

Don’t listen to people that tell you to “take pride in how far you’ve come in such a short time.” This kind of attitude assumes that your job is almost done and you can be satisfied in what you’ve accomplished. You should focus not on how much you have already done, but on how much you have left to do. You have lots and lots more work left. Trust me.

When appointed or elected officials release open data, or adopt innovative change, it’s critical to do what you can to make sure they get recognized for it – do this regardless of their party or political affiliation. Transparency is more important than politics. But…

Don’t let people feel like they’re doing you a favor, or going beyond the call of duty by releasing open data. If your city has an executive order covering open data then they are following a directive issued by the Mayor. If your city has an open data statute then they are following the law. You don’t get special credit for following the law.

Always be respectful of elected or appointed officials and understand that as important as you believe your job is, theirs is equally important. Not everyone will believe in what you’re trying to do as strongly as you do, but they will recognize if your actions are sincere. Find ways to frame open data and government innovation in ways that support the mission of their departments or agencies. Show them how their jobs can be made easier by embracing change. But…

Don’t become close or chummy with elected or appointed officials. It makes it more difficult to have the hard conversations when they’re needed. Some confrontation and uncomfortable meetings, where you remind people of their obligations under an open data policy, are inevitable. You should not seek out a fight if it can be avoided, but you also shouldn’t shrink from one if it can’t.

Take pride in the fact that you are a public servant. I believe strongly in the value of public service and that spending time in government working to make change happen is important. I rearranged my life and moved my family to a new state for the opportunity to serve a city that I have come to love. But…

Don’t hang any pictures in your office. Don’t get comfortable. Recognize that your time in public service is temporary. The office I currently occupy looks much like it did 519 days ago when I walked into it. The contents of the cabinets and desk drawers are those that the previous occupant left behind. There are no decorations, few items with sentimental value, nothing to suggest permanence.

You should be a little nervous – at all times – that you may have pushed too hard and the city fathers and mothers will grow weary and decide that your services are no longer needed.

If you don’t feel that way, you’re probably not pushing hard enough.

[Note – photo courtesy of Flickr user Theen Moy]

Turning Government Web Content Into APIs.

Inspired by a recent Open City project that repurposes data on sewage in the Chicago River, I wanted to work through a quick example of turing a web page that houses useful information on water quality in Philly into an API.

Here is a quick screencast showing how easy it is to take information rendered as part of a standard HTML page and turn it into a useful JSON API.

It goes without saying that as more and more data producers in government learn about how data can be repurposed by others – both inside and outside of government – we’ll see more and more valuable information exposed as structured data and APIs. Viva #opendata!

Until then, there are some very powerful tools available for turing web content into powerful, reusable APIs.

Urban Storytelling with Open Data

One of the most important qualities of open data is that it can provide unparalleled insight into how a city works.

Open data empowers urban storytelling – the process of identifying a trend, or some important characteristic of an urban area and then presenting that information in a compelling way for others.

When I speak about open data, one of the things I emphasize is that the tools for creating stories with data are becoming more powerful and widespread everyday. With the right data, it’s not hard to tell an important or compelling story about a city. To prove my point, I decided to “put my money where my mouth is” at a recent civic hacking meetup in the Northern Liberties section of Philadelphia.

The City of Philadelphia and the Philadelphia Police Department recently released a hugely important dataset – major crime incidents occurring in the city. This data is available as both a static download (updated with the latest information each morning) or through set of APIs. I decided to use this data set, and a tool that’s been on my list to experiment with for a while – Torque, from CartoDB – to tell an urban story.

The code I wrote to tell my story is on GitHub, and you can see the data visualization I built here.

Below is a screencast of the steps I took to build this visualization – as I have said many times, the amount of effort required to build a compelling civic application (provided the right data is available) is trivial. See if you agree after watching this short video.

The technology is elegant. The visualization is cool. But the story itself is rather sad – there are way too many homicides in Philadelphia.

Our Mayor and other city leaders are working hard to change this. Just recently, Mayor Nutter articulated the frustrations of mayors from across the country who are calling for tighter restrictions on the guns that cause so many of the homicides in Philadelphia and elsewhere.

I think part of the change that Mayor Nutter and other mayors are calling for is showing people the magnitude of the problem. This is what a compelling visualization can do – it can reach someone the way that a written story or even a one-on-one conversation often can not.

The problem of violence that plagues urban areas isn’t just a challenge for our city leaders. It’s a challenge for all of us – we all need to be a part of making positive change happen. One way to reach people and motiviate them to be a part of the change is to build a compelling visualization – to tell a compelling urban story. With the right tools, almost anyone can do it.

This is the power of open data.

From “Zero to Civic” in 5 Minutes

One of the most powerful things about open government data is all of the options it enables.

Open data is the single most important ingredient in civic apps, and it can also power visualizations, mashups and countless other civic uses. The very best open data typically lends itself to several different kinds of uses.

When paired with the increasingly sophisticated tools and APIs available to developers, building powerful and useful civic applications with open government data has never been easier.

This post is meant to provide a quick overview to using open government data to build a civic application – to demonstrate that someone with moderate coding ability, even if they’ve never built a civic application before – can go from “zero to civic” in just minutes.

The screencast below demonstrates how to take an open data set for health inspections from the City of Louisville, KY and build a simple but easy to use text messaging application using open source and free tools.

Read More

Building Multichannel Transit Apps with Tropo

This post is the third in a series about building an open source transit data application using GTFS data from the Delaware Transit Corporation.

In the first post, I described how to download the State of Delaware’s transit data and populate a MySQL database with it.

In the previous post, I walked through a process of setting up stored procedures for querying the transit data and setting up a LAMP application environment.

Now we’re ready to write code for our transit app!

Choosing a Platform

One of the most under appreciated developments that has accompanied the increasing amount of government data that has become available in open formats is the vast array of new tools now available for developers to use. I’ve talked about this a lot in the past but it bears repeating – it has never been easier to build sophisticated, multi-channel communication applications than it is now.
The number of options open to developers is truly exciting, but there are some platforms that rise above the rest in terms of ease of use and in what they enable developers to do. For this project, I will use the Tropo WebAPI platform.

The Tropo WebAPI has a number of advantage that will come in handy for our transit app project (and any other projects you’ve got in the works). You can write a Tropo app in one of several popular scripting and web development languages – Ruby, Python, PHP, C# and JavaScript (Node.js). There are libraries available for each language that make it easy to build Tropo apps and to integrate with the Tropo API. (Disclaimer – I’ve worked on several of these libraries.)

In addition, the real magic that Tropo brings to the table is the ability to serve users on multiple communication channels (phone, IM, SMS, Twitter) from a single code base. This is especially important for an application meant to service transit riders. These users may not have the luxury of sitting in front of a desktop computer in order to look up information on a bus route or schedule. They are much more likely to be traveling and using some sort of phone or mobile device. The Tropo WebAPI is perfect for our needs.

Vivek Kundra, the former CIO of the District of Columbia and current CIO of the United States, has described the effort by governments to release data in open formats as “the democratization of data” – these efforts make previously hard to get, or hard to use data available for everyone.

I like to describe platforms like Tropo and the various libraries that are available to use with it as “the democratization of application development” – these tools make building powerful communication apps simple for anyone who understands web development.

Building our Transit App

Before we can build our application, we need to decide what it will do.

For our purposes, this has already been determined by the stored procedures we built in the last post. Our transitdata database has 2 stored procedures – one to return the nearest bus stops to a specific address or location, and one to return the next bus departure times from a specific bus stop.

However, this series of posts is meant to inspire readers to build their own applications – now that you have transit data in a powerful relational database like MySQL you can query it any way you like. In addition, the SQL scripts and steps developed for this series of posts can certainly be used with the data from any other transit agency that uses the GTFS format. There are lots. Use your imagination – build whatever you find useful.

So now that we have some idea of what we want our application to do, we need to select a development language. It will probably come as no surprise that for this example I’m going to use the PHP scripting language and the PHP Library for the Tropo WebAPI. PHP is a good match for Linux, Apache and MySQL – all technologies we used in the previous entries in this series of blog posts.

If you want some more detailed information on building PHP applications that run on the Tropo WebAPI platform, you can review a separate series of blog posts on this issue here.

To get the PHP Library for the Tropo WebAPI, you can download it and unpack on your web server, or simply clone the Github repo.

Once you do that, you can grab the code for our demo application from GitHub as well.

In order to test this application, you’ll need to sign up for a free Tropo account – you can do that here. Once you are signed up, go to the Applications section in your Tropo account and set up a new WebAPI application that points to the location of our PHP script on your web server. You can see more detailed information on setting up a Tropo account here.


Note – You’ll also need an API key from Google Maps for geocoding addresses – get one here. Change the following line in the application to include your Google API key:

define("MAPS_API_KEY", "your-api-key-goes-here");

Once your Tropo account and application are set up, you can add as many different contact methods as you like – your Tropo application is automatically provisioned a Skype number, a SIP number and an iNUM.

To illustrate how our transit app will work, I’ve gone ahead and assigned a Jabber IM name to my app – findthebus@tropo.im. Add this to your friends/user list in Google chat and you can use the app I’ve set up. Here’s what it looks like in my IM client:


As you can see, my first IM to findthebus@tropo.im sends the address of a building in Downtown Wilmington (actually, a building I used to work in). The app responds with the three closest bus stops and the distance (in miles) to each.

I then send the number of the bus stop I am interested in. The app responds with the next three buses to leave that stop, the route served by each and the number of minutes before each departs.

How cools is that!

I could very easily make this application more sophisticated, so that it it delivers content tailored to specific channels (i.e., IM vs. phone) but I want to keep things simple for now.

In the next blog post of this series, we will introduce some additional tools, including Google Maps and the new hotness in cloud telephony – Phono.

Stay tuned!

Democratizing Transit Data with Open Source Software

Democratizing government data will help change how government operates—and give citizens the ability to participate in making government services more effective, accessible, and transparent.

Peter Orszag, OMB Director

This post is a continuation in a series on building a transit data application using GTFS data recently released by the State of Delaware.

If you missed my first post, go back and check it out. You can get a MySQL database loaded up with all of the Delaware GTFS data in just a couple of minutes. Once you do that, you’ll be ready to follow along.
MySQL Database
Continuing our work from the last post, in this post we’ll finish building out our database and set up an environment to run a web application – for the purposes of the demo app I’m building for this series, I’ll assume you have a standard LAMP set up to work with.

Finish the Database Setup

In the last post, we downloaded the GTFS data from the State of Delaware, unzipped it and loaded it into a MySQL database. Now, we need to set up some stored procedures so that we can extract data from our MySQL database and present it to an end user.

You can see the stored procedures I created for this demo application on GitHub. To load them into our shiny new database, simply run:

  ~$ wget http://gist.github.com/raw/632306/
  ~$ mysql -u user_name -p transitdata < dartfirststate_de_us_procs.sql

Thats it!

If you look at these procedures, you’ll see that they are set up to answer two different questions from users. The first one – getDepartureTimesAndRoutesByStopID – will query our database and get a set of routes and departure times by the ID of a transit stop. The other – GetClosestStopsByLocation – accepts a lat/lon and returns the stop ID and name of the transit stops closest the the requesting location.

In practice, you can see these two procedures working in tandem – the later procedure would be used by someone wishing to find the transit stop closest to their present location. The former would provide information on the next buses to reach that stop, the routes they serve and the scheduled departure time from that location.

There are certainly many more potential queries that could be used to extract valuable information from the GTFS data in our database, but these two should suffice for our demo application. Also, both are pretty well suited for use from a text messaging (SMS) application, which is what we’ll build in the last post in this series.

Setting up the Application Environment

I assume for this series of posts that you have access to a LAMP server. This server should be hosted somewhere where it can receive HTTP posts from a third party platform (this is required in order to build an SMS application).

While it is not a requirement that you code your transit application in PHP, I will do so in this series. Feel free to use the development language of your choice in building your own application – just about every web development language can work with MySQL.

Before we start writing code, lets finish a few last items. First, lets create a user for our web application – remember to give this user only the privileges they need. For our demo application, the web app user only needs to EXECUTE stored procedures. So, we want to do this at the MySQL shell:

mysql> GRANT EXECUTE ON transitdata.* TO username@'localhost' IDENTIFIED BY 'password'; 

Be sure to replace the ‘username’ and ‘password’ above with values of your choosing. Now, let’s put our database access credentials in a safe and convenient place.

When writing a web application, I prefer not to store this inforamtion in my code (as a config item or declared constant). Instead, I like to keep this information in my Apache configuration.

If you’re using Apache on Ubuntu, you can typically just store this inforamtion in your VirtualHost file (located in /etc/apache2/sites-available/). Use the Apache SetEnv directive to set the values you want to store:

SetEnv TRANSIT_DB_HOST localhost
SetEnv TRANSIT_DB_USER username
SetEnv TRANSIT_DB_PASS password
SetEnv TRANSIT_DB_NAME transitdata

Again, be sure to replace the ‘username’ and ‘password’ above with the values used when creating your MySQL user. Once you have entered these values into your VirtualHost file, save it and reload Apache:

 ~$ sudo /etc/init.d/apache2 reload

Now we’re all set to start writing code!

In the next post we’ll build a simple, yet powerful PHP-based SMS application that anyone with a cell phone can use to find a transit location nearest to them in the State of Delaware, and find out the departure times / routes from that location.

Stay tuned!

How to Build an Open Transit Data Application

Earlier this year, I had the chance to work with one of my state’s Senators to draft and pass a bill requiring the state’s transit agency to publish all of it’s route, schedule and fare information in an open format for use by third parties.

This bill was signed into law by the Governor a few months ago, and the data is now available (in GTFS format) on the Delaware Transit Agency’s web site.

My primary goal in working to get this law enacted was to raise awareness within my state about the potential for open government data to spur civic coding and the development of useful applications at little or no cost to the government. Now that my state actually publishes some open data (Hells to the yeah!), I think the next step for me is to provide some guidance on how to get started using it to build civic applications.

Hopefully, this will show others how easy it is and get them to try their hand at building a civic application.

(Note, transit data is an especially rich source for developing civic applications. For some background and more detail on this, see this post.)

In the next several posts, I’ll document one process for developing an open source transit data application using GTFS data from the Delaware Transit Agency. I’ll be sharing code and some examples that will help you get started if you feel like trying your hand at building a civic application.

Let’s get started!

Getting the Data

Now that the Delaware Transit Agency has published all of their route and schedule information, anyone that wants to use it can simply download it.

This zip file contains a collection of text files that conform to the GTFS specification – for a detailed description of file contents, go here. If you want to build a transit app with GTFS data, I recommend spending a little time becoming familiar with the layout of these files, and getting a sense of what the data represents.

Setting up a Database

In order to use this data as part of an application, we’re probably going to need to get it into a database so that we can manipulate it and run queries against it. An easy way to do this is to import it into a MySQL database instance.

MySQL is a powerful open source database that is used in scores of different web applications and its a solid choice for building a transit data application. In addition, the MySQL LOAD DATA INFILE statement is a powerful and easy way to populate a database with information from a text file (or multiple files).

I’ve created a SQL script to load Delaware transit data into a MySQL database. You can get this script from GitHub – it’s pretty simple, and you should feel free to modify it as your own personal preferences or requirements dictate. Just fork the Gist.

Combining this script with a couple of minutes on the command line will give you a MySQL database with all of the transit data loaded up and ready to use. The steps below assume that you have MySQL installed and running.

To install MySQL:
~$ sudo apt-get install mysql-server

To see if MySQL is running:
~$ pgrep mysql

Create a temporary location for the GTFS files:
~$ mkdir /tmp/dartfirst_de_us

Download the GTFS files from the Delaware Transit Agency website:
~$ wget http://www.dartfirststate.com/information/routes/

Unzip the individual text files to our temporary location:
~$ unzip dartfirststate_de_us.zip -d /tmp/dartfirst_de_us/

Get the SQL script for loading GTFS files into MySQL from GitHub:
~$ wget http://gist.github.com/raw/

Invoke MySQL and pass in the SQL script (make sure you change ‘user_name’ to a valid MySQL user name):
~$ mysql -u user_name -p < dartfirststate_de_us.sql

That’s it!

Now, all of the data from the original text files has been loaded into a MySQL database called transitdata. You can start to construct queries to retrieve information from these tables to support the functionality for your application.

In the next post, I’ll walk through a few basic queries that can extract useful information from these tables. We’ll also lay the groundwork for a really cool mobile application that I will deploy for use by the public when this series of posts is complete.

Stay tuned!

NoSQL Telephony with Tropo and CouchDB

In the last two posts, I’ve provided a basic overview of how to create cloud telephony applications using the Tropo platform and CouchDB.

Apache CouchDB Logo

In the first post of this series, I walked through a quick install of CouchDB and provided information on getting a Tropo account set up. In the second post, we created a simple auto attendant Tropo script in PHP that populates a CouchDB database with a call record for each inbound call that is transferred.

I’ll conclude the series with information on how to retrieve information from a CouchDB instance for use in a cloud telephony application, and talk about design documents. This post will also introduce the reader to the concepts of CouchDB Views and Show Functions – powerful tools that can be harnessed to create truly cutting edge cloud phone apps.

First, let’s create a CouchDB database to hold our call settings.

Creating a Call Settings Database

As mentioned in the previous CouchDB posts, you can create a new call settings database using curl from the command line, or using the Futon GUI.

$ curl -X PUT http://your_new_couchdb_ip:5984/call_settings

You should see a response from CouchDB like this:


You can add a record to the call settings database the same way. This time, however, we’ll append the URL for our CouchDB database with a document ID, in this case ‘1000’ – this is the extension that a caller to our cloud telephony app will dial. We’ll use the document ID and and the CouchDB REST API to get all of the settings we’ll need to conduct the transfer – these settings can be seen in the document structure below (feel free to add others to meet your needs or preferences).

$ curl -X PUT http://your_new_couchdb_ip:5984/call_settings/1000 -d ‘{“first_name”:”Joe”,”last_name”:”Blow”,”phone”:”17777777777″,”title”:”Master of Disaster”,”ring_tone”:”audio/ring.wav”}’

You should see a response from CouchDB like this:


Let’s add a few more documents to our call settings database (replacing the telephone numbers below with real ones that you want callers to transfer to) and then view all of the documents that we have created.

$ curl -X PUT http://your_new_couchdb_ip:5984/call_settings/2000 -d ‘{“first_name”:”Harry”,”last_name”:”Smith”,”phone”:”18888888888″,”title”:”President of the World”,”ring_tone”:”audio/ring.wav”}’

$ curl -X PUT http://your_new_couchdb_ip:5984/call_settings/3000 -d ‘{“first_name”:”Martin”,”last_name”:”Scorsese”,”phone”:”19999999999″,”title”:”The Departed”,”ring_tone”:”audio/ring.wav”}’

You can view all of the documents in a CouchDB database using the HTTP GET method:

$ curl -X GET http://your_new_couchdb_ip:5984/call_settings/_all_docs

You should see a response from CouchDB like this:


Now we need to modify our Tropo PHP script to retrieve the settings we want to use with each transferred call.

Note, for now we’ll keep the logic simple – if a caller enters an extension that does not exist we’ll get a specific HTTP response back from CouchDB – something in the 400 class of responses. If this happens, we’ll just end the call – in the real world you’d want to do something a little more friendly, but you can sort that out when you build your own cloud telephony application. 😉

Modifying the Tropo Script

So, our new Tropo script looks like this:

Note that the getPhoneNumberByExtension() method no longer returns a hard coded phone number – it is using the 4-digit extension entered by the caller to access our CouchDB database using the REST API. The response from CouchDB is a document in JSON format, that we can easily parse using PHP’s handy json_decode() function.

I’ve also modified the value of the $callLog variable to correctly capture some of the variables exposed in the Tropo environment (i.e., the session ID of the call, and the caller ID – see this thread for more information).

So now we have a working cloud telephony application built on Tropo that uses CouchDB to get its call settings, and also to write a call record for billing, reconciliation, etc.

As cool as this is, there is still a lot more we can do with CouchDB in our cloud telephony apps. Note the constants declared at the top of the Tropo script – the last two are blank; one for a design document name, and one for a show function.


Let’s talk about those concepts now, and explore how they could be used in a cloud telephony application.

Getting more out of CouchDB – Design Documents, Map/Reduce and Show Functions

As the title of this post suggests, we’re building cloud-based phone applications without SQL. CouchDB doesn’t use SQL – instead it uses a Map/Recuce framework to index documents in a database.

Map functions can be used to emit a key-value listing of documents in a CouchDB database. Reduce functions are used to aggregate the key-value pairs emitted by a Map function. Map/Reduce functions (or Views) live inside of a special document in a CouchDB database called a “design document“, which has a document ID prefixed with “_design/”.

For example, suppose we have a special design document in our database called “_design/extensions” with a View called “getExtensions” – our View is made up of a Map function and (optionally) a Reduce function. Let’s assume our View has only a Map function to return data on extensions with valid phone numbers to transfer a caller to.

function(doc) {
  if(doc.phone.length == 11 && doc.phone.substr(0,1) == ‘1’) {
    emit(doc._id, doc.phone);

Our Map function (which is written in JavaScript, and stored in our design document) has one parameter – doc. This function is called for each document in our database, and the doc parameter represents the document itself. As can be seen, we simply examine each document in the database to see if it has a valid phone number (11 digits, starting with 1).

Views are accessed using a specific URI structure (do note, however, that the REST API for querying Views can change significantly between CouchDB versions), and the response is a set of key-value pairs formatted as JSON.


$ curl -X GET http://your_new_couchdb_ip:5984/call_settings/_design/extensions/_view/getExtensions

You should see a response from CouchDB like this:


You can check to see if your Map function is working properly by adding a document with an invalid phone number.

$ curl -X PUT http://your_new_couchdb_ip:5984/call_settings/4000 -d ‘{“first_name”:”Richard”,”last_name”:”Kimble”,”phone”:”4444444″,”title”:”The Fugitive”,”ring_tone”:”audio/ring.wav”}’

Accessing the getExtensions view will return the same results as before, as the phone number for the new document does not pass validation. Using design documents and Views, cloud telephony developers can use CouchDB to build grammars for user input which will significantly enhance the usability of the sample application we’ve used during the last few posts.

But there is even more potential with another piece of functionality in CouchDB – show functions. Show function also live in design documents, alongside Views. Show functions allow a developer to return specifically formatted content from a CouchDB instance, not just data in JSON format.

A basic show function that can be used to return information from our CouchDB database in the format of a SRGS grammar might look like this.

function(doc, req) {
 var grammer = ‘<?xml version=”1.0″?><grammar xmlns=”http://www.w3.org/2001/06/grammar”>&#8217;;
 grammar += ‘<rule id=”R_1″><one-of>’;
 grammar += ‘<item>’ + doc.phone + ‘<item>’;
 grammar += ‘</one-of></rule></grammar>’;
 return grammar;

Like Views, Show Functions are accessed using a specific URI structure.


Note that the Show function above is different from the Map function discussed earlier in that it takes two parameters – doc and req. As before, the doc parameter represents the document the function is called against. The req parameter represents a parameter that is sent in with the HTTP request, which can be used inside the function to render output. So a Show function canbe accessed using the above URL with an optional parameter as well, like so.



I hope this series of posts has provided a helpful overview of CouchDB, with an emphasis on how it can be used to build cloud telephony applications.

Cloud telephony platforms like Tropo, CloudVox, CallFire and others provide enormous flexibility to developers in building and deploying sophisticated cloud telephony applications.

Pair these tools with CouchDB and you’ve got a powerful combination for building full featured, easy to maintain cloud-based phone apps.