Earlier this year, I had the chance to work with one of my state’s Senators to draft and pass a bill requiring the state’s transit agency to publish all of it’s route, schedule and fare information in an open format for use by third parties.
My primary goal in working to get this law enacted was to raise awareness within my state about the potential for open government data to spur civic coding and the development of useful applications at little or no cost to the government. Now that my state actually publishes some open data (Hells to the yeah!), I think the next step for me is to provide some guidance on how to get started using it to build civic applications.
Hopefully, this will show others how easy it is and get them to try their hand at building a civic application.
(Note, transit data is an especially rich source for developing civic applications. For some background and more detail on this, see this post.)
In the next several posts, I’ll document one process for developing an open source transit data application using GTFS data from the Delaware Transit Agency. I’ll be sharing code and some examples that will help you get started if you feel like trying your hand at building a civic application.
Let’s get started!
Getting the Data
Now that the Delaware Transit Agency has published all of their route and schedule information, anyone that wants to use it can simply download it.
This zip file contains a collection of text files that conform to the GTFS specification – for a detailed description of file contents, go here. If you want to build a transit app with GTFS data, I recommend spending a little time becoming familiar with the layout of these files, and getting a sense of what the data represents.
Setting up a Database
In order to use this data as part of an application, we’re probably going to need to get it into a database so that we can manipulate it and run queries against it. An easy way to do this is to import it into a MySQL database instance.
MySQL is a powerful open source database that is used in scores of different web applications and its a solid choice for building a transit data application. In addition, the MySQL LOAD DATA INFILE statement is a powerful and easy way to populate a database with information from a text file (or multiple files).
I’ve created a SQL script to load Delaware transit data into a MySQL database. You can get this script from GitHub – it’s pretty simple, and you should feel free to modify it as your own personal preferences or requirements dictate. Just fork the Gist.
Combining this script with a couple of minutes on the command line will give you a MySQL database with all of the transit data loaded up and ready to use. The steps below assume that you have MySQL installed and running.
To install MySQL:
~$ sudo apt-get install mysql-server
To see if MySQL is running:
~$ pgrep mysql
Create a temporary location for the GTFS files:
~$ mkdir /tmp/dartfirst_de_us
Download the GTFS files from the Delaware Transit Agency website:
~$ wget http://www.dartfirststate.com/information/routes/
Unzip the individual text files to our temporary location:
~$ unzip dartfirststate_de_us.zip -d /tmp/dartfirst_de_us/
Get the SQL script for loading GTFS files into MySQL from GitHub:
~$ wget http://gist.github.com/raw/
Invoke MySQL and pass in the SQL script (make sure you change ‘user_name’ to a valid MySQL user name):
~$ mysql -u user_name -p < dartfirststate_de_us.sql
Now, all of the data from the original text files has been loaded into a MySQL database called transitdata. You can start to construct queries to retrieve information from these tables to support the functionality for your application.
In the next post, I’ll walk through a few basic queries that can extract useful information from these tables. We’ll also lay the groundwork for a really cool mobile application that I will deploy for use by the public when this series of posts is complete.