Better Licensing For Open Data

It’s really interesting to see so many governments start to use GitHub as a platform for sharing both code and data. One of the things I find interesting, though, is how infrequently governments use standard licenses with their data and app releases on GitHub.

Why no licenses?

I’m as guilty as anyone of pushing government data and apps to GitHub without proper terms of use, or a standard license. Adding these to a repo can be a pain – more often than not, I used to find my self rooting around in older repos looking for a set of terms that I could include in a repo I wanted to create and copying it. This isn’t a terrible way ensure that terms of use for government data and apps stay consistent, but I think we can do better.

Before leaving the City of Philadelphia, I began experimenting with a new approach. I created a stand-alone repository for our most commonly used set of terms & conditions. Then, I added the license to a new project as a submodule. With this approach, we can ensure that every time a set of terms & conditions is included with a repo containing city data or apps that the language is up to date and consistent with what is being used in other repos.

Adding the terms of use to a new repo before making it public is easy:

~$ git submodule add git://github.com/CityOfPhiladelphia/terms-of-use.git license

This adds a new subdirectory in the parent repo named ‘license’ that contains a reference to the repo holding the license language. Any user cloning the repo to use the data or app, simply does (for purposes of demonstration, using this rep):

~$ git clone https://github.com/CityOfPhiladelphia/phl-polling-loctions
~$ git submodule init
~$ git submodule update

The user can run git submodule update any time to get the very latest license language, which can change from time to time.

Github is an amazing platform for governments to use in sharing open data and fostering collaboration through releasing applications as open source projects.

I think it also provides some powerful facilities for associating licenses and terms & conditions with these releases – something every open source project needs to be sustainable and successful.

One comment

  1. Stephane Guidoin (@Hoedic) · September 22, 2014

    We (Open North with some data.gov folks) are currently starting within the OGP open data working group around open data standardization. Licence (and metadata) is a major aspect since it prevents discoverability of the data and its possible usage.

    Most of our analysis will be based on open data portals (which usually have metadata support even if frequently not used as it should). Hosting on Github adds a layer of complexity since it is not designed to support metadata (although there is a convention for the licence).

    That’s why when I am asked about using GitHub for open data portal, I am hesitant. There are some obvious positive aspects but there are some obvious drawback. Lot of works still needed to make open data as accessible as we would expect!!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s