How Open is Open (Part Two)

The O’Reilly Where Conference (nee Where2.0) is this week in San Francisco and this is the second installment looks back at five years of public record requests, appeals, administrative appeals. lawsuits, public/private whining and cajoling. Actually, this post will just reference the high/low lights. Before proceeding, an incredibly important caveat is in order: as with society at large, the public sector is filled with passionate, caring and committed individuals. Their mission is often hard to uphold with changing tastes and politicking, but on whole they do a formidable role at keeping things together. But of course the public sector is not a single-headed beast. It serves many masters at different levels of government, most of which the public doesn’t understand.

This post is several years in the making and reads a bit like a poor-man’s law review. It isn’t meant as a stand-alone document. It represents details around gaining access to transit data, but is best consumed when attending my talk at the Where Conference. Tomorrow the trilogy will conclude with the presentation, if you are unable to attend.

Public Transit + Data: Background

Google Transit launched in late 2007 with about 15 transit agencies participating in the US, and the list quickly expanded as more and more transit agencies put their data into the GTFS (General Transit Feed Specification, formally known as Google Transit Feed Specification) format. There are currently more than 400 agencies worldwide that provide their transit data to Google. But despite their willingness to share with one private party, not all are equal and public agencies played favorites. About 20% of these agencies make their data available to others.

In 2005, Google began developing the GTFS standard in conjunction with Portland Oregon’s TriMet transit agency, using TriMet’s format as a basis for the standard. GTFS contains the schedule and route information for a transit system, and specifies in detail exactly how data should be formatted.  Part of the format requires that the data be made available in a “feed,” or a web address at which the data can be collected at regular intervals. Routes and schedules are constantly changing, and providing a feed allows websites using the data to automatically stay up to date.


Although the standardized format is easy for developers to use, some transit agencies can face significant labor and financial costs converting their data into GTFS. With archaic computer systems, or even no computer systems, there can be numerous technical hurdles to overcome. The Southeastern Pennsylvania Transportation Authority said, in response to our FOIL request: “This project of eventually making available to the riding public in GTFS format is a huge undertaking, involving countless hours of his staff at costs of hundreds of thousands of dollars.” Other agencies had little trouble converting their data to GTFS. Bay Area Rapid Transit’s data was converted in 1% of one employee’s time.

The Rise of GTFS

The New York City’s Metropolitan Transit Authority (MTA) had collaborated with Google Transit for at least a year before it was publicly unveiled. While the press release announcing the partnership emphasizes that there was “no cost to the taxpayer,” a copy of the agreement obtained by our lawyers has redacted information in the fees section. Additionally, other organizations seeking to obtain MTA schedule data in a proprietary format were told license fees are 10% of net revenues. Hard to believe Google would follow these same terms.

Whether these differences were due to technical competence, legacy systems, financial resources or simply the attitudes of the agencies, what was clear is that once the GTFS data was created, agencies perceived they had a valuable public resource they sought to control. Not all agencies supported the notion of transit data as a public resource. Some claimed legacy systems, vendor relationships, their interpretation of public record laws or public safety/security were valid reasons to deny access.

Freedom of Information Law (FOIL): To be foiled with a FOIL

Most of the time, agencies cooperated – many even proceeded to post their data publicly with Google or on the agency website. Several responded to our FOIL requests by posting the data online for everyone to have access to. For example, our FOIL request to Troy, MI prompted the attorney representing them to investigate whether the feed could be posted online. Within a few weeks, it was. Denials and subsequent appeals to Translink, of British Columbia, and GO Transit (metropolitan Toronto) effectively forced these organizations to simply do what was expected at tremendous expense.

Numerous transit agencies wouldn’t give us the data for legal reasons. Some had restrictive licensing agreements that made many valuable uses of the data impossible. The worst was NYC’s Department of Information Technology and Telecommunications (DoITT), which made some data available under a license that required, among other things, that no part of the data be “in any way made available over the internet.” Our lawyers had to engage in a lengthy litigation to get access without the restrictions of the license.

The Kansas City Area Transportation Authority wanted to maintain strict control over data, and insisted that we sign a license agreement to get access to it. Our lawyers proposed an agreement, which was roundly rejected. Apparently they could only provide us the data based on the same agreement that KCATA had with Google, and in addition needed the ability to monitor the end product to ensure quality. KCATA now releases the feed to the general public.

San Diego’s MTS offered to license the data to us for $1,500 a year. They said that this was reasonable because we “will most likely manipulate this data and market it for sale using similar Google applications [sic], such as the iPhone.”

Another common issue we encountered had to do with accessing a URL rather than the underlying data. The usually confused and overworked transit agency respondent somehow occasionally stumbled upon the relevant provision that FOIL does not include changing data. One of the major limitations to FOILs is that because they are older (the US federal FOIA statute was last updated in 1996), many state-level regulations did not foresee electronic records and consequently the nature of regularly updating data was not considered. Our attorneys worked around this by requesting a snapshot of the feed every month, each time suggesting it might be easier for the agency to provide access to the feed. Most agencies eventually relented.

Our lawyers engaged in a lengthy battle with the Southeastern Pennsylvania Transportation Authority over the release of the data they were developing for Google transit. In the end, we lost on the grounds that data in development does not “exist.” The appeals officer apparently did not understand that there were still data files that could be turned over, regardless of whether they were “complete.” Regardless of the merits of the decision, SEPTA released the data to the public within a few weeks.

Rather than legal reasons, stupidity and incompetence was the biggest difficulty faced by our lawyers when trying to get the data. Many agencies sent printed copies of transit schedules, maps and rider information in response to our request for electronic data.

Dealing with the Duluth, MN Transit Authority was particularly maddening. While they gave us access to the data after 10 months of back and forth, they refused to give us access to the feed on the grounds that it was a security issue.

Our lawyer’s argument – that the data was publicly available and could not possibly be a security concern – was met with a description of how to navigate the website in order to find route information, and a smug statement that surely this must fulfill our request, now would we please stop bothering them. They also mailed us many, many printed copies of the schedules. Eventually the Duluth Transit Authority same to its senses, and the feed is now available publicly online.

We also encountered difficulties when agencies made deals with private companies to provide services. NextBus Information Services, a now-defunct subsidiary of NextBus, sought to commercialize further (as in, double-dip) data it had received as a byproduct of a government contract. By providing the information gathered through the placement of GPS tracking devices on publicly owned vehicles only on their own website, NextBus effectively prevented any public requests of this information. That is, since the agencies never even see the data that they authorize NextBus to collect, it is absolutely proprietary and access cannot be obtained through FOIA.

NYC Transit: We Don’t Have No Stinking Datas

Because  New York City Transit runs the one of the world’s largest public transportation networks and the indelible imprint of 9/11, our experience was not surprising–or was it? It seems that common sense was thrown out with the bathwater, preventing any kind of civil discourse. How could obtaining such basic information in electronic format–locations of entrances, schedules, station amenities– do more harmful than good?  Such was the cooperative attitude of New York City transit and mapping agencies. A choice quote from the Chief of Counterterrorism at NYPD:

The challenge of protecting the New York City Subways and other transportation systems is a daunting one. I agree with Asst. Chief Colgan that “the release of the data underlying the DolTT map – including the NYCT data related to transit entrance and exit points ­ – would undermine the NYPD’s efforts to keep our citizens and visitors safe.” [...] It would be very useful to a terrorist group to have precise information and details regarding the location of subway entrances and exits. “It could help terrorists determine what amount of explosive material is needed for maximum effectiveness as well as the timing and efficacy of secondary explosive devices.”

To be clear, safety and security is everybody’s business, and I couldn’t live with myself for unknowingly making a terrorist act easier to achieve. But the above paragraph is representative of a political knee-jerk reaction without any regard for what knowledgeable people understand to be true. There are costs to restricting access. What happens in wake of a terrorist event, during a flood or extended power outage?

LIRR: No Schedules for You!

Our attorneys began their efforts to obtain schedule data from the Long Island Rail Road (LIRR) in New York for another client, Will James, formally of the blog onNYTurf, but their experience highlights some of the absurdities and incompetence that we faced when dealing with transit agencies. Collecting the schedule data from the LIRR seemed like it would be a relatively straightforward and simple task. The LIRR and several other websites had schedule lookup applications, and there should have been little difficulty in sending the data that those applications relied on. But the LIRR was apparently confused about what this meant. The first request our lawyers sent was met with a demand for clarification.

Our second, with a blanket denial that such scheduling data exists. In no uncertain terms the LIRR denied having train schedules.

Since we could not believe that the LIRR did not have any schedules, our lawyers appealed, pointing out that they must have arrival and departure times for their trains, and furthermore gave examples of the web applications that make use of the data in the format we were seeking. And the LIRR, true to form, responded by sending 54 printed copies of their train schedules.

Seeing that these efforts to communicate were going nowhere, our lawyers filed a petition in New York State Supreme Court, demanding the production of the data that they had requested so many times. Figuring, apparently, that it would be easier to just gives them what they wanted rather than fight it out in court,it didn’t take long for someone at the LIRR to wake up to the fact that they weren’t asking for very much, and shortly thereafter, they had the data.

Don’t DoITT!

Most exciting was litigating against New York City’s Metropolitan Transit Authority and Department of Information Technology and Telecommunications (DoITT). Our initial FOIL request was made on January of 2008, where we asked for all of the data underlying the NYCityMap on the DoITT website. In response, DoITT asked for a more detailed description on behalf of their technical staff. We narrowed our request to the entrance and exit points for various transit services in the Shapfile format that the website used. DoITT’s next response was puzzling, and a distinct change from the seeming cooperation that their previous response had foretold. They denied the request, citing the exemption to New York’s FOIL for information that, “if disclosed could endanger the life or safety of any person.” We expressed our confusion as to how revealing the locations of subway stops would endanger the life and safety of any person, and they essentially responded with ‘because we said so,’ citing the “level of detail” of the information.

Our lawyers petitioned the court for the release of the data, DoITT came back at us with guns blazing. DoITT essentially argued that, in their opinion, releasing even the entrance and exit points would “significantly undermine the Department’s efforts to keep the transit system safe.” Since transit systems are often a terrorist target, releasing this information would be helpful to the terrorists, who, apparently, are well versed in GIS yet haven’t figured out how to buy a Metrocard and visit the subways themselves.

In support of their opinion that releasing the locations of the subways is dangerous, John Colgan, the commanding officer of the New York City Police Department’s Counterterrorism Bureau, claimed in an affidavit that revealing the exact locations of subway entrances “would seriously undermine the safety and security of the City transit system, and could endanger the lives of our citizens and visitors,” and that “limiting disclosure of data underlying the city map was not only reasonable, but essential to maintaining the security of the City’s transportation infrastructure.” His opinion was slightly more specific than the “because we said so” response of DoITT. The information we were seeking could apparently “help determine the amount of explosive material that would be needed for maximum effectiveness, as well as the timing and efficacy of secondary explosive devices.” As compared to what could be collected from a visual observation of the public areas of the transit system, “having the exact locations and other relational data … would significantly enhance the ability of someone to devise and carry out such a plot.

William Morange, the Director of Security for the MTA essential summarizes Colgan’s affidavit, and says that he agrees with his opinion. In his only bit of independent thought, he strangely says that the “exact locations” are not publicly available. This assertion is mind-boggling in the ignorance that it demonstrates (especially in that it comes from such a highly placed public security figure), as the exact locations in question are available to anyone walking around the city, and can be measured very precisely with a $100 handheld GPS unit.

In their brief on the issue, our lawyers argued again that the data we were seeking is publicly available, such as on the websites that already have the data, detailed schematics of the subway system available online, and maps within the subway stations themselves.

Given how vehemently they had rebuffed us so far, we were surprised by their next response. They agreed to give us the data that we had been asking for all along. Apparently they were not aware that the data we were requesting was publicly available, and once they figured it out, they no longer had any objection to producing the GIS information.

After their capitulation, Judge Marilyn Shafer ordered DoITT to produce the entry and exit points of the subways, bus stops, and path train, with the sensitive data redacted, as we had requested all along. After nearly a year of asking for the data, Judge Shafer agreed that information which is “physically viewable by the public anyway” should be available under FOIL.

Oh, That’s What You Meant

Our next confrontation with a NYC agency was over access to the MTA’s transit data. Our lawyers first request was met with a denial that they had the information. When the MTA and Google announced a few weeks later to announce the launch of MTA routes on Google Transit, we knew that they had the data, and we again submitted a request.

In a phone conversation, their counsel said that the information was not in the FOIL office’s possession. This is, of course, the point of having a FOIL office –  to get information from elsewhere in the government. We pointed out that no such exemption to FOIL existed, and they said a decision on our request would be forthcoming in another three weeks. As the FOIL statute proscribed a 10 day period for them to respond, we went to court. It didn’t take long for the MTA to send us the data, without any more protest.

Our most recent confrontation with the MTA was not as successful as the others. We were trying to get maps of the subway stations in a level of detail that showed the locations of the Metrocard machines and attendant booths. Our lawyer’s first request was ignored for a while, and when prompted they were told that there were no maps that they could give out, for security reasons. We appealed, and they found maps of about 100 stations that had been prepared to show proposed changes to the subway stations. They refused to release the maps for the remaining 368 stations on the grounds that the remaining stations had not been mapped in a similar manner, and the plans for them showed the locations of critical infrastructure and raised security concerns.

Our lawyers appealed, taking them to court on the grounds that they have the blueprints of the subways stations, and are capable of redacting them. However, it turned out that the 100 maps that had already been released had taken over four employees over nine months to create. To similarly redact the blueprints for the remaining stations would take years, and, they argued, was the equivalent of having to create new documents because of the difficulty removing the sensitive information, which is exempt under FOIL. The judge agreed, and found that the MTA had fully complied with the FOIL request. As our only grounds for appeal were that we should be entitled to pay to have the blueprints redacted, we decided to let the matter rest, for now.

The MTA, like many agencies, have embraced openness: unimaginable only a few years ago, the MTA maintains a developer program and actively engages with this community. In a public sense it is exciting to know that the tide has turned and the public sector sees their role as un-encumbering individuals or groups who wish to take advantage of a public resource. The corollary is one of enormous frustration that private parties with resources and clout are able to co-opt the public sector, playing favorites at taxpayer expense.


Filed under newsworthy

2 responses to “How Open is Open (Part Two)

  1. Pingback: The End of the Geodata Trilogy | Urban Mapping Blog

  2. Pingback: How Open Is Open? | Urban Mapping Blog