When The Guardian broke the story about the NSA demanding ‘telephony metadata’ from Verizon, a new word was introduced into the public lexicon. In the world of maps and geographic data, metadata gets a bad rap. It’s generally perceived as a pain in the ass– something that must be tended to, like a perpetually leaky bike tire or cleaning up your room. At Urban Mapping, we’ve always viewed this differently. Poor documentation or an inability to readily know things about data, especially geographic information, is costly.
Since the dawn of library science at the Library of Alexandria in Egypt, where Callimachus in the third century BC conceived of the first bibliographic system (Pinakes), metadata developed as a way to catalog key elements of printed works and make them easily searchable. Unfortunately this also introduced an unintended consequence of divorcing metadata from data.
The next significant development in metadata was two millennia later. In 1595 AD, Johan van der Does of Leiden University published Nomenclator, the first definitive publication of library holdings. While this index was fairly crude, it took a few thousand years to arrive. Next up was Melville Dewey who created The Dewey Decimal System to organize all knowledge into ten main classes (further subdivided into ten divisions, each division into ten sections). This approach allowed for infinite hierarchy. Other systems followed, such as the Universal Decimal Classification and Library of Congress Classification.
Fast-forward a few decades. Libraries, archives, bookstores and other repositories of knowledge are filled to the brim with card catalogs and the like. Tremendous human effort was dedicated to manually draft, distribute and maintain indices about their collections. What to do with these massive storehouses of index cards? Thankfully the information age helped to solve the real estate problem by creating digital archives of metadata that could be searched. Beginning in the late-1960s, the OCLC took the lead to centralize the digitization and storage of metadata for all types of content. Content had become completely divorced from that which it describes– metadata and data had undergone a bifurcation, simultaneously advancing society and holding it back. If I was interested in, say, a book about musical scores and want to know more about New York, New York, I might be SOL. Searching by “Sinatra” might help me get what I want, but the limitations of searching metadata remained a function of that which was indexed.
Then the Internet happened. Moore’s Law took root and processing power went exponential, storage costs dropped like rocks. The cost to purge data became greater than to maintain. Bookstores, newspapers, libraries and anything dead tree oriented faced irrelevance. After a long separation, metadata was reunited with data, but not from the world of library science. Google Books and Amazon’s Search Inside are great examples of bringing content together with data, allowing users to simultaneously perform full text searches and query metadata.
Contrast this with the world of geospatial data, where metadata remained off to the side, completely divorced from content. This is effectively not much more advanced or useful than the days of card catalogs. This is why we take it so seriously at Urban Mapping. In the next few months we’ll be unveiling a significant improvement over current geospatial metadata, but more on that later…
Ok, what does this have to do with domestic surveillance you are thinking? With the NSA demanding telephony metadata from Verizon and President Obama assuring Americans that nobody is listening to your phone calls, what exactly is the big deal? Below is a list of what could be provided. On the left is the demand, the center column is a list of derivative information (meta-metadata?) that could be compiled based on logs from Verizon, and the right column indicates what this could mean.
The bottom line in Mapland is a mantra we’ve had in place for years: One person’s metadata is another person’s data. The Verizon-NSA issue is case study #1 in how metadata plays a critical role in surfacing actionable insights. In this case, political and security considerations are paramount, but it very clearly illustrates the concept. It’s been taken from wonky to pedestrian in a matter of days and hopefully future uses of metadata cease containing the term in quotes.