We’re pleased to announce our latest update to neighborhood boundaries, the product that put us on the (map!) in 2006. This release features increased coverage in over 370 cities across 7 new countries. This brings global coverage to more than 127,000 neighborhoods across 40 countries. Our sights are now set on our Q2 update which will include new attributes and expanded coverage.
Good things come during the holidays, like updated neighborhood data from Urban Mapping! This quarter we continued to expand coverage to include over 120,000 neighborhoods 34 countries. Equally important is responding to evolving user needs. We’ve expanded our Local Relevance attribute from a boolean to indexed value. New this quarter is an Editorial Sensitivity. These new attributes mean developers have even flexibility in creating applications tailored to customer needs.
Local Relevance is an indexed value based on a variety of metrics of social/cultural significance and popularity. Using a mix of log data and social media, Urban Mapping created an attribute that measures importance, allowing developers to whittle down a comprehensive set of neighborhoods to something more manageable. For example, in the five boroughs of New York City our database counts over 300 neighborhoods. For one kind of application, cultural and historical detail (think Alphabet City and Hell’s Kitchen) will be critical. For a mobile-based social networking application, dividing NYC into (say) 30 of the most important neighborhoods could be sufficient.
Editorial Sensitivity can be used to address scenarios where a publisher might want to “tone down” their property to cater to a given audience. As great as we think the “Funk Zone” in Santa Monica is, some publishers might feel differently!
Holidays bring good things of all kinds, including product announcements! This week we are live at The Kelsey Group’s Leading in Local conference in San Francisco. We’re announcing several online advertising products that provide increased geographic context, better user experience and increase monetization. What are these, you may ask?
GeoMods – a geotargeting tool that helps online advertisers increase effectiveness of campaigns. With access to Urban Mapping’s geographic warehouse of on-demand data, GeoMods provides a long tail of geo-expansion terms. With GeoMods, advertisers and agencies can perform truly hyperlocal geotargeting with accuracy far greater than traditional IP geotargeting. Try the demo!
Neighborhoods & Transit – Leverage Mapfluence to incorporate neighborhoods and transit data, a key one-two punch of local! The usage based model allows publishers to tag business listings, housing or other content to provide increased user relevance. Developers can show neighborhoods and transit information on a map or index their content to create an enriched hyperlocal experience.
- GeoLookup – Given the location of a mobile device, how can a user understand location? Maps are a good way, but often context is sufficient. Through reverse geocoding, developers can drill down and return place names associated with location. This is especially valuable with social/mobile applications where GeoLookup can serve as a filter to enforce user privacy.
To learn more, check out urbanmapping.com/adtech and register as a developer today!
(or, A Cautionary Tale of Geocoding)
So in this new era of open data, data is free, right? If you’ve never tried using open data, it’s harder than you might think.
For the Tableau Customer Conference last month, we thought it would be fun to show off some data that was relevant to the conference location: Washington, D.C.
Our original idea was to associate D.C. restaurant health code violation scores to buildings to provide a simple and reasonable sense of which buildings might not have the best food facilities. Building outlines or “footprints” are available from the D.C. government and OpenStreetMap, tagged with restaurant names. OSM data can be obtained many ways; the easiest might be Metro Extracts. These should not be confused with parcel boundaries, which are tied to property lines.
Enter data quality challenges. Geocoding is complicated and “free” data isn’t ever really free.
When we downloaded the restaurant health code violation data, it quickly became apparent that the geographic component of the data suffered from a fundamental problem that limited us from spatially linking the latitude and longitude of restaurants with the building outline data: the geocoder used to obtain the coordinates placed about half of the restaurants on the street and the rest were located somewhere within the property (i.e., parcel) boundary. This is a challenge as we wanted to tie the actual restaurant location to a polygon, not a point.
Before I explain further, I need to provide some terminology disambiguation. Within Tableau, geocoding means displaying data geographically. However, in common geospatial parlance, geocoding has a very specific meaning: a geocoder is a tool that is used to derive latitude and longitude from a human-readable street address. In short, geocoding makes address data geographic, but unless you understand the assumptions made in the geocoding process, your geocoding may not be useful.
While we are talking terminology, a composite geocoder is a geocoder that will try to find a point to assign to an address, but if it fails, will fall back on whatever part of the address it understands. For example, if you provide an invalid address in Bismarck, North Dakota, the composite geocoder will return a point in the center of Bismarck, North Dakota. If you give it a nonsensical street name and spell Bismarck like “Bismark”, it may return a point in the center of North Dakota. Rather than raising a flag, it gives you an answer, albeit a less accurate one. Failing all else, your geocoder will return a value of NULL, which, if interpreted by a mapping client, will represent as the latitude and longitude coordinates (0,0), aka Null Island.
Finally, rooftop geocoding returns a latitude and longitude that will land on the building that matches the address. Not all geocoders are designed to do this – for example, it is unnecessary or even disadvantageous for a geocoder built for navigation and routing to resolve addresses to rooftops. It just matters that you get there, not whether it’s a valid address. Most so-called rooftop geocoders match the parcel, but may drop the point in the parking lot rather than on the building. With a little extra geospatial wizardry, you can assign a parcel-level geocode to a building footprint. However, a geocode to the middle of the street tells you nothing about which building or parcel the point belongs to.
Returning to our restaurant health code violation scores, it appears the data was geocoded with a composite geocoder that took a first pass through a rooftop geocoder, but failing that, assigned coordinates according to an address range of a street, etc…. We would have to invest significant time in data cleaning, re-geocoding, and manually placing points to assign all restaurant locations to a building footprint.
Instead we scrapped our efforts, and found other data. Sometimes life is too short for data cleaning. (Ed note: please read our companion post about the viz we did build).
The question of how to build the most precise geocoder isn’t an easy one, but it is one we think about a lot. There is no solution that would not require an enormous amount of information about address and building configurations on the ground, but clearly It doesn’t make sense to drop one point at a time to generate hundreds of thousands of locations for civic engagement or business intelligence. In sum, the details of data matter, and dealing with them is less glamorous and more important than most of everything else you will do when creating a (geographic) data visualization.
At last month’s Tableau Customer Conference in Washington D.C., we ran a hands-on mapping session that showed how to create dual axis maps while retaining two measures, build custom regions, create a viz with open data and how to use additional data and services from Urban Mapping, Tableau’s official map provider, through Mapfluence, our mapping platform.
Because Mapfluence has a direct connection into Tableau that does an end-around WMS, high resolution imagery can be seamlessly integrated into Tableau, tiling only the portions of the image you need as a base layer under your viz. Any data viz guru will tell you that unnecessary levels of detail clutter your presentation and obscure your message, which is why it is better to avoid a complicated base map where a simpler one will do. Nevertheless, there are cases where high resolution imagery provides valuable context for your viz. For example, if you want to show parcel boundaries or building outlines in their spatial context, the benefits are obvious:
The above map is rendered by Mapfluence. It can display, customize and symbolize the features in the variety of ways you would style a filled map without any limitations on rendering boundaries. When you draw your filled map directly onto the base map, you are free to use all of your dimensions to visualize data.
With a little geospatial wizardry, you are not limited to the dimensions and measures associated with the geometries you are drawing. Using Mapfluence or geospatial software like QGIS, you can aggregate point level data to your custom geographic boundaries to create new dimensions and measures.
For the Tableau Customer Conference, we thought it would be fun to show off some data that was relevant to Washington D.C. After stumbling when trying to use open data describing restaurant health code violations (be sure to read the companion cautionary tale of open data), we found the District of Columbia produces good quality, up-to-date data on crime incidents since 2011.
Building The Viz
The source data contains location information (latitude-longitude pairs) and several associated attributes (crime type, description, etc). To maintain privacy, incidents are geocoded to the nearest block instead of the actual address, so on a map the points appear as a gridded mass of dots.
You could generate a kernel density estimation based on the point distribution (i.e. a heat map), however this masks the aggregation at block level, and could give your audience false impressions about the patterns in your data. Furthermore, we want to get a sense of overall crime rates throughout the city. Kernel density estimates are more useful for hot spot analyses of particular types of crime, and when you have an actual, non-aggregated location. When looking at crime rates overall, or any complicated and varied phenomenon, kernel densities are difficult to interpret:
When you aggregate points by census block, you are presenting the underlying data at its appropriate level of aggregation. As we can see, patterns emerge.
While the high resolution imagery allows us to see context block by block, it also complicates the viz. To show both, I varied the transparency with number of incidents by block instead of using a color ramp. That way, you can see where the greatest number of incidents were reported, but you also see as much of the underlying imagery as possible for the additional context of roads, buildings, and other geographic markers. Finally, we wanted to introduce another dimension to the viz. Because Mapfluence contains over 10,000 on-demand variables in our data catalog, we decided to overlay line and station information for the DC Metro subway system. This allows for anther dimension in the analysis.
To overcome some of the limitations in how Tableau deals with geographic polygons, we render and serve the census blocks from Mapfluence. This is effective as Mapfluence is designed as a web-based GIS and geographic analysis and representation is second nature to us. However, this is also not ideal as Tableau users like to play with all the data they can.
Working with Leigh Fonseca of Fonseca Data Science, we came to a very subtle but compelling solution: leave the heavy geo-lifting to Mapfluence, and allow Tableau to act as the reporting tool. In this way, the polygons act as a proxy for the underlying points that are available in the workbook, a la tooltip functionality! Importing data into Tableau allows users to create a dashboard on top of a custom base map that you can filter according to the underlying points.
Click the image above to explore the dashboard. You can click individual blocks or select an area to see a breakdown of the crime types compared to crime citywide. Crime definitions are explained here.
Although we do not purport to be criminologists, this viz highlights data that would be impossible to see in a spreadsheet. Here are some things we observed:
- Theft of property represents a very disproportionate number of the highest density of incidents near the Columbia Heights Metro Station (14th Street NW and Irving): 533 out of 554 reported incidents.
- Car break-ins represent a greater proportion of incidents on the outskirts of DC than in the city center.
- There are approximately half as many burglaries in Northwest as compared to Southeast or Northeast.
- Relatively few crime incidents were reported on the Washington DC Mall compared to the city overall between January, 2011 and August, 2013: 45 thefts, plus 22 car break-ins, 6 robberies, 5 car thefts, 3 assaults with weapons, and 2 sex crimes.
There are plenty of additional ways to drill down into this data for the inquisitive data nerds out there. You could normalize by population per census block to see which crimes occur more frequently were the residential population is higher or lower. You could focus on visualizing the spatial patterns of a particular type of crime or set of crimes. Or you could visualize the changes in crime patterns over time using a time slider.
At Urban Mapping we’re excited about Tableau, and we’re excited about mapping in Tableau, and we suspect you are as well. We’ve taken a moment to showcase high resolution imagery in Tableau, and the potential for using data + mapping in Mapfluence to build dashboards in Tableau for exploring geographic phenomena. Please let us know what you think and be sure to learn more about our enhanced mapping solutions for Tableau.