Baltimore Buildings Import 2.0
After running into some technical and administrative roadblocks two years ago with the last import attempt, the project to import buildings and addresses for Baltimore City, Maryland, USA, is once again ready for primetime.
This project is the result of collaboration between the Baltimore City Office of Information Technology (MOIT) and the OSM community. The project aims to provide accurate building footprints and physical addresses to OpenStreetMap within Baltimore City, considering existing mapped items and all OSM import guidelines.
License
MOIT has assured us that the data is public domain, free from all copyright.
Community Support
The local mapping community is very excited about this import. After the Baltimore County import was completed in 2014, the gaping hole where there is little building cover over Baltimore city proper needs to be filled. The local community support includes OSM users ElliottPlack, mpetroff, MDRoads, RoadGeek_MD99, as well as interested parties that have [1] contributed to issues on the GitHub repo. MOIT employee / participant Jim Garcia is perhaps the biggest supporter of the project. Additionally, a story about the project appeared in the local media, in the publication Technically Baltimore.
Scope
Baltimore covers an area of approximately 91 square miles[1]. Within the dataset, there are over 250,000 buildings and addresses.
Procedures
The broad view of the work performed is as follows.
- Get existing data
- Set aside conflicts for manual quality control checks
- Run code to prepare new data
- Simplify geometries (detailed on the previous import page)
- Perform address tagging translations
- Conflate addresses to building if the match is 1:1
- Create separate datasets for addresses and buildings that could not be conflated
- Validate
- Upload with scripts
Existing data
In order to preserve the work of other OSM editors, data extracted by GeoFabrik was used to compare existing OSM data and data to be imported. Buildings that did not intersect existing buildings were separated from those that did; the former will be uploaded using scripting, while the latter will be manually compared to the existing building data with the OSM Tasking Server.
Local mappers have traced many larger buildings along the waterfront and in popular places around the city, however many of the scores of rowhouses around that encircle the downtown areas are untraced.
Example
In this example, editors have mapped many buildings on the campus of Johns Hopkins University. However, the neighborhood to the east has hardly any buildings mapped.
Manual Quality Control Checks
The only part of the process that is not automated is checking to see where existing data is of higher or lower quality than what the data to be imported contains. This data will be separated from the import and then manually reviewed.
Code
All procedural code is on github at osmlab/bmorebuildings. The address-building/data-processing directory contains the script used to process the source data into (mostly) upload ready data. The main script, process_data.py, loads the source data into as PostGIS database, and simplifies, separates, and conflates the data using SQL queries. It then uses ogr2osm with a translation file to perform address tagging translations and generate OSM files. Once generated, these files were run through the JOSM validator, and all errors were manually corrected.
Conflate Addresses
Addresses were separated into two categories, those that intersected with buildings, either existing or in the import, and those that didn't. Those that didn't intersect are assumed to be for empty lots and will be uploaded using scripting. Of the remainder, those that intersected with building in the existing map data will be manually reviewed. Those that intersected with building outlines from this import will be handled in two different ways, depending if a building outline contained one address point or multiple address points. If a building contained one address point, that address was merged with the building outline. If a building contained multiple address points, those points were kept as points and will be uploaded via scripting. Duplicate points with different apartment numbers were merged into one point, and the apartment numbers were dropped.
Address Tagging Translations
Translations are handled by a custom ogr2osm translation.
The script expands all abbreviations in the street names, as well as their types and directionals. It also added the number and other common address tags, concatenating everything properly. Finally, it checks for the proper casing, and corrects special cases such as street names with a special character. The import team worked with the community and Baltimore City to determine the best spelling for special cases because the City uses all caps street name signs in most places.
Uploading
For scripted uploads, the upload scripts from the Open Street Map Importing and Mechanical Edit Toolkit will be used. For data that needs to be manually reviewed, the OSM Tasking Manager will be used. Import accounts will be used.
Preview
A live preview of the formatted address data is available on CartoDB.