[Talk-transit] Re: GTFS compatibility

Hillsman, Edward hillsman at cutr.usf.edu
Wed Jun 30 14:25:31 BST 2010


Our center has a project to explore the use of OSM as a repository and tool for supporting multimodal trip planners (for example, bike to transit, ride the bus, walk or bike to final destination). We are keenly interested in the current discussion of transit and GTFS in OSM, because one of our tasks is to develop software to import from GTFS into OSM, and then update the import as a transit agency modifies its routes or stops, taking into account that OSM mappers may have found and corrected errors in what was uploaded (or may have introduced errors). I'm writing to share some of our experience and get your suggestions. We will make the software we develop in this project (for uploading, matching, and updating GTFS data in OSM) publicly available.

We think it should be relatively easy to upload a set of GTFS stops into an area where no one has mapped bus stops into OSM. Generating the route relations will be harder and we may not accomplish that as part of this project. And we think that updating such data will be relatively simple, because it can rely on tags identifying and cross-referencing the stops; software would look for changes, and manual work would be needed to reconcile them. The hard part is going to be designing the initial upload process to work in areas where OSM already includes some bus stops, but not all of them. In the state of Florida, where we are working, there are about 450 stops already in OSM, many in areas served by transit agencies with GTFS data. Obviously, we want to respect what has been mapped. Things that complicate the initial upload include:

(1) Locational errors in the GTFS data. These are not systematic, and some are surprisingly large. One is more than 200 meters from its actual location, and only about 10 meters from another stop that GTFS has within 10 meters of its actual location (and that is mapped accurately in OSM). We came into this project knowing that there is locational error in GTFS. Now we are trying to figure out how to deal with it. The GTFS locations do match those appearing in Google Transit, by the way.
(2) Locational errors in the OSM data. These aren't systematic either but tend to be much smaller, except that in a few cases the stop has been recorded on the wrong side of the street, and a mapper in one city has recorded stops as nodes defining the street way rather than as points to the sides of the street.
(3) Incomplete and inconsistent tagging of the OSM stops. 
(4) The presence in an area of stops for multiple agencies, only one of which has GTFS data. Our campus has a shuttle bus circulator system with no GTFS data (they operate without a set schedule but with a target 10-minute headway, and frequency changes during the day and with the university class schedule). The area's main public transportation agency has several routes that pass through the campus, and has GTFS data. Most of the public-agency stops on campus, but not all, are also campus shuttle stops, and there are many more shuttle stops on campus than there are public-agency stops.
(5) Incomplete mapping of stops for each agency in OSM.

At the moment, we are rethinking the whole idea of trying to match the GTFS stops to the OSM stops for the initial upload. One idea would be to screen all stops in a GTFS area to look for tags indicating the operator (or no operator), tag all of them with a FIXME describing that an upload has occurred and may produce duplicates, but otherwise leave them alone, and then upload the GTFS ones. I see problems with that, and in any case it should be done only if there is a commitment by the uploader to work quickly to reconcile the two data sets in OSM. Given the surprisingly large locational errors in GTFS, I'm also uncomfortable with simply uploading it, because putting bad data into the system will create confusion. I suspect this is a problem with all uploads. We've certainly seen it with the TIGER street data.

But we are still in the thinking-about-this stage, haven't made any decisions, and are looking for suggestions and comments (hence this posting). Until we get a much better handle on the initial upload problems, any actual uploading we do as part of the project will be limited to the area of our campus, where we know what is actually on the ground and can clean up anything we do. We'd definitely enjoy sharing work and ideas.

Ed Hillsman

Edward L. Hillsman, Ph.D.
Senior Research Associate
Center for Urban Transportation Research
University of South Florida
4202 Fowler Ave., CUT100
Tampa, FL  33620-5375
813-974-2977 (tel)
813-974-5168 (fax)
hillsman at cutr.usf.edu
http://www.cutr.usf.edu



On Tue, 29 Jun 2010 15:26:07 +0100 Joe Hughes <joe at headwayblog.com> wrote:
>I agree that it would be helpful to end up with something that allows
>straightforward conversions to and from the GTFS format.  GTFS is a
>CC-licensed specification [1] which is evolved by an open community
>process [2].  Also, the great majority of U.S. and Canadian transport
>data is already available to developers in GTFS format [3], which has
>led to a community of developers  creating apps which can consume and
>produce it [4].
>
>Incidentally, as someone who's been deeply involved in the development
>of the format, I'm happy to answer any questions, and generally help
>to get this substantial mass of transport data into OSM.
>
>Cheers,
>Joe
>
>Links:
>[1] http://code.google.com/transit/spec/transit_feed_specification.html
>[2] http://groups.google.com/group/gtfs-changes/
>[3] http://www.gtfs-data-exchange.com/
>[4] http://groups.google.com/group/transit-developers




More information about the Talk-transit mailing list