Open source and crowd sourced gathering of bicycle data, through the use of a smart phone application, containing location and time information have been used by several MPOs in cities such as San Francisco, Atlanta, and Philadelphia.
Information located on the ends of the gathered trips are usually extraneous or suspect; most users do not start tracking then immediately set off nor do they stop tracking immediately upon reaching their destination. To clean the trip ends automatically, a polygon is created containing a given set of initial points using a convex hull algorithm. This is then iterated by adding successive points from the GPS trace and creating a new polygon and calculating the centroid and maximum radius. Once the maximum radius threshold is met or the bearing/heading of the bicycle data becomes consistent, the polygon shape is solidified and the trace backtracked to the first intersection with the polygon. Points following the intersection with the polygon are flagged as trip ends and ignored in further processing. The centroid of the polygons becomes the effective origin or destination depending on the context of the trip end. The method can also be adapted to anonymize GPS trace data by relaxing the maximum radius constraint.
To perform further analysis of the GPS traces such as speed distribution by bicycle facility type, an automatic method of snapping traces to a street network is required. A street network is created and traces are analysed point by point. Initially points are snapped to the nearest link, and then a shortest path algorithm is used to determine the path between two points. Following the initial pass of the GPS trace, spurs are removed if the length is small or the speed, acceleration, and bearing do not match the fitted path; detours are also analysed if the reported GPS accuracy is low.
With this data cleaning and processing, the bicycle GPS traces become more valuable for use in higher level analysis.