Dockless Open Data is a short technical guide from the city of Louisville, KY covering “how and why cities can convert MDS trip data to anonymized open data, while respecting rider privacy.”

The MDS standard does not support collecting personally identifiable information (PII), however it does support collecting detailed trip data that could potentially be combined with other data to re-identify individuals. This guide 1 describes a method for ensuring data collected using MDS is sufficiently anonymized so that it cannot be used for this purpose. The resulting data sets can be published or released via open data requests without fear that individuals can be identified from the data.

Trip start and end time data is binned into 15-minute increments. The geographic location data is both binned and fuzzed. The data is first binned by truncating the latitude and longitude data to 3 decimal places. The data is then “fuzzed” using a k-anonymity generalization function that groups multiple similar trips together and replaces their individual origins and destinations with the prototypical origin and destination for that group. This fuzzing is only done for trips where there are fewer than 5 trips made between the origin / destination bin pair. The entire process is described in detail, with SQL and other sample code provided and described. The processed data can be seen on Louisville’s public dashboard (https://cdolabs-admin.carto.com/builder/f57ee92e-09c3-4efd-b7c0-3d561cc9e951/embed).

The guide also provides links to open data from six other localities, and a general description of how each anonymizes its published data.

Additional resources include the city of Louisville, KY's public data dashboard 2 showing trip volumes and O/Ds, anonymized through binning, and a short online article 3 describing the methods that the City of Chicago uses to protect privacy before releasing TNP and taxi data to the public.