Interestingly, there are currently no common measurement standards.
To bring light into the dark, we decided to share our approach for an automated map data testing pipeline.
This includes the collection of a ground truth dataset in a target area, as well as the comparison with serviced data.
This post demonstrates the pipeline setup to test parking availability predictions.
We chose Berlin-Mitte as a sample target area.
Special: We published all relevant data online, open and available to everyone (here and here).
Automating manual workMany mapping companies are still sending out human surveyors to collect a ground truth to compare it to their provided content.
This method is not only inaccurate and prone to errors, it is also slow and extremely expensive.
Thus, we decided to automate what’s possible: We collect geo-referenced imagery with a low-cost setup, analyze the data using computer vision and compute quality metrics on the results, all in one single data processing pipeline.
Collecting raw dataWe mounted a smartphone in the windshield of our car and ran a simple custom app that captures a series of geo-referenced images.
Every image has a corresponding set of location and movement information.
Standard smartphone in the windshield mountBelow a first glimpse of the input data: During the test drive, the camera was centered a little more towards the right side of the street.
We did not capture the left side of the street because the view was obstructed by other cars quite frequently.
The resulting limitation: In one-way streets, where parking is possible, we did not cover the left streetside.
Raw data (left), availability analysis (right), captured by smartphone behind the windshieldWe collected data in Berlin-Mitte in the extended area around Friedrichstraße.
The location is interesting due its great diversity: While the south is one of Berlin’s most frequented places, the south-eastern part piece of the covered area (Humboldt University) provides less parking options and is also less frequented.
In the north, we see mostly residential areas, where car ownership is generally high, compared to other surrounding areas.
The test drive took place on Wednesday, 13th of March 2019 between 2 pm and 4.
This way, we were able to obtain a realistic snapshot of the parking situation on the covered area between (late) lunchtime and beginning of the daily evening rush hour.
Covered area in Berlin MitteExtracting a parking availability ground truthWe put the data into a custom module of our proprietary computer vision segmentation model.
We specifically designed it to capture the number of parked cars on the street sides (counter in the lower left corner).
The video below shows the result of the analysis for an exemplary road segment.
Automated counting of parked cars with computer visionQuality metricWe wanted to compare the accuracy of our predictive parking model with the actual situation on the street.
Thus, a quick background on the idea of predictive parking: The goal is to predict, whether an individual driver will be able to find a vacant parking spot at a specific parking option upon arrival.
If you want to learn more about this, check out this earlier post.
The AIPARK API (v.
0) returns a prediction value between 0 and 100.
In end-user services, this availability prediction is typically represented in the form of a color scheme with three easy categories:green: high availabilityyellow: intermediate availabilityred: low availabilitySimilar color sheme as for traffic lights, Photo by Kai Pilger on UnsplashIn order to enable benchmarking of different services and to stick with the typical user experience, we decided to map the prediction values to the three categories using equally distributed value ranges.
This function shows how it is determined whether a prediction is correct:def compare(prediction, actual_value): """ prediction is a value between 0 and 100 actual_value is the number of open spots / total capacity return value states if the prediction was true or false """ d = discretize_prediction(prediction) if inRange(actual_value,0,1/3) and d is "low": return True elif inRange(actual_value,1/3,2/3) and d is "high": return True elif inRange(actual_value,2/3,1) and d is "very_high": return True else: return FalsePrediction accuracy may then simply be computed using this formula:ResultsRunning the analysis pipeline yields in a prediction accuracy of 91,94% when applied to the full test data set of all 62 parking areas.
While overall prediction accuracy is already remarkable, analyzing the remaining errors is more interesting:For four parking areas, where the prediction was too positive (predicted availability is higher than ground truth), the mean category deviation is 0.
8 at an estimated standard deviation of 0.
For those parking areas, where the prediction was too negative (predicted availability is worse than ground truth), the mean category deviation is -0.
8 at an estimated standard deviation of 0.
In fact, the predicted category values deviate at maximum one from the ground truth availability category.
ConclusionWith this post, we want to show, how map quality testing for sophisticated features such as parking availability can be automated using a low cost data collection setup and an AI-powered analysis pipeline.
In the next posts, we’ll introduce some more thoughts about data quality and automation.
Stay tuned!Want to reproduce the results?If you want to reproduce the results of this article or play around with the data, follow the instructions in the project on GitHub.
Results will be printed on the command line if you run compare_predictions.
pyNote: You will need to enter your AIPARK API key first at the bottom of the file.
Sign up here to retrieve your API key: https://studio.
Also, make sure that you are using Python3 instead of Python2.
Otherwise, you may end up with rounding errors.
Keen to check out the raw ground truth?.We shared the anonymized street-level footage here.
About the authorTorgen is CIO and Co-Founder at AIPARK, a Berlin-based tech company that provides live parking maps for developers.
AIPARK’s APIs extend the functionality of Connected Cars, Mobile Apps and Traffic Management systems.