Learning to Evaluate Perception Models Using Planner-Centric Metrics

by   Jonah Philion, et al.

Variants of accuracy and precision are the gold-standard by which the computer vision community measures progress of perception algorithms. One reason for the ubiquity of these metrics is that they are largely task-agnostic; we in general seek to detect zero false negatives or positives. The downside of these metrics is that, at worst, they penalize all incorrect detections equally without conditioning on the task or scene, and at best, heuristics need to be chosen to ensure that different mistakes count differently. In this paper, we propose a principled metric for 3D object detection specifically for the task of self-driving. The core idea behind our metric is to isolate the task of object detection and measure the impact the produced detections would induce on the downstream task of driving. Without hand-designing it to, we find that our metric penalizes many of the mistakes that other metrics penalize by design. In addition, our metric downweighs detections based on additional factors such as distance from a detection to the ego car and the speed of the detection in intuitive ways that other detection metrics do not. For human evaluation, we generate scenes in which standard metrics and our metric disagree and find that humans side with our metric 79 of the time. Our project page including an evaluation server can be found at https://nv-tlabs.github.io/detection-relevance.


page 1

page 7

page 8


On Offline Evaluation of 3D Object Detection for Autonomous Driving

Prior work in 3D object detection evaluates models using offline metrics...

A Quality Index Metric and Method for Online Self-Assessment of Autonomous Vehicles Sensory Perception

Perception is critical to autonomous driving safety. Camera-based object...

On the Metrics for Evaluating Monocular Depth Estimation

Monocular Depth Estimation (MDE) is performed to produce 3D information ...

Towards Driving-Oriented Metric for Lane Detection Models

After the 2017 TuSimple Lane Detection Challenge, its dataset and evalua...

Tightness-aware Evaluation Protocol for Scene Text Detection

Evaluation protocols play key role in the developmental progress of text...

The efficacy of Neural Planning Metrics: A meta-analysis of PKL on nuScenes

A high-performing object detection system plays a crucial role in autono...

Please sign up or login with your details

Forgot password? Click here to reset