Benchmarking YOLOv4–YOLOv11 for Autonomous Driving: Small-Object Detection, Adverse Conditions and Confidence Calibration
Main Article Content
Abstract
Autonomous vehicles rely on real‑time object detection to perceive their surroundings and make safety‑critical decisions. The You Only Look Once (YOLO) family of one‑stage detectors is attractive for embedded platforms because it delivers high throughput; however, achieving high accuracy, fast inference and reliable confidence estimation simultaneously remains challenging. This study investigates how detection‑head design (anchor‑based vs. anchor‑free), intersection‑over‑union (IoU) loss functions and post‑processing strategies (standard non‑maximal suppression (NMS) vs. NMS‑free training) influence both accuracy and calibration for autonomous‑driving scenarios. Experiments were conducted on the BDD100K validation split using a unified training recipe with 640×640 images, consistent data augmentations and identical hyper‑parameters across eight configurations. Mean Average Precision (mAP), Expected Calibration Error (ECE), Brier score and end‑to‑end inference speed (frames per second, FPS) were measured alongside an error taxonomy for small objects. To further improve confidence reliability, a simple post‑hoc temperature‑scaling calibration was applied and evaluated. The results show that an anchor‑free head with a Complete‑IoU (CIoU) loss and NMS‑free training achieves the best accuracy–efficiency trade‑off, reducing ECE from 2.6 % to 2.1 % and increasing throughput to 97 FPS without sacrificing mAP. Temperature scaling further decreases ECE by approximately 0.5 percentage points and improves low‑confidence precision–recall area. These findings demonstrate that carefully chosen architectural and post‑processing design choices can significantly improve both the accuracy and trustworthiness of YOLO‑based detectors for autonomous vehicles.