Abstract
The Operational Design Domain (ODD) of Level 4 (L4) autonomous driving confronts formidable challenges in urban mixed-traffic environments, primarily due to the high density of Vulnerable Road Users (VRUs) and unpredictable interaction behaviors. However, existing open-source datasets predominantly focus on structured scenarios such as highways or regulated intersections, leaving a critical gap in data representing chaotic, unstructured urban environments.
To address this, we propose an efficient, high-precision method for constructing drone-based datasets and establish the Vehicle-Vulnerable Road User Interaction Dataset (VRUD). Distinct from prior works, VRUD is collected from typical "Urban Villages" in Shenzhen, characterized by loose traffic supervision and extreme occlusion. The dataset comprises 4 hours of 4K/30Hz recording, containing 11,479 VRU trajectories and 1,939 vehicle trajectories. A key differentiator of VRUD is its composition: VRUs account for about 87% of all traffic participants, significantly exceeding the proportions in existing benchmarks.
Furthermore, unlike datasets that only provide raw trajectories, we extracted 4,002 multi-agent interaction scenarios based on a novel VTTC threshold, supported by standard OpenDRIVE HD maps. This study provides valuable, rare edge-case resources for enhancing the safety performance of ADS in complex, unstructured urban environments.
Authors
National Key Laboratory of Automotive Chassis Integration and Bionics, Jilin University · Heriot-Watt University · Durham University
Dataset Overview
The detection results of different types of targets in VRUD are represented by bounding boxes in distinct colors, and the tracking trajectories are denoted by thin lines in the corresponding colors.
Collection Sites
Data was collected from two irregular intersections near residential areas in an urban village in Shenzhen, China. The traffic participants are highly diverse: buses, ride-hailing vehicles, food delivery electric bikes, and pedestrians. The surroundings include bus stops, apartments, and snack streets, with no traffic surveillance cameras.
Irregular intersection with two-way single-lane traffic, snack streets and residential apartment complexes.
Two-way single-lane road with residential apartments, bus stops and roadside parking.
Trajectory Visualization
Each subplot visualizes all annotated trajectories: (a) cars, buses, and trucks; (b) pedestrians and cyclists; (c) motorcycles and tricycles. The comparison reveals that VRU trajectory patterns are significantly more disordered and scattered.
(a) Cars, buses, and trucks
(b) Pedestrians and cyclists
(c) Motorcycles and tricycles
Data Validation
Data accuracy was verified using a test vehicle equipped with RT inertial navigation equipment. A soft target vehicle performed chasing maneuvers; relative distance and velocity were compared against ground truth.
Relative distance: drone vs. RT inertial navigation
Soft target vehicle velocity (Vx) comparison
Experimental setup with RT inertial navigation equipment
VTTC-Based Interaction Extraction
We introduce Vector Time to Collision (VTTC) as a Surrogate Safety Measure to quantify interaction relevance. The upper quartile (Q3) value of 1.53s was adopted as the filtering threshold to maximize complex scenario retention while eliminating non-interactive noise.
Multi-agent interaction scenario. Ego vehicle (ID 3913) interacts with highlighted critical targets.
Positive correlation between VRU count, complexity, and mean VTTC.
Dataset Statistics
Categorical distribution and average velocity statistics. Pedestrians and motorcycles predominate; motorcycles show high traffic efficiency in unstructured environments.
Comparison with Existing Datasets
| Dataset | Length | Trajectories | Road User Types | HD Map | Sample Freq | Behavior Extraction |
|---|---|---|---|---|---|---|
| INTERACTION | 16.5 h | 40054 | Pedestrian, bicycle, car | lanelet2 | 10 Hz | no |
| InD | 10.0 h | 13599 | Pedestrian, bicycle, car, bus | lanelet2 | 25 Hz | no |
| SIND | 7.0 h | 13248 | Car, bus, truck, bicycle, motorcycle, tricycle, pedestrian | lanelet2 | 10 Hz | no |
| VRUD (ours) | 4.0 h | 12888 | Car, bus, truck, bicycle, motorcycle, tricycle, pedestrian | OpenDRIVE | 30 Hz | yes |
VRUD vs. inD, SIND, and INTERACTION: VRUs account for nearly half of VRUD, significantly exceeding other datasets.
Behavioral Characterization
The ego-vehicle maintains a tactical velocity corridor (17.5–20.0 km/h) to manage latent conflicts. VTTC distribution consistently clusters around 0.7s, serving as a proxy for interaction relevance. The 0.7s threshold establishes a robust quantitative filter for extracting high-value, interaction-critical samples.
Empirical ego-speed distribution relative to VRU counts. Quasi-normal distribution highlights low-speed maneuvering in urban mixed-traffic.
Characterization of interaction intensity via velocity-VTTC coupling.
Contributions
- Large-scale, high-resolution dataset focusing on chaotic urban environments, featuring diverse VRU types and distinct mixed traffic characteristics.
- Comprehensive data processing pipeline with a standardized scenario library based on novel VTTC threshold, supported by OpenDRIVE HD maps.
- Detailed statistical analysis of VRU behaviors, revealing unique interaction patterns to support downstream ADS tasks.
Data Format
- Trajectory data: Static (dimensions, position, category, heading) and dynamic (velocity, acceleration, yaw rate) attributes in CSV format at 30 Hz.
- Map data: OpenDRIVE HD maps, high-resolution aerial base maps, calibration and conversion coefficients. All geographic information is anonymized.
- Interaction behavior data: Scenario index with Ego-vehicle ID, temporal window, and relevant traffic participant identifiers.
Get the Dataset
VRUD is fully open-source. Download, explore, and contribute.
📦 Download VRUD on GitHub