Profile Images and Annotations for Vehicle Re-identification Algorithms (PRIMAVERA)

Dataset

Description

This dataset contains 636,246 profile images of vehicles representing 13,963 unique vehicles. The data was collected by a set of roadside sensors over the course of three years. Each time a vehicle passed by one of the sensors, a series of images was collected. The images were processed to detect and localize each vehicle, and a license plate reader collocated with the sensor was used to provide a unique ID for the vehicle. Actual license plate numbers have been obfuscated by replacing with an arbitrary numerical ID for each vehicle. After localizing the vehicle in each image, the original RGB image was rotated, scaled, and shifted to produce a new RGB image of size 234x234 pixels such that the outermost two wheels are located at predetermined pixel locations in the image. In this way, all vehicle images are aligned to one another. This registration process occasionally results in a portion of certain vehicles being cutoff at the edges of the image. The dataset has been partitioned into two sets called "training" and "validation". The two partitions no common vehicles, i.e., a vehicle present in one partition is guaranteed not to be present in the other. In this way, an algorithm can be validated against a set of new vehicles that were not seen during the training process. The training set contains 543,926 images from 64,440 vehicle passes representing 11,918 unique vehicles, while the validation set contains 92,320 images from 10,991 vehicle passes representing 2,045 unique vehicles. Vehicle images are organized by directories corresponding to unique vehicles. The file naming scheme is as follows: veh_{vehID}_tr_{passID}_{frameID}_{elevation}_{timeofday}.jpg where {vehID} is the vehicle ID (unique across the entire dataset), {passID} is an identifier for each tracked vehicle pass (unique across the entire dataset), {frameID} is the index of the frame within the given vehicle pass starting at 0, {elevation} is a two-letter string indicating whether the sensor was elevated ("el") or at ground-level ("gl"), and {timeofday} is a two-letter string indicating whether the image was captured during daytime ("dt") or nighttime ("nt").

Cite this