Currently we support 4 datasets: nuScenes, Waymo Perception, Argoverse 2 Sensor, OpenDV.
-
Download the nuScenes dataset files to
{NUSCENES_TGZ_ROOT}on your file system. After the dataset is downloaded, there will be some*.tgzfiles under path{NUSCENES_TGZ_ROOT}. -
Since the TGZ format does not support random access to content, we recommend converting these files to ZIP format using the following command lines:
mkdir -p {NUSCENES_ZIP_ROOT}
python src/dwm/tools/tar2zip.py -i {NUSCENES_TGZ_ROOT}/v1.0-trainval_meta.tgz -o {NUSCENES_ZIP_ROOT}/v1.0-trainval_meta.zip
python src/dwm/tools/tar2zip.py -i {NUSCENES_TGZ_ROOT}/v1.0-trainval01_blobs.tgz -o {NUSCENES_ZIP_ROOT}/v1.0-trainval01_blobs.zip
python src/dwm/tools/tar2zip.py -i {NUSCENES_TGZ_ROOT}/v1.0-trainval02_blobs.tgz -o {NUSCENES_ZIP_ROOT}/v1.0-trainval02_blobs.zip
...
python src/dwm/tools/tar2zip.py -i {NUSCENES_TGZ_ROOT}/v1.0-trainval10_blobs.tgz -o {NUSCENES_ZIP_ROOT}/v1.0-trainval10_blobs.zip
-
Now the
{NUSCENES_ZIP_ROOT}is ready to update the nuScenes file system of your config file, for example. -
Prepare the HD map data.
-
Optional. When the 3D box conditions are used for training, the 12hz metadata is recommended.
-
Download 12 Hz nuScenes meta from Corner Case Scene Generation. After the metadata is downloaded, there will be
interp_12Hz.tarfile. -
Extract and repack the 12 Hz metadata to
interp_12Hz_trainval.zip, then update the FS and dataset name in the config.
-
python -m tarfile -e interp_12Hz.tar
cd data/nuscenes
python -m zipfile -c ../../interp_12Hz_trainval.zip interp_12Hz_trainval/
cd ../..
rm -rf data/
-
Alternative solution for 5. In the case of a broken download link, you can also regenerate 12 Hz annotations according to the instructions of ASAP from the origin nuScenes dataset.
-
Download the annotation of text prompt and update the config following the section text description for images
There are two versions of the Waymo Perception dataset. This project chooses version 1 (>= 1.4.2) because only this version provides HD map annotation, while version 2 does not provide HD map annotation.
- Optional. The Waymo Perception 1.x requires protobuffer, if you try to avoid installing waymo_open_dataset and its dependencies, you need to compile the proto files. Install the proto buffer compiler, then run following commands to compile proto files. After compilation,
import waymo_open_dataset.dataset_pb2works by addingexternals/waymo-open-dataset/srcto the environmant variablePYTHONPATH.
cd externals/waymo-open-dataset/src
protoc --proto_path=. --python_out=. waymo_open_dataset/*.proto
protoc --proto_path=. --python_out=. waymo_open_dataset/protos/*.proto
-
Download the Waymo Perception dataset (>= 1.4.2 for the annotation of HD map) to
{WAYMO_ROOT}. After the dataset is downloaded, there will be some*.tfrecordfiles under the path{WAYMO_ROOT}/trainingand{WAYMO_ROOT}/validation. -
Then make information JSON files to support inner-scene random access, by
PYTHONPATH=src python src/dwm/tools/dataset_make_info_json.py -dt waymo -i {WAYMO_ROOT}/training -o {WAYMO_ROOT}/training.info.json
PYTHONPATH=src python src/dwm/tools/dataset_make_info_json.py -dt waymo -i {WAYMO_ROOT}/validation -o {WAYMO_ROOT}/validation.info.json
-
Now the
{WAYMO_ROOT}and its information JSON files are ready to update the Waymo dataset of your config file, for example. -
Download the annotation of text prompt and update the config following the section text description for images
-
Download the Argoverse 2 Sensor dataset files to
{ARGOVERSE_ROOT}on your file system. After the dataset is downloaded, there will be some*.tarfiles under path{ARGOVERSE_ROOT}. -
Then make information JSON files to accelerate the loading speed, by:
PYTHONPATH=src python src/dwm/tools/dataset_make_info_json.py -dt argoverse -i {ARGOVERSE_ROOT} -o {ARGOVERSE_ROOT}
-
Now the
{ARGOVERSE_ROOT}is ready to update the Argoverse file system of your config file, for example. -
Download the annotation of text prompt and update the config following the section text description for images
-
Download the OpenDV dataset video files to
{OPENDV_ORIGIN_ROOT}on your file system, and the meta file to{OPENDV_JSON_META_PATH}prepared as JSON format. After the dataset is downloaded, there will be about 2K video files under the path{OPENDV_ORIGIN_ROOT}in the format of.mp4and.webp. -
Optional. It is recommended to transcode the original video files for better read and seek performance during training, by:
apt update && apt install -y ffmpeg
python src/dwm/tools/transcode_video.py -c src/dwm/tools/transcode_video.json -i {OPENDV_ORIGIN_ROOT} -o {OPENDV_ROOT}
- Now the
{OPENDV_ORIGIN_ROOT}(or{OPENDV_ROOT}) is ready to update the OpenDV file system config, and{OPENDV_JSON_META_PATH}to update the dataset config.
Register an account and download the KITTI360 dataset. We only require the LiDAR data from KITTI360, so you only need to download the Raw Velodyne Scans, 3D Bounding Boxes, and Vehicle Poses. The scenes 2013_05_28_drive_0000_sync and 2013_05_28_drive_0002_sync are used for validation, while all other scenes are used as training data. You may keep the files in their original zip format, as our code processes them automatically.
We made the image captions for both nuScenes, Waymo, Argoverse, OpenDV datasets by DriveMLM model. The caption files are available here.
| Dataset | Downloads |
|---|---|
| nuScenes | mini, trainval |
| Waymo | trainval |
| Argoverse | trainval |
| OpenDV | all |
-
Download the packages above and unzip them.
-
You will get some JSON files such as
nuscenes_v1.0-trainval_caption_v2_train.json(text of image captions),nuscenes_v1.0-trainval_caption_v2_times_train.json(indicates the moments of frames selected for image caption annotations in each scenario) -
Update paths to those files in your config. Please notice that the paths in the "image_description_settings" of all the datasets should be updated to your local downloaded and extracted files.