TartanGround Dataset
Welcome to the TartanGround dataset documentation — a large-scale dataset for ground robot perception and navigation.
Quick Links
🌐 Dataset Webpage |
|
📄 Paper |
|
💻 GitHub Repository |
|
📊 Metadata |
Installation
1# 1. Create and activate conda environment
2conda create -n tartanground python=3.9
3conda activate tartanground
4
5# 2. Clone repository with all submodules
6git clone --recursive git@github.com:castacks/tartanairpy.git
7cd tartanairpy
8
9# 3. Ensure submodules are up to date
10git submodule update --init --recursive
11
12# 4. Install the package
13 pip install -e .
Note
Make sure you have git
and conda
installed on your system before proceeding.
Dataset Structure
The TartanGround dataset is organized hierarchically by environment and robot type. Each environment contains data for multiple robot platforms with comprehensive sensor modalities.
Directory Layout
1TartanGround_Root/
2├── AbandonedCable/
3│ ├── AbandonedCable_rgb.pcd # Global RGB point cloud
4│ ├── AbandonedCable_sem.pcd # Global Semantic point cloud
5│ ├── seg_label_map.json # Semantic segmentation label map
6│ ├── Data_omni/ # Omnidirectional robot data
7│ │ ├── P0000/
8│ │ │ ├── image_lcam_front/
9│ │ │ ├── depth_lcam_front/
10│ │ │ ├── seg_lcam_front/
11│ │ │ ├── imu/
12│ │ │ ├── lidar/
13│ │ │ ├── pose_lcam_front.txt
14│ │ │ ├── P0000_metadata.json
15│ │ │ ├── image_lcam_left/
16│ │ │ └── ...
17│ │ └── P00XX/
18│ ├── Data_diff/ # Differential drive robot data
19│ │ ├── P1000/
20│ │ └── P10XX/
21│ └── Data_anymal/ # Quadrupedal robot data
22│ ├── P2000/
23│ └── P20XX/
24├── AbandonedFactory/
25│ └── (same structure as above)
26└── ...
Robot Platforms
Robot |
Type |
Trajectory IDs |
Description |
---|---|---|---|
|
Omnidirectional |
|
Holonomic movement in all directions |
|
Differential Drive |
|
Differential wheeled robot |
|
Quadrupedal |
|
Legged robot for complex terrains |
Camera Configuration
Each robot is equipped with a stereo 6-cam setup providing full 360° coverage (similar to Tartanair-v2: Modalities).
Parameter |
Value |
---|---|
Camera Positions |
|
Field of View |
90° (each camera) |
Resolution |
640×640 pixels |
Stereo Configuration |
Left ( |
Available Camera Names
# Left cameras
['lcam_front', 'lcam_right', 'lcam_left', 'lcam_back', 'lcam_top', 'lcam_bottom']
# Right cameras
['rcam_front', 'rcam_right', 'rcam_left', 'rcam_back', 'rcam_top', 'rcam_bottom']
Sensor Modalities
The dataset provides multi-modal sensor data to support various robotic perception tasks:
RGB Images (
image
): Color images for visual perceptionDepth Maps (
depth
): Accurate depth information for 3D scene understandingSemantic Segmentation (
seg
): Pixel-wise semantic labelsIMU Data (
imu
): Inertial measurement unit dataLiDAR Point Clouds (
lidar
): 3D point cloud data (32 Beam simulated LiDAR)Robot Poses (
meta
): Ground truth 6-DOF poses and metadata including robot heightGlobal Point Clouds:
RGB Point Clouds (
rgb_pcd
): Colored 3D representations of entire environmentsSemantic Point Clouds (
sem_pcd
): Point clouds with semantic labels
Segmentation Labels (
seg_labels
): Label mappings for semantic segmentation tasksROS Bags (
rosbag
): Proprioceptive data with joint states (available foranymal
robot only)
Available Modalities
# Complete list of available modalities
['image', 'meta', 'depth', 'seg', 'lidar', 'imu', 'rosbag', 'sem_pcd', 'seg_labels', 'rgb_pcd']
Note
The rosbag
modality is only available for the anymal
(quadrupedal) robot version.
Download Dataset
The TartanGround dataset can be downloaded using the tartanairpy Python toolkit. The repository includes ready-to-use examples in examples/download_ground_example.py
.
Download Examples
Example 1 – Download all modalities for specific environments and robots:
import tartanair as ta
ta.init('/path/to/tartanground/root')
env = ["AbandonedFactory", "ConstructionSite", "Hospital"]
ta.download_ground(
env = env,
version = ['omni', 'diff', 'anymal'],
traj = [],
modality = [
'image', 'meta', 'depth', 'seg', 'lidar', 'imu',
'rosbag', 'sem_pcd', 'seg_labels', 'rgb_pcd'
],
camera_name = ['lcam_front', 'lcam_right', 'lcam_left', 'lcam_back'],
unzip = False
)
Example 2 – Download one trajectory from each environment (Omnidirectional robot only):
import tartanair as ta
ta.init('/path/to/tartanground/root')
ta.download_ground(
env = [],
version = ['omni'],
traj = ['P0000'],
modality = [],
camera_name = ['lcam_front'],
unzip = False
)
Example 3 – Download semantic occupancy data only:
import tartanair as ta
ta.init('/path/to/tartanground/root')
ta.download_ground(
env = [],
version = [],
traj = [],
modality = ['seg_labels', 'sem_pcd'],
camera_name = [],
unzip = False
)
Example 4 – Download entire dataset (~15 TB):
import tartanair as ta
ta.init('/path/to/tartanground/root')
ta.download_ground(
env = [],
version = [],
traj = [],
modality = [],
camera_name = [],
unzip = False
)
Multi-threaded Download
For faster downloads, use the multi-threaded version:
import tartanair as ta
ta.init('/path/to/tartanground/root')
ta.download_ground_multi_thread(
env = env,
version = ['omni', 'diff', 'anymal'],
traj = [],
modality = [
'image', 'meta', 'depth', 'seg', 'lidar', 'imu',
'rosbag', 'sem_pcd', 'seg_labels', 'rgb_pcd'
],
camera_name = ['lcam_front', 'lcam_right', 'lcam_left', 'lcam_back'],
unzip = False,
num_workers = 8
)
Download Process Notes
Important Download Information
Pre-download Analysis: The script lists all files and calculates total space requirements before downloading
File Format: Data is downloaded as zip files
Post-processing: Use
examples/unzip_ground_files.py
to extract the downloaded dataMulti-threading: Adjust
num_workers
parameter based on your system capabilities and network bandwidth
See also
Full Examples: Comprehensive download scripts are available in examples/download_ground_example.py
in the GitHub repository.
Dataset Statistics
Metric |
Value |
---|---|
Total Environments |
63 diverse scenarios |
Total Trajectories |
878 trajectories |
Robot Platforms |
3 types (Omnidirectional, Differential, Quadrupedal) |
Trajectory Distribution |
440 omni, 198 diff, 240 legged |
Total Samples |
1.44 million samples |
RGB Images |
17.3 million images |
Dataset Size |
~16 TB |
Samples per Trajectory |
600-8,000 samples |
Camera Resolution |
640×640 pixels |
Data Collection Frequency |
10 Hz |
Environment Categories
Category |
Description |
---|---|
Indoor |
Indoor spaces with complex layouts and lighting |
Nature |
Natural environments with varied vegetation, natural terrain and seasons |
Rural |
Countryside settings with varied topography |
Urban |
City environments with structured layouts |
Industrial/Infrastructure |
Construction sites and industrial facilities |
Historical/Thematic |
Heritage sites and specialized environments |
Semantic Occupancy Maps
The TartanGround dataset supports semantic occupancy prediction research by providing tools to generate local 3D occupancy maps with semantic class labels. These maps are essential for training and evaluating neural networks that predict semantic occupancy from sensor observations.
Use Cases
Semantic Occupancy Prediction: Train networks to predict 3D semantic structure from 2D observations
3D Scene Understanding: Evaluate spatial reasoning capabilities of perception models
Navigation Planning: Generate semantically-aware path planning datasets
Multi-modal Fusion: Combine RGB, depth, and LiDAR data for 3D semantic mapping
Workflow Overview
The semantic occupancy map generation follows a two-step process:
Step 1: Generation - Extract local occupancy maps around each robot pose
Step 2: Visualization - Inspect and validate the generated occupancy maps
Step 1: Generate Semantic Occupancy Maps
Use the GPU-accelerated script to extract local semantic occupancy maps from global point clouds:
# Generate occupancy maps for a specific trajectory
python examples/subsample_semantic_pcd_gpu.py \
--root_dir /path/to/tartanground/dataset \
--env ConstructionSite \
--traj P0000
Key Parameters:
# Customize occupancy map properties
python examples/subsample_semantic_pcd_gpu.py \
--root_dir /path/to/dataset \
--env ConstructionSite \
--traj P0000 \
--resolution 0.1 \ # Voxel size in meters
--x_bounds -20 20 \ # Local X bounds [min, max]
--y_bounds -20 20 \ # Local Y bounds [min, max]
--z_bounds -3 5 \ # Local Z bounds [min, max]
--subsample_poses 10 # Process every 10th pose
Output:
The script generates .npz
files in {trajectory}/sem_occ/
containing:
occupancy_map: 3D voxel grid with semantic class IDs
pose: Robot pose [x, y, z, qx, qy, qz, qw]
bounds: Local coordinate bounds
resolution: Voxel resolution in meters
class_mapping: Semantic class ID to RGB color mapping
GPU Requirements
This script requires:
NVIDIA GPU with CUDA support
CuPy library for GPU acceleration
Sufficient GPU memory (8GB+ recommended for large environments)
Step 2: Visualize Occupancy Maps
Use the interactive viewer to inspect generated occupancy maps and verify their quality:
# Visualize occupancy maps with navigation controls
python examples/visualize_semantic_occ_local.py \
--root_dir /path/to/tartanground/dataset \
--env ConstructionSite \
--traj P0000 \
--skip_samples 100
Customization Options:
# Customize visualization appearance
python examples/visualize_semantic_occ_local.py \
--root_dir /path/to/dataset \
--env ConstructionSite \
--traj P0000 \
--skip_samples 50 \ # Show every 50th occupancy map
--point_size 12.0 \ # Larger point visualization
--background white # White background
Dataset Integration
File Structure:
After running the generation script, your dataset structure will include:
TartanGround_Root/
├── ConstructionSite/
│ ├── ConstructionSite_sem.pcd # Global semantic point cloud (input)
│ ├── Data_omni/
│ │ ├── P0000/
│ │ │ ├── pose_lcam_front.txt # Robot poses (input)
│ │ │ ├── sem_occ/ # Generated occupancy maps
│ │ │ │ ├── semantic_occupancy_000000.npz
│ │ │ │ ├── semantic_occupancy_000001.npz
│ │ │ │ └── ...
│ │ │ └── (other sensor data)
│ │ └── P0001/
│ └── seg_label_map.json # Semantic class names
Loading Occupancy Maps in Python:
import numpy as np
# Load a single occupancy map
data = np.load('semantic_occupancy_000000.npz', allow_pickle=True)
occupancy_map = data['occupancy_map'] # 3D array with class IDs
pose = data['pose'] # Robot pose [x,y,z,qx,qy,qz,qw]
bounds = data['bounds'] # Local bounds [x_min,x_max,y_min,y_max,z_min,z_max]
resolution = data['resolution'] # Voxel size in meters
class_mapping = data['class_mapping'] # Class ID to RGB mapping
print(f"Occupancy map shape: {occupancy_map.shape}")
print(f"Resolution: {resolution}m per voxel")
print(f"Occupied voxels: {np.sum(occupancy_map > 0)}")
See also
Complete Examples: Full parameter documentation and advanced usage examples are available in the GitHub repository examples folder.
Citation
If you use the TartanGround dataset in your research, please cite:
@article{patel2025tartanground,
title={TartanGround: A Large-Scale Dataset for Ground Robot Perception and Navigation},
author={Patel, Manthan and Yang, Fan and Qiu, Yuheng and Cadena, Cesar and Scherer, Sebastian and Hutter, Marco and Wang, Wenshan},
journal={arXiv preprint arXiv:2505.10696},
year={2025}
}
Support and Contact
For technical issues, questions, or bug reports, please open an issue on the GitHub repository.
For applications and interesting dataset uses, visit the dataset webpage.