LeRobot Dataset Format¶
LeRobot is a standardized format for robotics datasets that enables easy sharing, reproduction, and benchmarking across different platforms.
Overview¶
LeRobot provides:
- Unified data structure for multi-modal robotics data
- Efficient storage using Parquet format
- Easy loading with Python API
- Hugging Face integration for dataset sharing
- Standardized metadata for reproducibility
Dataset Structure¶
dataset_name/
├── meta/
│ ├── info.json # Dataset metadata
│ ├── tasks.json # Task descriptions
│ └── stats.json # Statistics
├── episodes/
│ ├── episode_000000.parquet
│ ├── episode_000001.parquet
│ └── ...
└── videos/
├── observation.image/
│ ├── episode_000000.mp4
│ └── ...
└── observation.wrist_image/
└── ...
Quick Start¶
Installation¶
Loading a Dataset¶
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
# Load from Hugging Face Hub
dataset = LeRobotDataset(
repo_id="lerobot/pusht",
root="data/"
)
# Access episode
episode = dataset[0]
print(episode.keys())
# dict_keys(['observation.image', 'observation.state', 'action', 'episode_index', 'frame_index', 'timestamp'])
# Get specific observation
image = episode['observation.image'] # torch.Tensor
state = episode['observation.state'] # torch.Tensor
action = episode['action'] # torch.Tensor
Creating a Dataset¶
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
import torch
# Initialize dataset
dataset = LeRobotDataset.create(
repo_id="username/my_dataset",
root="data/my_dataset",
robot_type="franka",
fps=30
)
# Add episode
episode_data = {
'observation.image': [], # List of images
'observation.state': [], # List of robot states
'action': [], # List of actions
'timestamp': [] # List of timestamps
}
for step in range(num_steps):
episode_data['observation.image'].append(camera.get_image())
episode_data['observation.state'].append(robot.get_state())
episode_data['action'].append(robot.get_action())
episode_data['timestamp'].append(time.time())
dataset.add_episode(episode_data)
# Save
dataset.save()
# Push to Hugging Face Hub
dataset.push_to_hub()
Data Format Specification¶
Episode Structure¶
Each episode is stored as a Parquet file with columns:
| Column | Type | Description |
|---|---|---|
episode_index |
int | Episode number |
frame_index |
int | Frame number within episode |
timestamp |
float | Timestamp in seconds |
observation.* |
varies | Observations (images, states, etc.) |
action |
float[] | Action taken |
next.done |
bool | Episode termination flag |
next.success |
bool | Task success flag (optional) |
next.reward |
float | Reward (optional, for RL) |
Observation Types¶
# Image observations (stored as video paths)
'observation.image': str # Path to frame in video
'observation.wrist_image': str
# State observations (stored as arrays)
'observation.state': float[] # Robot joint positions
'observation.velocity': float[] # Joint velocities
# Task information
'observation.goal': float[] # Goal state/position
'instruction': str # Natural language instruction
Action Format¶
# Continuous actions
action = np.array([x, y, z, roll, pitch, yaw, gripper])
# Normalized to [-1, 1]
action = (action - action_min) / (action_max - action_min) * 2 - 1
Metadata¶
info.json¶
{
"fps": 30,
"robot_type": "franka",
"total_episodes": 1000,
"total_frames": 50000,
"video_codec": "h264",
"shapes": {
"observation.image": [3, 224, 224],
"observation.state": [7],
"action": [7]
},
"names": {
"observation.state": ["q0", "q1", "q2", "q3", "q4", "q5", "q6"],
"action": ["x", "y", "z", "roll", "pitch", "yaw", "gripper"]
}
}
tasks.json¶
{
"tasks": [
{
"task_index": 0,
"task_name": "pick_red_cube",
"episodes": [0, 1, 2, 10, 11, 12]
},
{
"task_index": 1,
"task_name": "pick_blue_cube",
"episodes": [3, 4, 5, 13, 14, 15]
}
]
}
Using with PyTorch¶
DataLoader Integration¶
from torch.utils.data import DataLoader
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset
dataset = LeRobotDataset("lerobot/pusht")
# Create DataLoader
dataloader = DataLoader(
dataset,
batch_size=32,
shuffle=True,
num_workers=4,
collate_fn=dataset.collate_fn
)
# Training loop
for batch in dataloader:
observations = batch['observation']
actions = batch['action']
# Train model
predicted_actions = model(observations)
loss = criterion(predicted_actions, actions)
loss.backward()
Custom Transforms¶
from torchvision import transforms
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.RandomCrop((200, 200)),
transforms.ColorJitter(brightness=0.2),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
dataset = LeRobotDataset(
"lerobot/pusht",
image_transforms=transform
)
Best Practices¶
Data Organization¶
- One task per dataset: Keep related episodes together
- Consistent structure: Same observations across all episodes
- Metadata completeness: Fill in all relevant metadata
- Video compression: Use H.264 for efficient storage
Performance Tips¶
# Use video backend for fast image loading
dataset = LeRobotDataset(
"lerobot/pusht",
video_backend="pyav" # Faster than default
)
# Pre-load episodes for faster access
dataset.preload_episodes(range(100))
# Use memory mapping for large datasets
dataset = LeRobotDataset(
"lerobot/pusht",
use_mmap=True
)
Examples¶
Example Datasets¶
Browse available datasets:
from lerobot.common.datasets.lerobot_dataset import available_datasets
# List all available datasets
datasets = available_datasets()
for name in datasets:
print(name)
# Output:
# lerobot/pusht
# lerobot/aloha_static
# lerobot/xarm_lift
# ...
Popular datasets:
lerobot/pusht: 2D pushing tasklerobot/aloha_static: Bimanual manipulationlerobot/xarm_lift: Object liftinglerobot/koch_pick: Pick and place
Next Steps¶
- Format Specification - Detailed format spec
- Usage Guide - Advanced usage patterns
- Examples - Complete examples
- VLA Training - Use LeRobot data for VLA
- IL Training - Use LeRobot data for IL