A Rerun catalog can feed training pipelines two ways: export recordings to a standard format, or stream them directly into a PyTorch DataLoader.
Export to a training format
The catalog exposes recordings as queryable DataFrames via DataFusion. Multi-rate sensor streams can be time-aligned and columns of interest extracted, with the result written to whatever format a training pipeline expects.
See Export recordings to LeRobot datasets for a worked example.
Train directly from the catalog
The experimental rerun.experimental.dataloader module wraps a catalog as iterable or map-style PyTorch datasets, with no intermediate export step.
Sample space
Three things describe a dataset (see reference):
DataSourceβ a catalogDatasetEntrywith an optional segment filter; each registered RRD is one segment, typically one episode or trajectoryindexβ the timeline that defines what "one sample" means (e.g."frame_index"or"real_time")fieldsβ a dict ofFields, each mapping a source column (anentity:Archetype:componenttriple) to a decoder
SampleIndex pre-computes the full sample space from lightweight per-segment index-range metadata β one query per segment, not a scan of the data.
For timestamp timelines, FixedRateSampling defines the sampling grid and the server handles drift between grid and real row positions via fill_latest_at.
Decoders
Each Field has a ColumnDecoder (_decoders.py) that converts a raw Arrow column to a torch.Tensor:
NumericDecoderβ scalars and numeric listsImageDecoderβ JPEG/PNG blobsVideoFrameDecoderβ compressed video (h264/h265/av1)
Windows
Field(window=(start, end)) returns a slice of values across an inclusive range relative to the current sample rather than a single value.
This is how action chunks and observation history are expressed.
Dataset styles
RerunIterableDatasetβ streaming with automatic shuffling and cross-worker and DDP partitioningRerunMapDatasetβ random access by global index; works with PyTorch samplers likeDistributedSamplerandWeightedRandomSampler
See Train PyTorch models with Rerun for usage.