The Data Layer for Physical AI

The data primitives to build, understand, and improve your data loop. Designed for multi-rate, multimodal data, from the first recording to massive scale.

Get started Talk to us

10K+

GitHub stars

3 supported languages

End-to-end

Collection to training

End-to-end learning
needs end-to-end data

End-to-end models learn from multi-rate, multimodal sequences. So every stage of the loop works on that same data: collect, refine, train, deploy. Rerun is one layer underneath all of it.

Experiment loop

Teams win by iterating fast on data composition and modeling while scaling data and compute.

Collect

Teleop, fleet logs, human-data rigs, sim, and web video, all in one format.

Refine

Add columns and run CV transforms. Curate with dataframes or SQL.

Train

Stream dataset mixes straight to GPUs. No export jobs, no idle waiting.

Deploy

Evaluate rollouts and trace failures back to the data that caused them.

Rerun SDK

Open source SDK that makes it easy to build the applications and compute you need to create physical intelligence.

df.select(...).where(...)

for batch in dataloader:

Rerun Hub

Data catalog and backend for large-scale storage, access, and streaming from object storage. Makes Rerun SDK scale.

Two parts. One data layer.

The Rerun SDK is the open-source toolchain physical AI teams build on. Rerun Hub is the production backend that connects it to your data at scale.

Open source

Rerun SDK

Open-source SDK for logging, storing, querying, visualizing, and training on multi-rate, multimodal data. Includes viewer, query library, visualization framework, CLI, and the file format the data layer is built on.

pip install rerun-sdk
rerun

pip install rerun-sdk
Interactive viewer + visualization framework
Dataframe query library built in
Multi-rate, multimodal, spatial native
Python, Rust, and C++ APIs
Open source. Start in two minutes.

Explore the SDK GitHub

Commercial

Rerun Hub

The production backend for the Rerun data layer. Catalog, byte-range indexing, and retrieval that turns your object stores into a queryable, streamable foundation. Run transforms on the edge or close to the data.

Query across multiple object stores
Byte-range indexing for fast lookups
Stream dataset mixes directly to training
Catalog and metadata management
Deployed to your preferred cloud region

Explore Hub Book a meeting

How they fit together

You log, query, visualize, transform, and train with the same open-source SDK and the same APIs either way. Hub doesn't change what you do. It changes what runs underneath.

Rerun SDK Open source

Rerun Hub Commercial

Your data

Rerun SDK Local .rrd files on one machine

Rerun Hub Your object storage, streamed at petabyte scale

The catalog

Rerun SDK In-process, for local work

Rerun Hub Persistent and managed, indexed across your object stores

Access

Rerun SDK Just you

Rerun Hub Your team, with SSO and authenticated share links

Running it

Rerun SDK pip install, self-hosted

Rerun Hub Run for you, single-tenant, in your cloud region of choice

Iterate faster on robot learning data

Build, understand, and improve your data loop on one set of primitives, from your first recording to massive scale. Start with the SDK, or talk to us about Hub.

Install the SDK

Book a meeting