esc
Start typing to search the docs
Navigate Open

Rerun Hub

Infrastructure that powers
your data loop

The production backend for the Rerun data layer. Catalog, byte-range indexing, and retrieval that turns your object stores into a queryable, streamable foundation. Run transforms on the edge or close to the data.

Petabyte- scale

Across your object storage, any bucket or region

Direct streaming

Byte-range reads, straight from your storage to compute

Built for physical data

Multi-rate, multimodal data, kept in its native shape

Hub sits under your whole loop

Build your loop your way with the open-source SDK. Hub handles the hard parts: consistency and scale for physical data.

Experiment loop

Teams win by iterating fast on data composition and modeling while scaling data and compute.

Collect
Refine
Train
Deploy

Enabled by Rerun SDK

The same open-source SDK drives every stage, on your own compute and tools. It connects to Hub over an open protocol.

Visualization
SQL / dataframe queries
Post-processing
GPU training

Rerun Hub

The catalog, schema management, byte-range indexing, and streaming that keep physical data consistent and queryable at scale.

Rerun Hub
Object storage
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD
RRD

Everything Hub does with your data

Four capabilities on one catalog over your object storage, already handling petabytes of robot data today.

Query

Query into your recordings with SQL

Run any SQL or dataframe query across your catalog, down into the columns, time ranges, and values inside your recordings, not just their metadata.

Transform

Refine your data without copies

Add derived columns and evolve schemas without breaking history. You run the transforms with the SDK; Hub keeps the derived data and your raw recordings organized together.

Train

Train without an export step

Express a dataset mix as a query and stream it to your GPUs. The dataloader is column-aware and video-codec-aware, so you train directly on your recordings.

Share

Everyone works from the same data

One viewer, the same recordings, shared across the team. Explore, annotate, and trace a failure back to the data that caused it.

Your region. Your storage.

Single-tenant isolation

Your own isolated Hub deployment, run for you in the cloud region you choose.

Your data stays in your buckets

Any S3-compatible bucket, across regions. You decide where it lives.

Enterprise-ready with broad SSO support, self-managed storage options, and all the controls your security team expects. Designed for low friction at any scale: from empowering small research teams to facilitating secure cross-org data sharing.

Ready to scale your data layer?

Book a meeting to see Hub against your stack: your data, your storage, your training cluster.