The experiment loop

Refine: register, enrich, query

One converted file is a demo. A folder of them is a dataset, and once it's registered you stop opening files one at a time and start asking questions of all of them at once.

A real Rerun recording The converted episode, one of ten in the local dataset. In this article the dataset grows derived layers (keyframes, per-frame blur scores, operator metadata), computed after the fact and registered alongside recordings like this one, without touching them.

Register once, open consistently

The demo's catalog.py step registers every recording with a catalog (pixi run serve starts a local one; the same code points at Rerun Hub unchanged once your data outgrows the laptop) and registers a default blueprint with the dataset. That's why every episode (yours, a teammate's, one from three months ago) opens with the same dashboard instead of an arbitrary heuristic viewport. The catalog tracks each recording as a segment you can list, filter, view, and query.

Enrich with layers, never mutating raw data

Refining means adding derived data: a per-frame blur score, operator metadata, model outputs, quality verdicts. In Rerun these attach as layers: new data carrying the same recording id as their source, registered under a layer name. The raw recording is never modified, and a bad pass is fixed by re-registering. Anything you compute in the future lands the same way.

Query the whole dataset

With the dataset registered, questions become single queries instead of ten file-opens. The demo's notebooks/ show two flavors over the same data, side by side:

the DataFrame API (Python), in analyze_dataframe.ipynb: e.g. rank operators by success rate, or read a joint's value at each episode's final timestamp across the dataset;
SQL, in analyze_sql.ipynb: the same questions for people who'd rather write a query directly.

Results can be written back as tables that persist next to the dataset and show up in the viewer, so analysis becomes part of the data layer, not a notebook artifact someone loses. The query how-tos live in Query & transform docs.

Explore and extend

In the embedded viewer, pick a robot link and scrub, then imagine asking "across every episode, how often does this joint exceed its velocity limit?" That's one query.
Open the demo's notebooks/ (analyze_dataframe.ipynb and analyze_sql.ipynb) and read the DataFrame and SQL versions of the same question side by side.

The dataset can now describe its own quality. Train: filter it by a derived signal and stream the survivors straight into training. The training set is a query.

Refine: register, enrich, query

Register once, open consistently

Enrich with layers, never mutating raw data

Query the whole dataset

Explore and extend

Next