esc
Start typing to search the docs
Navigate Open
The experiment loop

Refine: register, enrich, query

One converted file is a demo. A folder of them is a dataset, and once it's registered you stop opening files one at a time and start asking questions of all of them at once.

A real Rerun recording The converted episode, one of ten in the local dataset. In this article the dataset grows derived layers — keyframes, per-frame blur scores, operator metadata — computed after the fact and registered alongside recordings like this one, without touching them.

Register once, open consistently

The demo's catalog.py step registers every recording with a catalog (pixi run serve starts a local one; the same code points at Rerun Hub unchanged once your data outgrows the laptop) and registers a default blueprint with the dataset. That's why every episode (yours, a teammate's, one from three months ago) opens with the same dashboard instead of an arbitrary heuristic viewport. The catalog tracks each recording as a segment you can list, filter, view, and query.

Enrich with layers, never mutating raw data

Refining means adding derived data: a per-frame blur score, operator metadata, model outputs, quality verdicts. In Rerun these attach as layers: new data carrying the same recording id as their source, registered under a layer name. The raw recording is never modified, and a bad pass is fixed by re-registering. Anything you compute in the future lands the same way.

Query the whole dataset

With the dataset registered, questions become single queries instead of ten file-opens. The demo's notebooks/ show two flavors over the same data, side by side:

  • the DataFrame API (Python), in analyze_dataframe.ipynb: e.g. rank operators by success rate, or read a joint's value at each episode's final timestamp across the dataset;
  • SQL, in analyze_sql.ipynb: the same questions for people who'd rather write a query directly.

Results can be written back as tables that persist next to the dataset and show up in the viewer, so analysis becomes part of the data layer, not a notebook artifact someone loses. The query how-tos live in Query & transform docs.

Explore and extend

  • In the embedded viewer, pick a robot link and scrub, then imagine asking "across every episode, how often does this joint exceed its velocity limit?" That's one query.
  • Open the demo's notebooks/ (analyze_dataframe.ipynb and analyze_sql.ipynb) and read the DataFrame and SQL versions of the same question side by side.

Next

The dataset can now describe its own quality. Train: filter it by a derived signal and stream the survivors straight into training. The training set is a query.