View Operations

Robotics data has many sensors and many columns. In order to more narrowly specify relevant content for further dataframe operations you first generate a view. This view can filter on episode, time, column name etc. This example shows specific instances highlighting these capabilities.

The dependencies in this example are contained in rerun-sdk[all].

Setup setup

Simplified setup. Perform imports and launch local server for demonstration. Extract an initial view observations that we later refine before generating a dataframe.

from __future__ import annotations

from pathlib import Path

import pyarrow as pa
import rerun as rr
from datafusion import col

sample_5_path = Path(__file__).parents[4] / "tests" / "assets" / "rrd" / "sample_5"

server = rr.server.Server(datasets={"sample_dataset": sample_5_path})
CATALOG_URL = server.address()
client = rr.catalog.CatalogClient(CATALOG_URL)
dataset = client.get_dataset(name="sample_dataset")
observations = dataset.filter_contents("/observation/**")

Filtering on episode and time filtering-on-episode-and-time

Limit the scope of the view so that only a subset of data will ever be considered client side. Pick a specific episode by id, and a time range.

episode = "ILIAD_50aee79f_2023_07_12_20h_55m_08s"
start = 1689220508
end = start + 5
filtered_view = dataset.filter_segments(episode).filter_contents("/observation/**")
filtered_df = filtered_view.reader(index="real_time")
filtered_df = filtered_df.filter((col("real_time") >= start) & (col("real_time") < end))

Querying static data querying-static-data

So far we've been selecting real_time as the index. However, some data might not change in time (e.g. static transformations) and thus isn't aligned with a time index. We specify this static timeline with None.

instructions = dataset.filter_contents("/language_instruction/**").reader(index=None)

# Sort to ensure documented output is always correct
instructions = instructions.sort("/language_instruction:TextDocument:text")
instructions_tbl = pa.table(instructions)
instructions_tbl["/language_instruction:TextDocument:text"][0]