Convert existing data to Rerun

There are a variety of ways to convert data into an RRD. When filetypes are opened in the viewer they go through our dataloaders.

For example, there's a built-in dataloader for MCAP files and we also have a few command line options for converting MCAP data directly into an RRD. This works great for message types that are supported by the built-in dataloader - however, the most general solution to support arbitrary message types is the logging API.

Converting existing data to RRD converting-existing-data-to-rrd

This guide covers the two recommended approaches: recording.log (row-oriented) and recording.send_columns (columnar). Both produce identical .rrd output.

Quick comparison quick-comparison

Rerun offers two APIs that we will use for conversion. Both produce identical .rrd files:

	`recording.log`	`recording.send_columns`
API style	Row-oriented: one entity per call	Columnar: many timestamps per call
Best for	Live streaming, prototyping, simple conversions	Batch conversion of large datasets
Performance	Lower throughput, no batch latency	~3–10x faster for batch workloads
Typical use cases	Sensor streams, simple scripts	Bulk data conversion
Language support	Python, Rust, C++	Python, Rust, C++

When to use which when-to-use-which

Use recording.log when:

Your dataset is small and performance isn't critical
Implementation simplicity is the priority

Use recording.send_columns when:

You're doing batch conversion of large recorded datasets
You have high-frequency signals (transforms, IMU, joint states)

Here are timings from a real-world MCAP conversion with custom Protobuf messages (~21k messages total):

	`recording.log`	`recording.send_columns`
Video frames (2,363 msgs)	0.12s	0.01s
Transforms (16,505 msgs)	0.84s	0.08s
Other messages (2,354 msgs)	0.09s	0.01s
Total Rerun logging time	1.33s	0.10s

Note: These are example timings from a specific dataset. Actual performance will vary. The relative speedup (10-13x here) is typical for the Rerun logging step of batch conversions.

Map to archetypes map-to-archetypes

Regardless of which API you use, the goal is to map your custom data into Rerun archetypes.

When writing your converter, the first question for each message type is: What is the proper Rerun archetype?

For example, transforms and poses map to Transform3D and InstancePoses3D, an image to Image, point clouds to Points3D.
For data that does not map cleanly to existing Archetypes, you can use AnyValues for simple key-value pairs, or DynamicArchetype when you want to group related fields under a named archetype. Both appear in the dataframe view and are queryable, but don't specify visual qualities as explicitly.

Converter structure with `recording.log` converter-structure-with-recordinglog

Full working example: Converting MCAP Protobuf data using recording.log

Here is an example of how you could build a converter using log calls. We use handler functions for each message type we want to convert. Each handler sets timestamps and logs directly.

First, we add an utility to manage logging timestamps:

def set_mcap_message_times(rec: rr.RecordingStream, msg: McapMessage) -> None:
    """
    Set both MCAP message timestamps on the recording stream.

    log_time_ns: when the message was logged by the recorder
    publish_time_ns: when the message was published
    """
    rec.set_time(timeline="message_log_time", timestamp=np.datetime64(msg.log_time_ns, "ns"))
    rec.set_time(timeline="message_publish_time", timestamp=np.datetime64(msg.publish_time_ns, "ns"))

Then we specify how to convert specific kinds of messages:

def compressed_video(rec: rr.RecordingStream, msg: McapMessage) -> bool:
    """Convert CompressedVideo messages to Rerun VideoStream."""
    if msg.proto_msg.DESCRIPTOR.name != "CompressedVideo":
        return False

    video_blob = rr.VideoStream(
        codec=msg.proto_msg.format,
        sample=msg.proto_msg.data,
    )
    set_mcap_message_times(rec, msg)
    rec.log(msg.topic, video_blob)
    return True

Finally, we loop over all messages and log them:

with open(path_to_mcap, "rb") as f:
    reader = make_reader(f, decoder_factories=[DecoderFactory()])
    for _schema, channel, message, proto_msg in reader.iter_decoded_messages():
        msg = McapMessage(
            topic=channel.topic,
            log_time_ns=message.log_time,
            publish_time_ns=message.publish_time,
            proto_msg=proto_msg,
        )
        if camera_calibration(rec, msg):
            continue
        if compressed_video(rec, msg):
            continue
        if transform_msg(rec, msg):
            continue
        if implicit_convert(rec, msg):
            continue
        print(f"Unhandled message on topic {msg.topic} of type {msg.proto_msg.DESCRIPTOR.name}")

Converter structure with `recording.send_columns` converter-structure-with-recordingsendcolumns

Full working example: Converting MCAP Protobuf data using recording.send_columns

Our example for send_columns works differently because it sends data in batches instead of single log calls. For this purpose, our handlers first extract data into collector utilities. These collectors first accumulate data during iteration and then send it in bulk after the loop.

Note: the ColumnCollector used below is a user-defined helper class (not part of the Rerun SDK) that accumulates time-indexed data and sends it via send_columns. See the full example for its implementation.

with open(path_to_mcap, "rb") as f:
    reader = make_reader(f, decoder_factories=[DecoderFactory()])
    for _schema, channel, message, proto_msg in reader.iter_decoded_messages():
        msg = McapMessage(
            topic=channel.topic,
            log_time_ns=message.log_time,
            publish_time_ns=message.publish_time,
            proto_msg=proto_msg,
        )

        # Static-only: camera calibration
        if camera_calibration(rec, msg, logged_static_calibrations):
            continue

        # Time-series: compressed video
        if frame := compressed_video(msg):
            entity_path = msg.topic
            if entity_path not in video_collectors:
                video_collectors[entity_path] = ColumnCollector(entity_path, rr.VideoStream)
                rec.log(entity_path, rr.VideoStream(codec=frame.codec), static=True)
            video_collectors[entity_path].append(
                indexes={"message_log_time": frame.log_time_ns, "message_publish_time": frame.publish_time_ns},
                sample=frame.data,
            )
            continue

        # Time-series: transforms (static transforms logged inside handler)
        if transforms := transform_msg(rec, msg, logged_static_transforms):
            for t in transforms:
                if t.entity_path not in transform_collectors:
                    transform_collectors[t.entity_path] = ColumnCollector(t.entity_path, rr.Transform3D)
                transform_collectors[t.entity_path].append(
                    indexes={"message_log_time": t.log_time_ns, "message_publish_time": t.publish_time_ns},
                    translation=t.translation,
                    quaternion=rr.Quaternion(xyzw=t.quaternion),
                    parent_frame=t.parent_frame,
                    child_frame=t.child_frame,
                )
            continue

        # Fallback: any unhandled message as DynamicArchetype
        archetype_name, entity_path, components = implicit_collect(msg)
        if entity_path not in dynamic_collectors:
            dynamic_collectors[entity_path] = ColumnCollector(
                entity_path, rr.DynamicArchetype, archetype_name=archetype_name
            )
        dynamic_collectors[entity_path].append(
            indexes={"message_log_time": msg.log_time_ns, "message_publish_time": msg.publish_time_ns},
            **components,
        )

# Send all collected time-series data using columnar API
send_collected_columns(rec, video_collectors, transform_collectors, dynamic_collectors)

Rerun