Skip to content

szobov/ros-opentelemetry

Repository files navigation

ROS2 OpenTelemetry Integration Library

A production-grade integration library for instrumenting ROS2 (Robot Operating System 2) applications with OpenTelemetry distributed tracing and observability capabilities. This project provides a comprehensive toolchain for building, deploying, and monitoring ROS2 workspaces with native OpenTelemetry support for both C++ and Python nodes.

Signoz x Lichtblick visualization

Table of Contents

Example

The library also provides a real-world example, utilising MoveIt2-based C++ RobotControl node and Python's TaskProducer.

Prerequisites to build locally

Quick Start

# Clone repository
git clone https://github.com/szobov/ros-opentelemetry.git
cd ros-opentelemetry

To run in Docker:

otel-example.mp4
# To run example telemetry setup
just docker-up-telemetry

# Run example
just docker-up-example

To run the example locally:

run-locally.mp4
# Allow direnv to operate on environment variables
direnv allow .

just setup-conan

just build-locally

# Update env variables with ROS2' required ones
direnv reload

# To run example telemetry setup (or do not run if your didn't stop it from previous example)
just docker-up-telemetry

# run local example with rViz
just run-example-locally

How to use this Library

Installation

There are two ways you can use this library.

Streamlined

You can directly look into example the toolkit is setup. The most of the magic happends inside build-locally.bash. It builds your project together with this library and installs dependencies using conan and uv, and then runs colcon in the virtualenv made by uv, so your Python nodes have access to PyPI packages. If you switch to this method, you'll be able to use conanfile.txt and pyproject.toml files to manage your dependencies without the installation hussle.

Classical

You install opentelemetry-sdk yourself and build it together with the library.

The tricky part is that there are no obvious ways to install PyPI packages so they're available in your ROS2 environment. Plus, for opentelemetry-cpp, you would need to follow the installation guide and build it yourself. If you choose this way, you would need to add ros_opentelemetry_py or ros_opentelemetry_cpp and ros_opentelemetry_interfaces from src/ to your ROS workspace.

Instrumenting Code

After your installation, you need to instrument your code.

OpenTelemetry provides an extensive guidance on code instrumentation for both C++ and Python.

Traces

To make it integrate into your node, you'd need to do the following code in C++:

#include "ros_opentelemetry_cpp/ros_opentelemetry_cpp.hpp"

std::string otlp_grpc_endpoint = "hostname-of-your-otel-collector:4317";
ros_opentelemetry_cpp::setup_tracer("robot_control", otlp_grpc_endpoint);

In Python:

from ros_opentelemetry_py import setup_tracer

# somewhere at the start of your node
if __name__ == "__main__":
    # Expects environment variable OTLP_ENDPOINT set
    setup_tracer("robot_task_producer")

That's the way you connect your tracers to the trace collector.

After that, you can use standard OpenTelemetry tracers to trace your code. In C++:

#include <opentelemetry/trace/span.h>
#include <opentelemetry/trace/tracer.h>
#include <opentelemetry/trace/provider.h>
#include <opentelemetry/trace/scope.h>


auto tracer = opentelemetry::trace::Provider::GetTracerProvider()->GetTracer(
      "name_of_your_component");
auto span = tracer->StartSpan("handleActionOrServiceOrOtherCallback");
{

    auto target_span = tracer->StartSpan("nested_span");
    opentelemetry::trace::Scope scope(span);
    // your code
}

In Python:

from opentelemetry import trace

tracer: trace.Tracer = trace.get_tracer(__name__)

@tracer.start_as_current_span("method_of_your_node")
def method_of_your_node(self, params):
    ...

To connect your nodes, you need to propagate the Trace Context.

For this, you'd need to add ros_opentelemetry_interfaces to your package.xml for your message package.

  <depend>ros_opentelemetry_interfaces</depend>

Update your CMakeLists.txt to link it and add TraceMetadata field to your messages:

ros_opentelemetry_interfaces/TraceMetadata trace_metadata

Then you have to inject trace context into your messages (Action/Service/Topics):

from ros_opentelemetry_py import inject_trace_context

example_msg = ExampleActionMessage.Goal()
example_msg.trace_metadata = inject_trace_context()

And add it, extract it in the other node:

#include <opentelemetry/context/runtime_context.h>

const auto goal = goal_handle->get_goal();
auto extracted_ctx =
ros_opentelemetry_cpp::extract_trace_context(&goal->trace_metadata);
[[maybe_unused]] auto ctx_token =
      opentelemetry::context::RuntimeContext::Attach(extracted_ctx);

After this instrumentation, the traces will be connected between two nodes.

Logs

To connect traces to the logs, you can use traced loggers:

RCLCPP_ERROR_TRACED(this->get_logger(), "logger")
from ros_opentelemetry_py import wrap_logger


self._traced_logger = wrap_logger(self.get_logger())

And update the way you collect the logs in your otel-collector: For example:

receivers:
  filelog:
    include: ["/opt/logs/**/*.log"]
    start_at: end
    multiline:
      line_start_pattern: '^\[\w+\] \[\d+\.\d+\] \[.*\]:'  # To support cases when we output multiline json
    operators:
      - type: regex_parser
        regex: '^\[(?P<level>\w+)\] \[(?P<timestamp>\d+\.\d+)\] \[(?P<source>[^\]]+)\]: (?P<message>.*)$'
        timestamp:
          parse_from: attributes.timestamp
          layout_type: epoch
          layout: "s.ns"
        severity:
          parse_from: attributes.level
      - type: regex_parser
        parse_from: attributes.message
        regex: '^(?:\[trace_id=(?P<trace_id>[0-9a-f]{32})\s+span_id=(?P<span_id>[0-9a-f]{16})\]\s*)?(?P<body>.*)$'

      - type: trace_parser
        trace_id:
          parse_from: attributes.trace_id     # 32-char lowercase hex
        span_id:
          parse_from: attributes.span_id      # 16-char lowercase hex

      - type: move
        from: attributes.body
        to: body

      - type: remove
        field: attributes.message
      - type: remove
        field: attributes.trace_id
      - type: remove
        field: attributes.span_id

This way, collected logs will be connected to the traces.

Architecture Overview

This library bridges the gap between ROS2's robotics-focused middleware and OpenTelemetry's observability ecosystem. It enables distributed tracing across heterogeneous ROS2 node graphs, supporting both ament_cmake (C++) and ament_python package types. The architecture leverages:

  • ROS2: Compatible with multiple ROS2 distributions with improved DDS middleware performance
  • OpenTelemetry SDK: Industry-standard observability instrumentation for traces, metrics, and logs
  • Conan 2.x: C++ dependency management with reproducible builds

The library is backend-agnostic and can integrate with any OpenTelemetry-compatible observability platform. Example configurations are provided for SigNoz, with Grafana support planned for future releases.

This project consists of three main packages:

Contributing

Contributions are welcome, but please note that this is an open-source project, and the maintainer reserves the right to accept or reject your changes. Please ensure just check passes before submitting pull requests.

About

ROS2 x OpenTelemetry: End‑to‑End Telemetry for Robotics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors