The relay's job: translate agent-to-agent activity into OTel spans, with no protocol changes visible to the agents themselves.
The persistence substrate is OTel spans. The wire format implemented today is A2A. The spec is shaped so other wire formats (and the Agent Channel coordination layer documented in channels-protocol.md, implemented in the channels/ package) map onto the same span / session / graph shape. Agents never read raw spans. Drop the relay between two existing agent peers and they coordinate normally, except every exchange is now a queryable trace.
This document is the v0.4 protocol shape. v0.4 retires the colony / multi-tenant framing from tracing.bootstrap() (see otel-a2a-relay#121) to align with the local-only substrate shape that luca adopted on 2026-05-12:
<namespace>.colonyresource attribute renamed to<namespace>.deployment. The old key encoded a per-colony multi-tenant deployment that is not a real use case for a local-only substrate.product_areaparameter dropped. Was the "hard-boundary slice within a deployment" knob - a per-tenant routing concept that does not exist in the local-only shape. Phoenix project name is now derived from<deployment>alone.- Bootstrap docstring de-coloned. No more "enterprise install" / "colony-defined" language.
v0.2 renamed wire-protocol attributes from a2a.* to o2r.* and added the tracing.bootstrap() entrypoint. v0.3 layered on data-legibility additions surfaced by real Phoenix sessions (see otel-a2a-relay#93):
- Sessions propagate via OpenInference's
using_session(...)context manager, not just hand-setsession.idattributes. Every relay-emitted span sits insideusing_session(context_id)so any nested OpenInference-instrumented call inherits the session ID. - Every span carries
agent.role. Workers, validators, planners, orchestrators are no longer anonymous in per-role analysis. - Every erroring relay span carries
o2r.relay.failure_class. Coarse, machine-readable bucket (topology_violation,peer_disconnect,peer_404,timeout,peer_jsonrpc_error,unknown). Mirrors the Phoenixrelay_failure_classannotation config thatscripts/phoenix_bootstrap.pyprovisions. tracing.bootstrap()no longer auto-emits atracing.session.startsmoke span. It is opt-in viaemit_readme_span=True. Default flow keeps the project list clean.
See "Phoenix-validated" below for which v0 claims survived and which were rewritten.
Phoenix-only OTLP/HTTP for v1. The relay process exports directly to Phoenix's collector port. No Tempo, no separate OTel Collector, no fan-out. Pluggability is a one-line endpoint swap, documented but not shipped.
One TracerProvider per relay process. Agent identity rides on span attributes, not on the Resource. (See finding 1.)
session.id = sha256("<repo>:<issue>")[:16] for GitHub-issue-rooted channels. Deterministic, stable across reconnects, no collector-side coordination. Other transports (Slack thread, Linear ticket) get their own deterministic derivation. The relay never mints session IDs server-side.
One session groups many traces. Each trace is one agent's burst.
Phoenix's Agent Graph view consumes OpenInference's explicit graph attributes, not span links. The relay synthesizes them from the A2A peer relationship and attaches them to each root span:
graph.node.id- the acting agent's identity for graph rendering. Default:<agent_id>. Use<agent_id>:<task_id>if you need per-task nodes.graph.node.parent_id- the upstream agent in the handoff chain (set on B'sa2a.taskroot when triggered by A; unset on the originating burst).
Span links can still be emitted as a secondary signal for OTel-native consumers that read them (Tempo, Jaeger). They are not load-bearing for Phoenix and are treated as informational only.
Three traces, one session. Task ID T is A2A-native, minted by the relay when message/stream is called.
Root span a2a.client.send:
name: a2a.client.send
openinference.span.kind: AGENT
session.id: <S>
agent.id: A
agent.name: <from A's Agent Card>
graph.node.id: A
peer.agent.id: B
o2r.method: message/stream
o2r.task.id: T
rpc.system: jsonrpc
rpc.service: a2a
rpc.method: message/stream
Child span a2a.message.send for the actual outgoing payload:
name: a2a.message.send
openinference.span.kind: LLM # if the message is model output
# else AGENT
agent.id: A
input.value: <message.parts as JSON>
input.mime_type: application/json
Trace ends when the JSON-RPC call returns the task handle. Streaming continues in Trace 2.
Root span a2a.task. Lifetime = task lifetime.
name: a2a.task
openinference.span.kind: AGENT
session.id: <S>
agent.id: B
agent.name: <from B's Agent Card>
graph.node.id: B
graph.node.parent_id: A
o2r.task.id: T
o2r.task.state: working # updated to terminal state at span end
State changes are span events, not separate spans:
event: o2r.task.state_change { from: submitted, to: working, ts: ... }
event: o2r.task.state_change { from: working, to: completed, ts: ... }
Streaming chunks. Each SSE frame from B back to A is a span event on the task span:
event: a2a.message.stream_chunk
{ seq: 0, message.role: agent, parts: [...], final: false }
event: a2a.message.stream_chunk
{ seq: 1, ..., final: false }
event: a2a.message.stream_chunk
{ seq: 2, ..., final: true }
Rationale for events-not-spans on chunks: a 200-token streaming response is 200 spans of structural noise that obscure the Agent Graph. Events keep the chunk timeline queryable without polluting the trace tree. Phoenix renders span events inline.
The terminal completion message is upgraded to a child span so it shows up as a node in the trace tree and Phoenix's message timeline:
name: a2a.message.send
openinference.span.kind: LLM
parent: a2a.task
agent.id: B
output.value: <final message.parts as JSON>
output.mime_type: application/json
Span ends. o2r.task.state is set to terminal at end. Status code OK on completed, ERROR on failed or canceled.
Root span a2a.client.recv:
name: a2a.client.recv
openinference.span.kind: AGENT
session.id: <S>
agent.id: A
graph.node.id: A
graph.node.parent_id: B
o2r.method: tasks/get # or implicit on stream-close
o2r.task.id: T
A's downstream processing of the result lives as children here. Out of scope for the relay.
- Agent identity = span-level attributes. Every span the relay emits on agent X's behalf carries
agent.id,agent.name,agent.role, and the graph attributes redundantly. Phoenix does not surfaceResourceattributes, so they cannot live there. The relay's ownservice.nameis fine on the Resource because it never needs to be queryable per-span. agent.roleis the broad role in the agent topology (relay,orchestrator,planner,validator,worker,deployer). Consumers may add a parallelagent.specializationfor narrower per-worker analysis (designer,curator,science_writer, ...). The relay never inspects either; consumers query against them.- Sessions ride on
using_session(context_id)plus a redundantsession.idspan attribute. The redundant attribute makes Phoenix's Sessions tab work without OpenInference instrumentation in the loop; the context manager propagates the session to anything OpenInference-instrumented that runs inside. user.idrides onusing_user(sender_id)plus a redundantuser.idattribute. Phoenix's per-user filters and the User column in the Sessions tab consume it.- Task = one root span, lifetime = task lifetime, state in events. Not one trace per task. Multiple tasks in the same session share
session.id. - Messages = mixed. Content-bearing initial sends and final completions are spans (LLM kind). Streaming chunks and protocol-level state pings are span events. Rule of thumb: if it would render as a node in the trace tree, it is a span. If it would render as a tick on a timeline, it is an event.
- Streaming and sync share span shape. Sync is the degenerate case where the task span opens, fires one
stream_chunkevent withfinal: true, ends with a child completion span. The relay does not branch on sync vs stream at the span layer. - Topology =
graph.node.*attributes, not span links. Span links may be emitted alongside but are not load-bearing. - Failure classification =
o2r.relay.failure_classon any erroring relay span. Stable, machine-readable bucket. Pairs with therelay_failure_classannotation config in Phoenix.
The relay is consumer-agnostic. Anything reading or writing o2r-shaped spans is a "consumer," and the relay never knows or hardcodes which one. To stand up an OTel tracer that emits into the right Phoenix project with the right resource attributes, a consumer calls otel_a2a_relay.tracing.bootstrap(...) once per process.
from otel_a2a_relay.tracing import bootstrap
tracer = bootstrap(
namespace="frob", # required - logical system name (OTel service.namespace)
deployment="acme", # required - logical install identifier
role="planner", # required - this process's role (OTel service.name)
deployment_env="prod", # optional
version="1.2.3", # optional
git_commit="deadbeef", # optional
extra_resource={...}, # optional - merged in last
)Returns a configured Tracer. Side effects:
- Sets
PHOENIX_PROJECT_NAMEenv var (if not already set) to the slugified<deployment>. The slug rule is: lowercase,[a-z0-9-], collapse separators. Phoenix's exporter reads the env var. - Resource attributes record
service.namespace=<namespace>,service.name=<role>,<namespace>.deployment=<deployment>, plus any optional fields andextra_resource. The relay never inspects or special-cases the namespace. - When the caller passes
emit_readme_span=True, emits onetracing.session.startspan with areadmeattribute (namespace=<x> deployment=<y> role=<r> version=<v>). Off by default. The smoke span has zero IO and zero session context, so a real flow's project list stays free of it. Tests and one-off probes opt in.
What's out of scope:
- Auto-detecting consumer identity from env / git / process name. The bootstrap requires explicit args so the consumer is the source of truth.
- Multi-tenant exporter routing. One process, one Phoenix project.
- Helpers for setting consumer-flow attributes (
step,task_id,kind.in,kind.out,role,graph.node.*). Those live in the consumer's namespace and the consumer sets them on its own spans.
Run against Phoenix platform v15.4.0 via phoenix serve + the harness/ posting the worked example. What the harness confirmed:
session.idflows correctly. Sessions tab groups traces as expected.openinference.span.kinddrives the rendered span kind (AGENT / LLM).- Span events render under their parent span without polluting the trace tree.
- Three-traces-per-session topology works, multiple AGENT-kind roots under one session.
- Span event attributes (state changes, stream-chunk seq + final) all preserved.
What the harness rewrote in v0:
- Resource attributes are not exposed by Phoenix. Agent identity must live on spans, not on the Resource. v0 specified the opposite. v0.1 fixes it. (v0.2 still puts caller identity on the Resource via
bootstrap()because non-Phoenix OTel consumers do read it; the relay's own span emission keeps the redundant span-level identity for Phoenix.) - Span links are not exposed by Phoenix. v0 used links for cross-trace topology. v0.1 uses
graph.node.id/graph.node.parent_idinstead.
What v0.2 changed:
- Wire-protocol attributes are now
o2r.*, nota2a.*. The renamed keys describe this protocol's mechanics, not the consumer's flow. Specifically:o2r.task.id,o2r.task.state,o2r.task.state_change(event),o2r.message.text,o2r.message.reply_text,o2r.peer.target,o2r.relay.mode,o2r.relay.reject_reason,o2r.method. Span names (a2a.task,a2a.client.send,a2a.relay.forward, ...) keep thea2a.*prefix because they label A2A wire events. tracing.bootstrap()entrypoint added. See "Tracing bootstrap" above.
What v0.3 changed:
agent.roleis mandatory on every relay-emitted span.relay, plus the registered role of the sender ono2r.peer.sender_roleand the target ono2r.peer.target_role.o2r.relay.failure_classrides on every erroring relay span. Stable label set:topology_violation,peer_disconnect,peer_404,timeout,peer_jsonrpc_error,unknown.- Sessions and users propagate via OpenInference's context managers. Every relay-emitted span sits inside
using_session(context_id)andusing_user(sender_id)so OpenInference auto-instrumentation picks them up without each handler restating them. The redundantsession.idanduser.idspan attributes remain so Phoenix queries against them work without OpenInference in the loop. tracing.session.startsmoke span is opt-in. Default behavior ofbootstrap()no longer emits it.
If a future Phoenix release changes any of these, the spec backs up again.
The canonical set of span attributes this protocol uses. This YAML block is parsed by scripts/emit_protocol_artifacts.py, which writes docs/generated/o2r-attributes.schema.json (JSON Schema) and docs/generated/o2r-semconv.yaml (OTel semantic-conventions shape). Doc is the source of truth; the generated files are committed for downstream tools that want a machine artifact.
# o2r-attributes
attributes:
- id: agent.id
type: string
requirement: required
brief: Stable per-agent identifier inside the deployment.
- id: agent.name
type: string
requirement: required
brief: Human-readable agent name.
- id: agent.role
type: string
requirement: required
brief: Broad role in the topology.
enum: [relay, orchestrator, planner, validator, worker, deployer]
- id: agent.specialization
type: string
requirement: optional
brief: Narrower per-agent specialization. Consumer-defined.
- id: session.id
type: string
requirement: required
brief: Deterministic session identifier. sha256(repo:issue)[:16] for GitHub-rooted channels.
- id: user.id
type: string
requirement: recommended
brief: Sender identity. Propagated via OpenInference using_user().
- id: graph.node.id
type: string
requirement: required
brief: Topology node identifier for cross-trace graph rendering.
- id: graph.node.parent_id
type: string
requirement: recommended
brief: Parent node id in the topology graph.
- id: o2r.task.id
type: string
requirement: required
brief: Task identifier within the session.
- id: o2r.task.state
type: string
requirement: required
brief: Current task state.
- id: o2r.task.state_change
type: string
requirement: optional
brief: State transition recorded as a span event attribute.
- id: o2r.message.text
type: string
requirement: optional
brief: Content of the originating message.
- id: o2r.message.reply_text
type: string
requirement: optional
brief: Content of the completion reply.
- id: o2r.peer.target
type: string
requirement: required
brief: Target peer id for a relay forward.
- id: o2r.peer.sender_role
type: string
requirement: required
brief: Registered role of the sending peer.
- id: o2r.peer.target_role
type: string
requirement: required
brief: Registered role of the target peer.
- id: o2r.relay.mode
type: string
requirement: optional
brief: Relay routing mode in effect for this span.
- id: o2r.relay.reject_reason
type: string
requirement: optional
brief: Reason a relay rejected a message.
- id: o2r.relay.failure_class
type: string
requirement: required
brief: Coarse failure class on any erroring relay span.
enum: [topology_violation, peer_disconnect, peer_404, timeout, peer_jsonrpc_error, unknown]
- id: o2r.method
type: string
requirement: required
brief: A2A JSON-RPC method name driving this span.coily channel <verb> lives in the coily repo and talks A2A to a relay. Verbs: send, stream, tasks-get, tasks-cancel, view. Inherits the audit + gate wrapper pattern. The relay itself ships only a serve mode.
- v0.1 protocol doc + harness updated and re-validated against Phoenix. (This commit.)
- Implement the relay around the verified shape.
coily channelships after the relay has aservemode reachable on the homelab.
Some endpoints on the relay are operational, not protocol. They affect routing decisions but do not change the wire shape between agents, and they do not change span shape. They live outside this doc and below the sequencing gate. Adding or changing them does not require a harness re-run.
Currently in this category:
POST /peers- register a peer dynamically with{id, url, role}. Role is one oforchestrator | planner | validator | worker | deployer. Used by transient processes (a worker that boots, registers, takes one task, exits, deregisters) so the relay does not need a static env-var registry for choreographies that spin agents up and down.DELETE /peers/{id}- deregister.- Star-topology enforcement - toggleable via
OTEL_A2A_RELAY_STAR_ENFORCE=1. When on,message/sendis rejected with JSON-RPC error-32010if neither sender nor target carries theorchestratorrole; ana2a.relay.rejectspan is emitted instead ofa2a.task. Peers without registered roles are not enforced (preserves legacy A/B dogfood behavior). The enforcement is a property of the deployment (which choreography is running), not of the A2A spec.
Both are exercised by the examples/luca-flow/ demo. The wire format and span shape they produce are unchanged from v0.1.