Lesson 08 · Effect AgentCore

Observability: tracing the agent

This closes the operate pillar. Effect has tracing built in — every Effect can be a span. We already saw it for free: the lesson-02 server logged an http.span line for every request. In this lesson we enrich the turn with our own spans (the model call, memory reads, each tool), export them over OpenTelemetry, connect them across services into one trace, and make them visible in the CloudWatch GenAI Observability dashboard — which, it turns out, takes several things lining up, not one.

01Operate: you can't run what you can't see

The course has three pillars: local (lessons 02–04, the contract-complete server), deploy (lessons 05–07, into AgentCore Runtime with memory), and operate — the one we close here. A production agent turn is not a single function call. It touches the model, and it touches one or more tools. When a turn is slow, or returns something wrong, the only useful question is where: was it the model, or a tool?

Logs alone won't answer that. A flat stream of log lines tells you a turn happened; it doesn't tell you that the model took 1.8 s while the search tool took 30 ms. To operate an agent you need the turn decomposed into timed, nested sub-steps — a trace. The good news is that Effect already gives us most of this.

src/effect-agentcore/src/agent.ts · operate semantics from AWS Bedrock AgentCore docs

02Tracing is already on

Go back to the lesson-02 server and read its log carefully. Effect's runtime instruments HTTP handling automatically — every request that flows through the HttpApi server becomes a span, emitted as an http.span log line with its duration and request attributes:

[INFO] Listening on http://0.0.0.0:8080
[INFO] http.span=2ms  Sent HTTP response { 'http.method': 'GET',  'http.url': '/ping',        'http.status': 200 }
[INFO] http.span=9ms  Sent HTTP response { 'http.method': 'POST', 'http.url': '/invocations', 'http.status': 200 }

We wrote no tracing code to get this. The http.span=2ms and http.span=9ms measurements, the http.method / http.url / http.status attributes — all of that is the runtime turning each request handler into a span on its own. Observability is on by default. Our job in this lesson is not to switch it on; it is to enrich it (add spans for the steps the framework can't see inside) and export it (send the spans somewhere we can browse them).

Why this matters

The POST /invocations span already exists and already measures the whole turn. But it is a single 9 ms bar — it can't tell you how much of that was the model and how much was the tool, because the framework doesn't know those sub-steps exist. Everything below is about giving the trace that detail.

real output from running src/agent.ts · Effect tracing surfaces this as http.span (effect docs)

03Spans we add — and the attributes the dashboard reads

effect/unstable/ai's LanguageModel.generateText opens its own span and threads it through ProviderOptions.span, so a generation node appears in the tree for free. But "appears in the tree" is not enough for the GenAI dashboard: its built-in query filters on specific attributes and span kinds. So we add a small set of explicit spans that carry exactly what the dashboard reads.

First, the turn's root span — and the one attribute that makes Sessions work. The dashboard groups traces by session.id; the Python auto-instrument path lifts it from a request header, but we're a BYO Node container, so we stamp it ourselves from the body sessionId:

.handle("invoke", (ctx) =>
  Effect.gen(function* () {
    /* read memory · respond · append */
  }).pipe(
    Effect.withSpan("agent.invoke", {
      kind: "server",
      attributes: ctx.payload.sessionId
        ? { "session.id": ctx.payload.sessionId }
        : {},
    }),
  ),
);

Second, the model call as a CLIENT span carrying token usage. This matters more than it looks: the dashboard's Total tokens column sums gen_ai.usage.*_tokens only over spans whose kind is "CLIENT". The AI module's own generation span is internal and our Bedrock provider didn't surface usage — so we wrap the Converse call (the actual outbound request) in a CLIENT span and annotate it from res.usage:

bedrock.converse(toConverse(model, prompt, tools)).pipe(
  Effect.tap((res) =>
    Effect.annotateCurrentSpan({
      "gen_ai.system": "aws.bedrock",
      "gen_ai.operation.name": "chat",
      "gen_ai.request.model": model,
      "gen_ai.usage.input_tokens": res.usage.inputTokens,
      "gen_ai.usage.output_tokens": res.usage.outputTokens,
      "gen_ai.usage.total_tokens": res.usage.totalTokens,
    }),
  ),
  Effect.map(fromConverse),
  Effect.withSpan("bedrock.converse", { kind: "client" }),
);

Third, the AWS resource calls the framework can't see inside. Memory retrieval and the gateway tool hop each get a CLIENT span, so a turn's trace shows where the time went — including the round-trip to the Memory data plane:

agentcore.listEvents({ memoryId, sessionId, actorId, includePayloads: true }).pipe(
  /* … sort + flatten to turns … */
  Effect.withSpan("memory.read", {
    kind: "client",
    attributes: {
      "aws.service": "bedrock-agentcore",
      "gen_ai.operation.name": "memory.list_events",
      "session.id": sessionId,
    },
  }),
);

Each withSpan opens a child of whatever span is active, so the tree composes itself: agent.invoke under the framework's request span, then bedrock.converse, memory.read/memory.append, and gateway.searchExplainers beneath it. Effect.annotateCurrentSpan adds attributes to the span already in progress without opening a new one — which is how the token counts land on the CLIENT span after the response comes back.

The tool's far side gets a span too. The search Lambda behind the Gateway (lesson 06) wraps its work in a search.lambda SERVER span and exports it the same way — its own service.name (effect-docs-search), its own dashboard row. One Lambda wrinkle: the export must flush before the handler returns, because a Lambda freezes the instant it resolves. We get that for free by providing the tracer to the handler's Effect (Effect.provide(Tracing)): the layer is built, the span runs, and releasing the layer on completion flushes the batch — no background timer to miss.

spans in src/agent.ts · src/model.ts · src/memory.ts · attribute names from the OTEL GenAI semantic conventions · the dashboard's query reads resource.attributes.aws.service.type, session.id, kind="CLIENT", gen_ai.usage.*

04Marking ≠ exporting: install the tracer

Here is the trap that makes a dashboard read 0 traces while the code looks complete. Effect.withSpan only marks a span; if no Tracer is installed, the framework's default tracer is a no-op and every span is dropped on the floor. The http.span log lines from section 02 are the logger printing, not an exporter shipping — they go nowhere over the wire. To export, you install a real tracer.

Effect 4 ships one inside the effect package itself — effect/unstable/observability — so there's no new dependency to bundle. But where does it send? This is the part the docs bury, and getting it wrong is what keeps the dashboard at 0. AgentCore runs no OTEL collector inside a bring-your-own container, and it injects no OTLP endpoint — that magic is exclusive to the Python auto-instrument path. A non-Python container posts OTLP/protobuf directly to the regional X-Ray endpoint, https://xray.<region>.amazonaws.com/v1/traces, SigV4-signed for service xray — the same signing we already do for the Gateway's MCP calls in lesson 06. X-Ray + Transaction Search then index the spans into the aws/spans log group the dashboard reads.

import * as OtlpTracer from "effect/unstable/observability/OtlpTracer";
import * as OtlpSerialization from "effect/unstable/observability/OtlpSerialization";
import { HttpClient, HttpClientRequest } from "effect/unstable/http";
import { AwsClient } from "aws4fetch";

// Decorate the HTTP client: SigV4-sign every request for `xray` before it goes out.
const SigningHttpClient = Layer.effect(HttpClient.HttpClient, Effect.gen(function* () {
  const base = yield* HttpClient.HttpClient;          // the undici client
  const resolveCreds = yield* Credentials.Credentials; // rotating role creds
  const region = yield* Config.string("AWS_REGION").pipe(Effect.orDie);
  return HttpClient.mapRequestEffect(base, (req) => Effect.gen(function* () {
    const creds = yield* resolveCreds;
    const aws = new AwsClient({ ...creds, service: "xray", region });
    const body = req.body._tag === "Uint8Array" ? req.body.body : new Uint8Array();
    const signed = yield* Effect.promise(() => aws.sign(req.url, { method: req.method, body }));
    return HttpClientRequest.setHeaders(req, headersOf(signed));
  }).pipe(Effect.orDie));
})).pipe(Layer.provide(NodeHttpClient.layerUndici), Layer.provide(Credentials.fromChain()));

const tracerLayer = (url: string, serviceName: string) =>
  OtlpTracer.layer({
    url,
    // serviceName is REQUIRED — OtlpTracer otherwise reads OTEL_SERVICE_NAME and a missing
    // value RAISES a ConfigError that fails the build and silently exports nothing.
    // aws.service.type=gen_ai_agent is the attribute the dashboard FILTERS on.
    resource: { serviceName, attributes: { "aws.service.type": "gen_ai_agent" } },
  }).pipe(Layer.provide(OtlpSerialization.layerProtobuf), Layer.provide(SigningHttpClient));

The endpoint is chosen by Config the same way the model (lesson 03) and memory (lesson 07) backings are: when AGENT_OBSERVABILITY_ENABLED=true (set on the deployed runtime) and a region is known, default to the X-Ray URL; an explicit OTEL_EXPORTER_OTLP_ENDPOINT overrides it; locally (tests, pnpm dev) neither is set, so Layer.empty leaves the no-op tracer in place and spans are simply dropped. Tracer is a Context.Reference, so providing this layer on the server overrides the default for every span in a served turn:

const ServerLive = AgentApiLayer.pipe(
  HttpRouter.serve,
  Layer.provide(SearchLive),
  Layer.provide(TracingLive),      // ← installs the (signing) OTLP tracer
  Layer.provide(NodeHttpServer.layer(createServer, { port: 8080, host: "0.0.0.0" })),
);

Two deliberate choices. We export traces only (OtlpTracer), not the combined logs+metrics layer — X-Ray's intake is a traces endpoint. And the export needs the execution role to allow xray:PutTraceSegments (granted in the stack, §06); without it the signed POST 403s and the dashboard stays empty even though the code is correct — the same failure-stays-loud lesson as the Memory grant in lesson 07.

src/observability.ts · effect/unstable/observability/OtlpTracer (ships in the effect package) · SigV4 via aws4fetch (same as src/gateway.ts) · X-Ray OTLP endpoint per the observability docs

05Reading a turn as a trace

Put it together and one POST /invocations becomes one trace, grouped under its session.id, with a child span for memory, the model call (carrying tokens), and each tool — the shape below.

POST /invocations span: http request (root) └─ agent.invoke { session.id } kind=server 9ms ├─ memory.read { session.id } kind=client 12ms ├─ LanguageModel.generateText (AI module) 1.8s │ └─ bedrock.converse { gen_ai.usage.total_tokens=842 } kind=client ├─ gateway.searchExplainers { tool.name } kind=client 30ms ├─ LanguageModel.generateText (AI module) 0.6s → text └─ memory.append { session.id } kind=client 15ms
Aha

A slow turn is no longer a mystery. The trace shows immediately whether the cost was Bedrock latency (the bedrock.converse bar dominates), the search tool, or the memory round-trip. Because bedrock.converse is kind=client with gen_ai.usage attributes, the same span feeds the dashboard's Total tokens metric — the trace tree and the token chart are the same data, not two pipelines. And the AiError from lesson 03 stops being a string lesson 02 quietly returned as a 200 — it becomes a span with an error status you can see and alert on.

spans in src/agent.ts / model.ts / memory.ts · generation spans from effect/unstable/ai · AiError from lesson 03

06Connecting the trace across services

§05's tree is one process. But a real turn crosses the network: the agent calls the Gateway (which calls the search Lambda) and the Memory data plane. Each of those is a separate service that, left alone, starts its own trace with a fresh random id — and you get three disconnected traces instead of one. A span only joins an existing trace if the trace context travels with the request. This is the step that turns "I emit spans" into "I have a distributed trace," and it's the one most easily missed.

The mechanism is HTTP headers. When a span is active, effect's HttpClient automatically injects W3C traceparent (and b3) on every outbound request — gated by the TracerPropagationEnabled reference, on by default. The receiving service decodes that header into a parent span and continues the trace. That's why Memory connects for free: the data-plane call rides HttpClient.

The leak: a raw fetch drops the context

The Gateway's MCP call is SigV4-signed with aws4fetch over the global fetch (lesson 06) — it never touches effect's HttpClient, so nothing is injected and the Gateway starts a brand-new trace. The fix is to propagate by hand: read the active span and add AWS's X-Ray header (AgentCore honours both traceparent and X-Amzn-Trace-Id). The idiomatic alternative is to route the call through a signing HttpClient layer so propagation is automatic again — same DI the Memory path gets.

// aws4fetch is a raw fetch — propagate the active trace ourselves so the Gateway
// (and the Lambda it invokes) JOIN this trace instead of starting a fresh one.
Effect.currentSpan.pipe(
  Effect.catch(() => Effect.succeed(undefined)),
  Effect.flatMap((span) => Effect.tryPromise(async () => {
    const trace = span && `Root=1-${span.traceId.slice(0, 8)}-${span.traceId.slice(8)};Parent=${span.spanId};Sampled=1`;
    await client.fetch(url, { method: "POST", headers: { ...mcp, ...(trace && { "X-Amzn-Trace-Id": trace }) }, body });
  })),
);

Propagation has two sides. The other side is adopting the context that arrives, so a service nests under its caller rather than rooting a new trace. AgentCore hands the container an X-Amzn-Trace-Id request header and the Lambda an _X_AMZN_TRACE_ID env var; both parse to the same OTEL id (X-Ray 1-{8hex}-{24hex} → 32-hex) and become the parent via Effect.withParentSpan:

const adoptXrayParent = (self) => Effect.gen(function* () {
  const request = yield* HttpServerRequest.HttpServerRequest;
  const parent = xrayParentSpan(request.headers["x-amzn-trace-id"]); // → Tracer.externalSpan
  return yield* (parent ? Effect.withParentSpan(self, parent) : self);
});

One last piece: an intermediary's own spans must be delivered too, or the tree breaks at it. The Gateway sits between our CLIENT span and the Lambda — it creates its own InvokeTool segment and parents the Lambda under that. Without delivering the Gateway's spans (§07), the Lambda's parent points at a span that isn't in aws/spans, so it shows up orphaned. With it delivered, the whole tree resolves:

agent.invoke { session.id } kind=server ├─ memory.read → http.client → ListEvents (Memory service) ├─ LanguageModel.generateText │ ├─ bedrock.converse { gen_ai.usage.total_tokens } kind=client │ └─ gateway.searchExplainers kind=client │ ├─ AgentCore.Gateway.Initialize / ListTools … (Gateway service) │ └─ AgentCore.Gateway.InvokeTool (Gateway service) │ └─ search.lambda (effect-docs-search) └─ memory.append → http.client ×2 → CreateEvent ×2 (Memory service)
Aha

Three services — agent, Gateway+Lambda, Memory — render as one trace, and the boundary you have to instrument by hand is exactly the one that doesn't go through effect's HttpClient. Propagation isn't automatic across an opaque AWS-managed hop: you inject context going out, adopt it coming in, and deliver the intermediary's own spans so nothing dangles.

propagation in src/gateway.ts · adoption (xrayParentSpan + adoptXrayParent) in Tracing.ts / agent.ts · auto-propagation: HttpClient + TracerPropagationEnabled (effect) · trace headers per the observability docs

07The third leg: making spans visible (IaC)

Emitting OTLP isn't enough either. The dashboard reads a CloudWatch log group called aws/spans, and that group is only populated once CloudWatch Transaction Search is turned on — a one-time, account-global setting. Skip it and you get the exact symptom that opens this lesson: code emits spans correctly, dashboard shows 0/0. So the three legs are: (1) emit OTLP from the container (section 04), (2) enable Transaction Search at the account, (3) deliver each resource's service spans into aws/spans. Legs 2 and 3 are IaC.

Rather than a console click-path, the course ships them as Alchemy resources (authored the same way as the AgentCore ones — lesson 10 opens the hood). TransactionSearch does the three calls AWS prescribes: a resource policy letting X-Ray write aws/spans, then updateTraceSegmentDestination("CloudWatchLogs"), then the indexing sample rate. TraceDelivery wires a TRACES → XRAY delivery for each AgentCore resource whose own spans you need in the tree — the Memory store (its ListEvents/CreateEvent spans) and the Gateway (its InvokeTool segment, the missing link from §06 that the Lambda parents under):

// (a) the account-global toggle — without it, aws/spans stays empty
const transactionSearch = yield* TransactionSearch("TransactionSearch", {
  samplingPercentage: 100,
});
// (b) deliver each resource's own service spans into aws/spans, so the cross-service
//     tree (§06) renders unbroken — without (c), search.lambda dangles off a missing parent
const memoryTraces  = yield* TraceDelivery("MemoryTraces",  { resourceArn: memory.memoryArn });
const gatewayTraces = yield* TraceDelivery("GatewayTraces", { resourceArn: gateway.gatewayArn });

One more grant the deploy needs: the container signs its trace export with the runtime's execution role (§04), so that role's inline policy gains xray:PutTraceSegments, xray:PutTelemetryRecords, and cloudwatch:PutMetricData. Without them the signed POST to the X-Ray endpoint 403s and the dashboard stays empty even though every line of code above is correct — the same failure-stays-loud lesson as the Memory data-plane grant in lesson 07. The search Lambda needs the same X-Ray grant, but its role is auto-created by the framework and (being a plain async handler) has no Effect .bind() calls to collect permissions — so a small RolePolicy resource attaches the inline policy to the function's roleName after it's created.

Then one command ships all three: alchemy deploy rolls out the exporting container, flips Transaction Search on, and creates the delivery — the spans ride along. TransactionSearch.delete is a deliberate no-op: the toggle is account-global, so tearing down this stack must not break observability for anything else in the account. That closes the operate pillar: local, deploy, operate, all three done.

Alpha — the path was found the hard way

Getting here took reading the live runtime, not just the docs. The first assumption — that AgentCore injects an OTLP endpoint into any container — is false: it's true only for the Python auto-instrument path. A BYO container gets no collector and no endpoint; it must sign and post to X-Ray itself, which is what §04 does. The signed export is verified locally (the span batch reaches a SigV4-checked sink); the last mile — X-Ray accepting it under a real role — is account-specific. Pin your versions and watch the container log line (OTLP tracing ON …) plus aws/spans on a real runtime before relying on it.

That completes the operate story: the agent enriches spans, exports them to X-Ray over SigV4, connects them across services, and Transaction Search makes them visible. The appendix (lesson 10) opens the hood on how the given/ AgentCore resources you leaned on are authored against the control-plane SDK.

given/observability/Observability.ts · wired in alchemy.run.ts · Transaction Search + aws/spans from observability-configure