Lesson 07 · Effect AgentCore

Memory: learner progress across turns

Every POST /invocations so far has been stateless — the handler sees one prompt and forgets it the moment the response goes out. AgentCore Memory is the managed store that survives between separate invocations, keyed by a session id, so the Effect-docs agent can remember what a learner already asked. We create a Memory resource, expose it to the agent the same way lesson 03 swapped models and lesson 06 swapped the search backing — as an Effect Layer chosen at the composition root — and teach the handler to read prior memory at the start of a turn and write the new turn at the end.

01Why turns need memory

Picture a learner working through the course. They ask the agent:

POST /invocations  {"prompt":"what's the runtime contract?"}
→ "Two HTTP routes: GET /ping and POST /invocations, bound on 0.0.0.0:8080."

POST /invocations  {"prompt":"and how do I deploy it?"}
→ ??? deploy what?

The second question only makes sense because of the first. "It" is the runtime from a turn the agent no longer has. Our lesson-02 handler builds messages fresh from a single prompt on every request — when the response is sent, that array is garbage-collected. Statelessness is exactly what made lesson 02 easy to reason about, but it is also what loses the thread of a conversation.

Memory is the managed store that bridges separate invocations. It is not a variable in your process (containers come and go, and AgentCore may route two turns to two different instances) — it is a persistent resource AgentCore hosts on your behalf, addressed by a session id.

handler that goes stateful: src/effect-agentcore/src/agent.ts

02The Memory resource

Like the Runtime and the Gateway, Memory is a resource the course ships in given/agentcore — Alchemy has no native AgentCore support, so Memory.ts is authored alongside Runtime.ts in the same Resource + Provider shape. You declare one in your stack:

import { Memory } from "./given/agentcore/index.ts";

const memory = yield* Memory("Memory", {
  eventExpiryDays: 30,        // how long a stored turn lives
});

Under the hood the provider's reconcile calls the control-plane createMemory (from @distilled.cloud/aws bedrock-agentcore-control, the same native-Effect control plane the Runtime uses). The resource outputs { memoryId, memoryArn, status } — memoryId is the handle the runtime data plane addresses.

Alpha — shapes verified against types, not the live service

given/agentcore/Memory.ts is implemented and exported (alongside Runtime.ts / Gateway.ts), and the stack wires it. But AgentCore is alpha: the resource is contract-verified (tsc against the pinned @distilled.cloud/aws@0.22.0) and not deploy-verified. The createMemory / deleteMemory request shapes, the eventExpiryDuration field, and the id-vs-arn outputs must be confirmed against the live service. Props ({ eventExpiryDays, executionRoleArn? }) are the current contract, not a frozen one.

authored beside given/agentcore/Runtime.ts · control plane: @distilled.cloud/aws bedrock-agentcore-control createMemory

03Wiring memory into the agent — a Layer swap

A declared resource is just a description until the agent uses it. Our course-authored Memory (like the Gateway) is not a framework Platform with a .bind() helper — so we wire it with Effect's own dependency injection, the exact pattern lessons 03 (the model) and 06 (the search backing) already used: one service interface, two interchangeable Layers, the choice made once at the composition root.

The contract is SessionMemory (given/agentcore/SessionMemory.ts) — read(sessionId) returns the prior turns, append(sessionId, turns) adds new ones. Two layers satisfy it: LocalMemory (an in-process Map — tests and pnpm dev, no AWS) and AgentCoreMemory.layer({ memoryId }) (the real store over the Memory data plane, src/memory.ts). Which one is wired is a deploy-time decision, read through Effect's Config from the env var the stack passes in:

// Same Config + Layer.unwrap shape as ModelLive (lesson 03) and SearchLive (lesson 06).
const MemoryLive = Layer.unwrap(
  Effect.map(Config.option(Config.string("MEMORY_ID")), (id): typeof LocalMemory =>
    Option.isSome(id)
      ? AgentCoreMemory.layer({ memoryId: id.value })  // cloud: the real store
      : LocalMemory,                                   // dev/tests: in-process Map
  ),
);

The handler then acquires the memory service once, in its init phase — the same place lesson 02 acquires the LanguageModel — and exercises it per request:

const model  = yield* LanguageModel.LanguageModel;   // INIT: once
const memory = yield* SessionMemory;                 // INIT: once — LocalMemory | AgentCoreMemory

// inside handle("invoke", ...)  (RUNTIME: per request):
const history  = yield* memory.read(sessionId);      // read prior turns
// ... run respond(prompt, history) ...
yield* memory.append(sessionId, [/* this turn */]);  // write new turns

So memory needs no new lifecycle: it is one more long-lived dependency acquired at init and called per turn, and the gateway-vs-local / cloud-vs-dev choice lives in the wiring (MemoryLive), not in the handler. The handler's memory.read/memory.append are identical whichever backing is wired — which is the whole point of programming to the SessionMemory interface.

contract: given/agentcore/SessionMemory.ts · layers: LocalMemory.ts + src/memory.ts (AgentCoreMemory)

04Threading it through `respond`

Where does sessionId come from? It is an optional field on the invocation request body — InvocationRequest is { prompt, sessionId?: string } — and the caller passes the same id for every turn of one conversation. When no sessionId is given, the turn is simply stateless. But that body sessionId is not the runtimeSessionId AgentCore's data plane carries — they look alike, travel together, and are constantly confused, so it is worth pinning the difference down:

sessionId vs runtimeSessionId — two ids, two jobs

sessionId — a field in the request body, defined by this course's InvocationRequest schema. It is the address into SessionMemory: memory.read(sessionId) / memory.append(sessionId, …). Same id on two turns ⇒ they share memory; different ids ⇒ separate conversations; no id ⇒ stateless. It is the application's notion of "which conversation," and it lives in the durable store.
runtimeSessionId — a field on AgentCore's data-plane invokeAgentRuntime call, defined by AWS (must be ≥ 33 chars). It is infrastructure routing: AgentCore uses it for microVM affinity, so turns with the same id tend to land on the same warm container instance. It never touches your memory store — the agent does not even read it.

They're independent on purpose. Memory lives in the durable store keyed by sessionId, so it survives even when AgentCore routes turn 2 to a different instance (a different runtimeSessionId microVM) — which is exactly why an in-process Map can't be the real backing (§1). The client (src/invoke.ts) makes the split explicit: it mints one stable sessionId for the whole conversation — reused every turn, and re-passable later via --session to resume — while the runtimeSessionId is minted fresh each process run, purely to keep one run's turns warm. Memory continuity comes from the body id, not the transport id: quit, come back with --session <id>, and the agent still remembers, even though that new run's runtimeSessionId (and very likely its container) is different.

The handler reads the prior turns for that session, seeds them into the prompt before calling the model, and appends the new turn after the model answers. Crucially, this slots into the lesson-02 handler without rewriting respond's core shape — respond grows one optional history parameter (the memory-local Turn type, decoupled from the AI module) and maps it into the prompt:

// respond grows one optional parameter; its generateAnswer loop is unchanged.
export const respond = (
  prompt: string,
  history: ReadonlyArray<Turn> = [],
): Effect.Effect<string, never, LanguageModel.LanguageModel | ToolHandlers> =>
  generateAnswer(
    Prompt.make([
      { role: "system", content: SYSTEM },
      ...history.map(turnToMessage),                   // prior turns seed the prompt
      { role: "user", content: prompt },
    ]),
    MAX_TURNS,
  ).pipe(Effect.catch((e) => Effect.succeed(`The model call failed: ${e.message}`)));

.handle("invoke", (ctx) =>
  Effect.gen(function* () {
    const { prompt, sessionId } = ctx.payload;         // sessionId keys this conversation
    const history = sessionId
      ? yield* memory.read(sessionId).pipe(            // READ at turn start
          Effect.catchTag("MemoryError", () => Effect.succeed([])),
        )
      : [];
    // model provided explicitly (acquired at init); tool handlers come from SearchLive.
    const response = yield* respond(prompt, history).pipe(
      Effect.provideService(LanguageModel.LanguageModel, model),
    );
    if (sessionId) {
      yield* memory.append(sessionId, [                // WRITE at turn end
        { role: "user", content: prompt },
        { role: "assistant", content: response },
      ]).pipe(Effect.catchTag("MemoryError", () => Effect.void));
    }
    return { response };
  }).pipe(Effect.withSpan("agent.invoke")),            // lesson 08: the turn is a span
)

The tool loop, the AiError boundary, the { response } shape — all untouched. Memory is two calls wrapped around the turn we already had, each catchTag-guarded so a memory outage degrades to a stateless turn rather than failing the request.

Alpha — data-plane shapes to confirm

On the data-plane shapes themselves (the sessionId-vs-runtimeSessionId split is covered above). Deploy-verified against the live service: listEvents (read) round-trips cleanly, but createEvent (write) has an SDK gap — its response omits event.payload while @distilled.cloud/aws@0.22.0's Event schema marks it required, so a successful write still raises SchemaError: Missing key at ["event"]["payload"] on response decode. The event is written; src/memory.ts catches that one decode failure and treats it as the success it is, while re-raising every other error. Two more live-verified quirks the code handles: listEvents returns events newest-first, so read sorts by eventTimestamp ascending to honour the chronological contract; and AgentCore rejects empty content.text with a ValidationException, so append drops empty/whitespace turns. Re-verify when the SDK pins a fixed shape.

src/agent.ts — the invoke handler (read at turn start, write at end, both catchTag-guarded) · data plane: src/memory.ts (listEvents / createEvent)

05Read at start, write at end

Mental model

A turn is a read-modify-write on the session's memory. Read the prior turns, modify by running the model with that context, write the new turn back. The session id is the address; the conversation is the value at that address, growing one turn at a time.

Ordering matters, and it is not arbitrary. You write after a successful turn, not before. If the model call fails, the turn never produced a sound assistant message — writing a half-turn would corrupt the session, so the next read seeds the model with garbage. By writing only after respond succeeds, a failed turn leaves memory exactly as it was: the conversation can be retried cleanly.

This ties straight back to the typed-error boundary from lessons 02 and 03. respond catches the AiError into a graceful { response }, so the append downstream always sees a well-formed turn rather than a throw blowing past it. The error channel is what makes "write only on success" enforceable rather than hopeful — the write is sequenced after the recovered value in the same Effect.gen, so it simply does not run if an earlier step short-circuits.

typed-error boundary reused from src/agent.ts (respond's catch of the AiError)

06Deploy + where this lands

alchemy deploy now provisions one more resource — the Memory — and passes its memoryId into the Runtime's environment as MEMORY_ID. That env var is what flips MemoryLive from the in-process Map to the real AgentCoreMemory store — the same env-driven swap as BEDROCK_MODEL (the model) and GATEWAY_URL (the search backing). No new deploy command, no new lifecycle: the resource graph grows by one node, and the container gets one more env var.

$ alchemy deploy
  ✓ AWS.BedrockAgentCore.Memory   Memory          created
  ✓ AWS.BedrockAgentCore.Runtime  EffectDocsAgent updated (MEMORY_ID set)

Driving it is src/invoke.ts — a small conversation REPL on effect/unstable/cli (Effect 4's native CLI module; the effect-3 @effect/cli package isn't published for the beta line). It reads AGENT_RUNTIME_ARN through Config, mints both ids from the platform Crypto service (not node:crypto), reuses one sessionId across turns, and takes --session to resume a past one:

const crypto = yield* Crypto.Crypto;                 // platform UUIDs (cryptographically secure)
// resume with --session, else a fresh conversation id; runtimeSessionId is always fresh
const sessionId        = resuming ? session.value : `cli-${yield* crypto.randomUUIDv4}`;
const runtimeSessionId = yield* crypto.randomUUIDv4; // ≥33 chars; microVM affinity

const ask = (prompt: string) => invokeAgentRuntime({
  agentRuntimeArn: arn,
  runtimeSessionId,                                        // transport: keep the VM warm
  payload: encode(JSON.stringify({ prompt, sessionId })),  // body: keys memory
});

# interactive — prints the sessionId on exit so you can resume later
$ AGENT_RUNTIME_ARN=<arn> pnpm invoke
conversation cli-9f3… — type a message; "exit" or Ctrl-C to quit
…
session ended — resume it with:  pnpm invoke --session cli-9f3…

# resume that exact conversation: same sessionId ⇒ same memory, fresh runtimeSessionId
$ AGENT_RUNTIME_ARN=<arn> pnpm invoke --session cli-9f3…

This is the lesson's milestone: deploy. With MEMORY_ID set, the deployed agent is stateful across turns of a session — and across container instances, which the in-process Map never could (recall from §1: AgentCore may route two turns to two different containers). Without MEMORY_ID the agent still runs, falling back to the ephemeral LocalMemory. What it is not yet is observable — when a turn reads stale memory or a write fails into the catchTag, you have no signal. Lesson 08 fixes that: tracing turns, model calls, and memory reads/writes as spans.

caller passes the SAME sessionId for both turns ────────────────────────────────────────────────────── Turn 1 Turn 2 POST /invocations POST /invocations sessionId = "abc" sessionId = "abc" "what's the runtime contract?" "and how do I deploy it?" │ │ ▼ ▼ memory.read("abc") → [] memory.read("abc") → [turn 1's Q + A] │ │ ▼ ▼ (model now SEES turn 1) respond(...) respond("...deploy it?", prior) │ │ ▼ ▼ memory.append("abc", turn 1) memory.append("abc", turn 2) │ │ └──────────► AgentCore Memory ◄───────┘ (session "abc" persists between invocations)

deploy provisions Memory + sets MEMORY_ID via alchemy.run.ts + given/agentcore/Memory.ts