Lecture 19 - OpenClaw Case Study: The Agent Loop¶

Course: Agentic AI & GenAI | Previous: Lecture 18 | Next: Lecture 20

Why this lecture exists¶

The agent loop is the core execution engine of an agent system.

Once you understand the loop, the surrounding pieces become easier to reason about:

cron decides when to run
hooks decide where to intercept
tools decide what actions are available
sessions decide what state is visible
the agent loop decides how one complete run executes

This lecture uses OpenClaw's agent loop as the case study.

The short definition:

an agent loop is one complete, controlled execution of an AI agent from input to actions to final output, while preserving consistency, safety, streaming, and session state

It is not only an LLM call.

It is a stateful, tool-using, streaming workflow pipeline.

Learning objectives¶

By the end of this lecture you will be able to:

Explain the full lifecycle of an agent run.
Understand why OpenClaw serializes runs per session.
Describe the role of session locks, queues, streaming, tools, hooks, and persistence.
Explain how agent, agent.wait, CLI runs, cron, and hooks connect to the same core runtime.
Identify where failures, timeouts, compaction, and early exits happen.
Design an agent loop for your own production assistant.

1. What an agent loop really is¶

At a high level:

input -> validate -> prepare context -> run model -> call tools -> stream output -> finalize -> persist

That looks simple, but each step hides real engineering concerns.

The loop must answer:

which session is this run using?
what model and auth profile should execute it?
what tools are allowed?
who can write to the transcript?
how are partial replies streamed?
what happens if a tool fails?
what happens if context is too large?
what final output should the user see?
what gets saved for the next turn?

So a more accurate definition is:

the agent loop is a deterministic, serialized, observable execution pipeline for AI agents with tools and memory

2. The full lifecycle¶

OpenClaw's loop can be understood as this sequence:

1. Intake request
2. Validate parameters
3. Resolve session
4. Queue by session lane
5. Prepare workspace, skills, and bootstrap context
6. Acquire session write lock
7. Assemble prompt
8. Resolve model and auth
9. Run model with streaming
10. Execute tools when requested
11. Shape final reply
12. Persist transcript and metadata
13. Emit lifecycle end or error

The key point:

the loop has a beginning, middle, and end that the system can observe

That is why OpenClaw can support live UI updates, agent.wait, cron run history, hooks, and debugging.

3. Entry points¶

The same loop can start from several places.

Entry point	Example	Meaning
Gateway RPC	`agent`	enqueue an agent run and return quickly
Gateway RPC wait	`agent.wait`	wait for lifecycle completion of a specific run
CLI	`openclaw agent ...`	local command-line invocation
Cron	`openclaw cron run <job-id>`	scheduled job triggers an agent run
Hook	webhook, Gmail, message hook	event-driven trigger starts or modifies a run

The important design choice is that these entry points should converge into one runtime path.

That gives the product consistent behavior:

same tool policy
same session semantics
same logging
same streaming events
same failure handling

4. Intake and validation¶

At the boundary, the gateway validates the request and resolves run metadata.

It needs to determine:

target agent
session key or session id
message body
model and thinking overrides
trace/verbose settings
delivery or caller context
timeout behavior

Then it can return an accepted response such as:

{
  "runId": "run_...",
  "acceptedAt": "2026-04-29T12:00:00Z"
}

This reflects an async-first design:

accepting a run is not the same as finishing a run

The Gateway can accept work, queue it, stream progress, and let clients wait separately.

5. Queueing and serialization¶

One of the most important OpenClaw design rules:

only one agent loop should run per session lane at a time

Why this matters:

prevents two runs from writing the same transcript concurrently
avoids interleaved tool results
keeps conversation history deterministic
avoids duplicate or contradictory final replies

The mental model:

session A: run 1 -> run 2 -> run 3
session B: run 1 -> run 2
session C: run 1

Each session lane is serial.

Different session lanes can still make progress independently, subject to global concurrency controls.

This explains why isolated cron and subagent sessions matter: they let background work proceed without blocking the user's main session lane.

6. Session write locks¶

Queueing handles logical order.

Locks handle actual file/state mutation.

OpenClaw protects transcript writes with a process-aware session write lock.

That matters because multiple processes may exist:

Gateway process
CLI process
maintenance or doctor commands
test workers
automation scripts

The rule:

any transcript write, rewrite, compaction, or truncation must acquire the same session write lock

By default, the lock should not be reentrant. Code that intentionally nests the same lock must opt in explicitly.

This is a strong production lesson:

session state is shared mutable state, so it needs a real concurrency boundary

7. Session and workspace preparation¶

Before the model is called, the loop prepares the run environment.

Typical preparation includes:

resolving workspace
creating workspace if needed
applying sandbox workspace root when sandboxed
loading a skills snapshot
resolving bootstrap context files
preparing environment variables
preparing the session manager

This is where "agent behavior" becomes more than a prompt.

The model sees the world through:

workspace
tools
skills
bootstrap context
session history
policy and runtime settings

Bad preparation leads to confusing agent behavior later.

8. Prompt assembly¶

The model does not see one string called "the prompt."

It sees assembled context.

OpenClaw-style prompt material includes:

base system prompt
+ skills prompt
+ bootstrap context
+ session history
+ per-run overrides
+ hook-injected context

The runtime must also account for:

model-specific token limits
compaction reserve tokens
truncation behavior
tool schemas
reasoning or thinking configuration

The important rule:

prompt assembly is runtime behavior, not just prompt writing

A production system should be able to answer:

what model saw which instructions?
which skill snapshot was active?
which bootstrap files were injected?
which hook changed the prompt?
why was context compacted?

9. Model execution¶

In OpenClaw's architecture, the embedded agent runner handles model execution.

Conceptually, this phase does:

resolve provider and model
resolve auth profile
start the model request
subscribe to model and tool events
enforce abort and timeout behavior
return final payloads and usage metadata

This is the "thinking" phase, but it is still runtime-controlled.

The model may:

emit assistant text
request tools
stream reasoning chunks when supported
hit idle timeout
trigger model switch behavior
fail with provider or network errors

The loop wraps that uncertainty in a stable contract.

10. Streaming events¶

OpenClaw streams more than final text.

A useful event model is:

Stream	What it carries
`lifecycle`	`start`, `end`, `error`
`assistant`	text deltas, block replies, optional reasoning chunks
`tool`	tool start, update, result, end

This is what makes live UI and channel updates possible.

For example:

lifecycle:start
assistant:delta "I will check..."
tool:start read_file
tool:end read_file
assistant:delta "The file shows..."
lifecycle:end

Streaming matters because long-running agent work should not feel like a black box.

It also gives operators a way to debug where a run got stuck:

no lifecycle start means intake/queue issue
lifecycle start but no assistant deltas means model or prompt issue
tool start with no tool end means tool execution issue
lifecycle error means the runtime saw a terminal failure

11. Tool execution¶

When the model requests a tool, the loop becomes an action engine.

The basic tool path:

model requests tool
-> emit tool start
-> validate tool call
-> run tool
-> sanitize result
-> emit tool result
-> feed result back to model

Tool results should be:

size-limited
sanitized
safe to persist
safe to stream
connected to the originating call

Some tools can send messages directly to users.

That introduces a reply-shaping issue:

if a messaging tool already sent the useful answer, the final assistant confirmation may be redundant.

OpenClaw tracks messaging tool sends so duplicate confirmations can be suppressed.

12. Hooks as interception points¶

Hooks let the product intercept the loop.

OpenClaw has internal Gateway hooks and plugin hooks. The important concept is that hooks can run at different lifecycle points.

Useful hook categories:

Hook area	Example	Purpose
Model selection	`before_model_resolve`	choose or override provider/model
Prompt assembly	`before_prompt_build`	inject context or system prompt material
Reply control	`before_agent_reply`	claim, override, or silence a turn
Tool policy	`before_tool_call`	block or modify a tool call
Tool output	`after_tool_call`, `tool_result_persist`	audit or transform tool results
Messaging	`message_received`, `message_sending`, `message_sent`	route, cancel, or audit messages
Lifecycle	`agent_end`, `gateway_start`, `gateway_stop`	observe system state
Compaction	`before_compaction`, `after_compaction`	inspect summarization behavior

Terminal hook decisions are important.

Examples:

{ "block": true }

or:

{ "cancel": true }

These stop lower-priority handling in their hook chain.

The production lesson:

hooks are powerful because they can change runtime behavior, so their ordering and terminal semantics must be explicit

13. Reply shaping¶

At the end of a run, the runtime decides what is user-visible.

Final payload assembly may include:

assistant text
block replies
tool summaries when verbose and allowed
error messages when needed

Then the runtime applies cleanup rules:

suppress exact silent tokens such as NO_REPLY or no_reply
remove messaging-tool duplicate confirmations
emit a fallback tool error reply if no renderable output remains and the user has not already seen a reply
avoid replaying stale acknowledgement-only text when a better descendant result exists

This phase exists because models often produce text that is not the right final user output.

The agent loop must shape output into product behavior.

14. Persistence¶

After the run, the system writes:

user message
assistant message
tool calls and tool results
metadata
usage information
lifecycle state

Persistence must happen under the session write lock.

This protects the transcript from races and gives future turns a coherent history.

The practical rule:

if it affects future context, persist it carefully

This is also why streaming and persistence are different concerns. A user can see partial output before the final transcript is fully committed.

15. `agent.wait`¶

The agent RPC starts a run.

The agent.wait RPC waits for a run to reach lifecycle end or error.

A wait result looks conceptually like:

{
  "status": "ok",
  "startedAt": "2026-04-29T12:00:00Z",
  "endedAt": "2026-04-29T12:00:10Z"
}

or:

{
  "status": "timeout"
}

Important distinction:

agent.wait timeout does not necessarily stop the agent run

It only means the waiter stopped waiting.

This is the same concept as watching a background job: your terminal can time out while the job continues.

16. Compaction and retry¶

Agent context grows.

Eventually the transcript may become too large for the model context window or for the configured reserve budget.

When that happens, the loop may trigger compaction:

detect context pressure
-> emit compaction event
-> summarize or rewrite context
-> retry the run

On retry, the runtime must reset in-memory buffers and tool summaries so output does not duplicate.

This matters because compaction is not just a storage task.

It affects:

what the model sees
what the user sees
which messages remain detailed
whether the rerun repeats old output

17. Timeouts and aborts¶

OpenClaw-style systems have multiple timeout layers.

Timeout	What it affects
agent runtime timeout	maximum agent run duration
LLM idle timeout	aborts a model request when no chunks arrive
`agent.wait` timeout	how long the caller waits
cron outer timeout	scheduled-job-level control

OpenClaw's documented defaults include:

agent runtime default can be configured through agents.defaults.timeoutSeconds, with documented examples around long 48-hour runs
agent.wait defaults to a short wait window
LLM idle timeout can be configured separately
cron-triggered runs may rely on outer cron control when no explicit LLM/agent timeout is set

The lesson:

timeout semantics must say what gets cancelled and what only stops waiting

Without that distinction, operators misread system behavior.

18. Where runs can end early¶

An agent loop may end early because of:

agent runtime timeout
abort signal
gateway disconnect
RPC wait timeout
model failure
tool failure
hook block or cancel decision
compaction failure

Good runtime design still attempts to emit lifecycle events:

lifecycle:error

That gives clients a final state and helps agent.wait, UI, cron, and logs agree about what happened.

19. How this connects to cron¶

Now the previous lecture on cron becomes easier.

Cron does not do the agent work itself.

Cron schedules the work:

cron schedule
-> due job
-> agent RPC
-> agent loop
-> delivery/logging

So:

cron answers when
the agent loop answers how
tools answer what actions
hooks answer where policy intervenes
sessions answer what state

This is the architecture pattern behind serious persistent agents.

20. Example: one OpenClaw-style run¶

Imagine a user sends:

Check the repo and summarize today's failing tests.

The loop might behave like this:

1. Gateway receives message.
2. Gateway resolves session key.
3. Run enters that session lane queue.
4. Session lock is acquired.
5. Workspace and skills are prepared.
6. Prompt is assembled with history and bootstrap context.
7. Model starts streaming.
8. Model calls a shell/test-inspection tool.
9. Tool result is sanitized and streamed.
10. Model writes a final summary.
11. Duplicate tool-send confirmations are suppressed.
12. Transcript and metadata are persisted.
13. lifecycle:end is emitted.
14. The chat channel sends the final response.

The user experiences one reply.

The system executed a controlled transaction-like runtime path.

21. Design exercise¶

Design an agent loop for a local engineering assistant.

Fill in this table:

Area	Your design
Entry points	CLI, Web UI, Slack, cron
Session lane rule	one run at a time per session
Global queue	max 2 concurrent model runs
Write lock	required for transcript writes and compaction
Prompt inputs	base prompt, skills, repo context, session history
Tools	read, search, test runner, issue lookup
Tool policy	write and deploy require approval
Hooks	block deploy from unpaired channels
Streaming	lifecycle, assistant, tool
Final reply shaping	suppress `NO_REPLY`, remove duplicate confirmations
Timeout model	wait timeout separate from runtime timeout
Persistence	transcript, tool outputs, usage, run metadata

The value of this exercise is that it forces you to treat the agent as infrastructure, not as a single API call.

Key takeaways¶

The agent loop is the execution engine of an agent product.
It is a serialized, observable pipeline from input to persisted final state.
Per-session queueing prevents transcript and tool races.
Session write locks protect durable state across processes.
Prompt assembly, model execution, tool execution, streaming, reply shaping, and persistence are separate runtime concerns.
Hooks are policy and extension points inside the loop.
agent.wait waits for lifecycle completion; it does not define the entire run.
Cron triggers agent loops, but cron is not the agent loop.

References¶

OpenClaw agent loop: https://openclaw.knidal.com/agent-loop
Case-study source repo: OpenClaw
OpenClaw concepts:
docs/automation/cron-jobs.md
docs/cli/cron.md
docs/tools/subagents.md
docs/concepts/session.md
docs/reference/session-management-compaction.md

Next: Lecture 20 - OpenClaw Case Study: Cron, Scheduled Agent Runs, and Automation Reliability