Lab 04 — TokenJuice Output Compaction for Terminal-Heavy Agents¶
Track B · Agentic AI & GenAI | ← Index | Previous → Lab 03 | Next → Lab 05
Overview¶
This is a small, fun systems lab.
You will use TokenJuice to compact noisy terminal output before it enters an agent transcript.
The core idea is simple:
run command normally
-> observe output
-> deterministically reduce prompt-facing text
-> keep raw output available only when explicitly needed
TokenJuice is not an LLM summarizer.
It is a rule-driven output reducer for terminal-heavy workflows such as:
git statuspnpm testdocker buildrg --filespnpm --help- build logs
- lint output
- package-manager noise
The lab goal is to measure whether compaction improves agent workflow quality without hiding critical debugging information.
Estimated time: 45-75 minutes
Difficulty: Beginner to intermediate
Why this matters¶
Agent workflows waste context on terminal output.
Example:
agent runs pnpm test
-> receives 800 lines
-> only 25 lines matter
-> transcript fills with noise
-> next turn has less useful context
-> agent reruns commands because it missed the important part
TokenJuice attacks that waste by making terminal output leaner.
The key design properties:
- command semantics stay untouched
- reduction happens after execution
- rules are inspectable JSON
- raw output is available through explicit bypasses
- host integrations stay thin wrappers around the same reducer
- project rules can override built-in rules
This is the right kind of "boring" infrastructure for agents.
Learning objectives¶
By the end of this lab, you should be able to:
- Explain why terminal output is a token-budget problem for agents.
- Use
tokenjuice reduceto compact existing logs. - Use
tokenjuice wrapto run a command and compact its observed output. - Use
--raw,--full, and artifact storage when exact bytes matter. - Inspect machine-facing output with
reduce-json. - Understand the rule precedence model.
- Write a small project-specific reducer.
- Decide when compaction is safe and when it is dangerous.
- Connect TokenJuice-style reducers to OpenClaw, Codex, Claude Code, and other agent harnesses.
Step 0 — Safety model¶
Before installing hooks into an agent, understand the safety boundary.
TokenJuice should not:
- rewrite commands silently
- pretend lossy output is complete
- summarize with an LLM
- hide raw output when exact bytes are required
- compact exact file-content reads such as
cat,sed,head, ortail
Good use:
Risky use:
security logs
binary dumps
exact config file reads
one-off debugging where every byte matters
mixed shell sequences with side effects
Use raw mode when necessary:
Step 1 — Install TokenJuice¶
Install with your preferred package manager:
or:
or, if using Homebrew:
Verify:
If you do not want a global install, use a scratch project:
mkdir tokenjuice-lab
cd tokenjuice-lab
pnpm init
pnpm add -D tokenjuice
pnpm exec tokenjuice --version
For the rest of the lab, replace tokenjuice with pnpm exec tokenjuice if using a local install.
Step 2 — Create noisy sample output¶
Create a lab folder:
Create a fake test log:
cat > logs/test-output.txt <<'EOF'
RUN v3.2.4 /repo
stdout | packages/core/test/reducer.test.ts > reducer keeps failure detail
loading config from /repo/tokenjuice.config.json
loading built-in rules from src/rules
loading user rules from ~/.config/tokenjuice/rules
loading project rules from .tokenjuice/rules
✓ packages/core/test/classify.test.ts (28 tests) 132ms
✓ packages/core/test/command.test.ts (42 tests) 188ms
✓ packages/core/test/artifacts.test.ts (17 tests) 96ms
✓ packages/hosts/test/codex.test.ts (18 tests) 120ms
✓ packages/hosts/test/claude-code.test.ts (21 tests) 140ms
stderr | packages/core/test/reducer.test.ts > reducer keeps failure detail
AssertionError: expected reducer to preserve "exit code 2"
at test/core/reducer.test.ts:88:15
at runTest test-runner.ts:55:7
FAIL packages/core/test/reducer.test.ts > reducer keeps failure detail
Expected preserved lines:
exit code 2
src/rules/fixtures/pnpm-test-failure.txt
Received:
exit code omitted
Test Files 1 failed | 5 passed (6)
Tests 1 failed | 126 passed (127)
Duration 2.8s
EOF
Check raw size:
Step 3 — Reduce existing text¶
Run:
Also compare with a pipe:
Record:
The important question is not "was it shorter?"
The important question is:
For a failing test, the reducer should preserve enough detail to identify:
- failing file
- failing test name
- expected versus received clue
- stack frame or source location
- total failure count
Step 4 — Wrap real commands¶
Now run commands through TokenJuice directly.
Start with safe inventory:
Try a noisy help command:
Try a command that may fail:
tokenjuice wrap -- bash -lc 'echo "compile start"; echo "error TS2322: Type string is not assignable"; exit 2'
Observe:
- output is compacted after execution
- the command still runs normally
- non-zero exits should preserve more detail than success noise
This is the key distinction:
Step 5 — Use raw and full bypasses¶
Compaction is useful until it is not.
Run:
Use raw/full modes when:
- exact text is required
- you are debugging a reducer
- a fixture needs exact expected output
- the agent needs every line of a file-like response
- the reduced output hides a relevant middle section
Write this rule into your own agent practice:
Step 6 — Store and inspect artifacts¶
TokenJuice can store raw output as an artifact when explicitly requested.
Run:
List artifacts:
Inspect one:
Why this matters:
That is the right compromise for agent transcripts.
Step 7 — Inspect machine-facing JSON¶
Host integrations need stable machine output.
Create a tool payload:
cat > payload.json <<'EOF'
{
"toolName": "exec",
"command": "pnpm test",
"argv": ["pnpm", "test"],
"combinedText": "RUN v3.2.4 /repo\nFAIL packages/core/test/reducer.test.ts > reducer keeps failure detail\nAssertionError: expected reducer to preserve exit code 2\nTest Files 1 failed | 5 passed\nTests 1 failed | 126 passed\n",
"exitCode": 1
}
EOF
Run:
Look for:
- reduced output text
- command classification
- savings information
- whether failure context was preserved
This is the surface host adapters should use.
Human-facing CLIs can be flexible.
Adapter protocols should be boring and structured.
Step 8 — Verify rules¶
TokenJuice rules are JSON.
Built-in rules live in the package. Overrides can live in:
Rule precedence:
Later layers override earlier layers by rule id.
Run:
If fixtures are available:
This should check:
- JSON parses
- schema shape is valid
- regexes compile
- duplicate ids are rejected inside the same layer
- fixture expectations still match reducers
Step 9 — Write a tiny project reducer¶
Create a project override folder:
Create a reducer for a fake hardware bring-up log:
cat > logs/otbr-output.txt <<'EOF'
Apr 20 09:34:20 ubuntu otbr-agent[10839]: Attach attempt 8, AnyPartition
Apr 20 09:34:20 ubuntu otbr-agent[10839]: Send Parent Request to routers
Apr 20 09:34:22 ubuntu otbr-agent[10839]: Attach attempt 8 unsuccessful, will try again in 32.128 seconds
Apr 20 09:34:46 ubuntu otbr-agent[11149]: Running 0.3.0-1e957ca
Apr 20 09:34:46 ubuntu otbr-agent[11149]: Thread version: 1.4.0
Apr 20 09:34:46 ubuntu otbr-agent[11149]: Radio URL: spinel+hdlc+uart:///dev/ttyTHS1?uart-baudrate=460800
Apr 20 09:34:46 ubuntu otbr-agent[11149]: InitMulticastRouterSock() at multicast_routing.cpp:227: Protocol not available
Apr 20 09:34:47 ubuntu otbr-agent[11149]: TrelDiscoverer: DNS-SD service registered successfully
EOF
Create a project rule:
cat > .tokenjuice/rules/otbr-journal.json <<'EOF'
{
"id": "otbr/journal",
"family": "otbr-journal",
"match": {
"commandIncludes": ["journalctl", "otbr-agent"]
},
"transforms": {
"trimEmptyEdges": true,
"dedupeAdjacent": true
},
"filters": {
"keepPatterns": [
"Attach attempt",
"unsuccessful",
"Radio URL",
"Thread version",
"Protocol not available",
"InitMulticastRouterSock"
],
"skipPatterns": [
"DNS-SD service registered successfully"
]
},
"summarize": {
"head": 20,
"tail": 8
},
"failure": {
"preserveOnFailure": true,
"head": 24,
"tail": 12
},
"counters": [
{
"name": "attach_attempts",
"pattern": "Attach attempt"
},
{
"name": "kernel_missing_mroute",
"pattern": "Protocol not available"
}
]
}
EOF
Verify:
Test with JSON input:
cat > otbr-payload.json <<'EOF'
{
"toolName": "exec",
"command": "journalctl -u otbr-agent -n 80 --no-pager",
"argv": ["journalctl", "-u", "otbr-agent", "-n", "80", "--no-pager"],
"combinedText": "",
"exitCode": 0
}
EOF
Inject the log content:
python3 - <<'PY'
import json
from pathlib import Path
payload = json.loads(Path("otbr-payload.json").read_text())
payload["combinedText"] = Path("logs/otbr-output.txt").read_text()
Path("otbr-payload.json").write_text(json.dumps(payload, indent=2))
PY
Run:
If the rule does not match, inspect the docs and simplify the match block.
The learning goal is not to memorize the exact schema.
The learning goal is to understand that project-local reducers can encode domain-specific signal.
Step 10 — Measure savings¶
Create a simple measurement script:
cat > measure.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
file="${1:?usage: ./measure.sh <file>}"
raw_bytes="$(wc -c < "$file" | tr -d ' ')"
reduced="$(tokenjuice reduce "$file")"
reduced_bytes="$(printf "%s" "$reduced" | wc -c | tr -d ' ')"
python3 - <<PY
raw = int("$raw_bytes")
reduced = int("$reduced_bytes")
savings = 0 if raw == 0 else (1 - reduced / raw) * 100
print(f"raw bytes: {raw}")
print(f"reduced bytes: {reduced}")
print(f"savings: {savings:.1f}%")
PY
EOF
chmod +x measure.sh
Run:
For this lab, good savings are not enough.
You must also answer:
For the OTBR example, the compacted output should still preserve:
- attach attempts failed
- Thread version
- Radio URL
Protocol not available- likely kernel multicast routing issue
If those disappear, the reducer is too aggressive.
Step 11 — Connect to an agent host¶
TokenJuice supports several host integrations, including Codex CLI, Claude Code, Cursor, OpenCode, pi, and OpenClaw.
For Codex CLI:
For Claude Code:
For aggregate hook state:
For OpenClaw, TokenJuice support is bundled on the OpenClaw side. The upstream docs say to enable the plugin instead of running tokenjuice install openclaw:
This requires OpenClaw 2026.4.22 or newer.
Do not install host hooks on machines where you do not understand the hook behavior.
Use doctor first, and keep a rollback path.
Step 12 — Evaluate the lab¶
Build a small table:
| Command or log | Raw bytes | Reduced bytes | Savings | Did it preserve next action? |
|---|---|---|---|---|
logs/test-output.txt |
||||
logs/otbr-output.txt |
||||
git ls-files |
||||
pnpm --help |
Then write a short conclusion:
TokenJuice is useful for:
TokenJuice is risky for:
The reducer I would add next is:
The command family I would always keep raw is:
What you should learn¶
TokenJuice is funny because the product language is playful: "token weight loss."
But the underlying systems idea is serious:
Terminal-heavy agents need:
- less noise
- preserved failure clues
- deterministic behavior
- explicit raw bypass
- recoverable artifacts
- inspectable rules
- measurable savings
This is a small tool, but it touches a real production problem.
Extensions¶
- Add a reducer for Jetson audio logs — Preserve
arecord,aplay, APE card, I2S2, and ALSA control lines. - Add a reducer for CUDA profiler output — Preserve kernel names, occupancy, memory throughput, and warnings.
- Add fixture tests — Create before/after expected outputs for your project reducer.
- Add OpenClaw workflow notes — Define when your agents should use raw output versus compacted output.
- Compare with LLM summarization — Run the same logs through an LLM summary and compare determinism, cost, and missed details.
References¶
- TokenJuice repository: https://github.com/vincentkoc/tokenjuice
- TokenJuice spec: https://github.com/vincentkoc/tokenjuice/blob/main/docs/spec.md
- TokenJuice rules: https://github.com/vincentkoc/tokenjuice/blob/main/docs/rules.md
- TokenJuice integration playbook: https://github.com/vincentkoc/tokenjuice/blob/main/docs/integration-playbook.md