bookly/scripts/build_litmd.py
Cody Borders 3947180841 Harden security/perf, add literate program at /architecture
Security and performance fixes addressing a comprehensive review:

- Server-issued HMAC-signed session cookies; client-supplied session_id
  ignored. Prevents session hijacking via body substitution.
- Sliding-window rate limiter per IP and per session.
- SessionStore with LRU eviction, idle TTL, per-session threading locks,
  and a hard turn cap. Bounds memory and serializes concurrent turns for
  the same session so FastAPI's threadpool cannot corrupt history.
- Tool-use loop capped at settings.max_tool_use_iterations; Anthropic
  client gets an explicit timeout. No more infinite-loop credit burn.
- Every tool argument is regex-validated, length-capped, and
  control-character-stripped. asserts replaced with ValueError so -O
  cannot silently disable the checks.
- PII-safe warning logs: session IDs and reply bodies are hashed, never
  logged in clear.
- hmac.compare_digest for email comparison (constant-time).
- Strict Content-Security-Policy plus X-Content-Type-Options,
  X-Frame-Options, Referrer-Policy, Permissions-Policy via middleware.
- Explicit handlers for anthropic.RateLimitError, APIConnectionError,
  APIStatusError, ValueError; static dir resolved from __file__.
- Prompt cache breakpoints on the last tool schema and the last message
  so per-turn input cost scales linearly, not quadratically.
- TypedDict handler argument shapes; direct block.name/block.id access.
- functools.lru_cache on _get_client.
- Anchored word-boundary regexes for out-of-scope detection to kill
  false positives on phrases like "I'd recommend contacting...".

Literate program:

- Bookly.lit.md is now the single source of truth for the five core
  Python files. Tangles byte-for-byte; verified via tangle.ts --verify.
- Prose walkthrough, three mermaid diagrams, narrative per module.
- Woven to static/architecture.html with the app's palette
  (background #f5f3ee) via scripts/architecture-header.html.
- New GET /architecture route serves the HTML with a relaxed CSP that
  allows pandoc's inline styles. Available at
  bookly.codyborders.com/architecture.
- scripts/rebuild_architecture_html.sh regenerates the HTML after edits.
- code_reviews/2026-04-15-1433-code-review.md captures the review that
  drove these changes.

All 37 tests pass.
2026-04-15 15:02:40 -07:00

448 lines
19 KiB
Python

"""Generate Bookly.lit.md from a template plus the current source files.
This script is invoked once to bootstrap the literate program. Edits after
that should go into Bookly.lit.md directly, with `tangle.ts` regenerating
the source files. See the reverse-sync hook in .claude/settings.local.json
for the path where source-file edits feed back into the .lit.md.
"""
from __future__ import annotations
import textwrap
from pathlib import Path
ROOT = Path(__file__).resolve().parent.parent
def _read(path: str) -> str:
return (ROOT / path).read_text(encoding="utf-8")
def _chunk(language: str, name: str, file_path: str, body: str) -> str:
# A chunk fence. The body is embedded verbatim -- every character of the
# file must round-trip through tangling, so we never rewrap or reformat.
if body.endswith("\n"):
body = body[:-1]
return f'```{language} {{chunk="{name}" file="{file_path}"}}\n{body}\n```'
def main() -> None:
config_py = _read("config.py")
mock_data_py = _read("mock_data.py")
tools_py = _read("tools.py")
agent_py = _read("agent.py")
server_py = _read("server.py")
out = textwrap.dedent(
"""\
---
title: "Bookly"
---
# Introduction
Bookly is a customer-support chatbot for a bookstore. It handles three
things: looking up orders, processing returns, and answering a small
set of standard policy questions. Everything else it refuses, using a
verbatim template.
The interesting engineering is not the feature set. It is the
guardrails. A chat agent wired to real tools can hallucinate order
details, leak private information, skip verification steps, or wander
off topic -- and the consequences land on real customers. Bookly
defends against that with four independent layers, each of which
assumes the previous layers have failed.
This document is both the prose walkthrough and the source code. The
code you see below is the code that runs. Tangling this file produces
the Python source tree byte-for-byte; weaving it produces the HTML
you are reading.
# The four guardrail layers
Before anything else, it helps to see the layers laid out in one
picture. Each layer is a separate defence, and a malicious or
confused input has to defeat all of them to cause harm.
```mermaid
graph TD
U[User message]
L1[Layer 1: System prompt<br/>identity, critical_rules, scope,<br/>verbatim policy, refusal template]
L2[Layer 2: Runtime reminders<br/>injected every turn +<br/>long-conversation re-anchor]
M[Claude]
T{Tool use?}
L3[Layer 3: Tool-side enforcement<br/>input validation +<br/>protocol guard<br/>eligibility before return]
L4[Layer 4: Output validation<br/>regex grounding checks,<br/>markdown / off-topic / ID / date]
OK[Reply to user]
BAD[Safe fallback,<br/>bad reply dropped from history]
U --> L1
L1 --> L2
L2 --> M
M --> T
T -- yes --> L3
L3 --> M
T -- no --> L4
L4 -- ok --> OK
L4 -- violations --> BAD
```
Layer 1 is the system prompt itself. It tells the model what Bookly
is, what it can and cannot help with, what the return policy actually
says (quoted verbatim, not paraphrased), and exactly which template
to use when refusing. Layer 2 adds short reminder blocks on every
turn so the model re-reads the non-negotiable rules at the
highest-attention position right before the user turn. Layer 3 lives
in `tools.py`: the tool handlers refuse unsafe calls regardless of
what the model decides. Layer 4 lives at the end of the agent loop
and does a deterministic regex pass over the final reply looking
for things like fabricated order IDs, markdown leakage, and
off-topic engagement.
# Request lifecycle
A single user message travels this path:
```mermaid
sequenceDiagram
autonumber
participant B as Browser
participant N as nginx
participant S as FastAPI
participant A as agent.run_turn
participant C as Claude
participant TL as tools.dispatch_tool
B->>N: POST /api/chat { message }
N->>S: proxy_pass
S->>S: security_headers middleware
S->>S: resolve_session (cookie)
S->>S: rate limit (ip + session)
S->>A: run_turn(session_id, message)
A->>A: SessionStore.get_or_create<br/>+ per-session lock
A->>C: messages.create(tools, system, history)
loop tool_use
C-->>A: tool_use blocks
A->>TL: dispatch_tool(name, args, state)
TL-->>A: tool result
A->>C: messages.create(history+tool_result)
end
C-->>A: final text
A->>A: validate_reply (layer 4)
A-->>S: reply text
S-->>B: { reply }
```
# Module layout
Five Python files form the core. They depend on each other in one
direction only -- there are no cycles.
```mermaid
graph LR
MD[mock_data.py<br/>ORDERS, POLICIES, RETURN_POLICY]
C[config.py<br/>Settings]
T[tools.py<br/>schemas, handlers, dispatch]
A[agent.py<br/>SessionStore, run_turn, validate]
SV[server.py<br/>FastAPI, middleware, routes]
MD --> T
MD --> A
C --> T
C --> A
C --> SV
T --> A
A --> SV
```
The rest of this document visits each module in dependency order:
configuration first, then the data fixtures they read, then tools,
then the agent loop, then the HTTP layer on top.
# Configuration
Every setting that might reasonably change between environments
lives in one place. The two required values -- the Anthropic API
key and the session-cookie signing secret -- are wrapped in
`SecretStr` so an accidental `print(settings)` cannot leak them to
a log.
Everything else has a default that is safe for local development
and reasonable for a small production deployment. A few knobs are
worth noticing:
- `max_tool_use_iterations` bounds the Layer-3 loop in `agent.py`.
A model that keeps asking for tools forever will not burn API
credit forever.
- `session_store_max_entries` and `session_idle_ttl_seconds` cap
the in-memory `SessionStore`, so a trivial script that opens
millions of sessions cannot OOM the process.
- `rate_limit_per_ip_per_minute` and
`rate_limit_per_session_per_minute` feed the sliding-window
limiter in `server.py`.
"""
)
out += _chunk("python", "config-py", "config.py", config_py) + "\n\n"
out += textwrap.dedent(
"""\
# Data fixtures
Bookly does not talk to a real database. Four fixture orders are
enough to cover the interesting scenarios: a delivered order that
is still inside the 30-day return window, an in-flight order that
has not been delivered yet, a processing order that has not
shipped, and an old delivered order outside the return window.
Sarah Chen owns two of the four so the agent has to disambiguate
when she says "my order".
The `RETURN_POLICY` dict is the single source of truth for policy
facts. Two things read it: the system prompt (via
`_format_return_policy_block` in `agent.py`, which renders it as
the `<return_policy>` section the model must quote) and the
`check_return_eligibility` handler (which enforces the window in
code). Having one copy prevents the two from drifting apart.
`POLICIES` is a tiny FAQ keyed by topic. The `lookup_policy` tool
returns one of these entries verbatim and the system prompt
instructs the model to quote the response without paraphrasing.
This is a deliberate anti-hallucination pattern: the less the
model has to generate, the less it can make up.
`RETURNS` is the only mutable state in this file. `initiate_return`
writes a new RMA record to it on each successful return.
"""
)
out += _chunk("python", "mock-data-py", "mock_data.py", mock_data_py) + "\n\n"
out += textwrap.dedent(
"""\
# Tools: Layer 3 enforcement
Four tools back the agent: `lookup_order`, `check_return_eligibility`,
`initiate_return`, and `lookup_policy`. Each has an Anthropic-format
schema (used in the `tools` argument to `messages.create`) and a
handler function that takes a validated arg dict plus the
per-session guard state and returns a dict that becomes the
`tool_result` content sent back to the model.
The most important guardrail in the entire system lives in this
file. `handle_initiate_return` refuses unless
`check_return_eligibility` has already succeeded for the same
order in the same session. This is enforced in code, not in the
prompt -- if a model somehow decides to skip the eligibility
check, the tool itself refuses. This is "Layer 3" in the stack:
the model's last line of defence against itself.
A second guardrail is the privacy boundary in `handle_lookup_order`.
When a caller supplies a `customer_email` and it does not match
the email on the order, the handler returns the same
`order_not_found` error as a missing order. This mirror means an
attacker cannot probe for which order IDs exist by watching
response differences. The check uses `hmac.compare_digest` for
constant-time comparison so response-time side channels cannot
leak the correct email prefix either.
Input validation lives in `_require_*` helpers at the top of the
file. Every string is control-character-stripped before length
checks so a malicious `\\x00` byte injected into a tool arg cannot
sneak into the tool result JSON and reappear in the next turn's
prompt. Order IDs, emails, and policy topics are validated with
tight regexes; unexpected input becomes a structured
`invalid_arguments` error that the model can recover from on its
next turn.
`TypedDict` argument shapes make the schema-to-handler contract
visible to the type checker without losing runtime validation --
the model is an untrusted caller, so the runtime checks stay.
"""
)
out += _chunk("python", "tools-py", "tools.py", tools_py) + "\n\n"
out += textwrap.dedent(
"""\
# Agent loop
This is the biggest file. It wires everything together: the system
prompt, runtime reminders, output validation (Layer 4), the
in-memory session store with per-session locking, the cached
Anthropic client, and the actual tool-use loop that drives a turn
end to end.
## System prompt
The prompt is structured with XML-style tags (`<identity>`,
`<critical_rules>`, `<scope>`, `<return_policy>`, `<tool_rules>`,
`<tone>`, `<examples>`, `<reminders>`). The critical rules are
stated up front and repeated at the bottom (primacy plus recency).
The return policy section interpolates the `RETURN_POLICY` dict
verbatim via `_format_return_policy_block`, so the prompt and the
enforcement in `tools.py` cannot disagree.
Four few-shot examples are embedded directly in the prompt. Each
one demonstrates a case that is easy to get wrong: missing order
ID, quoting a policy verbatim, refusing an off-topic request,
disambiguating between two orders.
## Runtime reminders
On every turn, `build_system_content` appends a short
`CRITICAL_REMINDER` block to the system content. Once the turn
count crosses `LONG_CONVERSATION_TURN_THRESHOLD`, a second
`LONG_CONVERSATION_REMINDER` is added. The big `SYSTEM_PROMPT`
block is the only one marked `cache_control: ephemeral` -- the
reminders vary per turn and we want them at the
highest-attention position, not in the cached prefix.
## Layer 4 output validation
After the model produces its final reply, `validate_reply` runs
four cheap deterministic checks: every `BK-NNNN` string in the
reply must also appear in a tool result from this turn, every
ISO date in the reply must appear in a tool result, the reply
must not contain markdown, and if the reply contains off-topic
engagement phrases it must also contain the refusal template.
Violations are collected and returned as a frozen
`ValidationResult`.
The off-topic patterns used to be loose substring matches on a
keyword set. That false-positived on plenty of legitimate support
replies ("I'd recommend contacting..."). The current patterns
use word boundaries so only the intended phrases trip them.
## Session store
`SessionStore` is a bounded in-memory LRU with an idle TTL. It
stores `Session` objects (history, guard state, turn count) keyed
by opaque server-issued session IDs. It also owns the per-session
locks used to serialize concurrent turns for the same session,
since FastAPI runs the sync `chat` handler in a threadpool and
two simultaneous requests for the same session would otherwise
corrupt the conversation history.
The locks-dict is itself protected by a class-level lock so two
threads trying to create the first lock for a session cannot race
into two different lock instances.
Under the "single-process demo deployment" constraint this is
enough. For multi-worker, the whole class would get swapped for
a Redis-backed equivalent.
## The tool-use loop
`_run_tool_use_loop` drives the model until it stops asking for
tools. It is bounded by `settings.max_tool_use_iterations` so a
runaway model cannot burn credit in an infinite loop. Each
iteration serializes the assistant's content blocks into history,
dispatches every requested tool, packs the results into a single
`tool_result` user-role message, and calls Claude again. Before
each call, `_with_last_message_cache_breakpoint` stamps the last
message with `cache_control: ephemeral` so prior turns do not
need to be re-tokenized on every call. This turns the per-turn
input-token cost from `O(turns^2)` into `O(turns)` across a
session.
## run_turn
`run_turn` is the top-level entry point the server calls. It
validates its inputs, acquires the per-session lock, appends the
user message, runs the loop, and then either persists the final
reply to history or -- on validation failure -- drops the bad
reply and returns a safe fallback. Dropping a bad reply from
history is important: it prevents a hallucinated claim from
poisoning subsequent turns.
Warning logs never include the reply body. Session IDs and reply
contents are logged only as short SHA-256 hashes for correlation,
which keeps PII out of the log pipeline even under active
incident response.
"""
)
out += _chunk("python", "agent-py", "agent.py", agent_py) + "\n\n"
out += textwrap.dedent(
"""\
# HTTP surface
The FastAPI app exposes four routes: `GET /health`, `GET /`
(redirects to `/static/index.html`), `POST /api/chat`, and
`GET /architecture` (this very document). Everything else is
deliberately missing -- the OpenAPI docs pages and the redoc
pages are disabled so the public surface is as small as possible.
## Security headers
A middleware injects a strict Content-Security-Policy and
friends on every response. CSP is defense in depth: the chat UI
in `static/chat.js` already renders model replies with
`textContent` rather than `innerHTML`, so XSS is structurally
impossible today. The CSP exists to catch any future regression
that accidentally switches to `innerHTML`.
The `/architecture` route overrides the middleware CSP with a
more permissive one because pandoc's standalone HTML has inline
styles.
## Sliding-window rate limiter
`SlidingWindowRateLimiter` keeps a deque of timestamps per key
and evicts anything older than the window. The `/api/chat`
handler checks twice per call -- once with an `ip:` prefix,
once with a `session:` prefix -- so a single attacker cannot
exhaust the per-session budget by rotating cookies, and a
legitimate user does not get locked out by a noisy neighbour on
the same IP.
Suitable for a single-process demo deployment. A multi-worker
deployment would externalize this to Redis.
## Session cookies
The client never chooses its own session ID. On the first
request a new random ID is minted, HMAC-signed with
`settings.session_secret`, and set in an HttpOnly, SameSite=Lax
cookie. Subsequent requests carry the cookie; the server
verifies the signature in constant time
(`hmac.compare_digest`) and trusts nothing else. A leaked or
guessed request body cannot hijack another user's conversation
because the session ID is not in the body at all.
## /api/chat
The handler resolves the session, checks both rate limits,
then calls into `agent.run_turn`. The Anthropic exception
hierarchy is caught explicitly so a rate-limit incident and a
code bug cannot look identical to operators:
`anthropic.RateLimitError` becomes 503, `APIConnectionError`
becomes 503, `APIStatusError` becomes 502, `ValueError` from
the agent becomes 400, anything else becomes 500.
## /architecture
This is where the woven literate program is served. The handler
reads `static/architecture.html` (produced by pandoc from this
file) and returns it with a relaxed CSP. If the file does not
exist yet, the route 404s with a clear message rather than
raising a 500.
"""
)
out += _chunk("python", "server-py", "server.py", server_py) + "\n"
out_path = ROOT / "Bookly.lit.md"
out_path.write_text(out, encoding="utf-8")
print(f"wrote {out_path} ({len(out.splitlines())} lines)")
if __name__ == "__main__":
main()