bookly/config.py
Cody Borders 3947180841 Harden security/perf, add literate program at /architecture
Security and performance fixes addressing a comprehensive review:

- Server-issued HMAC-signed session cookies; client-supplied session_id
  ignored. Prevents session hijacking via body substitution.
- Sliding-window rate limiter per IP and per session.
- SessionStore with LRU eviction, idle TTL, per-session threading locks,
  and a hard turn cap. Bounds memory and serializes concurrent turns for
  the same session so FastAPI's threadpool cannot corrupt history.
- Tool-use loop capped at settings.max_tool_use_iterations; Anthropic
  client gets an explicit timeout. No more infinite-loop credit burn.
- Every tool argument is regex-validated, length-capped, and
  control-character-stripped. asserts replaced with ValueError so -O
  cannot silently disable the checks.
- PII-safe warning logs: session IDs and reply bodies are hashed, never
  logged in clear.
- hmac.compare_digest for email comparison (constant-time).
- Strict Content-Security-Policy plus X-Content-Type-Options,
  X-Frame-Options, Referrer-Policy, Permissions-Policy via middleware.
- Explicit handlers for anthropic.RateLimitError, APIConnectionError,
  APIStatusError, ValueError; static dir resolved from __file__.
- Prompt cache breakpoints on the last tool schema and the last message
  so per-turn input cost scales linearly, not quadratically.
- TypedDict handler argument shapes; direct block.name/block.id access.
- functools.lru_cache on _get_client.
- Anchored word-boundary regexes for out-of-scope detection to kill
  false positives on phrases like "I'd recommend contacting...".

Literate program:

- Bookly.lit.md is now the single source of truth for the five core
  Python files. Tangles byte-for-byte; verified via tangle.ts --verify.
- Prose walkthrough, three mermaid diagrams, narrative per module.
- Woven to static/architecture.html with the app's palette
  (background #f5f3ee) via scripts/architecture-header.html.
- New GET /architecture route serves the HTML with a relaxed CSP that
  allows pandoc's inline styles. Available at
  bookly.codyborders.com/architecture.
- scripts/rebuild_architecture_html.sh regenerates the HTML after edits.
- code_reviews/2026-04-15-1433-code-review.md captures the review that
  drove these changes.

All 37 tests pass.
2026-04-15 15:02:40 -07:00

64 lines
2.7 KiB
Python

"""Application configuration loaded from environment variables.
Settings are read from `.env` at process start. The Anthropic API key and
the session-cookie signing secret are the only required values; everything
else has a sensible default so the app can boot in dev without ceremony.
"""
from __future__ import annotations
import secrets
from pydantic import Field, SecretStr
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")
# Required secrets -- wrapped in SecretStr so accidental logging or repr
# does not leak them. Access the raw value with `.get_secret_value()`.
anthropic_api_key: SecretStr
# Signing key for the server-issued session cookie. A fresh random value is
# generated at import if none is configured -- this means sessions do not
# survive a process restart in dev, which is the desired behavior until a
# real secret is set in the environment.
session_secret: SecretStr = Field(default_factory=lambda: SecretStr(secrets.token_urlsafe(32)))
anthropic_model: str = "claude-sonnet-4-5"
max_tokens: int = 1024
# Upper bound on the Anthropic HTTP call. A stuck request must not hold a
# worker thread forever -- see the tool-use loop cap in agent.py for the
# paired total-work bound.
anthropic_timeout_seconds: float = 30.0
server_host: str = "127.0.0.1"
server_port: int = 8014
# Session store bounds. Protects against a trivial DoS that opens many
# sessions or drives a single session to unbounded history length.
session_store_max_entries: int = 10_000
session_idle_ttl_seconds: int = 1800 # 30 minutes
max_turns_per_session: int = 40
# Hard cap on iterations of the tool-use loop within a single turn. The
# model should never legitimately need this many tool calls for a support
# conversation -- the cap exists to stop a runaway loop.
max_tool_use_iterations: int = 8
# Per-minute sliding-window rate limits. Enforced by a tiny in-memory
# limiter in server.py; suitable for a single-process demo deployment.
rate_limit_per_ip_per_minute: int = 30
rate_limit_per_session_per_minute: int = 20
# Session cookie configuration.
session_cookie_name: str = "bookly_session"
session_cookie_secure: bool = False # Flip to True behind HTTPS.
session_cookie_max_age_seconds: int = 60 * 60 * 8 # 8 hours
# The type ignore is needed because pydantic-settings reads `anthropic_api_key`
# and `session_secret` from environment / .env at runtime, but mypy sees them as
# required constructor arguments and has no way to know about that.
settings = Settings() # type: ignore[call-arg]