Security and performance fixes addressing a comprehensive review: - Server-issued HMAC-signed session cookies; client-supplied session_id ignored. Prevents session hijacking via body substitution. - Sliding-window rate limiter per IP and per session. - SessionStore with LRU eviction, idle TTL, per-session threading locks, and a hard turn cap. Bounds memory and serializes concurrent turns for the same session so FastAPI's threadpool cannot corrupt history. - Tool-use loop capped at settings.max_tool_use_iterations; Anthropic client gets an explicit timeout. No more infinite-loop credit burn. - Every tool argument is regex-validated, length-capped, and control-character-stripped. asserts replaced with ValueError so -O cannot silently disable the checks. - PII-safe warning logs: session IDs and reply bodies are hashed, never logged in clear. - hmac.compare_digest for email comparison (constant-time). - Strict Content-Security-Policy plus X-Content-Type-Options, X-Frame-Options, Referrer-Policy, Permissions-Policy via middleware. - Explicit handlers for anthropic.RateLimitError, APIConnectionError, APIStatusError, ValueError; static dir resolved from __file__. - Prompt cache breakpoints on the last tool schema and the last message so per-turn input cost scales linearly, not quadratically. - TypedDict handler argument shapes; direct block.name/block.id access. - functools.lru_cache on _get_client. - Anchored word-boundary regexes for out-of-scope detection to kill false positives on phrases like "I'd recommend contacting...". Literate program: - Bookly.lit.md is now the single source of truth for the five core Python files. Tangles byte-for-byte; verified via tangle.ts --verify. - Prose walkthrough, three mermaid diagrams, narrative per module. - Woven to static/architecture.html with the app's palette (background #f5f3ee) via scripts/architecture-header.html. - New GET /architecture route serves the HTML with a relaxed CSP that allows pandoc's inline styles. Available at bookly.codyborders.com/architecture. - scripts/rebuild_architecture_html.sh regenerates the HTML after edits. - code_reviews/2026-04-15-1433-code-review.md captures the review that drove these changes. All 37 tests pass.
64 lines
2.7 KiB
Python
64 lines
2.7 KiB
Python
"""Application configuration loaded from environment variables.
|
|
|
|
Settings are read from `.env` at process start. The Anthropic API key and
|
|
the session-cookie signing secret are the only required values; everything
|
|
else has a sensible default so the app can boot in dev without ceremony.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import secrets
|
|
|
|
from pydantic import Field, SecretStr
|
|
from pydantic_settings import BaseSettings, SettingsConfigDict
|
|
|
|
|
|
class Settings(BaseSettings):
|
|
model_config = SettingsConfigDict(env_file=".env", env_file_encoding="utf-8", extra="ignore")
|
|
|
|
# Required secrets -- wrapped in SecretStr so accidental logging or repr
|
|
# does not leak them. Access the raw value with `.get_secret_value()`.
|
|
anthropic_api_key: SecretStr
|
|
# Signing key for the server-issued session cookie. A fresh random value is
|
|
# generated at import if none is configured -- this means sessions do not
|
|
# survive a process restart in dev, which is the desired behavior until a
|
|
# real secret is set in the environment.
|
|
session_secret: SecretStr = Field(default_factory=lambda: SecretStr(secrets.token_urlsafe(32)))
|
|
|
|
anthropic_model: str = "claude-sonnet-4-5"
|
|
max_tokens: int = 1024
|
|
# Upper bound on the Anthropic HTTP call. A stuck request must not hold a
|
|
# worker thread forever -- see the tool-use loop cap in agent.py for the
|
|
# paired total-work bound.
|
|
anthropic_timeout_seconds: float = 30.0
|
|
|
|
server_host: str = "127.0.0.1"
|
|
server_port: int = 8014
|
|
|
|
# Session store bounds. Protects against a trivial DoS that opens many
|
|
# sessions or drives a single session to unbounded history length.
|
|
session_store_max_entries: int = 10_000
|
|
session_idle_ttl_seconds: int = 1800 # 30 minutes
|
|
max_turns_per_session: int = 40
|
|
|
|
# Hard cap on iterations of the tool-use loop within a single turn. The
|
|
# model should never legitimately need this many tool calls for a support
|
|
# conversation -- the cap exists to stop a runaway loop.
|
|
max_tool_use_iterations: int = 8
|
|
|
|
# Per-minute sliding-window rate limits. Enforced by a tiny in-memory
|
|
# limiter in server.py; suitable for a single-process demo deployment.
|
|
rate_limit_per_ip_per_minute: int = 30
|
|
rate_limit_per_session_per_minute: int = 20
|
|
|
|
# Session cookie configuration.
|
|
session_cookie_name: str = "bookly_session"
|
|
session_cookie_secure: bool = False # Flip to True behind HTTPS.
|
|
session_cookie_max_age_seconds: int = 60 * 60 * 8 # 8 hours
|
|
|
|
|
|
# The type ignore is needed because pydantic-settings reads `anthropic_api_key`
|
|
# and `session_secret` from environment / .env at runtime, but mypy sees them as
|
|
# required constructor arguments and has no way to know about that.
|
|
settings = Settings() # type: ignore[call-arg]
|