Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Trust Boundaries

Untrusted Inputs

All fetched content is untrusted. HTTP responses, HTML, JavaScript, CSS, images — all of it. Never execute, eval, or interpret fetched content outside a sandbox.

All URLs are untrusted. Validate scheme, host, and port. Block private IP ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16), link-local (169.254.0.0/16), and loopback (127.0.0.0/8) unless explicitly configured.

All DNS responses are untrusted. Record them in the ExecutionEnvelope for forensic replay, but verify against policy before connecting. DNS rebinding attacks can redirect requests to internal infrastructure.

All TLS certificates are recorded. The full certificate chain is stored in the envelope’s TlsFingerprint (protocol, cipher, cert chain hash). This enables forensic analysis of TLS state at capture time.

Storage Integrity

All artifacts are content-addressed. Tampering is detectable by recomputing the BLAKE3 hash and comparing it against the stored ContentHash. This verification happens on every read.

Storage backends must support atomic writes. The FileSystemBlobStore uses temp-file-plus-rename to prevent partial artifacts from being visible.

Blob deletion requires an explicit garbage collection pass — never inline during normal operation.

Credential Safety

  • No credentials in source code, configuration files committed to git, or artifact metadata
  • HTTP auth credentials (for authenticated crawls) are injected via environment variables
  • TLS client certificates are loaded from a configured path, never embedded