palimpsest-cli
Command-line interface with 10 subcommands. Thin wrapper around the kernel crates.
crawl
Start a crawl with seed URLs.
palimpsest crawl <SEEDS>... [OPTIONS]
-d, --depth <N> Max crawl depth [default: 2]
-m, --max-urls <N> Max URLs to fetch [default: 100]
-s, --seed <N> Deterministic seed [default: 42]
-o, --output-dir <DIR> Persist to disk
--browser Headless Chrome capture
--user-agent <UA> User-Agent [default: PalimpsestBot/0.1]
--politeness-ms <N> Per-host delay in ms [default: 1000]
-c, --config <FILE> TOML config file
replay
palimpsest replay <URL> --data-dir <DIR>
history
palimpsest history <URL> --data-dir <DIR>
extract
palimpsest extract <URL> --data-dir <DIR> [--json]
shadow-compare
palimpsest shadow-compare --legacy <DIR> --palimpsest <DIR> [--json]
serve
Start a distributed frontier server.
palimpsest serve --port <PORT> --seed <N> --politeness-ms <N>
Default port: 8090.
worker
Connect to a frontier server and crawl.
palimpsest worker --server <URL> --output-dir <DIR> [--user-agent <UA>]
api
Start the retrieval API server.
palimpsest api --port <PORT> --data-dir <DIR>
Default port: 8080.
stats
Print workspace statistics.
palimpsest stats
migrate
Run storage migrations (JSON index to SQLite).
palimpsest migrate --data-dir <DIR>