Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Retrieval API

The retrieval API serves captured content over HTTP for AI pipelines, RAG systems, and content auditing.

Start the Server

palimpsest api --port 8080 --data-dir ./output

Endpoints

GET /v1/content

Retrieve raw captured content for a URL.

curl "http://localhost:8080/v1/content?url=https://example.com/"

Returns the stored HTTP response body.

GET /v1/chunks

Retrieve RAG-ready chunks with full provenance.

curl "http://localhost:8080/v1/chunks?url=https://example.com/"

Response:

{
  "url": "https://example.com/",
  "chunks": [
    {
      "text": "Example Domain. This domain is for use in illustrative examples...",
      "chunk_index": 0,
      "total_chunks": 3,
      "char_offset": 0,
      "chunk_hash": "blake3:af13...",
      "source_hash": "blake3:c7d2...",
      "captured_at": "2026-04-12T10:30:00Z"
    }
  ]
}

GET /v1/history

All captures of a URL with timestamps and content hashes.

curl "http://localhost:8080/v1/history?url=https://example.com/"

Response:

{
  "url": "https://example.com/",
  "captures": [
    {"captured_at": "2026-04-12T10:30:00Z", "content_hash": "blake3:af13...", "crawl_context": 1},
    {"captured_at": "2026-04-13T08:00:00Z", "content_hash": "blake3:b8e2...", "crawl_context": 2}
  ]
}

GET /v1/search

Search across captured content.

curl "http://localhost:8080/v1/search?q=example+domain"

GET /metrics

Prometheus-compatible metrics (see Monitoring).

GET /health

curl http://localhost:8080/health
# "ok"

Use Cases

  • RAG pipelines/v1/chunks provides pre-chunked text with provenance for embedding
  • Content auditing/v1/history shows exactly when content changed
  • AI training/v1/content serves raw captured pages
  • Search systems/v1/search provides full-text search across the archive