Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Laconic

Maximum meaning, minimum tokens.

Laconic compresses markdown documents for LLM workflows by applying lossless syntactic transformations. It strips decorative noise — badge images, padded tables, HTML wrappers, redundant whitespace — that costs tokens but carries no semantic value.

Named after the Spartans of Laconia, famous for expressing maximum meaning with minimum words. When Philip II of Macedon threatened “If I invade Laconia, I will raze Sparta,” the Spartans replied: “If.”

What It Does

Laconic removes structure that LLMs don’t need to understand your document:

BeforeAfterWhy
[![Build](https://img.shields.io/...)](...)(removed)Badges are visual, not semantic
| Col1 | Col2 | with separator rowsCol1,Col2CSV is more token-efficient
<div style="padding: 20px">content</div>contentDecorative HTML wrappers
Three blank linesOne blank lineWhitespace normalization
Repeated inline URLsReference-style linksURL deduplication

What It Never Touches

  • Prose text
  • Code blocks (contents preserved exactly)
  • Headings (structure preserved)
  • Lists
  • Anything that carries meaning

Three Ways to Use It

  1. CLIlaconic compress README.md
  2. Rust librarylaconic_core::compress(&text, &config)
  3. MCP server — Agents call compress_markdown as a tool

Pick the one that fits your workflow. The next pages walk through each.

Installation

Requires Rust 1.75+.

git clone https://github.com/copyleftdev/laconic.git
cd laconic
cargo build --release

The binaries land in target/release/:

BinarySizePurpose
laconic~5.6 MBCLI tool
laconic-mcp~8.2 MBMCP server for agents

Add to PATH

# Copy to a directory in your PATH
cp target/release/laconic /usr/local/bin/
cp target/release/laconic-mcp /usr/local/bin/

# Verify
laconic --version

Verify Installation

# Compress a file and see the stats
echo "# Hello World" | laconic compress -

# Should output the compressed text to stdout
# and stats to stderr

Shell Completions

Laconic uses clap, so you can generate shell completions:

# Bash
laconic compress --help

# The CLI supports standard POSIX conventions:
#   -j  for --json
#   -f  for --fast
#   -   for stdin
#   --  to end option parsing

Getting Started

This page walks through the three things you’ll do most: compress a file, estimate savings on a batch, and use fast mode.

Compress a Single File

laconic compress README.md

Compressed text goes to stdout. Stats go to stderr:

# README.md: 1648 → 1418 tokens (saved 230, 14.0%)

This means you can pipe the output cleanly:

laconic compress README.md > compressed.md
laconic compress README.md | pbcopy        # macOS clipboard
laconic compress README.md | xclip         # Linux clipboard

Compress from Stdin

Use - to read from stdin:

cat README.md | laconic compress -
curl -s https://raw.githubusercontent.com/.../README.md | laconic compress -

Estimate Savings (Without Compressing)

Want to know how much you’d save without producing output?

laconic estimate docs/*.md
docs/api.md: 3044 → 2446 tokens (saved 598, 19.6%)
docs/guide.md: 1572 → 1572 tokens (saved 0, 0.0%)
TOTAL: 4616 → 4018 tokens (saved 598, 12.9%)

Fast Mode

If you only need the compressed text and don’t care about token statistics, use --fast (or -f):

laconic compress -f README.md

This skips the BPE tokenizer entirely, making compression near-instant even on large batches.

JSON Output

Add --json (or -j) for machine-readable output:

laconic compress -j README.md
{
  "file": "README.md",
  "original_tokens": 1648,
  "compressed_tokens": 1418,
  "tokens_saved": 230,
  "savings_pct": 13.96,
  "text": "# FastAPI Authentication Middleware\n..."
}

Batch Processing

Compress every markdown file in a directory:

for f in docs/*.md; do
  laconic compress -f "$f" > "compressed/$(basename "$f")"
done

Or estimate savings across an entire corpus:

laconic estimate docs/**/*.md

What to Expect

Document typeTypical savings
HTML-heavy component docs40–55%
Awesome-lists (links, badges)20–30%
Table-heavy documentation15–25%
READMEs (badges, tables, code)10–15%
Pure prose0%

Savings depend on how much decorative structure the document contains. Pure prose gets 0% savings — and that’s correct. Laconic never modifies semantic content.

CLI Reference

laconic compress

Compress one or more markdown files and output the result.

laconic compress [OPTIONS] <FILES>...

Arguments

ArgumentDescription
<FILES>...One or more file paths. Use - for stdin.

Options

FlagShortDescription
--json-jOutput as JSON (includes token counts and savings)
--fast-fSkip token counting (faster, no stats in text mode)
--no-tablesDisable markdown table compaction
--no-htmlDisable HTML table conversion and HTML cleanup
--no-badgesDisable badge/shield image stripping
--url-dedupEnable URL deduplication (off by default)

Output Behavior

  • Compressed text goes to stdout
  • Statistics go to stderr
  • Exit code 0 on success, 1 on error

This means you can pipe cleanly:

laconic compress input.md > output.md       # redirect text
laconic compress input.md 2>/dev/null       # suppress stats
laconic compress input.md 2>stats.txt       # capture stats separately

Examples

# Basic compression
laconic compress README.md

# Fast mode, no token counting
laconic compress -f README.md

# JSON output for scripting
laconic compress -j README.md | jq '.tokens_saved'

# Stdin
cat README.md | laconic compress -

# Preserve tables, skip HTML cleanup
laconic compress --no-tables --no-html README.md

# Multiple files
laconic compress docs/*.md

laconic estimate

Estimate token savings without producing compressed output.

laconic estimate [OPTIONS] <FILES>...

Arguments

ArgumentDescription
<FILES>...One or more file paths. Use - for stdin.

Options

FlagShortDescription
--json-jOutput as JSON

Output

Per-file stats go to stdout. When processing multiple files, a TOTAL summary goes to stderr.

laconic estimate docs/*.md
docs/api.md: 3044 → 2446 tokens (saved 598, 19.6%)
docs/guide.md: 1572 → 1572 tokens (saved 0, 0.0%)
TOTAL: 4616 → 4018 tokens (saved 598, 12.9%)

Environment Variables

VariableDefaultDescription
LACONIC_TELEMETRY1Set to 0 to disable anonymous usage telemetry

POSIX Compliance

Laconic follows POSIX utility conventions:

  • - reads from stdin
  • -- ends option parsing
  • Short flags: -j, -f
  • stdout for data, stderr for diagnostics
  • Exit 0 on success, >0 on failure
  • SIGPIPE handled correctly (piping to head/tail works)
  • Output always ends with a newline

Compression Strategies

Laconic applies eight independent strategies. Each targets a specific type of markdown structure that costs tokens but carries no semantic value.

Whitespace Normalization

Always on. Cannot be disabled.

  • Strips trailing spaces from every line
  • Collapses three or more consecutive blank lines down to two

This is the lowest-impact strategy but applies universally.

Table Compaction

On by default. Disable with --no-tables.

Converts markdown pipe tables to compact CSV:

Before:

| Name   | Role       | Status  |
|--------|------------|---------|
| Alice  | Engineer   | Active  |
| Bob    | Designer   | On Leave|

After:

Name,Role,Status
Alice,Engineer,Active
Bob,Designer,On Leave

The separator row is removed entirely. Padding spaces inside cells are trimmed.

Tradeoff: Column alignment is lost. LLMs parse CSV natively, but the output is less human-readable. Use --no-tables when human readability of the compressed output matters.

HTML Table Conversion

On by default. Disable with --no-html.

Converts <table> HTML to the same compact CSV format:

Before:

<table>
  <tr><th>Name</th><th>Value</th></tr>
  <tr><td>timeout</td><td>30s</td></tr>
</table>

After:

Name,Value
timeout,30s

HTML Cleanup

On by default. Disable with --no-html.

Removes decorative HTML that carries no semantic weight:

  • Strips style="..." attributes
  • Strips align="..." attributes
  • Unwraps <div> and </div> tags (keeps inner content)

Does not touch <code>, <pre>, <a>, or any semantic HTML.

Badge Stripping

On by default. Disable with --no-badges.

Removes shield.io / badge images that are purely visual:

Before:

[![Build Status](https://img.shields.io/github/actions/workflow/status/user/repo/ci.yml)](...)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](...)

After:

(empty — both lines removed)

Badges are meaningful to humans scanning a GitHub page but carry zero information for an LLM processing the document’s content.

Heading Normalization

Always on.

Strips trailing # characters from ATX headings:

Before:

## Configuration ##
### Options ###

After:

## Configuration
### Options

Minimal savings, but consistent normalization.

Code Fence Compaction

Always on.

Removes common leading indentation from code blocks without changing the code’s meaning:

Before:

```python
    def hello():
        print("world")
```

After:

```python
def hello():
    print("world")
```

The relative indentation is preserved. Only the common prefix is removed.

URL Deduplication

Off by default. Enable with --url-dedup.

Converts repeated or long inline URLs to reference-style links:

Before:

See [the docs](https://example.com/very/long/path/to/documentation).
Also check [the API](https://example.com/very/long/path/to/documentation).

After:

See [the docs][1].
Also check [the API][1].

[1]: https://example.com/very/long/path/to/documentation

This is off by default because it changes the link style, which some workflows may not want. Enable it when you have documents with many repeated URLs.

Strategy Selection Guide

ScenarioRecommended flags
Maximum compression(defaults — all on) + --url-dedup
Preserve table formatting--no-tables
Keep HTML structure intact--no-html
Conservative (whitespace + headings only)--no-tables --no-html --no-badges
Speed over stats-f (fast mode)

Library Usage

Add laconic-core to your Rust project:

cargo add laconic-core

Basic Compression

#![allow(unused)]
fn main() {
use laconic_core::{compress, CompressConfig};

let input = std::fs::read_to_string("README.md").unwrap();
let config = CompressConfig::default();
let result = compress(&input, &config);

println!("Saved {} tokens ({:.1}%)", result.tokens_saved, result.savings_pct);
println!("{}", result.text);
}

Fast Path (No Token Counting)

When you only need the compressed text and don’t need statistics:

#![allow(unused)]
fn main() {
use laconic_core::{compress_text, CompressConfig};

let input = std::fs::read_to_string("README.md").unwrap();
let config = CompressConfig::default();
let compressed = compress_text(&input, &config);

// `compressed` is a String — no token counting overhead
}

This is significantly faster for batch processing where you don’t need per-file token stats.

Streaming

Process large files without loading everything into a string first:

#![allow(unused)]
fn main() {
use laconic_core::{compress_reader, CompressConfig};
use std::io;

let config = CompressConfig::default();
compress_reader(io::stdin(), io::stdout(), &config).unwrap();
}

Custom Configuration

Toggle individual strategies on or off:

#![allow(unused)]
fn main() {
use laconic_core::CompressConfig;

let config = CompressConfig {
    tables: false,          // preserve markdown tables
    html_tables: true,      // convert HTML tables to CSV
    html_cleanup: true,     // strip decorative HTML
    badges: true,           // remove badge images
    url_dedup: true,        // deduplicate URLs (off by default)
    skip_token_count: true, // skip BPE tokenizer (fast mode)
    ..CompressConfig::default()
};
}

The CompressResult Struct

#![allow(unused)]
fn main() {
pub struct CompressResult {
    pub text: String,           // compressed markdown
    pub original_tokens: usize, // token count before (0 if skip_token_count)
    pub compressed_tokens: usize,
    pub tokens_saved: usize,
    pub savings_pct: f64,       // 0.0–100.0
}
}

Guarantees

These hold for all inputs:

  • Idempotent: compress(compress(x)) == compress(x)
  • Never inflates: result.compressed_tokens <= result.original_tokens
  • No panics: Tested across hundreds of real-world markdown files
  • Deterministic: Same input + config always produces the same output

MCP Server

Laconic ships as a Model Context Protocol (MCP) server that any MCP-compatible agent can call directly. The agent decides when compression is worth it; Laconic provides the tools.

Setup

Build the MCP server binary:

cargo build --release --bin laconic-mcp
cp target/release/laconic-mcp /usr/local/bin/

Agent Configuration

Add this to your MCP client config (Windsurf, Cursor, Claude Desktop, or any MCP-compatible agent):

{
  "mcpServers": {
    "laconic": {
      "command": "laconic-mcp",
      "args": []
    }
  }
}

The server communicates over stdio — no ports, no HTTP, no configuration.

Available Tools

compress_markdown

Compresses a markdown string and returns the result with token statistics.

Input:

{
  "markdown": "# Title\n\n[![Badge](https://img.shields.io/...)]\n\n| Col | Col |\n|---|---|\n| A | B |"
}

Output:

{
  "text": "# Title\n\nCol,Col\nA,B",
  "original_tokens": 45,
  "compressed_tokens": 12,
  "tokens_saved": 33,
  "savings_pct": 73.3
}

estimate_savings

Returns token statistics and a recommendation without the compressed text. Useful for agents that want to decide whether compression is worthwhile before committing.

Input:

{
  "markdown": "Some markdown content..."
}

Output:

{
  "original_tokens": 500,
  "compressed_tokens": 420,
  "tokens_saved": 80,
  "savings_pct": 16.0,
  "recommendation": "Compress — 16% savings available."
}

Typical Agent Workflow

  1. Agent retrieves a document for context injection
  2. Agent calls estimate_savings to check if compression is worthwhile
  3. If savings exceed a threshold (e.g., 5%), agent calls compress_markdown
  4. Agent uses the compressed text in its prompt

This keeps the agent in control of the cost/benefit tradeoff.

Testing the Server

You can test the MCP server manually by piping JSON-RPC messages:

echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | laconic-mcp

This returns the list of available tools and their schemas.

RAG Pipelines

The primary use case for Laconic: compress retrieved documents before injecting them into an LLM’s context window.

The Problem

RAG pipelines retrieve markdown documents and stuff them into prompts. But markdown carries a lot of decorative weight:

  • Badge images that mean nothing to an LLM
  • Padded table formatting that wastes tokens
  • HTML wrappers from CMS exports
  • Redundant whitespace

Every wasted token is money spent and context window consumed.

Basic Integration

# Compress all retrieved docs before feeding to the LLM
for doc in retrieved_docs/*.md; do
  laconic compress -f "$doc" > "context/$(basename "$doc")"
done

Python Integration

Call the CLI from Python:

import subprocess

def compress_markdown(text: str) -> str:
    result = subprocess.run(
        ["laconic", "compress", "-f", "-"],
        input=text,
        capture_output=True,
        text=True,
    )
    return result.stdout

# In your RAG pipeline
retrieved_doc = vector_store.query("How do I configure auth?")
compressed = compress_markdown(retrieved_doc.content)
prompt = f"Given this context:\n\n{compressed}\n\nAnswer the question..."

With Token Stats

If you want to track savings:

import subprocess
import json

def compress_with_stats(text: str) -> dict:
    result = subprocess.run(
        ["laconic", "compress", "-j", "-"],
        input=text,
        capture_output=True,
        text=True,
    )
    return json.loads(result.stdout)

stats = compress_with_stats(doc.content)
print(f"Saved {stats['tokens_saved']} tokens ({stats['savings_pct']}%)")
compressed_text = stats["text"]

Rust Integration

If your pipeline is in Rust:

#![allow(unused)]
fn main() {
use laconic_core::{compress_text, CompressConfig};

fn prepare_context(docs: &[String]) -> String {
    let config = CompressConfig::default();
    docs.iter()
        .map(|doc| compress_text(doc, &config))
        .collect::<Vec<_>>()
        .join("\n---\n")
}
}

Decision: When to Compress

Not every document benefits from compression. Use estimate to decide:

laconic estimate docs/*.md

If a document shows 0% savings (pure prose), skip it. Focus compression on structure-heavy documents where savings are 10%+.

The MCP server’s estimate_savings tool lets agents make this decision autonomously.

CI/CD Integration

Compress documentation at build time so every downstream consumer benefits automatically.

GitHub Actions

Add a step to your workflow that compresses docs before they enter a vector store or knowledge base:

- name: Compress docs for vector store
  run: |
    cargo install --path crates/laconic-cli
    mkdir -p compressed_docs
    for f in docs/*.md; do
      laconic compress -f "$f" > "compressed_docs/$(basename "$f")"
    done

- name: Upload to vector store
  run: ./scripts/upload_to_pinecone.sh compressed_docs/

Pre-commit Hook

Compress docs automatically on every commit:

#!/bin/sh
# .git/hooks/pre-commit

for f in $(git diff --cached --name-only --diff-filter=ACM -- '*.md'); do
  laconic compress -f "$f" > "${f%.md}.compressed.md"
  git add "${f%.md}.compressed.md"
done

Audit Token Spend

Add a CI step that reports token savings across your doc corpus:

- name: Token audit
  run: |
    laconic estimate docs/**/*.md 2>&1 | tee token-audit.txt
    # Fail if any file shows negative savings (should never happen)
    if grep -q "saved -" token-audit.txt; then
      echo "ERROR: Token inflation detected"
      exit 1
    fi

Docker

Laconic is a single static binary. No runtime dependencies:

FROM rust:1.75 AS builder
WORKDIR /build
COPY . .
RUN cargo build --release --bin laconic

FROM debian:bookworm-slim
COPY --from=builder /build/target/release/laconic /usr/local/bin/
ENTRYPOINT ["laconic"]
docker build -t laconic .
docker run --rm -i laconic compress - < README.md

Token Budgeting

When you have a fixed context window (e.g., 128K tokens), every token matters. Laconic helps you fit more documents into the same budget without lossy truncation.

The Math

Say you’re building a prompt with retrieved context:

  • System prompt: 2,000 tokens
  • User query: 500 tokens
  • Available for context: 125,500 tokens
  • Retrieved docs: 150,000 tokens — doesn’t fit

Option A: Truncate. Lose information.

Option B: Compress with Laconic. If your docs are structure-heavy (tables, HTML, badges), you recover 15–50% of that space.

Budget-Aware Pipeline

#!/bin/bash
BUDGET=125000
USED=0

for doc in retrieved_docs/*.md; do
  # Get token count of compressed version
  stats=$(laconic compress -j "$doc" 2>/dev/null)
  tokens=$(echo "$stats" | jq '.compressed_tokens')
  
  NEXT=$((USED + tokens))
  if [ "$NEXT" -gt "$BUDGET" ]; then
    echo "Budget full at $USED tokens. Skipping remaining docs." >&2
    break
  fi
  
  # Output compressed text
  echo "$stats" | jq -r '.text'
  echo "---"
  USED=$NEXT
done

Python Example

import subprocess
import json

def compress_and_budget(docs: list[str], budget: int) -> str:
    context_parts = []
    used = 0

    for doc in docs:
        result = subprocess.run(
            ["laconic", "compress", "-j", "-"],
            input=doc, capture_output=True, text=True,
        )
        data = json.loads(result.stdout)
        tokens = data["compressed_tokens"]

        if used + tokens > budget:
            break

        context_parts.append(data["text"])
        used += tokens

    return "\n---\n".join(context_parts)

Fast Mode for Large Batches

If you’re processing hundreds of docs and just need the compressed text (not token counts), use fast mode to skip the BPE tokenizer entirely:

# Compress 500 docs in under a second
for doc in corpus/*.md; do
  laconic compress -f "$doc" > "compressed/$(basename "$doc")"
done

You can then count tokens separately on just the winners, or use your LLM provider’s tokenizer.

Stacking with Other Optimizations

Laconic compresses the structure. You can stack it with other techniques:

TechniqueWhat it removesTypical savings
LaconicDecorative markdown structure15–50% on structured docs
Prompt cachingRepeated prefix tokensUp to 90% cost reduction
Batch APINothing — just cheaper pricing50% cost reduction

These are multiplicative. Laconic + prompt caching + batch API can reduce effective cost by 95%+ on structure-heavy workloads.