Laconic
Maximum meaning, minimum tokens.
Laconic compresses markdown documents for LLM workflows by applying lossless syntactic transformations. It strips decorative noise — badge images, padded tables, HTML wrappers, redundant whitespace — that costs tokens but carries no semantic value.
Named after the Spartans of Laconia, famous for expressing maximum meaning with minimum words. When Philip II of Macedon threatened “If I invade Laconia, I will raze Sparta,” the Spartans replied: “If.”
What It Does
Laconic removes structure that LLMs don’t need to understand your document:
| Before | After | Why |
|---|---|---|
[](...) | (removed) | Badges are visual, not semantic |
| Col1 | Col2 | with separator rows | Col1,Col2 | CSV is more token-efficient |
<div style="padding: 20px">content</div> | content | Decorative HTML wrappers |
| Three blank lines | One blank line | Whitespace normalization |
| Repeated inline URLs | Reference-style links | URL deduplication |
What It Never Touches
- Prose text
- Code blocks (contents preserved exactly)
- Headings (structure preserved)
- Lists
- Anything that carries meaning
Three Ways to Use It
- CLI —
laconic compress README.md - Rust library —
laconic_core::compress(&text, &config) - MCP server — Agents call
compress_markdownas a tool
Pick the one that fits your workflow. The next pages walk through each.
Installation
From Source (Recommended)
Requires Rust 1.75+.
git clone https://github.com/copyleftdev/laconic.git
cd laconic
cargo build --release
The binaries land in target/release/:
| Binary | Size | Purpose |
|---|---|---|
laconic | ~5.6 MB | CLI tool |
laconic-mcp | ~8.2 MB | MCP server for agents |
Add to PATH
# Copy to a directory in your PATH
cp target/release/laconic /usr/local/bin/
cp target/release/laconic-mcp /usr/local/bin/
# Verify
laconic --version
Verify Installation
# Compress a file and see the stats
echo "# Hello World" | laconic compress -
# Should output the compressed text to stdout
# and stats to stderr
Shell Completions
Laconic uses clap, so you can generate shell completions:
# Bash
laconic compress --help
# The CLI supports standard POSIX conventions:
# -j for --json
# -f for --fast
# - for stdin
# -- to end option parsing
Getting Started
This page walks through the three things you’ll do most: compress a file, estimate savings on a batch, and use fast mode.
Compress a Single File
laconic compress README.md
Compressed text goes to stdout. Stats go to stderr:
# README.md: 1648 → 1418 tokens (saved 230, 14.0%)
This means you can pipe the output cleanly:
laconic compress README.md > compressed.md
laconic compress README.md | pbcopy # macOS clipboard
laconic compress README.md | xclip # Linux clipboard
Compress from Stdin
Use - to read from stdin:
cat README.md | laconic compress -
curl -s https://raw.githubusercontent.com/.../README.md | laconic compress -
Estimate Savings (Without Compressing)
Want to know how much you’d save without producing output?
laconic estimate docs/*.md
docs/api.md: 3044 → 2446 tokens (saved 598, 19.6%)
docs/guide.md: 1572 → 1572 tokens (saved 0, 0.0%)
TOTAL: 4616 → 4018 tokens (saved 598, 12.9%)
Fast Mode
If you only need the compressed text and don’t care about token statistics, use --fast (or -f):
laconic compress -f README.md
This skips the BPE tokenizer entirely, making compression near-instant even on large batches.
JSON Output
Add --json (or -j) for machine-readable output:
laconic compress -j README.md
{
"file": "README.md",
"original_tokens": 1648,
"compressed_tokens": 1418,
"tokens_saved": 230,
"savings_pct": 13.96,
"text": "# FastAPI Authentication Middleware\n..."
}
Batch Processing
Compress every markdown file in a directory:
for f in docs/*.md; do
laconic compress -f "$f" > "compressed/$(basename "$f")"
done
Or estimate savings across an entire corpus:
laconic estimate docs/**/*.md
What to Expect
| Document type | Typical savings |
|---|---|
| HTML-heavy component docs | 40–55% |
| Awesome-lists (links, badges) | 20–30% |
| Table-heavy documentation | 15–25% |
| READMEs (badges, tables, code) | 10–15% |
| Pure prose | 0% |
Savings depend on how much decorative structure the document contains. Pure prose gets 0% savings — and that’s correct. Laconic never modifies semantic content.
CLI Reference
laconic compress
Compress one or more markdown files and output the result.
laconic compress [OPTIONS] <FILES>...
Arguments
| Argument | Description |
|---|---|
<FILES>... | One or more file paths. Use - for stdin. |
Options
| Flag | Short | Description |
|---|---|---|
--json | -j | Output as JSON (includes token counts and savings) |
--fast | -f | Skip token counting (faster, no stats in text mode) |
--no-tables | Disable markdown table compaction | |
--no-html | Disable HTML table conversion and HTML cleanup | |
--no-badges | Disable badge/shield image stripping | |
--url-dedup | Enable URL deduplication (off by default) |
Output Behavior
- Compressed text goes to stdout
- Statistics go to stderr
- Exit code
0on success,1on error
This means you can pipe cleanly:
laconic compress input.md > output.md # redirect text
laconic compress input.md 2>/dev/null # suppress stats
laconic compress input.md 2>stats.txt # capture stats separately
Examples
# Basic compression
laconic compress README.md
# Fast mode, no token counting
laconic compress -f README.md
# JSON output for scripting
laconic compress -j README.md | jq '.tokens_saved'
# Stdin
cat README.md | laconic compress -
# Preserve tables, skip HTML cleanup
laconic compress --no-tables --no-html README.md
# Multiple files
laconic compress docs/*.md
laconic estimate
Estimate token savings without producing compressed output.
laconic estimate [OPTIONS] <FILES>...
Arguments
| Argument | Description |
|---|---|
<FILES>... | One or more file paths. Use - for stdin. |
Options
| Flag | Short | Description |
|---|---|---|
--json | -j | Output as JSON |
Output
Per-file stats go to stdout. When processing multiple files, a TOTAL summary goes to stderr.
laconic estimate docs/*.md
docs/api.md: 3044 → 2446 tokens (saved 598, 19.6%)
docs/guide.md: 1572 → 1572 tokens (saved 0, 0.0%)
TOTAL: 4616 → 4018 tokens (saved 598, 12.9%)
Environment Variables
| Variable | Default | Description |
|---|---|---|
LACONIC_TELEMETRY | 1 | Set to 0 to disable anonymous usage telemetry |
POSIX Compliance
Laconic follows POSIX utility conventions:
-reads from stdin--ends option parsing- Short flags:
-j,-f - stdout for data, stderr for diagnostics
- Exit 0 on success, >0 on failure
- SIGPIPE handled correctly (piping to
head/tailworks) - Output always ends with a newline
Compression Strategies
Laconic applies eight independent strategies. Each targets a specific type of markdown structure that costs tokens but carries no semantic value.
Whitespace Normalization
Always on. Cannot be disabled.
- Strips trailing spaces from every line
- Collapses three or more consecutive blank lines down to two
This is the lowest-impact strategy but applies universally.
Table Compaction
On by default. Disable with --no-tables.
Converts markdown pipe tables to compact CSV:
Before:
| Name | Role | Status |
|--------|------------|---------|
| Alice | Engineer | Active |
| Bob | Designer | On Leave|
After:
Name,Role,Status
Alice,Engineer,Active
Bob,Designer,On Leave
The separator row is removed entirely. Padding spaces inside cells are trimmed.
Tradeoff: Column alignment is lost. LLMs parse CSV natively, but the output is less human-readable. Use --no-tables when human readability of the compressed output matters.
HTML Table Conversion
On by default. Disable with --no-html.
Converts <table> HTML to the same compact CSV format:
Before:
<table>
<tr><th>Name</th><th>Value</th></tr>
<tr><td>timeout</td><td>30s</td></tr>
</table>
After:
Name,Value
timeout,30s
HTML Cleanup
On by default. Disable with --no-html.
Removes decorative HTML that carries no semantic weight:
- Strips
style="..."attributes - Strips
align="..."attributes - Unwraps
<div>and</div>tags (keeps inner content)
Does not touch <code>, <pre>, <a>, or any semantic HTML.
Badge Stripping
On by default. Disable with --no-badges.
Removes shield.io / badge images that are purely visual:
Before:
[](...)
[](...)
After:
(empty — both lines removed)
Badges are meaningful to humans scanning a GitHub page but carry zero information for an LLM processing the document’s content.
Heading Normalization
Always on.
Strips trailing # characters from ATX headings:
Before:
## Configuration ##
### Options ###
After:
## Configuration
### Options
Minimal savings, but consistent normalization.
Code Fence Compaction
Always on.
Removes common leading indentation from code blocks without changing the code’s meaning:
Before:
```python
def hello():
print("world")
```
After:
```python
def hello():
print("world")
```
The relative indentation is preserved. Only the common prefix is removed.
URL Deduplication
Off by default. Enable with --url-dedup.
Converts repeated or long inline URLs to reference-style links:
Before:
See [the docs](https://example.com/very/long/path/to/documentation).
Also check [the API](https://example.com/very/long/path/to/documentation).
After:
See [the docs][1].
Also check [the API][1].
[1]: https://example.com/very/long/path/to/documentation
This is off by default because it changes the link style, which some workflows may not want. Enable it when you have documents with many repeated URLs.
Strategy Selection Guide
| Scenario | Recommended flags |
|---|---|
| Maximum compression | (defaults — all on) + --url-dedup |
| Preserve table formatting | --no-tables |
| Keep HTML structure intact | --no-html |
| Conservative (whitespace + headings only) | --no-tables --no-html --no-badges |
| Speed over stats | -f (fast mode) |
Library Usage
Add laconic-core to your Rust project:
cargo add laconic-core
Basic Compression
#![allow(unused)]
fn main() {
use laconic_core::{compress, CompressConfig};
let input = std::fs::read_to_string("README.md").unwrap();
let config = CompressConfig::default();
let result = compress(&input, &config);
println!("Saved {} tokens ({:.1}%)", result.tokens_saved, result.savings_pct);
println!("{}", result.text);
}
Fast Path (No Token Counting)
When you only need the compressed text and don’t need statistics:
#![allow(unused)]
fn main() {
use laconic_core::{compress_text, CompressConfig};
let input = std::fs::read_to_string("README.md").unwrap();
let config = CompressConfig::default();
let compressed = compress_text(&input, &config);
// `compressed` is a String — no token counting overhead
}
This is significantly faster for batch processing where you don’t need per-file token stats.
Streaming
Process large files without loading everything into a string first:
#![allow(unused)]
fn main() {
use laconic_core::{compress_reader, CompressConfig};
use std::io;
let config = CompressConfig::default();
compress_reader(io::stdin(), io::stdout(), &config).unwrap();
}
Custom Configuration
Toggle individual strategies on or off:
#![allow(unused)]
fn main() {
use laconic_core::CompressConfig;
let config = CompressConfig {
tables: false, // preserve markdown tables
html_tables: true, // convert HTML tables to CSV
html_cleanup: true, // strip decorative HTML
badges: true, // remove badge images
url_dedup: true, // deduplicate URLs (off by default)
skip_token_count: true, // skip BPE tokenizer (fast mode)
..CompressConfig::default()
};
}
The CompressResult Struct
#![allow(unused)]
fn main() {
pub struct CompressResult {
pub text: String, // compressed markdown
pub original_tokens: usize, // token count before (0 if skip_token_count)
pub compressed_tokens: usize,
pub tokens_saved: usize,
pub savings_pct: f64, // 0.0–100.0
}
}
Guarantees
These hold for all inputs:
- Idempotent:
compress(compress(x)) == compress(x) - Never inflates:
result.compressed_tokens <= result.original_tokens - No panics: Tested across hundreds of real-world markdown files
- Deterministic: Same input + config always produces the same output
MCP Server
Laconic ships as a Model Context Protocol (MCP) server that any MCP-compatible agent can call directly. The agent decides when compression is worth it; Laconic provides the tools.
Setup
Build the MCP server binary:
cargo build --release --bin laconic-mcp
cp target/release/laconic-mcp /usr/local/bin/
Agent Configuration
Add this to your MCP client config (Windsurf, Cursor, Claude Desktop, or any MCP-compatible agent):
{
"mcpServers": {
"laconic": {
"command": "laconic-mcp",
"args": []
}
}
}
The server communicates over stdio — no ports, no HTTP, no configuration.
Available Tools
compress_markdown
Compresses a markdown string and returns the result with token statistics.
Input:
{
"markdown": "# Title\n\n[]\n\n| Col | Col |\n|---|---|\n| A | B |"
}
Output:
{
"text": "# Title\n\nCol,Col\nA,B",
"original_tokens": 45,
"compressed_tokens": 12,
"tokens_saved": 33,
"savings_pct": 73.3
}
estimate_savings
Returns token statistics and a recommendation without the compressed text. Useful for agents that want to decide whether compression is worthwhile before committing.
Input:
{
"markdown": "Some markdown content..."
}
Output:
{
"original_tokens": 500,
"compressed_tokens": 420,
"tokens_saved": 80,
"savings_pct": 16.0,
"recommendation": "Compress — 16% savings available."
}
Typical Agent Workflow
- Agent retrieves a document for context injection
- Agent calls
estimate_savingsto check if compression is worthwhile - If savings exceed a threshold (e.g., 5%), agent calls
compress_markdown - Agent uses the compressed text in its prompt
This keeps the agent in control of the cost/benefit tradeoff.
Testing the Server
You can test the MCP server manually by piping JSON-RPC messages:
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | laconic-mcp
This returns the list of available tools and their schemas.
RAG Pipelines
The primary use case for Laconic: compress retrieved documents before injecting them into an LLM’s context window.
The Problem
RAG pipelines retrieve markdown documents and stuff them into prompts. But markdown carries a lot of decorative weight:
- Badge images that mean nothing to an LLM
- Padded table formatting that wastes tokens
- HTML wrappers from CMS exports
- Redundant whitespace
Every wasted token is money spent and context window consumed.
Basic Integration
# Compress all retrieved docs before feeding to the LLM
for doc in retrieved_docs/*.md; do
laconic compress -f "$doc" > "context/$(basename "$doc")"
done
Python Integration
Call the CLI from Python:
import subprocess
def compress_markdown(text: str) -> str:
result = subprocess.run(
["laconic", "compress", "-f", "-"],
input=text,
capture_output=True,
text=True,
)
return result.stdout
# In your RAG pipeline
retrieved_doc = vector_store.query("How do I configure auth?")
compressed = compress_markdown(retrieved_doc.content)
prompt = f"Given this context:\n\n{compressed}\n\nAnswer the question..."
With Token Stats
If you want to track savings:
import subprocess
import json
def compress_with_stats(text: str) -> dict:
result = subprocess.run(
["laconic", "compress", "-j", "-"],
input=text,
capture_output=True,
text=True,
)
return json.loads(result.stdout)
stats = compress_with_stats(doc.content)
print(f"Saved {stats['tokens_saved']} tokens ({stats['savings_pct']}%)")
compressed_text = stats["text"]
Rust Integration
If your pipeline is in Rust:
#![allow(unused)]
fn main() {
use laconic_core::{compress_text, CompressConfig};
fn prepare_context(docs: &[String]) -> String {
let config = CompressConfig::default();
docs.iter()
.map(|doc| compress_text(doc, &config))
.collect::<Vec<_>>()
.join("\n---\n")
}
}
Decision: When to Compress
Not every document benefits from compression. Use estimate to decide:
laconic estimate docs/*.md
If a document shows 0% savings (pure prose), skip it. Focus compression on structure-heavy documents where savings are 10%+.
The MCP server’s estimate_savings tool lets agents make this decision autonomously.
CI/CD Integration
Compress documentation at build time so every downstream consumer benefits automatically.
GitHub Actions
Add a step to your workflow that compresses docs before they enter a vector store or knowledge base:
- name: Compress docs for vector store
run: |
cargo install --path crates/laconic-cli
mkdir -p compressed_docs
for f in docs/*.md; do
laconic compress -f "$f" > "compressed_docs/$(basename "$f")"
done
- name: Upload to vector store
run: ./scripts/upload_to_pinecone.sh compressed_docs/
Pre-commit Hook
Compress docs automatically on every commit:
#!/bin/sh
# .git/hooks/pre-commit
for f in $(git diff --cached --name-only --diff-filter=ACM -- '*.md'); do
laconic compress -f "$f" > "${f%.md}.compressed.md"
git add "${f%.md}.compressed.md"
done
Audit Token Spend
Add a CI step that reports token savings across your doc corpus:
- name: Token audit
run: |
laconic estimate docs/**/*.md 2>&1 | tee token-audit.txt
# Fail if any file shows negative savings (should never happen)
if grep -q "saved -" token-audit.txt; then
echo "ERROR: Token inflation detected"
exit 1
fi
Docker
Laconic is a single static binary. No runtime dependencies:
FROM rust:1.75 AS builder
WORKDIR /build
COPY . .
RUN cargo build --release --bin laconic
FROM debian:bookworm-slim
COPY --from=builder /build/target/release/laconic /usr/local/bin/
ENTRYPOINT ["laconic"]
docker build -t laconic .
docker run --rm -i laconic compress - < README.md
Token Budgeting
When you have a fixed context window (e.g., 128K tokens), every token matters. Laconic helps you fit more documents into the same budget without lossy truncation.
The Math
Say you’re building a prompt with retrieved context:
- System prompt: 2,000 tokens
- User query: 500 tokens
- Available for context: 125,500 tokens
- Retrieved docs: 150,000 tokens — doesn’t fit
Option A: Truncate. Lose information.
Option B: Compress with Laconic. If your docs are structure-heavy (tables, HTML, badges), you recover 15–50% of that space.
Budget-Aware Pipeline
#!/bin/bash
BUDGET=125000
USED=0
for doc in retrieved_docs/*.md; do
# Get token count of compressed version
stats=$(laconic compress -j "$doc" 2>/dev/null)
tokens=$(echo "$stats" | jq '.compressed_tokens')
NEXT=$((USED + tokens))
if [ "$NEXT" -gt "$BUDGET" ]; then
echo "Budget full at $USED tokens. Skipping remaining docs." >&2
break
fi
# Output compressed text
echo "$stats" | jq -r '.text'
echo "---"
USED=$NEXT
done
Python Example
import subprocess
import json
def compress_and_budget(docs: list[str], budget: int) -> str:
context_parts = []
used = 0
for doc in docs:
result = subprocess.run(
["laconic", "compress", "-j", "-"],
input=doc, capture_output=True, text=True,
)
data = json.loads(result.stdout)
tokens = data["compressed_tokens"]
if used + tokens > budget:
break
context_parts.append(data["text"])
used += tokens
return "\n---\n".join(context_parts)
Fast Mode for Large Batches
If you’re processing hundreds of docs and just need the compressed text (not token counts), use fast mode to skip the BPE tokenizer entirely:
# Compress 500 docs in under a second
for doc in corpus/*.md; do
laconic compress -f "$doc" > "compressed/$(basename "$doc")"
done
You can then count tokens separately on just the winners, or use your LLM provider’s tokenizer.
Stacking with Other Optimizations
Laconic compresses the structure. You can stack it with other techniques:
| Technique | What it removes | Typical savings |
|---|---|---|
| Laconic | Decorative markdown structure | 15–50% on structured docs |
| Prompt caching | Repeated prefix tokens | Up to 90% cost reduction |
| Batch API | Nothing — just cheaper pricing | 50% cost reduction |
These are multiplicative. Laconic + prompt caching + batch API can reduce effective cost by 95%+ on structure-heavy workloads.