markdownify-mcp: your agent's context window is mostly HTML scaffolding

In 2026, the under-discussed cost in an agent stack isn’t model pricing. It’s the tokens spent on chrome the model has to read before it reaches anything worth reading. A single nav wrapper might open with <div class="container mx-auto flex flex-col gap-y-4 lg:flex-row lg:items-center lg:justify-between ...">, and a typical article carries dozens of those before the prose starts. Cloudflare measured the total on their own blog post on 12 February 2026: it took 16,180 tokens served as HTML and 3,150 as Markdown. An 80% reduction, or roughly 5x more readable content per context window.

Same blog post, two formats

Cloudflare measured a single article served as HTML versus Markdown. 80% of the tokens were scaffolding.

Source: Cloudflare, “Introducing Markdown for Agents”, 12 February 2026.

Markdown has quickly become the lingua franca for agents and AI systems as a whole. The format’s explicit structure makes it ideal for AI processing, ultimately resulting in better results while minimising token waste.

— Cloudflare, "Introducing Markdown for Agents," 12 February 2026

What it does

Markdownification: converting any input format (HTML, PDF, audio, video, slides) into Markdown so an LLM reads structure and prose instead of presentational chrome.

markdownify-mcp, maintained by @zcaceres, is the MCP server that does this at the host edge. Point it at a PDF, a YouTube URL, an .xlsx, an audio file, or a webpage, and you get Markdown back. The agent then reasons over headings and prose instead of class names and SVG paths.

The current tool list:

webpage-to-markdown: convert any URL to Markdown.
youtube-to-markdown: get the transcript of a YouTube video.
pdf-to-markdown, docx-to-markdown, xlsx-to-markdown, pptx-to-markdown: convert document files.
audio-to-markdown: transcribe an audio file.
image-to-markdown: extract OCR text and metadata from an image.
bing-search-to-markdown: run a Bing search and return the results.
get-markdown-file: read an existing .md file from disk.

How it slots into a stack

Most agent stacks already include a generic fetch tool. The right move is to replace it. A vanilla fetch returns HTML and the agent burns a chunk of its context window on closing tags. markdownify-mcp returns Markdown, and the same call costs roughly a fifth of the tokens. That compound saving shows up most on workflows where the agent reads things it didn’t write: a research agent crawling competitor docs, a finance agent scanning a quarterly PDF, a support agent skimming a YouTube interview for a quote, a back-office agent flattening a spreadsheet into a single string.

What stands out

Two design choices reward stack builders specifically. First, markdownify-mcp delegates the messy parsing to Microsoft’s markitdown rather than reinventing it. That’s the kind of restraint most one-person tools never manage. The preinstall step builds a Python venv and installs markitdown[all], and from there markdownify-mcp’s job is to expose that capability as a clean MCP surface. Second, the tool list is small enough to memorise: ten verbs, all returning Markdown, all named after their input type. No orchestration layer, no opinion on chunking. The server does the conversion and gets out of the way.

Where it isn’t the right tool

A few honest limits:

If the source is already structured (a JSON API, an llms.txt, or a Markdown file already on disk), use the source directly. The conversion adds nothing.
It won’t fix a model that reasons poorly over Markdown. A few smaller open models still do.
Audio transcription and image OCR require the full markitdown[all] install. The published Docker image (mcp/markdownify) ships only markitdown[pdf], so either run from source or pin a custom image if you need those tools.

Install

You’ll need Bun installed first. Then clone and build:

git clone https://github.com/zcaceres/markdownify-mcp.git
cd markdownify-mcp
bun install
bun run build

Once dist/index.js exists, wire it into your host. Pick your tool:

claude mcp add markdownify -- node {ABSOLUTE_PATH}/dist/index.js

Send the agent at a PDF and watch the context window stop being 70% scaffolding.