← back·← dev log

MCP and CLIs are the same shape: notes on tool design for AI agents

Date: 2026-05-25 Post 3 of ~27 (build-in-public, intermittent cadence)

Engineers keep asking "how do I design an MCP server?" like it's a new problem. It's not. It's CLI design with a different transport. If you can't write your tool as a CLI subcommand, you probably shouldn't ship it as MCP.

A short writeup, because there are thousands of MCP servers in the public registry and most of them are clearly designed without anyone thinking about what an agent actually needs the tool to do. The good ones look suspiciously like good CLIs.

The accidental discovery

The agent portfolio at danielmicaletti.dev has four surfaces — MCP HTTP, MCP stdio, a citty CLI, and a Next.js web UI — all wrapping a single handler package. When I scaffolded it, I assumed I'd hit a fork point: surely the MCP tools would need different schemas than the CLI subcommands, surely the agent-facing contract would diverge from the human-facing one. It didn't. The same Zod schema validates a CLI flag, generates the MCP tool's JSON schema, and types the web handler. One source of truth, four surfaces, zero divergence.

That's not an accident of clever scaffolding. The reason it works is that MCP tools and CLI commands have the same underlying shape: typed input, do work, typed output. Everything else is surface chrome — help text vs JSON schema descriptions, argv parsing vs stdio framing, exit codes vs structured errors. The structural part is identical.

The implications, once you see it

Implication 1: the CLI is the better dev loop. When I'm designing a new agent tool, I write it as a CLI subcommand first. The feedback is human-readable, the iteration cycle is "save file, hit up-arrow, see what it returns." Once it feels right at the CLI, I add the MCP wrapper. If it feels wrong as a CLI — too verbose, too coarse, asks for inputs a human would never type — it'll feel wrong to the agent too. Agents aren't sentient but they share a surprising amount of UX taste with engineers.

Implication 2: tools should be named like verbs, not like objects. A CLI named projects returns everything; a CLI named query_projects takes a filter and returns a slice. Both work. But put yourself in an agent's seat: which one is easier to call correctly with a JD as input? The verb-named one, every time. Same principle applies to MCP. query_projects, evaluate_role_fit, generate_brief, schedule_call — all verbs in my portfolio's tool list, because that's what the agent is doing.

Implication 3: small, composable tools beat big do-it-all tools. A do_portfolio tool that takes a mode parameter and returns everything is a junk-drawer interface. Agents call it, get a wall of JSON, can't reason over the result. Better: five verbs that each do one thing, with sharp Zod schemas that document what each returns. The agent composes them in sequence — get_about then query_projects then evaluate_role_fit — and the composition is legible to a human reading the trace.

Implication 4: stateful tools are an anti-pattern. MCP is stateless across calls by design. So is a CLI. Don't ship a tool that depends on what the previous call did. If the agent needs state, materialize it in the input of the next call. That sounds restrictive until you watch an agent make three parallel tool calls and realize how much state-coupling silently breaks the parallelization model.

What this looks like in practice

The portfolio's query_projects handler is ~60 lines of TypeScript. It takes a Zod-validated input (project_name string, with fuzzy matching across slug + name + display_name + keywords), reads a JSON file, returns a typed match object. That's the entire tool. The MCP wrapper is a four-line tool({ inputSchema, execute }) call. The CLI wrapper is a citty subcommand with the same schema. The web tool path is the same handler imported and awaited. There is no MCP-specific code in the handler. There never needed to be.

When I had to fix a chat hit-rate problem last week (Cicero was hallucinating project details when fuzzy lookup missed), the entire fix was in the handler — add keyword fields, expand the matchesProjectName logic, write two new Vitest cases. The MCP server, the CLI, and the web tool path all picked up the fix automatically because they're all calling the same function. That's not a Vercel-specific magic trick; it's what happens when you design the handler and generate the surfaces.

The principle

Design the handler. Generate the surfaces. The fork between "MCP server" and "CLI" and "web tool" is at the transport, not at the contract. Engineers who get this ship faster, change less code per feature, and produce tools that agents can actually use without months of prompt-tuning.

Why I'm posting this

Tool design is the load-bearing skill for the next decade of AI engineering. Every team building an agent product is going to hit the same wall: their MCP server, their internal SDK, their CLI, and their web tools are all built independently, drift over time, and end up with subtly different contracts that the agent has to learn five times. The shape was the same the whole time. Build it once.

If you're hiring for Forward Deployed / Applied AI / Staff scope at an AI-native team, the harness engineer you actually want is the one who recognizes that the MCP tools they're shipping are CLIs in a trenchcoat — and designs accordingly.

Open to Staff/Founding AI Engineer and Forward Deployed / Applied AI Engineer scope at AI-native teams. Remote-first. The live agent at the top of this page is a working MCP server you can wire into Claude Desktop in 30 seconds — it's the demo.