What Are MCP Servers? The Developer's Guide to the Model Context Protocol
MCP is how AI agents talk to the outside world. This is the practitioner's guide: what the protocol actually does, how the pieces fit together, and what I've learned building and running MCP servers in production.
If you’ve used Claude Code, Cursor, Windsurf, or any other AI coding tool in the last year, you’ve probably used MCP without knowing it. The Model Context Protocol is the wire protocol that lets AI agents call external tools, read external data, and interact with systems beyond what’s in their training data or context window. It’s the reason Claude Code can query your database schema, search your codebase, or check your git history — not because those capabilities are built in, but because an MCP server is running locally and answering those questions over a standardized protocol.
I’ve built four MCP servers in Go — scry for code intelligence, tome for database schemas, lore for git history, and flume for HTTP traffic inspection. They all run as local daemons, they all speak MCP, and they all exist because the protocol is good enough that building on it is the obvious choice for giving AI agents access to structured data.
This post is the overview I wish I’d had when I started. Not the spec — you can read the spec. This is what the protocol means in practice, how the pieces fit together, and where the sharp edges are.
The problem MCP solves
Before MCP, giving an AI agent access to external tools meant one of two things. Either the tool vendor built a bespoke integration into the AI platform (which meant the platform gatekept which tools existed), or you hacked something together with function calling and hoped the model would use it correctly.
Both approaches had the same fundamental issue: there was no standard way for an agent to discover what tools were available, what parameters they accepted, what data they could return, and how to authenticate. Every integration was a snowflake. If you wanted Claude to talk to your database and your IDE wanted Claude to talk to your database, those were two separate integrations built by two separate teams using two separate interfaces.
MCP standardizes all of it. One protocol, one wire format, one set of primitives. A tool that works with Claude Code works with Claude Desktop works with Cursor works with any client that implements the spec. The server doesn’t need to know which client is calling it. The client doesn’t need to know how the server is implemented. That’s the point.
The practical impact is that the MCP ecosystem has exploded. There are hundreds of community-built servers covering everything from Slack to Postgres to Kubernetes. And because the protocol is simple enough to implement in a weekend, individual developers can build MCP servers that solve their specific workflow problems — which is exactly what I ended up doing four times.
What are MCP servers used for?
MCP servers give AI agents capabilities they don’t have natively. The categories break down roughly like this:
Code and development tools. This is the most mature category. MCP servers for code search, symbol lookup, test running, linting, and build systems. My own scry falls here — it gives agents sub-millisecond access to code definitions, references, and call graphs instead of grepping through files.
Data access. Databases, data warehouses, APIs. Tome introspects SQL databases so an agent can ask “what columns does the users table have?” and get an instant answer instead of reading migration files. There are community servers for Postgres, MySQL, BigQuery, Snowflake, and most other data stores.
Infrastructure and DevOps. Kubernetes, Docker, cloud providers, CI/CD systems. Agents can check deployment status, read logs, manage resources. This category is growing fast as teams start using agents for operational tasks.
Communication and productivity. Slack, GitHub, Linear, Notion, email. An agent can read issues, post messages, create PRs, or search documentation. These servers bridge the gap between the agent’s workspace and the team’s communication tools.
File systems and local environment. Reading and writing files, running shell commands, accessing environment variables. These are the simplest MCP servers and often the first ones developers encounter.
Observability and debugging. Log aggregation, tracing, HTTP traffic inspection. Flume captures HTTP traffic between your browser and dev server and makes it queryable by agents — so when something breaks, the agent can see exactly what requests were made and what came back.
The common thread is that MCP servers give agents access to real-time, structured data from systems that exist outside the model’s training data. The agent doesn’t need to guess what your database schema looks like if it can just ask.
How MCP works
MCP is JSON-RPC 2.0 over one of three transports. That’s the entire protocol in one sentence. If you’ve built a REST API, the mental model is similar, except the client is an LLM agent, the methods are standardized, and the protocol handles capability negotiation that REST leaves to convention.
The lifecycle
Every MCP session follows the same flow:
- Initialize. The client sends an
initializerequest declaring its name, version, and capabilities. The server responds with its own name, version, and capabilities. This is where both sides learn what the other supports. - Discover. The client calls
tools/list,resources/list, andprompts/listto learn what the server offers. This is dynamic — a server can change its capabilities at runtime and notify the client. - Use. The client calls
tools/callto invoke a tool,resources/readto fetch data, orprompts/getto retrieve a prompt template. These are the actual work calls. - Disconnect. Either side can shut down the session cleanly.
The important thing is that discovery is built into the protocol. An AI agent connecting to an MCP server doesn’t need to be pre-configured with a list of available tools. It asks the server what’s available, gets back structured descriptions with JSON Schema parameter definitions, and can immediately start using them. This is what makes MCP composable — you can add a new server to your setup and the agent automatically learns what it can do.
The three transports
MCP can run over three different transports, and the choice matters more than most people realize:
Stdio is the simplest. The client spawns the server as a child process and communicates over stdin/stdout. There’s no networking, no ports, no configuration. This is what most local MCP servers use, and it’s what Claude Desktop and Claude Code default to for locally-installed tools. The downside is that the server’s lifecycle is tied to the client — when the client exits, the server dies.
SSE (Server-Sent Events) runs the server as an HTTP service. The client connects to an SSE endpoint for server-to-client messages and sends requests via HTTP POST. This is the transport you use when the server needs to outlive the client session, run on a remote machine, or serve multiple clients. It’s slightly more complex to set up but dramatically more flexible.
Streamable HTTP is the newest transport and the one the spec is converging toward. It uses standard HTTP POST for all communication, with optional SSE streaming for server-initiated messages. It’s the most firewall-friendly and the easiest to deploy behind a reverse proxy. If you’re starting a new remote MCP server today, this is the one to use.
The transport decision has real architectural implications. I wrote about this in detail in my complete guide to building MCP servers in Go, but the short version is: use stdio for local tools that the AI agent launches on demand, and use streamable HTTP for anything that runs as a service.
The three primitives
MCP servers expose functionality through three primitives. Most tutorials focus on tools and ignore the other two, which means most developers only use a third of the protocol. Understanding all three matters because choosing the right primitive for a given capability changes how the AI agent interacts with your server.
Tools
Tools are functions the agent can call. They take parameters, do something, and return a result. This is the primitive that maps most directly to function calling. When Claude Code runs a database query or searches your codebase, it’s calling an MCP tool.
A tool has a name, a description, and a JSON Schema defining its parameters. The description is more important than you’d think — it’s what the AI agent reads to decide whether to use the tool. A vague description means the agent won’t call your tool when it should, or will call it when it shouldn’t. Writing good tool descriptions is honestly one of the underrated skills of MCP server development.
Every one of my servers is primarily tool-based. Scry exposes tools like scry_defs (find where a symbol is defined), scry_callers (find what calls a function), and scry_refs (find all references to a symbol). Tome exposes tome_describe (get a table’s schema) and tome_relations (get foreign key relationships). The tools are narrow, specific, and do one thing well. That pattern works better than broad, do-everything tools because the agent can compose them.
Resources
Resources are data the agent can read. They’re identified by URIs and return content — think of them as a read-only file system. A resource might be a configuration file, a database schema dump, a log stream, or any other structured data the agent might want to reference.
The key difference from tools is that resources are meant for context, not action. An agent reads a resource to inform its next decision; it calls a tool to actually do something. In practice, resources are underused. Most MCP servers expose everything as tools because tool calling is what developers understand. But resources are the right choice when the agent needs reference data — the schema of your database, the contents of your configuration files, the current state of your deployment.
Prompts
Prompts are reusable templates the server can offer to the client. They’re the least understood primitive and the most interesting for workflow automation. A prompt template can include parameters, system messages, and structured output formats. The client can present them to the user as actions or use them programmatically.
I haven’t used prompts heavily in my own servers yet, but the pattern I keep seeing in the ecosystem is prompts as workflow starters — “analyze this table,” “review this PR,” “explain this error” — where the server provides a battle-tested prompt template that the agent wouldn’t generate on its own.
MCP servers vs REST APIs
This is the comparison that comes up most often, and the confusion makes sense — both are ways for software to talk to other software over a network. But they’re designed for different consumers.
A REST API is designed for human-driven applications. A developer reads the docs, writes a client, handles pagination, parses responses, manages auth tokens. The API assumes a human understood the docs and wrote correct integration code.
An MCP server is designed for AI agent consumption. The agent discovers what’s available at runtime through the protocol itself — no docs to read, no client to write. The server declares its tools with structured parameter schemas, the agent reads those schemas, and it knows how to call them. When you add a new tool to an MCP server, every connected agent can use it immediately. No client update, no code change, no deployment.
The other key difference is that MCP is bidirectional. A REST API is request-response: the client asks, the server answers. MCP supports server-initiated notifications — the server can tell the agent that its capabilities changed, that a long-running operation completed, or that something in the environment shifted. This matters for agentic workflows where the agent needs to react to changes, not just poll for them.
In practice, many MCP servers are thin wrappers around existing REST APIs. That’s fine. The MCP layer adds discovery, schema enforcement, and client compatibility. You’re not replacing your API — you’re making it accessible to agents.
MCP servers vs RAG
This is the other common confusion. RAG (Retrieval-Augmented Generation) and MCP both give AI models access to external data, but the mechanisms and use cases are different.
RAG is about pre-loading context. Before the model generates a response, you retrieve relevant documents from a vector database and inject them into the prompt. The model sees the retrieved text as part of its input and uses it to ground its response. RAG is passive — the model doesn’t choose to retrieve; the system does it on the model’s behalf.
MCP is about active tool use. During its reasoning process, the agent decides it needs information, calls an MCP tool to get it, receives the result, and continues reasoning. The agent is in control of when and what to retrieve. It can call multiple tools, combine results, and make decisions about what information it actually needs.
The practical difference: RAG is great when you know what context the model will need ahead of time. MCP is great when the model needs to explore, query, and interact with systems dynamically. They complement each other — and in fact, one common MCP server pattern is a tool that performs vector similarity search, essentially giving the agent the ability to do RAG on demand rather than having it done for them.
I find MCP more powerful for development workflows because the agent’s information needs are unpredictable. When debugging a production issue, the agent might need to check the database schema, then read a log file, then look at git blame, then query an API. You can’t pre-load all of that context via RAG. But you can give the agent MCP servers for each data source and let it pull what it needs.
The MCP ecosystem in 2026
The ecosystem has matured fast. When I started building MCP servers in late 2025, the Go SDK had just landed and most servers were TypeScript. A year later, the landscape looks very different.
SDKs
The official SDKs cover TypeScript, Python, Go, Java, Kotlin, and C#. The Go SDK in particular has gotten solid — it supports all three transports, has auth middleware, and tracks the spec closely. There are also strong community SDKs in most languages. I covered the Go SDK landscape in depth — the short version is that the official SDK is the right default for new projects.
Clients
Every major AI coding tool now supports MCP: Claude Desktop, Claude Code, Cursor, Windsurf, VS Code Copilot, and more. The client landscape is why MCP servers are worth building — write one server and it works everywhere.
Servers
The official MCP servers repository has reference implementations for dozens of integrations. The community has built hundreds more. The quality varies widely. Some servers are production-grade; others are weekend projects that break on edge cases. This is one of the reasons I keep building my own — I need tools that handle real workloads reliably, and the fastest path to that is owning the code.
Building MCP servers
If you’re a developer looking at MCP and thinking “I should build one,” you’re probably right. The protocol is simple enough that a useful MCP server is a weekend project. A production-grade one takes longer, but not as long as you’d think.
I’ve written extensively about the implementation side:
- Building MCP Servers in Go: The Complete Guide covers SDK choice, transport configuration, testing patterns, and production deployment with working code at every step.
- The Context Layer explains the architectural pattern behind my four MCP servers — pre-computing answers into an embedded KV store and serving them in single-digit milliseconds.
- Building Scry is a deep dive into building a code intelligence MCP server specifically.
The one thing I’ll emphasize here is that the hardest part of building an MCP server isn’t the protocol. The MCP layer is maybe 200 lines of code in any of my servers. The hard part is the domain logic — indexing code symbols, parsing SQL schemas, traversing git history. The protocol is plumbing. The value is in what you pipe through it.
MCP server examples
To make this concrete, here’s what some real MCP servers actually do:
Scry (code intelligence): Pre-indexes your codebase using SCIP, stores symbol definitions, references, and call graphs in BadgerDB. When an agent asks “where is this function defined?” or “what calls this method?”, scry returns the answer in under 5 milliseconds. Without it, the agent would grep through every file in your project.
Tome (database schema): Connects to your SQL database, introspects every table, column, foreign key, and index, caches the full schema. The agent asks tome_describe users and gets back the complete schema with types, constraints, and relationships. Without it, the agent reads migration files and guesses.
Lore (git intelligence): Pre-indexes git blame, commit history, and co-change patterns. The agent asks “who last modified this function?” or “what files usually change together with this one?” and gets structured answers. Without it, the agent runs git log and git blame and parses unstructured text.
Flume (HTTP debugging): Runs as a reverse proxy between your browser and dev server, captures every request and response. The agent asks “what was the last POST to /api/users?” and gets the full request body, response, headers, and timing. Without it, the agent asks you to add console.log statements and re-trigger the request.
Each of these follows the same pattern: pre-compute expensive work, cache the results, serve them instantly over MCP. The protocol is the easy part. The domain logic — parsing SCIP indexes, traversing foreign key graphs, building blame indexes — is where the real engineering lives.
Security and authentication
MCP servers run with the full permissions of the process that hosts them. A tool that reads files can read any file the server process can access. A tool that executes shell commands can execute anything. There’s no sandboxing built into the protocol.
This is fine for local servers running on your development machine under your user account. It’s a real concern for remote servers or anything that handles sensitive data. The spec includes an auth framework based on OAuth 2.1, and the official SDKs have middleware for it, but auth is one of those things that’s easy to skip in a tutorial and hard to retrofit later.
The practical advice: if your MCP server runs locally and only you use it, the ambient permissions of your user account are sufficient. If it runs remotely, faces the network, or handles data beyond your personal workstation, authentication and authorization are not optional. The OAuth 2.1 flow in the spec is well-designed — use it.
MCP in agentic workflows
MCP gets more interesting when you move beyond single-agent, single-server setups. In a multi-agent system, MCP servers become shared infrastructure. My four servers — scry, tome, lore, flume — all run as persistent daemons, and any agent in a multi-agent setup can connect to them. The orchestrator doesn’t need to know which tools each agent needs. The agents discover what’s available via MCP’s built-in capability negotiation.
This is where the protocol’s design pays off. Because MCP handles discovery, authentication, and structured data exchange, adding a new tool to a multi-agent workflow is just starting a new server process. No configuration changes to the agents, no coordination logic to update, no message schema to version. The agents ask the server what’s available, and the server tells them.
I’ve written about multi-agent patterns and how Orch coordinates multiple Claude Code instances. MCP is the layer that makes this work without drowning in integration complexity.
Where MCP is going
The protocol is still evolving. The 2026 roadmap includes better support for long-running operations, improved streaming capabilities, and tighter integration between the auth framework and client-side UX. The transport layer is converging on streamable HTTP as the standard for remote servers, with stdio remaining the default for local tools.
The trend I’m watching is MCP servers moving from development tools to production infrastructure. Right now, most MCP usage is developers connecting tools to their AI coding assistants. But the same protocol works for production AI systems that need to query databases, call APIs, and interact with external services. The protocol doesn’t care whether the client is Claude Code on a developer’s laptop or a production agent processing customer requests.
The bet I’m making with my own tools is that the MCP ecosystem will be the primary interface between AI agents and external systems within the next year. Every server I build now is an investment in that future — and the protocol is stable enough that I’m not worried about breaking changes.
Frequently Asked Questions
What is an MCP server?
An MCP server is a program that exposes tools, data, and prompt templates to AI agents using the Model Context Protocol. It gives AI assistants like Claude standardized access to external capabilities — databases, APIs, code intelligence, file systems, or any other system the agent needs to interact with. MCP servers can run locally on your machine or remotely as HTTP services.
How is MCP different from function calling?
Function calling is a model-level feature where the AI generates structured arguments for a predefined function. MCP wraps that concept in a full protocol with capability discovery, transport negotiation, structured error handling, and authentication. Function calling tells the model what it can call. MCP tells the model what's available, how to call it, and handles the communication layer.
What is the difference between MCP servers and REST APIs?
REST APIs are designed for human-driven applications — browsers, mobile apps, other services. MCP servers are designed specifically for AI agent consumption. MCP includes built-in capability discovery (the agent asks the server what tools are available), structured parameter schemas via JSON Schema, and a standardized protocol that works across all AI clients. You don't need to write API docs for an MCP server because the protocol itself is the documentation.
What are MCP servers used for?
MCP servers give AI agents access to external systems. Common use cases include querying databases, searching codebases, reading file systems, interacting with APIs like Slack or GitHub, managing cloud infrastructure, and accessing real-time data. Any capability you want an AI agent to have beyond its training data can be exposed through an MCP server.
What languages can I build MCP servers in?
MCP has official SDKs for TypeScript, Python, Go, Java, Kotlin, and C#. Community SDKs exist for Rust, Ruby, Swift, and others. The protocol is language-agnostic — any language that can do JSON-RPC 2.0 over stdio or HTTP can implement an MCP server.
Is MCP only for Claude?
No. MCP is an open protocol supported by Claude Desktop, Claude Code, Cursor, Windsurf, VS Code Copilot, and many other AI tools. A server built for one client works with all of them. The protocol is maintained by Anthropic but is not exclusive to Anthropic products.
Are MCP servers free?
The protocol itself is open and free. The official SDKs are open-source. Building and running your own MCP server costs nothing beyond your compute. Many community-built MCP servers are also free and open-source. Some commercial MCP servers exist as paid products, but the ecosystem is overwhelmingly open.
How do I test an MCP server?
The MCP Inspector is the standard debugging tool — it connects to your server and lets you call tools interactively. For automated testing, write integration tests that send JSON-RPC requests to your server over stdio and assert on the responses. Most SDKs include test utilities for spinning up a server in-process.
Do I need to know the MCP specification to build a server?
Not to get started. The SDKs abstract the protocol details. But reading the architecture overview on modelcontextprotocol.io is worth it once you're building something real, because it explains why things work the way they do — especially around capability negotiation and the primitive types.
What is the difference between MCP and RAG?
RAG (Retrieval-Augmented Generation) is a pattern where you retrieve relevant documents from a vector database and inject them into the model's context before it generates a response. MCP is a protocol for real-time tool use — the agent calls functions, queries databases, and interacts with systems during its reasoning process. RAG is about feeding context in. MCP is about reaching out. They complement each other: you might use an MCP server that performs RAG-style retrieval as one of its tools.
Why do we need MCP servers?
Without MCP, every AI tool integration is a custom, one-off implementation. MCP standardizes how agents discover tools, pass parameters, handle errors, and authenticate. This means a tool you build once works with every AI client that supports the protocol. It also means the agent can dynamically discover new tools at runtime, rather than being hardcoded to a fixed set of capabilities.