WebMCP: Making Your Web App Agent-Ready

WebMCP: Making Your Web App Agent-Ready

AI agents no longer need to scrape your UI. Your website simply tells them what it can do.

Every time an AI agent interacts with a website today, it’s doing something absurd: it takes a screenshot, sends that image to a large language model, and hopes the model can figure out where to click. When that doesn’t work, it parses raw HTML with up to thousands of tokens, trying to reverse-engineer an interface designed for human eyes. A single button click can use up as many tokens as it takes to write a small novel.

This is the state of the art, and it’s embarrassingly fragile. A minor CSS change, a repositioned button, a lazy-loaded element, any of these can break this entire flow.

WebMCP changes this. Instead of forcing agents to guess, your website explicitly declares directly in the browser what it can do through structured, callable tools. No backend server, no OAuth dance and no screenshots.

Most articles about WebMCP stop at explaining the concept. This one gives you a working demo app that implements both the declarative and imperative APIs. Code you can clone and run in Chrome 146 today. Along the way, I’ll break down how WebMCP works under the hood, where it fits alongside Anthropic’s MCP, and what to watch out for while the spec is still evolving.

What is WebMCP?

WebMCP (Web Model Context Protocol) is a browser API specification incubated within a W3C Community Group. It introduces navigator.modelContext, a new interface that lets any web page register tools (named functions with natural-language descriptions and JSON Schema inputs) that AI agents can discover and invoke.

The concept builds on Anthropic’s Model Context Protocol (MCP), which standardized how AI models communicate with external tools via backend servers. WebMCP takes that same mental model (tools with names, descriptions, schemas, and execute handlers) and moves it entirely client-side, into the browser tab.

The W3C specification, published as a Draft Community Group Report in February 2026 and jointly authored by engineers at Google and Microsoft, puts it this way: web pages using WebMCP can be thought of as MCP servers that implement tools in client-side script rather than on a backend.

Chrome 146 Canary carries the first early preview implementation behind a feature flag, and Microsoft Edge is expected to follow soon given its shared Chromium base.

Why does this matter?

Consider the numbers. When an AI agent uses screenshot-based browsing to complete a task on a website, each interaction might consume thousands of tokens just for the image alone, not counting the reasoning needed to interpret it. Early benchmarks report up to a ~90% decrease in token usage when switching from screenshot-based CDP automation to structured WebMCP tool calls (source: WebMCP-org/chrome-devtools-quickstart). The exact savings depend on the task and baseline and independent measurements have shown more modest reductions around 68%, so the directional improvement is significant.

But token savings are only part of the story. The more fundamental shift is from nondeterministic to deterministic interaction. A screenshot-based agent might identify a “Submit” button nine times out of ten and miss it on the tenth because a cookie banner appeared. A WebMCP tool invocation either succeeds or fails because there’s no guessing.

And perhaps the most practical advantage: no new infrastructure. WebMCP tools run in your existing browser tab, using your existing JavaScript, inheriting the user’s existing authentication session. There is no separate server to deploy, no API keys to manage, no OAuth flows to configure.

WebMCP vs. MCP: Complementary, not competing

A common point of confusion is the relationship between WebMCP and Anthropic’s MCP. They share the same conceptual building blocks (tools with names, descriptions, input schemas, and handlers) but they are architecturally distinct and designed for different scenarios.

MCP is a backend protocol. It runs on servers, uses JSON-RPC over transports like stdio or HTTP SSE, requires OAuth for authentication, and supports tools, resources, and prompts. It works regardless of whether the user has a browser open. Think of it as a company’s customer service call center which is available anywhere, anytime.

WebMCP is a browser-native API. It runs in the active tab, uses postMessage internally, inherits the browser's cookies and session, and currently supports tools only. It exists only while the user has the page open. Think of it as an in-store expert who is only available when the customer walks in.

The Chrome team’s recommendation is clear: use both. MCP handles your core backend logic, data retrieval, and background processing. WebMCP handles contextual, in-browser interactions where the user is present and the agent needs to act within the page.

The technical architecture

WebMCP creates two parallel layers on any web page:

Human layer: the familiar visual UI made of HTML, CSS, and JavaScript which remains completely unchanged. Your users never know WebMCP exists.

Machine layer: structured tool contracts with JSON Schema descriptions which sit alongside the human layer, invisible but discoverable by AI agents.

The browser acts as the orchestration layer between these layers. It maintains the tool registry, validates inputs against JSON schemas, routes invocations to the correct handlers, enforces same-origin security boundaries, and mediates permission flows between agent, user, and site.

The flow works like this: a website exposes a set of tools, the user asks an AI agent to do something, and the agent checks the browser for what’s available. It picks the right tool, sends an invocation request, and the browser asks the user for permission when configured. The tool runs on the page and returns structured JSON back through the chain. The UI updates in real time, so both the user and the agent see the same result.

Building an agent-ready Todo app

Let’s make this concrete. I’ve built a minimal Todo application that demonstrates both of WebMCP’s APIs:

  • Declarative (HTML)
  • Imperative (JavaScript)

The full source is on GitHub, and it runs with zero dependencies beyond Node.js for a static file server.

Project setup

webmcp-todo-demo/
├── server.js # Zero-dep static file server
├── package.json
└── public/
├── index.html # HTML with declarative WebMCP forms
├── style.css
├── app.js # Todo app logic (state, CRUD, render)
└── webmcp.js # WebMCP integration (Steps 1-5)

Step 1: Feature detection

Before registering any tools, check whether the browser supports WebMCP. Chrome 146’s early preview exposes the API as navigator.modelContextTesting; the stable release will use navigator.modelContext. A simple fallback handles both:

const mcpApi =
navigator.modelContextTesting || navigator.modelContext || null;
if (mcpApi) {
// WebMCP is available — register tools
} else {
// Graceful fallback — app works normally for humans
}

This is the WebMCP equivalent of checking for serviceWorker in navigator or IntersectionObserver in window. The app should always work without WebMCP; the tools are an additive layer.

Step 2: The Declarative API, an HTML form as a tool

The simplest way to expose a tool is by adding two attributes to an existing HTML form. Note that Chrome’s early preview supports these attributes today, though the formal W3C spec section on declarative WebMCP is still a TODO. Attribute names and behavior come from the explainer document and may evolve before standardization:

<form
toolname="add-todo"
tooldescription="Add a new todo item to the list. Provide a short description of the task."
>
<input type="text" name="description" required />
<button type="submit">Add</button>
</form>

That’s it. The browser reads toolname and tooldescription, automatically generates a JSON Schema from the form's inputs, and registers the tool. When an agent invokes add-todo, the browser pre-fills the description field and submits the form.

On the JavaScript side, you can detect whether the submission came from an agent and return structured data:

addForm.addEventListener("submit", (e) => {
e.preventDefault();
const desc = addForm.querySelector("input[name='description']").value;
const todo = addTodo(desc);
  // Agent-aware response
if (e.agentInvoked && typeof e.respondWith === "function") {
e.respondWith(
Promise.resolve({
content: [{ type: "text", text: JSON.stringify({ success: true, todo }) }],
})
);
}
});

The e.agentInvoked boolean and e.respondWith() method are WebMCP extensions to the standard SubmitEvent. If a human submits the form, these don't exist and your normal flow runs unchanged.

Step 3: The Imperative API, JavaScript tool registration

For operations that don’t map to a simple form, like listing items, toggling status or deleting, the W3C spec defines a JavaScript API using registerTool(). In Chrome 146's early preview, this method is not yet available on navigator.modelContextTesting, so the demo uses hidden declarative forms as the working path and includes the imperative code behind a feature guard:

mcpApi.registerTool({
name: "list-todos",
description: "List all current todo items with their status.",
inputSchema: {
type: "object",
properties: {
filter: {
type: "string",
enum: ["all", "pending", "done"],
description: "Filter by status. Defaults to 'all'.",
},
},
},
execute: async ({ filter = "all" }) => {
let result = listTodos();
if (filter === "pending") result = result.filter((t) => !t.done);
if (filter === "done") result = result.filter((t) => t.done);
  mcpApi.registerTool({
name: "list-todos",
description: "List all current todo items with their status.",
inputSchema: {
type: "object",
properties: {
filter: {
type: "string",
enum: ["all", "pending", "done"],
description: "Filter by status. Defaults to 'all'.",
},
},
},
execute: async ({ filter = "all" }) => {
let result = listTodos();
if (filter === "pending") result = result.filter((t) => !t.done);
if (filter === "done") result = result.filter((t) => t.done);
      return {
content: [{ type: "text", text: JSON.stringify(result) }],
};
},
annotations: { readOnlyHint: true },
});
}

Meanwhile, the same tool works today via a hidden declarative form with toolautosubmit. The browser submits it automatically when an agent invokes the tool:

<form
id="list-todos-form"
toolname="list-todos"
tooldescription="List all current todo items."
toolautosubmit
hidden
>
<input type="text" name="filter" />
</form>

A few things to notice. The description field is critical because it's the primary input agents use to decide which tool to call. Write it for an LLM, not a human. The inputSchema uses standard JSON Schema, so agents can validate their own inputs before sending. The annotations.readOnlyHint signals that this tool only reads data, helping agents prioritize safe operations.

Step 4: Human-in-the-loop for destructive actions

WebMCP supports two approaches to human-in-the-loop confirmation, depending on which API you use.

Declarative approach (works today): Omit toolautosubmit from the form. The browser pre-fills the fields but waits for the user to click the submit button before proceeding. The add-todo form in the demo uses this pattern, so the agent proposes, the human confirms with a click.

Imperative approach (future-ready): The delete-todo tool in the demo includes guarded code using requestUserInteraction(), which will activate when Chrome ships the full imperative API:

execute: async ({ id }, agent) => {
// Pause and ask the user
const confirmed = await agent.requestUserInteraction(async () => {
return window.confirm(`An AI agent wants to delete todo #${id}. Allow?`);
});
  if (!confirmed) {
return { content: [{ type: "text", text: "Deletion cancelled by user." }] };
}
  const removed = deleteTodo(id);
return {
content: [{ type: "text", text: JSON.stringify({ success: true, deleted: removed }) }],
};
},

The second argument to execute is a ModelContextClient object. Its requestUserInteraction() method pauses the tool's execution, hands control back to the user, waits for them to interact (in this case, responding to a confirm() dialog), and then resumes. This is the spec's answer to the question "what if an agent tries to do something the user didn't intend?"

Step 5: Contextual re-registration

One of WebMCP’s most powerful patterns is updating tool context whenever your app’s state changes. In a single-page application, the tools available should reflect the current view and data.

Declarative approach (works today): Update the tooldescription attribute dynamically. The demo hooks into the render cycle to keep the list-todos description current:

function updateToolContext() {
const snapshot = listTodos()
.map((t) => `#${t.id} [${t.done ? "done" : "pending"}] ${t.description}`)
.join("; ");
listForm.setAttribute(
"tooldescription",
`List all current todo items. Current state: ${
listTodos().length === 0 ? "empty list" : snapshot
}`
);
}

Imperative approach (future-ready): The proposal’s explainer defines provideContext(), a batch method that atomically replaces all registered tools. Note: the formal W3C spec currently defines registerTool() and unregisterTool() as the primary imperative APIs; provideContext() coexists in Chrome's implementation and the explainer but may evolve:

if (mcpApi && typeof mcpApi.provideContext === "function") {
mcpApi.provideContext({ tools: toolDefinitions });
}

Both approaches serve the same goal: the list-todos tool description always contains a snapshot of the current todo list, giving the agent immediate context without needing a separate "read state" call first.

Testing it with the Model Context Tool Inspector

Testing WebMCP tools requires dedicated tooling since you can’t simply invoke them from the console. Google provides the Model Context Tool Inspector, a Chrome extension that acts as a minimal agent:

  1. Install the extension from the Chrome Web Store.
  2. Navigate to http://localhost:3000 with WebMCP enabled.
  3. Open the extension popup.
  4. You’ll see all four registered tools (add-todo, list-todos, toggle-todo, delete-todo) with their input schemas.
  5. Click any tool, fill in the parameters, and invoke it. The app updates in real time.

For a more realistic agent experience, you can configure the extension with a Gemini API key and use natural language, “Add a todo to buy milk, then mark the first todo as done”, and watch the tool calls.

Three patterns for designing WebMCP tools

Based on guidance from Alex Nahas, the creator of MCP-B (the browser-based precursor that influenced the WebMCP standard), there are three categories of tools to think about:

Read-only tools should be flat, always available, and marked with readOnlyHint: true. These are your GET operations: listing items, fetching state, checking status. Agents can call them freely without risk.

Navigation tools tell the agent what the site does and where information lives. They’re marked as potentially destructive since they may cause page navigation, which tears down current tools and registers new ones.

Write tools take action: filling forms, submitting data, updating records. These should use requestUserInteraction() for human confirmation before making changes.

Security: What’s built in and what’s still open

WebMCP ships with several security mechanisms baked into the spec. It requires a secure context (HTTPS only). Tools are scoped by the same-origin policy. The browser mediates all invocations with permission prompts. And tools only exist while the page is open.

However, the spec is still evolving, and there are known open concerns. Prompt injection remains a theoretical risk: a tool’s description or return value could contain harmful instructions targeting the agent’s reasoning. Cross-tab context isolation is undefined. An agent with access to multiple tabs could potentially be manipulated by a malicious tab to act on a legitimate one. And the current provideContext() method replaces all tools atomically, which means a malicious third-party script loaded on your page could overwrite your tools.

These are areas of active discussion in the W3C working group, and the spec explicitly positions itself as experimental. For now, treat WebMCP tools the same way you’d treat any client-side code: don’t trust inputs blindly, validate on the server side for anything sensitive, and use requestUserInteraction() as a guardrail for destructive operations.

Current status and what’s next

As of March 2026, WebMCP is available as an early preview in Chrome 146 Canary behind a feature flag. Microsoft Edge is expected to follow. Firefox and Safari have representatives in the W3C working group but have not committed to implementation timelines.

The specification is a W3C Draft Community Group Report, not a W3C Standard, and not yet on the Standards Track. The API surface may change. There are over 100 open issues on the GitHub repository, covering discovery mechanisms, permission models, and security hardening.

That said, the foundation is solid, the backing from Chrome and Edge is real, and the developer tooling already exists. If you’re building web applications that you expect AI agents to interact with, now is the right time to start experimenting.

Get started

Clone the demo:

git clone https://github.com/skoepp/webmcp-todo-demo.git
cd webmcp-todo-demo
node server.js

Enable WebMCP in Chrome:

Navigate to chrome://flags/#enable-webmcp-testing, enable the flag, and relaunch.

Explore the specification:

The full W3C spec lives at webmachinelearning.github.io/webmcp, and the official Chrome tools are at GoogleChromeLabs/webmcp-tools.

Resources


WebMCP: Making Your Web App Agent-Ready was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top