WebMCP: The Web Finally Speaks Agent

A brief on WebMCP, Google and Microsoft's new browser standard that gives AI agents a proper way to use the web — without the screenshots, the scraping, or the guesswork.

The web was built for human eyes and fingers. Clicking buttons. Filling forms. Making things happen. Not really for software trying to complete tasks - AI agents are learning that the hard way.

For years, the only way for code to interact with websites was to essentially impersonate a human - scrape the HTML, reverse-engineer the frontend, and pray nothing changes the next day. An entire ecosystem was born from this frustration - BeautifulSoup, Playwright, Scrapy, Selenium - tools whose job was to make code pass as human. The cat-and-mouse game between scrapers and anti-bot defences like reCAPTCHA and Cloudflare has escalated to the point where heavily funded startups like Firecrawl exist just to stay one step ahead.

AI agents haven't solved this - they have only made it more expensive. When you ask an agent to book a flight or raise a support ticket, what actually happens under the hood is almost embarrassing. The agent takes a screenshot of the page, guesses where the buttons are, clicks, takes another screenshot, and repeats - one inference call at a time. Or it parses raw HTML to find the elements. Move a button five pixels, rename a CSS class, or run an A/B test, and the agent silently breaks. This is untenable. Something needed to change.

WebMCP in a world of agents (image credit: Gemini)
WebMCP in a world of agents (image credit: Gemini)

What is WebMCP?

WebMCP is a new, open-standard browser API initiative co-authored by Google and Microsoft that enables web developers to define tools and actions directly on a web page for AI agents to use. Instead of an agent looking at a screenshot of a website and guessing where the "buy" button is, the website explicitly tells the browser: "I have a function called buyTicket that requires a date and a destination."

WebMCP proposes two new APIs that allow browser agents to interact with pages:

  • Declarative API: Perform standard actions directly through HTML forms, adding metadata tags like toolname and tooldescription to existing elements
  • Imperative API: Perform complex, dynamic, or multi-step interactions that require JavaScript execution

These APIs serve as a bridge between your website and agentic workflows. Websites publish a "Tool Contract" - a structured manifest of capabilities - and agents discover and invoke them directly.

What Next?

WebMCP is not a replacement for Anthropic's Model Context Protocol (MCP). While MCP operates as a backend protocol connecting AI platforms to services, WebMCP operates entirely client-side within the browser. WebMCP also states that it is designed for collaborative browsing with users "in the loop", approving actions and maintaining control. Fully autonomous scenarios, where agents take over a site, are not within the scope of WebMCP.

At scale, this unlocks a whole new class of use cases that the traditional web was never built for. The mundane friction that defines much of digital life becomes a walk in the park for agents. For businesses, this will be the new SEO. Being discoverable by search engines won't be enough anymore; you'll need to be operable by AI agents too. The websites that thrive won't necessarily be the ones with the most beautiful interfaces. They'll be the ones with the clearest contracts.

Subscribe to alphasec

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe