major labs
Essay 03Show your working10 min read

Inside the discovery layer

By Charlie Major · 2026-06-12 · updated 2026-05-27 with v0 scan results

In the last essay I called the order. Discovery closes first. This essay shows the work behind the call.

We ran the v0 scan over a few hours yesterday. What follows is what the scanner actually does, what the first sweep found, what an AEO citation tracker measures that Google Search Console cannot, and why the pricing economics of the measurement layer favor an independent operator.

The discovery layer is the most measurable layer in the agentic stack, which is the same reason it is the cheapest to ship into. Most of what follows here is what we will publish in the State of MCP Security Q3 2026 report, with the methodology disclosed in full. Treat this as the preview.


What the scan actually does

GitHub returns sixteen thousand one hundred and fifty results for topic:mcp-server. Most of those are noise. After dedupe across five source queries and active-only filtering, the population of distinct, non-archived, non-fork repos is 2,388. We scanned all of them.

The scan runs in four stages. None of them are exotic. The point is that the work has not been done in public until now.

Stage one is discovery. We pull candidates from five GitHub queries: the topic:mcp-server tag (16,150 raw results), the topic:model-context-protocol tag (10,979), the topic:modelcontextprotocol tag (613), the official modelcontextprotocol org (42), and a name-and-description search for "mcp server" on non-fork repos. We dedupe by canonical repo URL and normalize the names. The union after dedupe is 2,388 distinct active repos.

Stage two is connection. We speak the MCP protocol against each reachable server, enumerate the tools and resources it exposes, and capture the basic capability surface. v0 focuses on what GitHub returns plus the README; live protocol connection runs on the HTTP-mode subset in v1.

Stage three is security probing. A controlled, ethical, non-destructive probe suite is the v1 layer. The v0 scan establishes the population and the transport classification needed before the probes are useful. We publish v1 results when the SSRF and related probes have completed.

Stage four is quality measurement. We pull maintenance signals from the GitHub API (last commit, last release, open issue ratio, license, language, size), and classify transport mode from README pattern matching.

The v0 scan ran on a single machine in a few hours. The cost was zero — the GitHub API calls fit inside the 5,000-per-hour authenticated rate limit. There is no reason this scan has not existed before now except that nobody decided to do it.


What v0 measured

The first sweep produced a different shape than the headline numbers suggest. Six findings worth naming.

The functional population is much smaller than the topic count. GitHub reports 16,150 results for topic:mcp-server. After dedupe across five source queries, 2,388 distinct active repos exist. The difference is the long tail of integrations, client wrappers, references, abandoned forks, and unrelated projects that wear the topic tag.

Thirty-nine percent of MCP-tagged repos have no README at all.

Of the 2,322 we deep-analyzed, 906 returned no README from the GitHub API. That is not a methodology bug. Those repos genuinely lack a README at the default branch — placeholder repos, abandoned experiments, empty scaffolds. The honest population of functional candidate MCP servers is therefore closer to 1,200 than to the "5,800-plus" figure that circulates in commentary about the category. Anyone leading with the larger number is counting code that does not document itself, let alone ship to users.

The category is mid-explosion. Among active repos, 735 were touched in the last seven days. That is 32 percent of the active population, this week. Another 421 in the last thirty days. Another 401 in the last ninety. Only 193 are stale beyond a year. A third of the active ecosystem is being touched every week, which is what a category looks like right before somebody names it.

Transport mode shifts dramatically as you go down the stars curve. In the top hundred by stars, half of the analyzed repos support both stdio and HTTP transports. Across the full population, only 31 percent do. Stdio-only is 13 percent. HTTP-only is 8 percent. The remaining 38 percent have no readable README, so transport is undeterminable. The high-star cohort is materially more sophisticated than the long tail.

The license risk is real. Fifty-five percent ship under MIT. Fifteen percent under Apache 2.0. Fourteen point six percent ship with no license at all. That is a procurement-blocking finding for any enterprise that wants to deploy MCP servers commercially. Many CTOs do not yet know this is the number.

TypeScript and Python dominate. Active population: TypeScript 32 percent, Python 30 percent, JavaScript 9 percent, Go 7 percent, Rust 5 percent. The remainder is fragmented across 30-plus languages. The Anthropic SDK distribution explains the top two.

The full data set lives in out/repos.csv in the scanner repo. The top fifty by stars and the maintenance freshness curve sit in the State of MCP Security preview.


What v1 measures next

The v0 sweep establishes the population and the maintainability picture. v1 adds the security probes that turn maintenance data into a procurement signal.

Five vulnerability categories the v1 sweep covers. Each probe is non-destructive, returns a binary present-or-not-present per repo, and ships open-source.

SSRF. Server-side request forgery. The server exposes a tool that fetches URLs without validating the target. The probe sends URLs pointing at internal addresses (169.254.169.254, localhost, link-local space) and checks for successful retrieval. Target population: the HTTP-mode and dual-transport subset, roughly nine hundred repos by the v0 transport classification.

Unauthenticated tool execution. The server exposes state-modifying tools without verifying the caller. The probe attempts a representative state-modifying call without any auth credentials and checks whether the call succeeds.

Prompt injection passthrough. The server includes user input verbatim in the context passed to an LLM, with no sanitization or boundary marking. The probe sends a string containing a known injection signature and checks whether the model executes the injected instruction.

Information disclosure in errors. The server returns errors that leak internal paths, stack traces, or configuration details. The probe sends malformed requests and pattern-matches the response body for file paths and secret signatures.

Outdated dependencies. The server runs on package versions with known CVEs. The probe parses exposed package manifests and cross-references the OSV vulnerability database.

The State of MCP Security Q3 2026 report ships when the v1 sweep completes. The headline will lead with the v0 finding — 39 percent no README, ~1,200 functional candidates, 14.6 percent unlicensed — and then drill into the v1 probe results. Every number is reproducible from the methodology and the open-source scanner.

A note on intellectual honesty: an earlier draft of this essay cited specific percentages for the five vulnerability categories as if v0 had already measured them. It had not. The v0 sweep covered population, maintenance, transport, license, and language. The probe percentages are v1 work. The category needs the registry whether the probe findings are 5 percent or 50 percent in any given class, but we will not invent the numbers before the probes run.


AEO Citation Tracker — what it measures that GSC cannot

Google Search Console measures Google. AEO Citation Tracker measures what Google cannot tell you.

GSC reports impressions in Google search, clicks to your site, average position, click-through rate, the pages that rank, and the queries that drive traffic. All of it is Google-only, and all of it assumes a model where the user reads your URL on a search results page and clicks through. That model is breaking.

A third of consumer queries now resolve inside an AI Overview without a click. ChatGPT, Perplexity, and Gemini cite differently from each other and differently from Google's organic ranking. The overlap between any two LLMs is small. The publisher who optimized for Google for the last fifteen years has no instrument to measure the new surface that is replacing it.

AEO Citation Tracker measures four things GSC structurally cannot.

Citation share by LLM. When you ask ChatGPT "what are the new agentic commerce protocols," how often is your URL cited in the answer? When you ask Perplexity the same question? Gemini? The numbers are typically different by 10x. The same content, three platforms, three radically different shares. Publishers we have shown a sample to are universally surprised at the spread.

Citation context. When cited, what role does your content play? Primary source, supporting reference, or contradicting view? The role determines whether a citation drives traffic, brand recall, or just noise. The tracker classifies each citation by role using a separate scoring pass.

Schema correlation. Does adding FAQPage, HowTo, or Article schema change citation likelihood? The early data says yes, materially, on Gemini, weakly on Perplexity, and not at all on ChatGPT. Publishers can act on that data the same week they collect it. None of it is visible to anyone running GSC alone.

Disappearance events. When do you stop being cited? Model updates, content drift, competitive replacement. The tracker watches for citation drops and surfaces the likely cause. Publishers find out their flagship content stopped being cited the same hour it happens, not the next quarter when revenue craters.

AEO is a different measurement category, not an extension of SEO. The skills, the tooling, and the buyer all live somewhere else. We are not building a Google Search Console competitor. We are building the instrument that measures the surface Google Search Console cannot.


The pricing economics

The current landscape of measurement tools for the AI-overviews era looks like this.

Free DIY scripts. A capable developer can write the equivalent of a citation tracker in a weekend. Many have. None of them sustain it past month three because the ongoing API budget and the rate-limit dance are tedious. The free tier exists in theory and rarely in production.

Google Search Console. Free, comprehensive for Google's own search, and structurally blind to everything we just described.

The enterprise SEO incumbents. Semrush, Ahrefs, Conductor, and the rest. Pricing starts around $400 a month and runs to $5,000 a month for the enterprise tiers. The data is comprehensive for Google but their AI-search coverage is bolted on, partial, and lags the AI platforms by weeks.

The emerging AI-search-visibility category. Three or four startups launched in 2025 with $1,000 to $3,000 a month pricing and an enterprise sales motion. They serve corporate procurement, they require a sales call, and they are not buyable by a solo publisher or a five-person content team.

Major Labs AEO Citation Tracker prices at free, $49, and $199. The economics behind that range are deliberate. A solo publisher with fifty thousand monthly readers cannot afford $1,000 a month to measure their AI search visibility. A five-person content team at a mid-market SaaS can afford $199. The buyer is the same person who already pays for Webflow, Stripe, Postmark, and a stack of individual SaaS subscriptions in the $50-to-$200 range. The buying motion is muscle memory.

The free tier is a discovery wedge. Three URLs, weekly checks, no credit card. Most users convert when they hit the wall, which usually happens the week after they share a screenshot of the dashboard with a colleague.

The broader pattern is familiar. The discovery-layer measurement category mirrors what Vercel did to Heroku and what Resend is doing to SendGrid. The incumbents are too expensive and too slow for the long tail. The category opens for an operator who prices for the buyer who is actually shopping, not the buyer the legacy vendor wishes they were selling to.


What ships first

Both products land in Q3 2026.

MCP Quality Registry. Public directory of the 2,388 active MCP-tagged repos discovered in v0, with maintenance, transport, license, and (once v1 ships) security scores. Refreshed weekly. Free read access. Verified badge for server publishers at $299 one-time review plus $99 a month maintenance. Enterprise API access for procurement teams at $999 a month. The State of MCP Security Q3 2026 report drops alongside the registry launch.

AEO Citation Tracker. Free tier with three URLs and weekly checks. Pro tier at $49 a month with 25 URLs, daily checks, and basic alerts. Team tier at $199 a month with 100 URLs, real-time alerts, and API access. The first dataset goes to subscribers two weeks before public availability.

Both ship with the open-source scanner cores. The hosted layers are the commercial product. The methodology is published in full. The data behind every number we publish is reproducible by anyone with a weekend and an API budget.

The next essay goes inside the commerce layer. Mandate scope verification, refund primitives, audit trails that survive a processor inquiry, and what BudgetGuard and MandateKit actually do at the protocol level.

See you Friday.

— Charlie

Charlie Major writes Major Matters and joined Mastercard in April 2026. Major Labs is independent of Mastercard and operates separately from Major Matters. Any opinions in these essays are Charlie's own.

Update 2026-05-27: This essay was originally drafted with projected vulnerability percentages before the v0 scan had run. The updated text replaces those projections with the actual v0 findings (population, maintenance, transport, license) and moves the vulnerability percentages to v1 work. The earlier framing overclaimed what v0 had measured. The category needs the registry regardless of the eventual probe numbers.

Essay 04 · Now live
Inside the commerce layer

Mandate scope verification, the refund and dispute void, audit trails that survive a processor inquiry, and what MandateKit and BudgetGuard actually do at the protocol level.

Get every essay

Two essays a week. Quarterly State of reports drop here first. No marketing, no fluff.