major labs
Open data · Security

The security surface, measured

A weekly static read of the source code behind the most-used public MCP servers, looking for the patterns that matter when an AI agent is on the other end: command injection, SSRF surface, code execution, unsafe deserialization, committed secrets. Population statistics, updated as the sweep reruns. A companion to the population data.

The one hard rule

Static analysis of public source code only. We never connect to, run, install, probe, or exploit a running MCP server. The code is open; the signal is in the code. A finding is a pattern visible in the source, never a confirmed vulnerability, and a "High surface" score means the code does risky things in risky ways and deserves a closer look, never that a server is compromised.

1,852
servers swept (most-used, active)
34.9%
have at least one risky pattern
47
High surface (deserve a closer look)
2026-06-08
latest sweep

Risk-surface tiers

Each swept server gets a transparent score from the categories present in its source, capped at 100. Tiers are score bands, not verdicts.

47
High surface
599
Elevated surface
1,206
Low surface

By pattern (share of swept servers)

ssrf surface
31.3%
code execution
3.3%
command injection
2.3%
unsafe deserialization
0.4%
hardcoded secret
0.3%

Heuristics are tuned for precision over recall, so every number here is a lower bound.

What the breakdowns show

The headline rate is steady across the population, so we cut it three ways to see what moves it. The first answer is the one people don't expect: being popular does not make a server safer. The most-used servers carry a risky pattern at the same rate as obscure ones. Stars measure adoption, not review.

By popularity (GitHub stars)
serversanySSRF surface
<50 stars66637.8%
33.2%
50-499 stars86832.7%
30.4%
500+ stars31834.6%
29.9%
By language — Python and JS/TS only (see note)
serversanySSRF surface
JavaScript15756.7%
52.9%
TypeScript60743.5%
41.8%
Python53136.2%
28.8%
By transport
serversanySSRF surface
local (stdio)22041.8%
38.2%
network (HTTP)76835.5%
32.2%

Language. Within the languages the checks actually cover, the JavaScript and TypeScript ecosystem — most of MCP — carries a markedly higher SSRF surface than Python, because building a request from a variable (fetch(url)) is idiomatic there. We restrict this cut to Python and JS/TS on purpose: the heuristics are Python and JS specific, so ranking Go or Rust against them would measure our checks, not their code. We don't publish a number we'd have to caveat into meaninglessness.

Transport. Counter to intuition, the local (stdio) servers flag more than the network-facing ones. The servers running on your machine, with filesystem and shell reach, are the dirtier set.

The series

The series starts here. Security snapshots accumulate with each weekly sweep, and a longitudinal record cannot be reconstructed after the fact. Check back: the line gets interesting when it has somewhere to go.

What the sweep looks for

command injectionsubprocess with shell=True, os.system, child_process.exec with interpolated arguments. An agent tool that shells out with model-controlled input is a direct remote-code-execution path.
SSRF surfaceoutbound requests/fetch/axios calls to a URL built from a variable, with no allow-list in sight. The classic MCP risk: an agent fetches an attacker-chosen internal URL, like a cloud metadata endpoint.
code executioneval, exec, new Function(), vm.runInNewContext on non-literal input. Arbitrary code from tool arguments.
unsafe deserializationpickle.loads, yaml.load without SafeLoader. Deserializing untrusted input is code execution wearing a different hat.
hardcoded secretslive-looking API key patterns committed to the source. Keys in public repos get harvested in minutes.

Weights are per category present, not per hit, so one noisy file cannot inflate a score. Comment lines are skipped. Findings carry file, line, and snippet internally so a false positive is obvious and cheap to dismiss.

Check your own server

The exact checks behind this scoreboard run as a GitHub Action. Drop it into your MCP server's CI and you get a security-surface score on every push, plus a README badge. Read-only, same hard rule: it never connects to or probes anything.

- uses: major-matters/mcp-surfacecheck@v1
mcp-surfacecheck on GitHub →

Embed the live number

Writing about MCP security? Embed the current sweep figure as a badge that updates with every weekly re-sweep, so your piece never carries a stale number.

[![MCP security sweep](https://img.shields.io/endpoint?url=https://majorlabs.co/data/badge-security.json)](https://majorlabs.co/security)
[![MCP identity gap](https://img.shields.io/endpoint?url=https://majorlabs.co/data/badge-identity.json)](https://majorlabs.co/identity)

Why no names

Per-repo findings stay in the database and go to maintainers through coordinated disclosure, not into a public feed. Publishing a ranked list of risky servers would be a target list, and a static heuristic does not earn the right to put a name on it. What we publish is the population: how common each pattern is, how the tiers are distributed, and whether the trend is improving.

After the first disclosure cycle completes, one exception arrives: a fixed-since-last-sweep feed naming servers whose maintainers resolved findings. Good actors get named. Target lists do not get made.

Check our work