Open data · Identity

The identity gap, measured

A weekly static read of the source behind the most-used public MCP servers, asking two questions of each: does it take sensitive actions (shell, file writes, database writes, mail or money), and does it ship any visible identity layer at all? Population statistics, updated as the sweep reruns. A companion to the security surface and the population data.

The one hard rule

Static analysis of public source code only. We never connect to, run, install, probe, or authenticate against a running MCP server. "No identity layer" means no auth signal is visible in the source we read, never that a server is exploitable: auth can live in a gateway or reverse proxy in front of the server, and a local (stdio) server legitimately delegates identity to the host process. That is exactly why the headline is cut to network-facing servers, and why every number here is a signal, not a verdict.

1,875

servers swept (most-used, active)

247

network-facing AND take sensitive actions

75.7%

of those show no identity layer in source

2026-07-27

latest sweep

How the population splits

Every swept server lands in one of four buckets, from the combination of the two questions. The bucket that matters is the first one.

477	Sensitive, no identity layer. takes consequential actions; no auth signal anywhere in source
121	Sensitive, with identity layer. takes consequential actions; ships a visible auth signal
169	Identity layer only. auth signal present, no sensitive-action pattern found
1,108	No sensitive actions found. neither pattern set fired

What counts as a sensitive action

shell exec	subprocess.run / os.system, child_process exec/spawn — the tool can run commands on the host
fs write	file writes and deletes — the tool can change what is on disk
db write	INSERT/UPDATE/DELETE/DROP through a query interface — the tool can mutate stored data
mail / money	smtplib, nodemailer, SendGrid, Stripe, Twilio — the tool can send messages or move money

Build and release tooling (scripts/, tools/) is excluded from the sensitive count: a subprocess call in a release script is not agent-reachable. The exclusion only ever makes the headline smaller.

What counts as an identity signal

We count ANY of these as an identity layer, anywhere in the repo, deliberately generous: the question is not whether the auth is good, it is whether there is any at all.

authz header	the code inspects the Authorization header at all
token verify	JWT verification or timing-safe secret comparison
oauth wiring	OAuth/OIDC endpoints, client-credentials flow, protected-resource metadata
mcp sdk auth	the MCP SDK's own auth helpers (requireBearerAuth, token verifiers, WWW-Authenticate)
api key check	x-api-key handling or explicit API-key validation

Does adoption bring identity?

Share of sensitive-action servers with no identity layer, by GitHub stars. The security sweep found popularity does not buy code review; this asks whether it buys an auth layer.

	sensitive servers	no identity layer
<50 stars	181	74.6%
50-499 stars	297	83.5%
500+ stars	120	78.3%

The series

Share of network-facing, sensitive-action servers with no visible identity layer, captured per sweep. The question the series answers: is the identity layer arriving as the ecosystem professionalizes, or staying absent as it grows?

The response

This page is the instrument; the primitive is the response. IdentityKit is an open-source, verifiable identity layer for agents and the servers they call: who is calling, provably, in Python or TypeScript. v0, experimental, honestly labeled.

identitykit on GitHub →

Why no names

Per-repo results stay in the database, same posture as the security sweep. Publishing a list of unauthenticated servers that take sensitive actions would be a target list, and a static heuristic does not earn the right to put a name on it. What we publish is the population: how big the gap is, where it concentrates, and whether it is closing.

Check our work

Scanner source and methodology — the identity checks are readable Python, same as the security sweep.
The security surface — the companion sweep: risky patterns in the same population.
The population dataset — what we swept, how it was discovered, and the weekly series.