Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
65b34e5
feat: add tier-check CLI for SDK tier assessment
felixweinberger Feb 10, 2026
c4f1c59
refactor: revise tier-check CLI and skill based on review feedback
felixweinberger Feb 10, 2026
f0849b9
fix: address review feedback on skill and README
felixweinberger Feb 10, 2026
cdc7402
fix: resolve lint errors in conformance.ts
felixweinberger Feb 10, 2026
5e62b5f
style: apply prettier formatting
felixweinberger Feb 10, 2026
9a1018c
docs: add npm run tier-check script, update docs with examples
felixweinberger Feb 10, 2026
0a91672
style: fix prettier formatting in SKILL.md
felixweinberger Feb 10, 2026
12c50a7
fix: add per-scenario timeout, support url-only conformance, fix stdo…
felixweinberger Feb 10, 2026
1bf90c8
refactor: skill takes local path + server URL instead of repo name
felixweinberger Feb 10, 2026
394b6f4
refactor: write detailed reports to files, show concise summary
felixweinberger Feb 10, 2026
b6672f0
docs: update READMEs for new skill interface and pre-start workflow
felixweinberger Feb 10, 2026
c035764
simplify: flat file output instead of nested directory
felixweinberger Feb 10, 2026
efae415
fix: remediation always shows path to Tier 2 and Tier 1
felixweinberger Feb 10, 2026
0e1099d
feat: add client conformance testing to tier-check CLI and skill
felixweinberger Feb 11, 2026
e198a24
improve: table summary output, write reports via subagents
felixweinberger Feb 11, 2026
5307492
improve: list tier gaps as numbered items instead of one-line blob
felixweinberger Feb 11, 2026
ffa04c0
improve: finalize summary format with separator, high-priority fixes,…
felixweinberger Feb 11, 2026
6e662a7
improve: add pre-flight checks for gh auth and server reachability
felixweinberger Feb 11, 2026
b9dffd1
docs: improve README and fix skill auto-detection paths
felixweinberger Feb 11, 2026
12eb095
simplify: remove client-cmd auto-detection, require explicit argument
felixweinberger Feb 11, 2026
655734f
fix: align docs table with canonical list (48 features), simplify pol…
felixweinberger Feb 11, 2026
6f3ec84
refactor: extract canonical feature list into single source of truth
felixweinberger Feb 11, 2026
5b4f5e2
fix: separate deterministic file checks from AI content evaluation
felixweinberger Feb 11, 2026
f5beda8
style: apply prettier formatting
felixweinberger Feb 11, 2026
de6cfd8
Merge branch 'main' into fweinberger/tier-check-cli
felixweinberger Feb 11, 2026
d55be40
revert: undo unrelated console.log change in runner/server.ts
felixweinberger Feb 11, 2026
771adf5
refactor: shell out to conformance CLI instead of reimplementing runner
felixweinberger Feb 12, 2026
4606e0d
docs: add Go and C# SDK examples to README and SKILL.md
felixweinberger Feb 12, 2026
4ac0834
fix: add --framework net9.0 to C# server command
felixweinberger Feb 12, 2026
d78d689
rename: conformance.ts -> test-conformance-results.ts
felixweinberger Feb 12, 2026
b708334
style: prettier formatting
felixweinberger Feb 12, 2026
ba1591f
fix: reconcile conformance results against full scenario list
felixweinberger Feb 12, 2026
1e1cfa4
docs: tighten documentation evaluation criteria
felixweinberger Feb 12, 2026
ffde2d9
docs: add Labels and Spec Tracking rows to audit report templates
felixweinberger Feb 12, 2026
abed710
fix: reuse ConformanceCheck type from src/types.ts instead of redefining
felixweinberger Feb 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
258 changes: 258 additions & 0 deletions .claude/skills/mcp-sdk-tier-audit/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,258 @@
# MCP SDK Tier Audit

Assess any MCP SDK repository against [SEP-1730](https://github.com/modelcontextprotocol/modelcontextprotocol/issues/1730) (the SDK Tiering System). Produces a tier classification (1/2/3) with an evidence-backed scorecard.

Two components work together:

- **`tier-check` CLI** — runs deterministic checks (server + client conformance pass rate, issue triage speed, P0 resolution, labels, releases, policy signals). Works standalone, no AI needed.
- **AI-assisted assessment** — an agent uses the CLI scorecard plus judgment-based evaluation (documentation coverage, dependency policy, roadmap) to produce a full tier report with remediation guide.

## Quick Start: CLI

The CLI is a subcommand of the [MCP Conformance](https://github.com/modelcontextprotocol/conformance) tool.

```bash
# Clone and build
git clone https://github.com/modelcontextprotocol/conformance.git
cd conformance
npm install
npm run build

# Authenticate with GitHub (needed for API access)
gh auth login

# Run against any MCP SDK repo (without conformance tests)
npm run --silent tier-check -- --repo modelcontextprotocol/typescript-sdk --skip-conformance
```

The CLI uses the GitHub API (read-only) for issue metrics, labels, and release checks. Authenticate via one of:

- **GitHub CLI** (recommended): `gh auth login` — the CLI picks up your token automatically
- **Environment variable**: `export GITHUB_TOKEN=ghp_...`
- **Flag**: `--token ghp_...`

For public repos, any authenticated token works (no special scopes needed — authentication just avoids rate limits). For a [fine-grained personal access token](https://github.com/settings/personal-access-tokens/new), select **Public Repositories (read-only)** with no additional permissions.

### CLI Options

```
--repo <owner/repo> GitHub repository (required)
--branch <branch> Branch to check
--skip-conformance Skip conformance tests
--conformance-server-url <url> URL of the already-running conformance server
--client-cmd <cmd> Command to run the SDK conformance client (for client conformance tests)
--days <n> Limit triage analysis to last N days
--output <format> json | markdown | terminal (default: terminal)
--token <token> GitHub token (defaults to GITHUB_TOKEN or gh auth token)
```

### What the CLI Checks

| Check | What it measures |
| ------------------ | ------------------------------------------------------------------------------ |
| Server Conformance | Pass rate of server implementation against the conformance test suite |
| Client Conformance | Pass rate of client implementation against the conformance test suite |
| Labels | Whether SEP-1730 label taxonomy is set up (supports GitHub native issue types) |
| Triage | How quickly issues get labeled after creation |
| P0 Resolution | Whether critical bugs are resolved within SLA |
| Stable Release | Whether a stable release >= 1.0.0 exists |
| Policy Signals | Presence of CHANGELOG, SECURITY, CONTRIBUTING, dependabot, ROADMAP |
| Spec Tracking | Gap between latest spec release and SDK release |

### Example Output

```
Tier Assessment: Tier 2

Repo: modelcontextprotocol/typescript-sdk
Timestamp: 2026-02-10T12:00:00Z

Check Results:

✓ Server Conformance 45/45 (100%)
✓ Client Conformance 4/4 (100%)
✗ Labels 9/12 required labels
Missing: needs confirmation, needs repro, ready for work
✓ Triage 92% within 2BD (150 issues, median 8h)
✓ P0 Resolution 0 open, 3/3 closed within 7d
✓ Stable Release 2.3.1
~ Policy Signals ✓ CHANGELOG.md, ✗ SECURITY.md, ✓ CONTRIBUTING.md, ✓ .github/dependabot.yml, ✗ ROADMAP.md
✓ Spec Tracking 2d gap
```

Use `--output json` to get machine-readable results, or `--output markdown` for a report you can paste into an issue.

## Full AI-Assisted Assessment

The CLI produces a deterministic scorecard, but some SEP-1730 requirements need judgment: documentation quality, dependency policy, roadmap substance. An AI agent can evaluate these by reading the repo.

### Claude Code

The skill lives in `.claude/skills/` in this repo, so if you open [Claude Code](https://docs.anthropic.com/en/docs/claude-code) in the conformance repo it's already available.

1. Make sure `gh auth login` is done (the skill checks this upfront)
2. Start the SDK's everything server in a separate terminal
3. Run the skill:

```
/mcp-sdk-tier-audit <local-sdk-path> <conformance-server-url> [client-cmd]
```

Pass the client command as the third argument to include client conformance testing. If omitted, client conformance is skipped and noted as a gap in the report.

**TypeScript SDK example:**

```bash
# Terminal 1: start the everything server (build first: npm run build)
cd ~/src/mcp/typescript-sdk && npm run test:conformance:server:run

# Terminal 2: run the audit (from the conformance repo)
/mcp-sdk-tier-audit ~/src/mcp/typescript-sdk http://localhost:3000/mcp "npx tsx ~/src/mcp/typescript-sdk/test/conformance/src/everythingClient.ts"
```

**Python SDK example:**

```bash
# Terminal 1: install and start the everything server
cd ~/src/mcp/python-sdk && uv sync --frozen --all-extras --package mcp-everything-server
uv run mcp-everything-server --port 3001

# Terminal 2: run the audit (from the conformance repo)
/mcp-sdk-tier-audit ~/src/mcp/python-sdk http://localhost:3001/mcp "uv run python ~/src/mcp/python-sdk/.github/actions/conformance/client.py"
```

**Go SDK example:**

```bash
# Terminal 1: build and start the everything server
cd ~/src/mcp/go-sdk && go build -o /tmp/go-conformance-server ./conformance/everything-server
go build -o /tmp/go-conformance-client ./conformance/everything-client
/tmp/go-conformance-server -http="localhost:3002"

# Terminal 2: run the audit (from the conformance repo)
/mcp-sdk-tier-audit ~/src/mcp/go-sdk http://localhost:3002 "/tmp/go-conformance-client"
```

**C# SDK example:**

```bash
# Terminal 1: start the everything server (requires .NET SDK)
cd ~/src/mcp/csharp-sdk
dotnet run --project tests/ModelContextProtocol.ConformanceServer --framework net9.0 -- --urls http://localhost:3003

# Terminal 2: run the audit (from the conformance repo)
/mcp-sdk-tier-audit ~/src/mcp/csharp-sdk http://localhost:3003 "dotnet run --project ~/src/mcp/csharp-sdk/tests/ModelContextProtocol.ConformanceClient"
```

The skill derives `owner/repo` from git remote, runs the CLI, launches parallel evaluations for docs and policy, and writes detailed reports to `results/`.

### Any Other AI Coding Agent

If you use a different agent (Codex, Cursor, Aider, OpenCode, etc.), give it these instructions:

1. **Run the CLI** to get the deterministic scorecard:

```bash
node dist/index.js tier-check --repo <repo> --conformance-server-url <url> --output json
```

2. **Evaluate documentation coverage** — check whether MCP features (tools, resources, prompts, sampling, transports, etc.) are documented with examples. See [`references/docs-coverage-prompt.md`](references/docs-coverage-prompt.md) for the full checklist.

3. **Evaluate policies** — check for dependency update policy, roadmap, and versioning/breaking-change policy. See [`references/policy-evaluation-prompt.md`](references/policy-evaluation-prompt.md) for criteria.

4. **Apply tier logic** — combine scorecard + evaluations against the thresholds in [`references/tier-requirements.md`](references/tier-requirements.md).

5. **Generate report** — use [`references/report-template.md`](references/report-template.md) for the output format.

### Manual Review

Run the CLI for the scorecard, then review docs and policies yourself using the tier requirements as a checklist:

| Requirement | Tier 1 | Tier 2 |
| ------------------ | ------------------------------ | ------------------------ |
| Server Conformance | 100% pass | >= 80% pass |
| Client Conformance | 100% pass | >= 80% pass |
| Issue triage | Within 2 business days | Within 1 month |
| P0 resolution | Within 7 days | Within 2 weeks |
| Stable release | >= 1.0.0 with clear versioning | At least one >= 1.0.0 |
| Documentation | All features with examples | Core features documented |
| Dependency policy | Published | Published |
| Roadmap | Published with spec tracking | Plan toward Tier 1 |

## Running Conformance Tests

To include conformance test results, start the SDK's everything server first, then pass the URL to the CLI. To also run client conformance tests, pass `--client-cmd` with the command to launch the SDK's conformance client.

**TypeScript SDK**:

```bash
# Terminal 1: start the server (SDK must be built first)
cd ~/src/mcp/typescript-sdk && npm run build
npm run test:conformance:server:run # starts on port 3000

# Terminal 2: run tier-check (server + client conformance)
npm run --silent tier-check -- \
--repo modelcontextprotocol/typescript-sdk \
--conformance-server-url http://localhost:3000/mcp \
--client-cmd 'npx tsx ~/src/mcp/typescript-sdk/test/conformance/src/everythingClient.ts'
```

**Python SDK**:

```bash
# Terminal 1: install and start the server
cd ~/src/mcp/python-sdk
uv sync --frozen --all-extras --package mcp-everything-server
uv run mcp-everything-server --port 3001 # specify port to avoid conflicts

# Terminal 2: run tier-check (server + client conformance)
npm run --silent tier-check -- \
--repo modelcontextprotocol/python-sdk \
--conformance-server-url http://localhost:3001/mcp \
--client-cmd 'uv run python ~/src/mcp/python-sdk/.github/actions/conformance/client.py'
```

**Go SDK**:

```bash
# Terminal 1: build and start the server
cd ~/src/mcp/go-sdk
go build -o /tmp/go-conformance-server ./conformance/everything-server
go build -o /tmp/go-conformance-client ./conformance/everything-client
/tmp/go-conformance-server -http="localhost:3002"

# Terminal 2: run tier-check (server + client conformance)
npm run --silent tier-check -- \
--repo modelcontextprotocol/go-sdk \
--conformance-server-url http://localhost:3002 \
--client-cmd '/tmp/go-conformance-client'
```

**C# SDK**:

```bash
# Terminal 1: start the server (requires .NET SDK)
cd ~/src/mcp/csharp-sdk
dotnet run --project tests/ModelContextProtocol.ConformanceServer --framework net9.0 -- --urls http://localhost:3003

# Terminal 2: run tier-check (server + client conformance)
npm run --silent tier-check -- \
--repo modelcontextprotocol/csharp-sdk \
--conformance-server-url http://localhost:3003 \
--client-cmd 'dotnet run --project ~/src/mcp/csharp-sdk/tests/ModelContextProtocol.ConformanceClient'
```

**Other SDKs:** Your SDK needs an "everything server" — an HTTP server implementing the [Streamable HTTP transport](https://modelcontextprotocol.io/specification/draft/basic/transports.md) with all MCP features (tools, resources, prompts, etc.). See the implementations above as reference.

Start your everything server, then pass `--conformance-server-url`. Pass `--client-cmd` if your SDK has a conformance client. If neither exists yet, use `--skip-conformance` — the scorecard will note this as a gap.

## Reference Files

These files in [`references/`](references/) contain the detailed criteria and prompts:

| File | Purpose |
| ----------------------------- | ------------------------------------------------------- |
| `tier-requirements.md` | Full SEP-1730 requirements with exact thresholds |
| `docs-coverage-prompt.md` | Feature checklist for documentation evaluation |
| `policy-evaluation-prompt.md` | Criteria for dependency, roadmap, and versioning policy |
| `report-template.md` | Output format for the full audit report |
Loading
Loading