Analysis — sdl-mcp
ANALYSIS: sdl-mcp
Section titled “ANALYSIS: sdl-mcp”Summary
Section titled “Summary”SDL-MCP is a TypeScript/Node.js MCP server that indexes a codebase into LadybugDB (built on Kuzu, an open-source embedded property graph database) and exposes 38 tool surfaces to coding agents. Its two structurally interesting ideas are the Symbol Card (a compact ~100-token metadata record per symbol, replacing direct file reads) and the Iris Gate Ladder (a five-rung escalation policy with partial protocol enforcement — the top rung is actively gated in source, workflow mode validates all rung transitions). A Tool Gateway consolidates 27 legacy flat action tools into 4 namespace-scoped gateway schemas for tools/list.
What it does (verified from source)
Section titled “What it does (verified from source)”Symbol Cards
Section titled “Symbol Cards”Every indexed symbol (function, class, interface, type, variable) is stored as a Kuzu Symbol node containing (from src/db/ladybug-schema.ts): kind, exported, file location (rangeStartLine/rangeEndLine), signatureJson, summary, summaryQuality (double), summarySource (tracks whether the summary was LLM-generated or auto-derived), invariantsJson, sideEffectsJson, roleTagsJson, fan-in/fan-out/churn metrics, and inline embedding columns for hybrid retrieval.
The summarySource field confirms that summary provenance is tracked at the node level. summaryQuality (a double 0.0–1.0) is also stored, giving the system a handle for filtering low-quality summaries. The model used to generate summaries and the per-symbol cost are not exposed through any tool or configuration option.
ETag-based conditional re-fetch is implemented as a CardHash node in the schema — a stored hash of the card content used to detect staleness and skip unchanged symbols in repeated slice builds (knownCardEtags in slice.build args).
Iris Gate Ladder
Section titled “Iris Gate Ladder”The ladder is defined in two source files:
src/code/gate.ts — enforces the top rung (code.needWindow):
Implements evaluateRequest() which actively evaluates raw-source requests. Checks include:
- Policy limit enforcement:
request.expectedLines > policy.maxWindowLines,request.maxTokens > policy.maxWindowTokens,policy.requireIdentifiers - Identifier matching: if
identifiersToFindis provided, the window must contain them - Utility scoring: symbols above
UTILITY_SCORE_THRESHOLD(0.3) are auto-approved; below threshold, requests are denied withnextBestActionguidance pointing togetSkeletonorgetHotPath
Denied requests receive a structured DenialGuidance response that includes a concrete alternative tool call (e.g. sdl.code.getSkeleton) rather than a generic refusal — this steers the agent toward the correct rung automatically.
src/code-mode/ladder-validator.ts — validates workflow sequences:
Defines the rung order and validates sdl.workflow step sequences:
| Rung | Actions | Notes |
|---|---|---|
| 0 | symbol.search | Entry — find symbols by name/query |
| 1 | symbol.getCard, slice.build | Card or task-scoped subgraph |
| 2 | code.getSkeleton | Control flow, no full bodies |
| 3 | code.getHotPath | Exact lines for named identifiers |
| 4 | code.needWindow | Full source window — gated by gate.ts |
Warns (or blocks in enforce mode) when a workflow step skips more than one rung for the same symbol. Enforcement applies only inside sdl.workflow — standalone tool calls are not ladder-validated.
Enforcement summary: the top rung (rung 4, code.needWindow) is actively gated server-side regardless of mode. Full ladder validation applies inside sdl.workflow steps. Outside of workflow mode, agents calling lower rungs (0–3) in arbitrary order are not blocked.
Tool Gateway
Section titled “Tool Gateway”src/gateway/index.ts confirms 4 namespace-scoped gateway tools (sdl.query, sdl.code, sdl.repo, sdl.agent) with dual-schema registration: a full Zod schema for server-side validation and a thin-schemas.ts-built wire schema for tools/list. Legacy flat tool names are registered from src/gateway/legacy.ts (27 registerTool calls from source — the README figure of “32” is inaccurate).
The thin wire schema is explicitly designed to reduce tools/list payload size while keeping full validation server-side. The 81% reduction figure (as reported) refers to this registration payload, not session-level token consumption.
LadybugDB (Kuzu)
Section titled “LadybugDB (Kuzu)”src/db/ladybug-schema.ts defines the full graph schema as TypeScript Cypher DDL using the Kuzu embedded graph database. Kuzu is open-source (MIT), embeds in-process with no server, and supports Cypher queries natively — comparable to SQLite’s role in relational tooling. The schema is fully readable and versioned (LADYBUG_SCHEMA_VERSION), with a migration runner in src/db/migration-runner.ts.
Node tables include: Repo, File, Symbol, Version, SymbolVersion, Metrics, Cluster, Process, FileSummary, SliceHandle, CardHash, Memory, ScipIngestion, and others. Relationship tables include: DEPENDS_ON, BELONGS_TO_CLUSTER, HAS_MEMORY, MEMORY_OF, PARTICIPATES_IN, and others.
Correction from pre-source analysis: LadybugDB is NOT opaque or proprietary — it is a well-defined Kuzu graph with a versioned TypeScript schema and migration tooling. The prior characterisation was based on README documentation alone.
Delta Packs & Blast Radius
Section titled “Delta Packs & Blast Radius”src/delta/index.ts exports computeDelta, computeBlastRadius, runGovernorLoop, and snapshotSymbols — all confirmed present in source. runGovernorLoop suggests the blast-radius computation has a budget-management loop (a “governor”) to prevent unbounded traversal.
Development Memories (opt-in)
Section titled “Development Memories (opt-in)”A Memory node table and HAS_MEMORY/MEMORY_OF/MEMORY_OF_FILE relationship tables are defined in the schema. src/memory/ contains file-sync.ts and surface.ts, suggesting memories can be attached to files as well as the session. The SyncArtifact node and the existing CLI export/import commands suggest memory can be exported across sessions, though the mechanism is not fully documented in the README.
Benchmark claims — verified vs as-reported
Section titled “Benchmark claims — verified vs as-reported”Local clone available at tools/glitterkill-sdl-mcp/ (commit 492b5e8). Source review performed; harness not executed.
| Claim | Value | Scope | Assessment |
|---|---|---|---|
| Symbol Card size | ~100 tokens | Per symbol | Schema confirms ~15–20 fields per symbol; 100 tokens is plausible as median. No size distribution in source. |
| Full-file read cost | ~2,000 tokens | Per file | Rough estimate; no methodology in source. |
| Token reduction from cards | ~20× vs file read | Symbol lookup only | Directionally plausible given schema field count vs full-file size. Not measured by any source-accessible harness. |
| Tool Gateway reduction | 81% | tools/list payload | Source confirms thin wire schemas are intentionally minimal — mechanism is real. Figure unverified; no schema size measurement in source. |
| Legacy tool count | ”32” (README) | Flat tools | 27 registerTool calls in src/gateway/legacy.ts from source — README figure is inaccurate. |
| SCIP edge confidence | 0.95 | Post-SCIP ingest | ScipIngestion node confirmed in schema. Confidence value not found in source; likely a data model constant, not a measured recall figure. |
| Real-world benchmark gates | p50 ≥ 50% capped reduction | benchmarks/real-world/ matrix | Formal gates defined in CLAIMS.md; harness confirmed present but not executed. |
| GitHub stars | 125 | As of 2026-04-14 | Verified. |
Verdict: the thin-schema gateway mechanism is source-confirmed and real. The Symbol Card schema is richer than the README describes (summaryQuality, summarySource, inline embeddings). A formal real-world benchmark harness exists with claim gates — this is more rigorous than most tools in this survey, though no reproduction has been run.
Architectural assessment
Section titled “Architectural assessment”Strengths
Section titled “Strengths”-
Iris Gate Ladder has meaningful server-side enforcement.
gate.tsactively evaluates and deniescode.needWindowrequests that fail policy, identifier-match, or utility-score checks. Denied requests return structuredDenialGuidancewith a concrete alternative tool call — the agent is guided to a cheaper rung rather than simply refused. This is stronger than a prompting convention. -
LadybugDB is Kuzu — open, versioned, and queryable. The prior characterisation of LadybugDB as opaque is incorrect. It is a well-defined Kuzu graph database with a versioned TypeScript DDL schema and a migration runner. Kuzu is MIT-licensed, open-source, and supports Cypher. This removes the main vendor-lock-in risk identified in the triage.
-
ETag conditional re-fetch is source-confirmed.
CardHashnodes store content hashes;slice.buildacceptsknownCardEtagsto skip unchanged symbols. This is a concrete, implemented token-saving mechanism. -
A formal real-world benchmark harness exists with claim gates.
benchmarks/real-world/CLAIMS.mddefinesp50 ≥ 50%capped reduction as the formal gate, enforced byscripts/check-benchmark-claims.ts. This is more rigorous than any other tool in this survey except rtk. -
Delta Packs have a governor loop.
runGovernorLoopinsrc/delta/blastRadius.tsprevents unbounded blast-radius traversal — a practical engineering detail absent from the README.
Weaknesses
Section titled “Weaknesses”-
LLM-generated summaries: model, cost, and staleness policy are undocumented.
summaryQualityandsummarySourcefields exist, but the model used, the per-symbol generation cost, and the threshold for re-generation are not exposed in config or documentation. For large codebases, the initial index cost could be substantial. -
Source-available license limits commercial integration. Commercial embedding requires a paid license with unpublished terms. This is a higher friction point than MIT or Apache-2.0 tools.
-
12 languages (Rust indexer) is narrow. codebase-memory-mcp supports 66; tree-sitter tools support 40+. For polyglot repos with languages outside the Rust indexer’s scope, the tree-sitter fallback applies but its coverage is uncharacterised.
-
Ladder validation applies only in
sdl.workflowmode. Outside of workflow mode, agents calling lower rungs in arbitrary order are not validated. Only the top rung (code.needWindow) is universally gated. -
Real-world benchmark has Windows-absolute paths.
benchmarks/real-world/benchmark.config.jsoncontainsF:/Claude/projects/...paths — manual fixup is required to reproduce on macOS/Linux. -
Legacy tool count discrepancy. README claims 32 flat tools; source contains 27. Minor but indicates documentation is not kept in sync with source.
Comparison to adjacent tools
Section titled “Comparison to adjacent tools”| SDL-MCP | codebase-memory-mcp | oraios-serena | |
|---|---|---|---|
| Retrieval primitive | Symbol Card (Kuzu node, LLM summary, ETag) | AST node (SQLite, no LLM) | LSP symbol (live, no LLM) |
| Context escalation | Iris Gate Ladder (5 rungs, top rung gated) | None | Progressive fallback on oversize |
| DB | Kuzu (open-source, Cypher, versioned schema) | SQLite (open, accessible) | none (live LSP) |
| Languages | 12 (Rust) + tree-sitter fallback | 66 | 40+ (via LSP servers) |
| License | source-available | MIT | MIT |
| Benchmark harness | Real-world matrix + claim gates (not run) | None (single anecdote) | None |
| Session token claim | p50 ≥ 50% capped (formal gate, unrun) | 99.2% vs grep (as reported) | — |
SDL-MCP’s Iris Gate Ladder with server-enforced top rung and Kuzu-backed graph is architecturally more mature than the README suggested. codebase-memory-mcp retains the language coverage and MIT license advantage. oraios-serena wins on live accuracy.
Recommendation
Section titled “Recommendation”Watch — strongest escalation model in this survey; benchmark harness exists but unrun.
Source review upgrades the assessment on two fronts: LadybugDB is Kuzu (open, not opaque) and the Iris Gate is partially enforced server-side (not just a prompting convention). The real-world benchmark harness with formal claim gates is more rigorous than most peers.
Remaining blockers for Adopt: the benchmark harness has not been run here; LLM summary cost is undisclosed; the source-available license restricts commercial use; and ladder enforcement outside of sdl.workflow mode relies on agent compliance for rungs 0–3.
Condition for upgrading to Adopt: run the real-world benchmark matrix and confirm p50 ≥ 50% capped reduction on a neutral codebase; document LLM summary cost and staleness policy.
Comparison hooks (for ANALYSIS.md matrix)
Section titled “Comparison hooks (for ANALYSIS.md matrix)”- Context escalation model: Iris Gate Ladder (5 rungs: search → card/slice → skeleton → hot-path → window); top rung gated server-side by
gate.ts; all rungs validated insdl.workflowmode. - Retrieval primitive: Symbol Card (~100 tokens, LLM-generated summary with
summaryQualityscore, ETagCardHashconditional re-fetch). - Token saving scope:
tools/list81% (gateway mode, thin wire schema confirmed in source, as reported); no end-to-end session figure; formal real-world benchmark harness unrun. - Storage: LadybugDB on Kuzu (open-source MIT embedded graph DB, Cypher queries, versioned TypeScript DDL schema, migration runner).
- License risk: source-available; commercial embedding requires paid license with unpublished terms.
- Unique value: Delta Packs with governor loop for bounded blast-radius; ETag
CardHashskip for unchanged symbols;DenialGuidancesteers agent to correct rung on gate denial. - Source-corrected: README claims 32 legacy flat tools — source has 27. LadybugDB described as proprietary in README — it is Kuzu (MIT open-source).