Skip to content

codebase-memory-mcp — Benchmark Reproduction

codebase-memory-mcp — Benchmark Reproduction

Section titled “codebase-memory-mcp — Benchmark Reproduction”

Source: tools/deusdata-codebase-memory-mcp/ (pinned: d33b9e4) Date: 2026-04-09 Environment: macOS Darwin 25.4.0, codebase-memory-mcp v0.5.7 (pre-built binary) Outcome: partially verified — live queries confirmed compact output; 99.2% reduction claim directional but comparison baseline is naive grep


Indexed tools/deusdata-codebase-memory-mcp in fast mode: 24,138 nodes, 51,664 edges.

5 structural queries run against the live MCP server:

#ToolQueryResponse sizeEst. tokens
1get_graph_schemanode/edge type counts~600 B~150
2search_graphcbm_pipeline_* functions (35 found)~2.1 KB~530
3trace_call_pathcallers of cbm_pipeline_run, depth=3~500 B~125
4search_graphcbm_mcp_handle_tool with connections~250 B~65
5get_architecturepackages, languages, hotspots~900 B~225
Total~4.4 KB~1,095

Equivalent grep to find all cbm_pipeline_* definitions across 673 C files would return hundreds of lines — estimated 5,000–20,000 tokens before model reasoning.

From the benchmark source file in the repo:

  • “Tokens” = all input + output tokens during the 12-question answering session, including Claude’s reasoning about tool results — not just raw tool output.
  • Baseline = file-by-file grep, not optimized focused reads or RAG.
  • Single scenario on one repo (not disclosed which repo).

This means the comparison is favorable: grep forces the model to reason over large raw context; graph queries return pre-structured answers. Savings vs focused file reads or optimized RAG would be lower.

The token benchmark is not a runnable harness — it is an author-run session whose methodology is documented in a markdown file in the repo. No reproducible benchmark script exists.

To replicate live queries:

Terminal window
# Install binary
curl -fsSL https://raw.githubusercontent.com/DeusData/codebase-memory-mcp/main/install.sh | bash
# Index a repo
codebase-memory-mcp cli index_repository '{"repo_path":"/path/to/repo"}'
# Run structural queries via MCP (or CLI)
codebase-memory-mcp cli get_graph_schema '{"project":"<project-name>"}'
codebase-memory-mcp cli search_graph '{"project":"<project-name>","name_pattern":"<pattern>","label":"Function"}'
codebase-memory-mcp cli trace_call_path '{"project":"<project-name>","function_name":"<fn>","direction":"inbound","depth":3}'

Compare token counts in Claude Code’s /cost output or token logging against equivalent grep -r operations.

Pure C unit tests in tests/test_*.c. Requires building from source:

Terminal window
make test

Tests cover: arena allocator, C LSP, CLI, file discovery, dynamic arrays, hash tables, logging, MCP protocol, memory, parallel execution, platform, simhash, store bulk/edges, traces, UI, and integration. No standalone token benchmark harness.