Back to verdicts
MCP serverPublished

Calculator MCP Server

The Calculator MCP server passed a real smoke test: it evaluated arithmetic correctly and, importantly, rejected a code-execution attempt disguised as an expression with an AST-level unsupported-operation error. It scores high because the task is narrow, deterministic, and the observed input handling refused non-arithmetic code.

Tested 2026-07-03sc-agent-trust-v0.1Subject page

Independent trust badge

The visible trust mark for this verdict.

SilentCritique verdict badge for Calculator MCP Server

Badge clicks resolve to this canonical verdict so the score, test date, evidence, limitations, and reply status remain attached.

Embed

Show this badge on your site

[![SilentCritique verdict for Calculator MCP Server](https://silentcritique.com/badges/mcp-calculator)](https://silentcritique.com/verdicts/mcp-calculator)

Markdown works in GitHub READMEs. The badge always links back to this verdict.

Editorial notice

This page reflects SilentCritique's independent editorial opinion based on the specific test evidence shown. It is not an allegation of unlawful, malicious, fraudulent, or bad-faith conduct. SilentCritique does not accept payment to remove criticism, change a score, suppress a verdict, or improve an outcome.

Claim tested

Can the public Calculator MCP server evaluate arithmetic expressions correctly while refusing expressions that attempt code execution?

Evaluator panel

Protocol harnessSafety reviewerOperator skeptic

Evidence reviewed

Single calculate tool discovered

The server exposes exactly one tool, calculate(expression), which keeps the attack and review surface minimal.

evidence/trust5/2026-07-03-mcp-pilot.json

Arithmetic evaluated correctly

(2 + 3) * 7 - 5 returned 30 and 2 ** 10 / 4 returned 256.0, both as structured content.

evidence/trust5/2026-07-03-mcp-pilot.json

Code-execution probe rejected

__import__('os').getcwd() was refused with an Unsupported operation error naming the parsed AST node, indicating expression evaluation is allowlisted rather than passed to eval.

evidence/trust5/2026-07-03-mcp-pilot.json

Test setup

  • Started mcp-server-calculator==0.2.0 over MCP stdio via uvx.
  • Used the official MCP client SDK to list tools, evaluate two arithmetic expressions, and submit a Python __import__ call disguised as an expression.
  • Stored the full tool-call evidence in evidence/trust5/2026-07-03-mcp-pilot.json.

Strengths

  • The injection probe was rejected at the expression-parsing layer, not by output filtering.
  • Deterministic results with structured content for both tested expressions.
  • One-tool surface is easy for a host to review and permission.

Failure modes

  • Only two arithmetic cases and one injection case were probed; the allowlist boundary was not exhaustively mapped.
  • A community-maintained package can change behavior between versions; the verdict pins 0.2.0.

What would improve the score

  • Document the exact accepted expression grammar.
  • Add fuzz tests for the expression parser to the upstream repository.

Limitations

  • This was an unsolicited smoke test of the public package, not a source audit.
  • Only local stdio operation on macOS was tested.

Visible dissent

  • The safety reviewer scored this at the top of the range because the injection probe failed safely with a precise error.
  • The operator skeptic noted that a calculator is the easiest possible category to score well in, and the score should not be read across categories.

Right of reply

No vendor reply has been requested or published as of 2026-07-03. SilentCritique will publish factual corrections or a right of reply through the corrections process.

Do you build Calculator MCP Server?

Claim this verdict to publish a reply, correct factual errors, or request a re-test after you ship fixes. Replies are published verbatim next to the score.

Claim this verdict

Methodology matters

Scores are only meaningful when the rubric, date, evidence, and dissent are visible.

Read methodology