Agent trust rail

The public trust directory for AI agents.

SilentCritique publishes dated, evidence-backed verdicts from real tests. Each score carries a badge, methodology, visible limitations, and a right-of-reply path.

Run a Verdict Read methodology Editorial policy Corrections and reply

Verdict count

Every score comes from a real, dated test.

Published verdicts

Internal methodology examples are labeled separately and never counted as published verdicts.

Evidence before score

Every verdict names what was tested, what was observed, and what was outside scope.

Dissent is visible

Evaluator disagreement is part of the artifact, not edited out for a cleaner score.

Badges link back

Embeds point to the canonical verdict so the score, date, methodology, and any reply can be checked.

Published corpus

Published verdicts

Read methodology

MCP serverPublished

Score

/100

Calculator MCP Server

Narrow, deterministic, injection-resistant in smoke test

The Calculator MCP server passed a real smoke test: it evaluated arithmetic correctly and, importantly, rejected a code-execution attempt disguised as an expression with an AST-level unsupported-operation error. It scores high because the task is narrow, deterministic, and the observed input handling refused non-arithmetic code.

Tested 2026-07-03 · sc-agent-trust-v0.1 · Recently verified

Open verdict

MCP serverPublished

Score

/100

Chroma MCP Server

Solid vector-memory primitive, ephemeral mode tested

The Chroma MCP server passed a real smoke test: it created a collection, embedded and stored a document, retrieved it by semantic query, and failed cleanly when asked about a collection that does not exist. It scores well as a vendor-maintained memory primitive, tested here only in ephemeral in-process mode.

Tested 2026-07-03 · sc-agent-trust-v0.1 · Recently verified

Open verdict

MCP serverPublished

Score

/100

Excel MCP Server

Verified workbook round-trip, large untested surface

The Excel MCP server passed a real smoke test: it created a workbook, wrote a data range, and read the same cells back with full per-cell metadata, all without Excel installed. The score is solid for the verified round-trip, with caution because only 3 of 25 exposed tools were exercised and file paths are not sandboxed by the server.

Tested 2026-07-03 · sc-agent-trust-v0.1 · Recently verified

Open verdict

MCP serverPublished

Score

/100

DuckDuckGo Search MCP Server

Working keyless search, unofficial upstream

The DuckDuckGo Search MCP server passed a real smoke test: a live search returned relevant ranked results and page fetch returned readable content with character accounting. The score is moderate because the server scrapes DuckDuckGo without an official API, so reliability and continued operation depend on a third party that has not agreed to serve it.

Tested 2026-07-03 · sc-agent-trust-v0.1 · Recently verified

Open verdict

MCP serverPublished

Score

/100

Yahoo Finance MCP Server (yfmcp)

Working market data, unofficial data path

The Yahoo Finance MCP server passed a real smoke test: it returned live company data for AAPL and correct symbol search results. The score is moderate because the server rides an unofficial Yahoo Finance data path with no service agreement, so data continuity and accuracy cannot be assured for decision-grade use.

Tested 2026-07-03 · sc-agent-trust-v0.1 · Recently verified

Open verdict

MCP serverPublished

Score

/100

MCP Git Reference Server

Accurate git reads, powerful write surface

The Git reference server passed a real MCP smoke test: against a fresh repository it reported a clean status and returned an accurate commit log. The score is solid for read operations, with caution because the same server exposes commit, reset, and checkout tools whose safety depends entirely on which repository paths the client allows.

Tested 2026-06-13 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MCP Everything Reference Server

Clean protocol conformance in smoke test

The Everything reference server passed a real MCP smoke test: it discovered 13 tools and returned correct results for an echo and a numeric sum. The score is solid for protocol conformance, with caution because this is an explicit feature-demonstration server — including an environment-dump tool — and is not meant for production deployment.

Tested 2026-06-13 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MCP Fetch Reference Server

Focused fetch tool, needs egress guardrails

The Fetch reference server passed a real MCP smoke test: it fetched a live URL and returned simplified markdown with length controls. The score is good for a focused, single-purpose tool, with caution because it will fetch arbitrary URLs and needs client-side egress and internal-network controls.

Tested 2026-06-13 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MarkItDown MCP Server

Clean conversion, broad URI reach

Microsoft's MarkItDown MCP server passed a real MCP smoke test: it converted a local HTML fixture into clean markdown. The score is good for a focused conversion tool, with caution because it accepts file and http URIs, giving it local-file-read and network-egress reach that the client must constrain.

Tested 2026-06-13 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MCP SQLite Reference Server

Works as shown, but archived and write-broad

The SQLite reference server passed a real MCP smoke test: it created a table, inserted a row, and read it back correctly. The score is moderate because the server now lives in the archived servers repository and its write_query tool executes arbitrary mutating SQL with no statement-level guardrails.

Tested 2026-06-13 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MCP Time Reference Server

Narrow, deterministic utility

The Time reference server passed a real MCP smoke test for current-time lookup and timezone conversion. It scores well because the task is narrow, deterministic, and low-side-effect, though invalid input and localization behavior were not deeply tested.

Tested 2026-06-11 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MCP Filesystem Reference Server

Strong scoped file control in smoke test

The Filesystem reference server passed a real MCP smoke test for tool discovery, allowed-directory read/write, and denial of an out-of-scope /etc/hosts read. The score is high for scoped file controls, with caution because the tool surface includes destructive write, edit, and move capabilities.

Tested 2026-06-11 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

Playwright MCP Server

Capable browser automation with high-power surface

Playwright MCP passed a real MCP smoke test for browser navigation and accessibility snapshot extraction. It exposes a rich, useful automation surface, but the same power means hosts need strict policy around origins, files, credentials, and side-effecting browser actions.

Tested 2026-06-11 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MCP Memory Reference Server

Useful memory primitive, limited assurance

The Memory reference server passed a real MCP smoke test for entity creation and search. It is useful as a simple knowledge-graph memory primitive, but the test did not validate persistence guarantees, conflict behavior, or controls for sensitive memory use.

Tested 2026-06-11 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

MCP serverPublished

Score

/100

MCP Sequential Thinking Reference Server

Functional narrow reasoning utility

The Sequential Thinking reference server passed a real MCP smoke test for its single reasoning tool. Its behavior is legible and low-side-effect, but the evaluated utility is narrow and the test did not validate complex branching or long-running reasoning quality.

Tested 2026-06-11 · sc-agent-trust-v0.1 · Evidence aging

Open verdict

Methodology examples

Internal examples, not third-party verdicts

These examples validate page structure, scoring language, and evidence layout. They are noindex and must not be treated as Trust 100 outcomes.

AI appMethodology example

Example

SilentCritique Public Report Privacy

Strong privacy boundary

The private report sharing route keeps token-gated reports out of search, which is the right privacy stance and a useful boundary for the new public verdict product.

Internal calibration only. Not a public third-party verdict.

Open example

Agent APIMethodology example

Example

SilentCritique Instant Critique Tool API

Operationally credible

The agent-facing instant critique endpoint has strong operational controls, including API-key auth, wallet charging, idempotency, rate limits, URL safety, and callback validation.

Internal calibration only. Not a public third-party verdict.

Open example

Agent APIMethodology example

Example

SilentCritique Wallet-Funded Agent Jobs

Accountability foundation present

Wallet-funded tool jobs are a solid foundation for accountable machine work, but the public certificate layer still needs a score history and verification surface.

Internal calibration only. Not a public third-party verdict.

Open example

Agent APIMethodology example

Example

SilentCritique Agent Protocol

Promising infrastructure, limited market proof

The protocol has a concrete trust, staking, discovery, and tool-catalog contract, but it still needs public third-party execution history before it can function as an independent trust signal.

Internal calibration only. Not a public third-party verdict.

Open example

AI appMethodology example

Example

SilentCritique Agent Marketplace

Legible mechanics, weak demand proof

The marketplace page makes participation terms legible, but the product should not rely on marketplace liquidity until public verdict demand exists.

Internal calibration only. Not a public third-party verdict.

Open example