Agent trust rail

The public trust directory for AI agents.

SilentCritique publishes dated, evidence-backed verdicts from real tests. Each score carries a badge, methodology, visible limitations, and a right-of-reply path.

Verdict count

Every score comes from a real, dated test.

Published verdicts

10

Internal methodology examples are labeled separately and never counted as published verdicts.

Evidence before score

Every verdict names what was tested, what was observed, and what was outside scope.

Dissent is visible

Evaluator disagreement is part of the artifact, not edited out for a cleaner score.

Badges link back

Embeds point to the canonical verdict so the score, date, methodology, and any reply can be checked.

Published corpus

Published verdicts

Read methodology
MCP serverPublished

Score

82

/100

MCP Git Reference Server

Accurate git reads, powerful write surface

The Git reference server passed a real MCP smoke test: against a fresh repository it reported a clean status and returned an accurate commit log. The score is solid for read operations, with caution because the same server exposes commit, reset, and checkout tools whose safety depends entirely on which repository paths the client allows.

Tested 2026-06-13 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

81

/100

MCP Everything Reference Server

Clean protocol conformance in smoke test

The Everything reference server passed a real MCP smoke test: it discovered 13 tools and returned correct results for an echo and a numeric sum. The score is solid for protocol conformance, with caution because this is an explicit feature-demonstration server — including an environment-dump tool — and is not meant for production deployment.

Tested 2026-06-13 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

79

/100

MCP Fetch Reference Server

Focused fetch tool, needs egress guardrails

The Fetch reference server passed a real MCP smoke test: it fetched a live URL and returned simplified markdown with length controls. The score is good for a focused, single-purpose tool, with caution because it will fetch arbitrary URLs and needs client-side egress and internal-network controls.

Tested 2026-06-13 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

78

/100

MarkItDown MCP Server

Clean conversion, broad URI reach

Microsoft's MarkItDown MCP server passed a real MCP smoke test: it converted a local HTML fixture into clean markdown. The score is good for a focused conversion tool, with caution because it accepts file and http URIs, giving it local-file-read and network-egress reach that the client must constrain.

Tested 2026-06-13 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

76

/100

MCP SQLite Reference Server

Works as shown, but archived and write-broad

The SQLite reference server passed a real MCP smoke test: it created a table, inserted a row, and read it back correctly. The score is moderate because the server now lives in the archived servers repository and its write_query tool executes arbitrary mutating SQL with no statement-level guardrails.

Tested 2026-06-13 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

88

/100

MCP Time Reference Server

Narrow, deterministic utility

The Time reference server passed a real MCP smoke test for current-time lookup and timezone conversion. It scores well because the task is narrow, deterministic, and low-side-effect, though invalid input and localization behavior were not deeply tested.

Tested 2026-06-11 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

84

/100

MCP Filesystem Reference Server

Strong scoped file control in smoke test

The Filesystem reference server passed a real MCP smoke test for tool discovery, allowed-directory read/write, and denial of an out-of-scope /etc/hosts read. The score is high for scoped file controls, with caution because the tool surface includes destructive write, edit, and move capabilities.

Tested 2026-06-11 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

82

/100

Playwright MCP Server

Capable browser automation with high-power surface

Playwright MCP passed a real MCP smoke test for browser navigation and accessibility snapshot extraction. It exposes a rich, useful automation surface, but the same power means hosts need strict policy around origins, files, credentials, and side-effecting browser actions.

Tested 2026-06-11 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

76

/100

MCP Memory Reference Server

Useful memory primitive, limited assurance

The Memory reference server passed a real MCP smoke test for entity creation and search. It is useful as a simple knowledge-graph memory primitive, but the test did not validate persistence guarantees, conflict behavior, or controls for sensitive memory use.

Tested 2026-06-11 · sc-agent-trust-v0.1

Open verdict
MCP serverPublished

Score

68

/100

MCP Sequential Thinking Reference Server

Functional narrow reasoning utility

The Sequential Thinking reference server passed a real MCP smoke test for its single reasoning tool. Its behavior is legible and low-side-effect, but the evaluated utility is narrow and the test did not validate complex branching or long-running reasoning quality.

Tested 2026-06-11 · sc-agent-trust-v0.1

Open verdict

Methodology examples

Internal examples, not third-party verdicts

These examples validate page structure, scoring language, and evidence layout. They are noindex and must not be treated as Trust 100 outcomes.

AI appMethodology example

Example

86

SilentCritique Public Report Privacy

Strong privacy boundary

The private report sharing route keeps token-gated reports out of search, which is the right privacy stance and a useful boundary for the new public verdict product.

Internal calibration only. Not a public third-party verdict.

Open example
Agent APIMethodology example

Example

81

SilentCritique Instant Critique Tool API

Operationally credible

The agent-facing instant critique endpoint has strong operational controls, including API-key auth, wallet charging, idempotency, rate limits, URL safety, and callback validation.

Internal calibration only. Not a public third-party verdict.

Open example
Agent APIMethodology example

Example

76

SilentCritique Wallet-Funded Agent Jobs

Accountability foundation present

Wallet-funded tool jobs are a solid foundation for accountable machine work, but the public certificate layer still needs a score history and verification surface.

Internal calibration only. Not a public third-party verdict.

Open example
Agent APIMethodology example

Example

72

SilentCritique Agent Protocol

Promising infrastructure, limited market proof

The protocol has a concrete trust, staking, discovery, and tool-catalog contract, but it still needs public third-party execution history before it can function as an independent trust signal.

Internal calibration only. Not a public third-party verdict.

Open example
AI appMethodology example

Example

58

SilentCritique Agent Marketplace

Legible mechanics, weak demand proof

The marketplace page makes participation terms legible, but the product should not rely on marketplace liquidity until public verdict demand exists.

Internal calibration only. Not a public third-party verdict.

Open example