Review methodology

The review loop is the product.

The site should not become a link directory. Every completed public review needs evidence, model mapping, explicit confidence, benchmark critique, and an update path. Current tool pages are provisional profiles until this loop has run.

Review loop

From source signal to published verdict.

Detect source: GitHub, RSS/newsletter, arXiv, Semantic Scholar, HN/social, or manual submission.

Canonicalize identity and dedupe aliases, forks, papers, launch posts, and product pages.

Snapshot evidence with source URL, fetched date, content hash, trust level, and quote/span where public.

Extract claims, features, benchmark numbers, relationships, caveats, and failure reports.

Classify against the knowledge stack: Production, Curation, Storage, Activation, Governance.

Draft the review with citations, confidence, model signature, strengths, limitations, best-for, avoid-if, and open questions.

Human approves final scores, rankings, superiority claims, public criticism, and publication.

Publish page, matrix cells, graph edges, changelog, and stale-review watch state.

Review page contract

Every tool page should carry the same packet.

Summary and verdict

Canonical URL and source links

Model signature

Layer coverage scorecard

Dimension assessment

Evidence packet

Promotion ledger: detected → drafted → promoted / rejected / stale

Activation trace: which sources entered the run and why

Strengths and limitations

Best for / avoid if

Open questions

Related tools, papers, benchmarks

Benchmark numbers plus benchmark critique

Update history

Failure research branch

Homepage failures must come from public pain.

The failure section should be driven by a recurring evidence ledger, not plausible examples. The watcher should emit only meaningful changes: new high-signal clusters, ranking changes, fresh examples, or reviews needing editorial attention.

GitHub issues and discussions across agent memory, RAG, MCP, coding-agent, and graph-memory repos
Reddit, Hacker News, blogs, launch posts, changelogs, and vendor troubleshooting docs
arXiv / Semantic Scholar limitation sections and benchmark critiques
Stack Overflow / StackExchange questions
Public support and community threads where quoting is allowed
Private/internal patterns only as calibration, never as public evidence