The research pass mapped Letta primarily to context assembly: what gets assembled into the working context, when, and as what kind of state.
Tool profile · provisional profile
Letta / MemGPT
Strong agent-state framing. Good fit when you need explicit agent memory blocks, persona/state management and inspectable long-running agents.
Provisional fit
76/100
Best for: Agent-state experiments where memory must be visible and editable rather than implicit in prompts.
Avoid if: you need a fully governed, citation-complete knowledge architecture without adding policy, evidence capture, and review workflow around the tool.
Caution: Still needs external discipline for provenance, deletion, lifecycle and cross-user governance.
Model signature: Context Assembly primary · agent scope · Tool
Layer coverage
Where this tool fits.
This is not a completed review. It is a provisional profile from public positioning plus known failure-mode mapping. Hands-on benchmarks, source snapshots, and citation-bound claims are still required before stronger conclusions.
Evidence notes
What the provisional profile has applied so far.
Failure modes to test: hidden stale state, unclear memory authority, and weak audit trail from source event to assembled context.
Profile depth is provisional: based on public positioning/docs and known memory-failure clusters, not a hands-on benchmark or citation-bound review yet.
Review packet
What a complete review must contain.
This page exposes the intended review structure. The current artifact is a profile, not a completed evidence-backed review.
Canonical source
Strengths
Agent-state experiments where memory must be visible and editable rather than implicit in prompts.
Limitations
Still needs external discipline for provenance, deletion, lifecycle and cross-user governance.
Dimension assessment
Scope, volatility, authority, lifecycle, resource economics, interoperability, and evidence quality must each get a rationale and citations before final scoring.
Open questions
- What can be verified from docs, code, issues, benchmarks, and changelogs?
- Where does the tool fail under stale, contradictory, private, or high-cost knowledge?
- Which claims are vendor claims versus independently observed behavior?
Benchmark critique
No benchmark number is accepted as architectural evidence unless it says which layer it tests and what it misses: lifecycle, scope boundaries, authority, context cost, and governance.
Related systems
Related tools should be connected by evidence-backed edges: competes with, integrates with, implements concept, evaluated by, or has governance gap.
Update history
Provisional profile created. Stale-review detection, source snapshots, and changelog watching are required before this becomes a durable review.