entorin

Overview

What you get from Entorin, mapped to the harness pains every agent stack reinvents.

Entorin is a substrate, not a platform. Whatever you already use to drive agents — Claude Agent SDK, Codex CLI / Codex SDK, a hand-written while loop, eventually LangGraph or CrewAI — keeps its shape. Entorin slides underneath and supplies the harness layer: trace, budget, sandbox, audit, capability flow.

What you get, by pain point

PainWhat the integration gives you
P0 — observabilityOne OTel trace per run. Every LLM call, tool call, agent invocation, sandbox exec, and checkpoint round-trip is a span carrying entorin.run_id, entorin.principal_id, tokens, cost.
P1 — frameworks over-abstractedThe bare-loop reference shows that entorin itself never asks you to subclass anything or build a DAG. A 50-line Python while loop inherits the full harness.
P7 — testing / evals weakSaved traces are the regression substrate. entorin.replay ships a TraceRecorder and a small set of invariant checks (assert_calls_paired, assert_run_lifecycle, assert_budget_within_cap).

Install

uv add entorin
# Optional extras:
uv add 'entorin[mcp]'    # MCP transport for the tool wrapper
uv add 'entorin[http]'   # FastAPI-backed HITL checkpoint transport

What Entorin does not do

Repeated for emphasis — the substrate philosophy is scope discipline:

  • No DAG / workflow builder. That’s LangGraph / CrewAI / etc.
  • No prompt templating. That’s your code.
  • No vector DB / retrieval. Retrieval ships as a protocol; bring your own backend.
  • No deployment infra. No Docker / K8s / queues / load balancers.
  • No eval suites. Traces are the regression substrate; you bring the assertions.

If you find yourself wanting one of these from Entorin, that is a sign the wrong tool is on your shortlist.