New Research-first features shipped — ExperimentGrid, EvalNode, reflection/consensus/debate strategies Explore the research toolkit →

The agent-native
runtime

Durable execution. Native MCP + A2A. Full observability.
Built in Rust. Authored in Python.

$ pip install jamjet
scheduler Plan Research Analyze Review Synthesize

Production agents have
unsolved problems

Reliability

Agents crash and lose everything

Durable graph execution with event sourcing

Interop

Agents are locked in silos

Native MCP + A2A protocols

Observability

You can't see what's happening

OTel GenAI traces + checkpoint replay

Safety

Autonomous agents run forever

Compile-time autonomy constraints

Process

Human approval is bolted on

Human-in-the-loop as a native node type

Performance

Python orchestration doesn't scale

Rust async scheduler, microsecond overhead

Python you
already know

Decorate functions with @task and @tool. The runtime handles durability, cost limits, and telemetry.

agent.py
from jamjet import task, tool

@tool
async def web_search(query: str) -> str:
    """Search the web for current information."""
    ...

@task(model="claude-sonnet-4-6", tools=[web_search])
async def research(question: str) -> str:
    """Search first, then summarize clearly."""

result = await research("Latest AI agent trends?")
print(result)

Everything production demands

Durable Graph Execution

Every step checkpointed with event sourcing. Crash the process, restart it — execution resumes exactly where it stopped.

Native MCP + A2A

Connect to any tool server. Expose your tools. Delegate to external agents. Both protocols built in, not bolted on.

Full Observability

OpenTelemetry GenAI traces. Per-node cost attribution. Checkpoint replay. Know exactly what happened and why.

Human-in-the-Loop

Native workflow node for approvals. Durable suspension — the process sleeps until a human responds, even across restarts.

Autonomy Limits

Token budgets, cost caps, iteration limits. Enforced by the runtime, not by convention. Agents can't run away.

Native Eval Harness

LLM-as-judge, assertion, latency, and cost scorers. Run evals as workflow nodes. CI exit codes on regression.

Typed Schemas

Pydantic and JSON Schema validation at every step boundary. No loose dicts. Catch errors before they propagate.

Rust Core

Async scheduler built in Rust. Microsecond dispatch overhead. Python is for authoring, not for orchestration.

Multi-Model

Any OpenAI-compatible API. Anthropic, Ollama, Groq, Azure — swap models per task with a single config change.

Your next paper,
reproducible by default

ExperimentGrid

Cartesian product over models, prompts, and temperatures. Parallel execution. Results export to CSV, LaTeX, and JSON with one call.

Built-in Strategies

ReAct, plan-and-execute, critic, reflection, consensus, and debate — compiled to IR sub-DAGs. Compare strategies across the same dataset.

Durable Replay

Every decision checkpointed. Replay any execution. Fork with modified inputs. Perfect for ablation studies and failure analysis.

Explore the research toolkit

Start building

Get a durable workflow running locally in under 10 minutes.

Read the quickstart View on GitHub