The problem

The interesting question with coding agents isn’t whether one Claude call can write a function. It’s whether a sequence of calls can hold a project state, recover from failures, and decide when to stop. Most agent frameworks either over-promise (one prompt, full app) or under-deliver (a chat that occasionally writes code).

What we’re building

Auto-Claude is a multi-agent runtime built directly on the Anthropic Claude SDK. A planner decomposes the goal into steps; specialist agents (research, code, test, review) take ownership of step types; a controller reconciles state and decides what runs next. The desktop app (Electron + React) is the operator’s console; the FastAPI backend is the engine.

The AI angle

The honest engineering is in the failure handling. Every agent run either succeeds, raises a structured failure with context, or asks the human a single specific question. The controller routes those failures back into the plan instead of silently retrying. This is the difference between a cute demo and a system you trust to run overnight.

How it’ll be used

Solo developers who want a project-aware agent for evening tickets.
Studios like ours using it to compress repetitive build work.
Researchers studying multi-agent failure modes in production conditions.

Where we are

Single-goal end-to-end runs are working on real codebases. Multi-day, multi-goal queues are in soak. Public access is gated on a model of cost transparency — users need to know mid-run how much the agent has spent.