claude-code-api/README.md

# claude-code-api
[![AI Slop Inside](https://sladge.net/badge.svg)](https://sladge.net)

Python wrapper around the `claude` CLI for subscription-mode (no API key)
backends. Drives one long-running interactive `claude` per conversation via
a PTY and reads events from the JSONL session file; the public surface is
Anthropic-Messages-API shaped so a gateway in front of it is a one-liner
serializer away.

Not affiliated with Anthropic. You need a working subscription, the
`claude` CLI on PATH, and to have run `claude /login` once.

## Install

As a library inside another project:

```bash
uv add "claude-code-api @ git+https://git.kotikot.com/beaver/claude-code-api"
```

The runtime needs only `ptyprocess`.

## Use

```python
import asyncio
from claude_code_api import BackendOptions, ClaudeCodeBackend

async def main() -> None:
    opts = BackendOptions(cwd="/path/to/project", dangerously_skip_permissions=True)
    async with ClaudeCodeBackend(opts) as backend:
        async for event in backend.complete(
            [{"role": "user", "content": "say hi"}]
        ):
            print(event)

asyncio.run(main())
```

Multi-turn works by construction — append the assistant reply + a fresh
user message to the same `messages` list and call `complete()` again. The
backend fingerprints `messages[:-1]`, finds the live PTY from the previous
turn, and reuses it (so the server-side prompt cache stays warm):

```python
history = [{"role": "user", "content": "remember Beaver"}]
async for ev in backend.complete(history): ...

history += [
    {"role": "assistant", "content": [{"type": "text", "text": "OK"}]},
    {"role": "user", "content": "what was the codeword?"},
]
async for ev in backend.complete(history): ...
```

## Public surface

Events (Anthropic-style, vendored to keep the dep tree empty):
`AssistantMessage`, `UserMessage`, `SystemMessage`, `ResultMessage`,
`TextBlock`, `ThinkingBlock`, `ToolUseBlock`, `ToolResultBlock`.

Errors: `BackendError` (root), `AuthError`, `ProcessError`,
`CLINotFoundError`, `RateLimitError`, `SessionError`, `MessageParseError`.

Backend: `ClaudeCodeBackend(opts).complete(messages)` is an async
generator of events. `BackendOptions` exposes model / system prompt /
allowed-tools / `mcp_servers` / permission mode / history injection mode.

Lower layers (`PtyClaudeProcess`, `JsonlWatcher`, `TurnManager`,
`normalize`) are re-exported for callers that want to assemble their own
session orchestration.

## How a turn works

1. The backend looks up a live session by `hash_history(messages[:-1])`.
   If one matches, the new user message goes straight into its PTY.
2. If nothing matches and `messages[:-1]` is empty, a fresh `claude` is
   spawned with a brand-new `--session-id`.
3. If `messages[:-1]` is non-empty (a continuation we don't have a live
   PTY for — e.g. after restart), the backend writes a hand-crafted
   JSONL transcript at `~/.claude/projects/<key>/<id>.jsonl` and spawns
   `claude --resume <id>`. That is the `native_jsonl` injection mode;
   the fallback is `concat_message`, which folds the prior history into
   one large first prompt.
4. The PTY's stdout is drained continuously by a background thread; we
   never read events from there. The JSONL file is tailed at 100ms
   cadence and each new record is normalized into a typed `Event`.
5. The turn closes on the first `assistant` record with `stop_reason ∈
   {end_turn, max_tokens, stop_sequence, refusal}`. A `ResultMessage`
   is synthesized from its `usage` and yielded last.

## Examples

- `examples/basic_usage.py` — one turn, real `claude`.
- `examples/multi_turn.py` — two turns sharing one live PTY.
- `examples/mcp_tool.py` — wire up the bundled echo MCP server and let
  the model call it.

## Tests

```bash
uv run pytest                  # unit tests (fast, no real claude)
RUN_CLAUDE_SMOKE=1 uv run pytest tests/test_pty.py tests/test_turn.py tests/test_backend.py
```

The smoke-marked tests spawn a real `claude` process and need a logged-in
subscription on the host.