How it works

Fulcrum is a Python TUI that hosts an agent loop. The agent reads a system prompt, your prompt, and the conversation history, picks tools, runs them locally, and streams the result back as a chat message. Models live on scx.ai (sovereign Australian infrastructure). Tools — file I/O, shell, search, MCP — live on your machine. The split is the whole product: code never leaves your laptop, inference never leaves Australia.

The agent loop

One turn is a streamed call to /v1/chat/completions. If the response contains tool calls, Fulcrum runs each one locally, appends the result to the message history, and calls the model again. Repeat until the model returns plain text without tool calls — that becomes the assistant's reply.

What runs where

The boundary is hard. Inference is the only thing that leaves your machine, and only because GPUs cost more than laptops do. Everything else — including the entire tool surface — runs in the same process as the TUI.

Tools include Read, Write, Edit, Glob, Grep, Bash, WebFetch, WebSearch, NotebookEdit, Memory, MCP dispatchers, and the rest. The agent loop dispatches each tool call to its implementation; the model never sees a syscall, only a JSON result. See tools reference for the full toolbelt.

Anatomy of a turn

Sequence diagram. Mint moves are the assistant; amber moves are the tool runner. The TUI is just a humble messenger between them.

A turn that needs no tools collapses to t0 → t1 → done. A multi-step refactor can ping-pong between scx.ai and the tool runner a dozen times in one user turn — each round-trip is the same shape.

Streaming and interruption

Every assistant turn streams. Token deltas arrive as SSE events; the TUI paints them straight into the transcript. Tool calls render as collapsible blocks the moment the arguments finish parsing, so you see what the agent is about to do before it does it.

To kill an in-flight turn: /cancel or Ctrl+C. Cancelling kills the agent worker, stops the spinner, drops any partial assistant message, and re-focuses the prompt input. Tool calls in flight at the moment of cancellation are abandoned — their results never make it into the transcript.

Where state lives

Everything Fulcrum persists is under ~/.fulcrum/, plus per-project memory files in your repo root. No database, no cloud sync, no hidden home directory — back it up with tar.

Thing	Path
Config	`~/.fulcrum/config.json`
Memory	`~/.fulcrum/memory/`
Skills	`~/.fulcrum/skills/ (custom) + bundled with the binary`
Plugins	`~/.fulcrum/plugins/`
Hooks	`~/.fulcrum/hooks.json`
Logs	`~/.fulcrum/logs/fulcrum.log`
MCP servers	`~/.fulcrum/mcp.json`
Sessions	`~/.fulcrum/sessions/`
Per-project memory	`<repo>/CLAUDE.md and <repo>/AGENTS.md`

The scx.ai key lives outside this tree on purpose: it's stored in the OS keychain (macOS Keychain, GNOME Keyring, Windows Credential Manager) via keyring, with an encrypted-file fallback at ~/.fulcrum/scx.enc when no keychain backend is available. See models · API key flow for the full handshake.

Next steps

Tools → The full toolbelt — what each tool does, when the agent picks it, what arguments it takes.
Memory & knowledge → How ~/.fulcrum/memory/, project-level CLAUDE.md, and skills get injected into the system prompt.
Voice & speak → /voice mic capture, /speak TTS, voice picker.
Models → The scx.ai catalogue and how Fulcrum routes between chat, vision, transcription, and TTS.