specs/v2/todo.md

The V2 launch checklist with owner initials: post-Hono cleanup (Kit), new data mode mostly done (Dax), the Effect-native agent loop with its reviewed next slices (Kit), compaction rework (Aiden?), plugin API design (James?), and unowned tracks for config, auth, model database and providers. EventV2 is implemented with a cursor API still to expose; a deferred-hardening list (tail wake lifecycle, replay paging, ripgrep timeouts) closes it out. Picking up V2 work or checking whether a gap is known and owned.

TODO

ok we need to work towards a launch of v2 so we can get out of this rebuild phase

Post-Hono cleanup - Kit

The opencode server has moved to the Effect HttpApi backend. Remaining work is mostly cleanup: delete compatibility shims, shrink Zod surfaces, and simplify test harnesses that used to compare Hono and HttpApi behavior.

New Data Mode - Dax

This is mostly done. I'm working through modeling subagents, skill invocations and shell commands.

Rework agent loop - Kit?

The first Effect-native local runner slice is implemented without bridging through legacy SessionPrompt.loop(...):

process-global SessionExecution.resume(sessionID) discovers Location from the Session read model
cached Location-scoped SessionRunner resolves one supported catalog model and issues one explicit llm.stream(request) provider turn at a time
durable V2 projections record text, reasoning, provider failures, tool calls, tool results, and assistant output
a scoped ToolRegistry advertises definitions and the first permission-checked read built-in
local continuation reloads projected history and stops after 25 provider turns within one local drain activity
concurrent resumes for one Session join one process-local run while different Sessions remain concurrent

Prompt admission now uses a durable session_input inbox rather than immediate transcript projection. steer inputs coalesce into the active activity at the next safe provider-turn boundary. queue inputs form a FIFO of future activities that open one at a time. A location-scoped SessionRunCoordinator coalesces process-local wakeups around settlement races. Explicit run resumes perform at least one provider attempt; advisory wake notifications call the provider only for eligible inbox work. Steers coalesce into the active activity at safe provider boundaries; queued inputs open later activities one at a time in FIFO order.

Next reviewed slices:

preserve eager structured local-tool settlement: durably record each complete call, start its child execution immediately, await every settlement after the provider turn closes, then reload projected history once
revisit per-turn tool-call limits, output truncation, and operational backpressure before broadening exposure; eager local execution is deliberately unbounded in the current local slice while SQLite publication stays serialized
remove the public in-memory @opencode-ai/llm tool loop after replacing its remaining one-turn native-adapter use with a narrow typed dispatcher
batch streamed deltas and add covering context indexes
expose replayable Session event cursors over HTTP and the generated SDK where remote consumers need them
integrate the new BackgroundJob service with V2 tool execution: support background bash jobs and background agent dispatch with durable status observation, completion delivery, and explicit cancellation / continuation semantics
add compaction, interruption, retries, and stale-owner fencing only as their slices become concrete

Deferred durable activity recovery

Do not infer that ambiguous provider work is safe to retry from an advisory wake. The first inbox-driven runner intentionally omits outer provider-attempt markers until they have a concrete consumer and a complete recovery policy.

Design post-crash activity recovery as one explicit slice. It should model:

durable activity identity and settlement
queue-opener reservation and steer assignment
provider-attempt preparation versus provider-dispatch ambiguity
required post-tool continuation across process loss
explicit retry and abandon decisions for unknown outcomes
bounded automatic retry only where provider and tool idempotency make it safe
retry budget, backoff, visible recovery status, startup discovery, and future clustered ownership fencing

Rework compaction - Aiden?

The new agent loop needs to trigger compaction properly

Plugin API design - James?

We need to figure out how we want server plugins to work and what hooks are useful.

Some ideas:

plugins get immer drafts so bad mutations can be thrown away
plugins get global "opencode" instance like in that post i showed
opencode instance has stuff like opencode.session.prompt() or opencode.tool.register({...})

Rework Config - ???

We should do another pass on config to clean up any mistakes we made with it and simplify as much as possible. Old configs should get auto-converted to new

Auth - ???

I have a basic auth system that can track any kind of auth, not just providers

Model Database - ???

I have a basic model service that allows for models to be registered dynamically

Provider - ???

Providers should register as plugins and autoload based on whatever logic they want / config. They should register models into model database

Event - Kit

The self-contained durable EventV2 core service is implemented. It owns sync-versioned persistence, transactional sequencing, pub/sub, replay, and replay-owner claims without relying on the old bus system.

Remaining slices:

expose the embedded consumer-facing Session cursor API over HTTP and the generated SDK where remote consumers need it
keep replay-owner claims distinct from future clustered Session execution ownership and stale-runtime fencing

Deferred hardening cleanup

Keep these visible, but do not block functionality slices on them unless a concrete failure appears during canary work:

serialize database migration claiming across processes; current migration application is protected only by an in-process semaphore, so two processes starting against one SQLite database can still race
simplify process-local durable-tail wake lifecycle with Effect RcMap and one shared PubSub.sliding<void>(1) per active aggregate; keep SQLite cursor replay and subscribe-before-history semantics unchanged
page large durable aggregate replay reads instead of loading every row after a stale cursor into one array
decide whether connected tails need a periodic polling fallback for cross-process SQLite writers; current advisory wakes are intentionally process-local
stream-cap websearch body collection before parsing
add ripgrep execution timeout and bounded line framing
materialize or consistently reject unresolved URL and file attachment sources
decide stateless OpenAI Responses hosted-tool continuation behavior; reconstructed hosted output can replay as a stored item_reference when store !== false, while store: false intentionally omits the unavailable reference path
decide whether to preserve deprecated @opencode-ai/llm orchestration exports
preserve or alias renamed filesystem SDK generated type names if compatibility consumers require them
revisit syscall-level mutation confinement for hostile external processes (openat, O_NOFOLLOW, and descriptor-relative mutation where supported)

Everything is hotreloadable - ???

Instead of needing to tear down things when something changes every service should emit granular events so services can react to them and reconfigure themselves. Allows frontend to receive these too, eg model.added. also prevents startup from blocking