Skip to content

MCP Runtime Lifecycle

This document describes how MCP servers are discovered, connected, exposed as tools, refreshed, and torn down in the coding-agent runtime.

  1. SDK startup calls discoverAndLoadMCPTools() (unless MCP is disabled).
  2. Discovery (loadAllMCPConfigs) resolves MCP server configs from capability sources, filters disabled/project/Exa entries, and preserves source metadata.
  3. Manager connect phase (MCPManager.connectServers) starts per-server connect + tools/list in parallel.
  4. Fast startup gate waits up to 250ms, then may return:
    • fully loaded MCPTools,
    • failures per server,
    • or cached DeferredMCPTools for still-pending servers.
  5. SDK wiring merges MCP tools into runtime tool registry for the session.
  6. Live session can refresh MCP tools via /mcp flows (disconnectAll + rediscover + session.refreshMCPTools).
  7. Teardown happens when callers invoke disconnectServer/disconnectAll; manager also clears MCP tool registrations for disconnected servers.

createAgentSession() in src/sdk.ts performs MCP startup when enableMCP is true (default):

  • calls discoverAndLoadMCPTools(cwd, { ... }),
  • passes authStorage, cache storage, and mcp.enableProjectConfig setting,
  • always sets filterExa: true,
  • logs per-server load/connect errors,
  • stores returned manager in toolSession.mcpManager and session result.

If enableMCP is false, MCP discovery is skipped entirely.

loadAllMCPConfigs() (src/mcp/config.ts) loads canonical MCP server items through capability discovery, then converts to legacy MCPServerConfig.

Filtering behavior:

  • enableProjectConfig: false removes project-level entries (_source.level === "project").
  • enabled: false servers are skipped before connect attempts.
  • Exa servers are filtered out by default and API keys are extracted for native Exa tool integration.

Result includes both configs and sources (metadata used later for provider labeling).

discoverAndLoadMCPTools() distinguishes two failure classes:

  • Discovery hard failure (exception from manager.discoverAndConnect, typically from config discovery): returns an empty tool set and one synthetic error { path: ".mcp.json", error }.
  • Per-server runtime/connect failure: manager returns partial success with errors map; other servers continue.

So startup does not fail the whole agent session when individual MCP servers fail.

MCPManager tracks runtime lifecycle with separate registries:

  • #connections: Map<string, MCPServerConnection> — fully connected servers.
  • #pendingConnections: Map<string, Promise<MCPServerConnection>> — handshake in progress.
  • #pendingToolLoads: Map<string, Promise<{ connection, serverTools }>> — connected but tools still loading.
  • #tools: CustomTool[] — current MCP tool view exposed to callers.
  • #sources: Map<string, SourceMeta> — provider/source metadata even before connect completes.

getConnectionStatus(name) derives status from these maps:

  • connected if in #connections,
  • connecting if pending connect or pending tool load,
  • disconnected otherwise.

Connection establishment and startup timing

Section titled “Connection establishment and startup timing”

For each discovered server in connectServers():

  1. store/update source metadata,
  2. skip if already connected/pending,
  3. validate transport fields (validateServerConfig),
  4. resolve auth/shell substitutions (#resolveAuthConfig),
  5. call connectToServer(name, resolvedConfig),
  6. call listTools(connection),
  7. cache tool definitions (MCPToolCache.set) best-effort.

connectToServer() behavior (src/mcp/client.ts):

  • creates stdio or HTTP/SSE transport,
  • performs MCP initialize + notifications/initialized,
  • uses timeout (config.timeout or 30s default),
  • closes transport on init failure.

connectServers() waits on a race between:

  • all connect/tool-load tasks settled, and
  • STARTUP_TIMEOUT_MS = 250.

After 250ms:

  • fulfilled tasks become live MCPTools,
  • rejected tasks produce per-server errors,
  • still-pending tasks:
    • use cached tool definitions if available (MCPToolCache.get) to create DeferredMCPTools,
    • otherwise block until those pending tasks settle.

This is a hybrid startup model: fast return when cache is available, correctness wait when cache is not.

Each pending toolsPromise also has a background continuation that eventually:

  • replaces that server’s tool slice in manager state via #replaceServerTools,
  • writes cache,
  • logs late failures only after startup (allowBackgroundLogging).

Tool exposure and live-session availability

Section titled “Tool exposure and live-session availability”

discoverAndLoadMCPTools() converts manager tools into LoadedCustomTool[] and decorates paths (mcp:<server> via <providerName> when known).

createAgentSession() then pushes these tools into customTools, which are wrapped and added to the runtime tool registry with names like mcp_<server>_<tool>.

  • MCPTool calls tools through an already connected MCPServerConnection.
  • DeferredMCPTool waits for waitForConnection(server) before calling; this allows cached tools to exist before connection is ready.

Both return structured tool output and convert transport/tool errors into MCP error: ... tool content (abort remains abort).

Refresh/reload paths (startup vs live reload)

Section titled “Refresh/reload paths (startup vs live reload)”
  • one-time discovery/load in sdk.ts,
  • tools are registered in initial session tool registry.

/mcp reload path (src/modes/controllers/mcp-command-controller.ts) does:

  1. mcpManager.disconnectAll(),
  2. mcpManager.discoverAndConnect(),
  3. session.refreshMCPTools(mcpManager.getTools()).

session.refreshMCPTools() (src/session/agent-session.ts) removes all mcp_ tools, re-wraps latest MCP tools, and re-activates tool set so MCP changes apply without restarting session.

There is also a follow-up path for late connections: after waiting for a specific server, if status becomes connected, it re-runs session.refreshMCPTools(...) so newly available tools are rebound in-session.

Health, reconnect, and partial failure behavior

Section titled “Health, reconnect, and partial failure behavior”

Current runtime behavior is intentionally minimal:

  • No autonomous health monitor in manager/client.
  • No automatic reconnect loop when a transport drops.
  • Manager does not subscribe to transport onClose/onError; status is registry-driven.
  • Reconnect is explicit: reload flow or direct connectServers() invocation.

Operationally:

  • one server failing does not remove tools from healthy servers,
  • connect/list failures are isolated per server,
  • tool cache and background updates are best-effort (warnings/errors logged, no hard stop).

disconnectServer(name):

  • removes pending entries/source metadata,
  • closes transport if connected,
  • removes that server’s mcp_ tools from manager state.

disconnectAll():

  • closes all active transports with Promise.allSettled,
  • clears pending maps, sources, connections, and manager tool list.

In current wiring, explicit teardown is used in MCP command flows (for reload/remove/disable). There is no separate automatic manager disposal hook in the startup path itself; callers are responsible for invoking manager disconnect methods when they need deterministic MCP shutdown.

ScenarioBehaviorHard fail vs best-effort
Discovery throws (capability/config load path)Loader returns empty tools + synthetic .mcp.json errorBest-effort session startup
Invalid server configServer skipped with validation error entryBest-effort per server
Connect timeout/init failureServer error recorded; others continueBest-effort per server
tools/list still pending at startup with cache hitDeferred tools returned immediatelyBest-effort fast startup
tools/list still pending at startup without cacheStartup waits for pending to settleHard wait for correctness
Late background tool-load failureLogged after startup gateBest-effort logging
Runtime dropped transportNo automatic reconnect; future calls fail until reconnect/reloadBest-effort recovery via manual action

src/mcp/index.ts re-exports loader/manager/client APIs for external callers. src/sdk.ts exposes discoverMCPServers() as a convenience wrapper returning the same loader result shape.