Credentials

Sandboxing addresses what the agent is allowed to do. Credentials address how the agent proves who it is — to the model provider, MCP servers, OAuth endpoints, Git hosts, and cloud platforms. Credentials expire. A mid-conversation 401 doesn’t gracefully degrade a task; it breaks it. Treat the lifecycle as infrastructure.

The Credential Stack

A session holds multiple credentials, each with different lifetimes, scopes, and refresh mechanisms.

graph TB
    subgraph credentials ["Active Credentials"]
        API["Model Provider API Key<br/>Anthropic, OpenAI, Google"]
        OAuth["OAuth Access Token<br/>GitHub, Slack, Calendar"]
        MCP["MCP Server Credentials<br/>Per-server env vars"]
        GIT["Git Credentials<br/>SSH keys, PATs"]
        CLOUD["Cloud Provider<br/>AWS, GCP, Azure tokens"]
    end

    subgraph lifecycle ["Lifecycle Operations"]
        ACQ["Acquire"]
        STORE["Store"]
        REFRESH["Refresh"]
        REVOKE["Revoke"]
    end

    credentials --> lifecycle

Credential Type	Typical Lifetime	Refresh	Storage
API key (Anthropic, OpenAI)	Indefinite	Manual rotation	OS keychain or env var
OAuth access token	15 min - 1 hr (drifting shorter)	Refresh-token exchange	OS keychain
OAuth refresh token	30-90 days (LinkedIn 1 yr; MS SPA 24 hr)	Re-auth	OS keychain
MCP server env vars	Indefinite	Manual	`${ENV_VAR}` in `.mcp.json`, resolved at spawn
Git PAT	30-365 days	Manual regen	System credential store
Cloud provider token	1-12 hours	CLI (`gcloud auth`, `aws sso`)	Provider-managed

OAuth 2.1 Baseline

The reference point in 2026 is OAuth 2.1, consolidating RFC 7636 (PKCE), RFC 6749 BCP, and RFC 8252. The practical consequences:

PKCE is mandatory under OAuth 2.1 and RFC 9700 — for all clients, not just public CLIs. The old carve-out for confidential clients is gone.
RFC 8628 Device Authorization Grant is now the preferred flow for headless CLI agents. Use it instead of a localhost redirect when the agent runs over SSH, in a container, or on any host without a browser. Claude Code, Codex, Goose, and Mistral Vibe all default to it.
RFC 8707 Resource Indicators binds each token to a single resource server, so a stolen MCP token cannot be replayed against another API. Required by the MCP 2026 spec.
Refresh-token rotation is required. Every refresh returns a new refresh token; the previous one is single-use.
Implicit and ROPC are removed. PKCE S256 only. Exact redirect-URI matching.

PKCE Flow

sequenceDiagram
    participant A as Agent Harness
    participant B as User's Browser
    participant P as OAuth Provider

    A->>A: Generate code_verifier + code_challenge
    A->>B: Open authorization URL with code_challenge
    B->>P: User authenticates and approves scopes
    P->>A: Redirect to localhost callback with auth code
    A->>P: Exchange auth code + code_verifier for tokens
    P->>A: { access_token, refresh_token, expires_in }
    A->>A: Store tokens in OS keychain

The localhost callback runs on a random port and shuts down after receiving the code. For headless agents, swap the redirect leg for RFC 8628: the harness polls the token endpoint while the user completes the device-code grant in any browser.

Token Storage

Old default: ~/.agent/oauth/<provider>.json with 0600 permissions. Still widely shipped for compatibility, but now flagged as risk in every 2026 security review. chmod 600 is necessary, not sufficient.

2026 recommended baseline: the OS-native credential store — macOS Keychain, Windows Credential Manager (wincred), Linux libsecret / Secret Service — with file-fallback for headless servers without a daemon. Cross-platform wrappers handle the fork: keyring (Python), hrantzsch/keychain (C++), git-credential-manager, mcp-secrets-plugin. New MCP-server projects increasingly default to keychain-first.

Non-negotiables don’t change: never commit, never log, redact from traces. Tenant-multiplexed agents must also encrypt at rest and isolate per tenant.

Proactive Refresh

The most common credential failure is a token expiring mid-conversation. The model makes a call, gets a 401, and either retries in a loop or gives up. Both are preventable.

Token lifetime: ──────────────────────────────────────────► expiry
                                          │
                                    ~75-80% of TTL
                                    REFRESH HERE
                                          │
                        ┌─────────────────┼──────────────┐
                        │  Safe zone      │  Danger zone │
                        │  Token valid    │  May expire  │
                        └─────────────────┼──────────────┘

The “80% rule” is convention, not standard — no RFC prescribes a number; sources cite 70-80%, with Azure SDK and Scalekit both landing on ~75%. Pick a threshold in that range, combine with startup validation and a reactive 401 retry as safety net, and call ensure_valid_token before every API request — not after a failure. Refresh-token rotation is mandatory under OAuth 2.1, so persist the new refresh token on every successful refresh.

def ensure_valid_token(cred):
    ttl = cred.expires_at - cred.created_at
    if current_time() < cred.created_at + (ttl * 0.75):
        return cred
    try:
        new = oauth_refresh(cred.provider, cred.refresh_token)
        save_credential(new)  # MUST persist rotated refresh token
        return new
    except RefreshError:
        return prompt_reauthentication(cred.provider)

FAPI 2.0 and DPoP

For finance and health agents handling regulated data, the next tier above OAuth 2.1 is FAPI 2.0 (Final 2025) plus DPoP — proof-of-possession token binding that stops stolen tokens being replayed without the matching key. Singpass enforces FAPI 2.0 in production from Jan 2026. Most agents won’t need this; if you handle PHI or trade orders, it’s the baseline, and the DPoP pattern is leaking into general MCP server guidance.

MCP Credential Injection

Each MCP server may need its own credentials, configured via env-variable references in .mcp.json:

{
  "mcpServers": {
    "github":   { "env": { "GITHUB_TOKEN":   "${GITHUB_TOKEN}" } },
    "postgres": { "env": { "DATABASE_URL":   "${DATABASE_URL}" } },
    "slack":    { "env": { "SLACK_BOT_TOKEN": "${SLACK_BOT_TOKEN}" } }
  }
}

The host resolves ${...} at server-spawn time, not at config load — so values can come from a secrets manager invoked just before the spawn. Each MCP server sees only its declared variables. An unset variable should produce a clear error, not an empty string that fails downstream.

For production, prefer Vault / AWS Secrets Manager / Azure Key Vault over shell env. Cleartext literals in .mcp.json and accidental commits remain the two most-reported MCP credential incidents of 2026.

Credential Isolation

Credential	Who Uses It	Isolation Requirement
Model API key	Harness only	Never reaches MCP servers or tool processes
OAuth tokens	Harness + specific tools	Per-provider scoping; only relevant token per tool
MCP server env vars	Individual MCP servers	Each server sees only its declared vars
Git credentials	Git operations only	Available only during `bash(git *)` calls

The two-phase runtime (Sandboxing) extends to credentials: setup authenticates and stores, execution exposes credentials only to the specific tool or server that needs them.

Agent-Scoped OAuth: Current Reality

The widely-quoted scopes agent:read, agent:draft_only, agent:write are aspirational and vendor-specific, not ratified. No RFC or major provider defines them as a portable namespace; IETF work (draft-rosenberg-oauth-aauth-00, draft-klrc-aiagent-auth-01 introducing the agent_assertion grant type) is all pre-RFC. The real 2026 pattern is per-tool/per-action scopes (mcp:tool:read_file:read, calendar:create_event, gmail:draft) combined with On-Behalf-Of delegation via standard OAuth 2.0 extensions — Stytch, WorkOS, Scalekit, and Apideck all ship variants.

Logout and Revocation

Explicit logout is a safety requirement, not convenience.

agent logout                 # all providers
agent logout github          # one provider
agent logout --oauth-only    # keep API keys

On logout: delete the keychain entry (or file), attempt server-side revocation where supported (GitHub, Google, Slack all do), clear in-memory caches, and tell the user which services will require re-authentication.

Cross-Platform Patterns

Platform	Storage	Refresh Mechanism
Claude Code	OS keychain; file fallback at `~/.claude/oauth/`	Harness PKCE refresh; device-code for headless
OpenAI Codex	System keychain or `OPENAI_API_KEY`; ChatGPT OAuth via desktop app	Harness-managed silent refresh
Gemini CLI	Google Cloud credential store via `gcloud auth`	`gcloud` refreshes automatically
Google ADK	Provider-managed (Cloud IAM, Workload Identity Federation)	ADC / service account; no harness involvement

The shape is consistent: API key or device-code OAuth for the model provider, PKCE + refresh for third-party integrations, ${ENV_VAR} for MCP servers, OS keychain underneath. The variation is whether the platform refreshes for you (Google) or expects the harness to (Claude Code, Codex).