Anti-Patterns

Most agent literature documents what to build. This page documents what to recognize when it has already been built wrong. The three failure modes below were extracted from forensic reading of six open-source agent projects with non-trivial usage. The projects are not named; the code shapes are. If your codebase has the shape, you have the failure mode regardless of which project you started from.


1. Prompted Architecture

Definition: When a feature’s load-bearing logic lives in a prompt template rather than in code, the feature inherits the LLM’s reliability (~90%) rather than the code’s (~100%).

Symptoms:

Evidence from the field: Forensic analysis of one widely-starred multi-agent platform found a 267-line “leader prompt” containing the comment

## Sequencing Dependent Work (CRITICAL — avoid teammate timeouts)

instructing the model to manually poll for completion before dispatching a dependent task. That is not orchestration — that is a workaround dressed as a feature. A separate, popular code-intelligence tool achieves its headline performance number (“94% fewer tool calls”) by installing a CLAUDE.md fragment into the user’s project that tells the agent not to grep.

The honest version: Both projects ship real engineering underneath. The pattern isn’t “prompts are bad.” It’s that describing what the LLM should do is not the same as making the LLM do it. Prompted Architecture is when the description IS the architecture.

Fix:


2. Vector-Default Memory

Definition: Making vector retrieval the primary memory mechanism without an integration layer above it. The system can recall that something happened but cannot reason about what it means.

Symptoms:

Evidence from the field: Forensic analysis of one production memory framework found six declared memory stores (episodic, semantic, procedural, resource, knowledge_vault, core) whose ORM schemas shared ~70% of their columns. The “router” that supposedly dispatches observations to the right store was a hardcoded line:

return await self.agents["meta_memory_agent"].step(...)

Every observation cost ~2 LLM calls plus 2 embedding calls. The system requires a cloud LLM key to ingest at all — but is marketed as privacy-first.

The architectural failure: The LLM is good at integrating a small amount of well-curated text. It is bad at integrating a large number of approximate-nearest-neighbor fragments. Vector retrieval can find relevant fragments; only summarization can integrate them. A memory system whose primary mechanism is retrieval returns a thousand fragments. A memory system whose primary mechanism is hierarchical summarization returns one paragraph that captures their meaning.

Fix:


3. Premature Distribution

Definition: Adopting distributed-systems infrastructure (Kafka, distributed queues, multi-service orchestration) for a workload that runs in a single process. The operational cost is paid upfront; the architectural benefit is never realized.

Symptoms:

Evidence from the field: Forensic analysis of one production memory framework found Kafka wired into the compose file with consumer-group configuration:

kafka:
  image: confluentinc/cp-kafka
  environment:
    KAFKA_GROUP_ID: agent-events

for a workload that, in the code, is a single asyncio task pulling from a single producer. A separate personal-agent project solved the same decoupling problem with two asyncio.Queue objects totaling ~40 lines of code.

The honest version: Distribution is correct when you need it. A team that has hit the limits of single-process async, has multiple producers writing into the same logical stream, and has the operational maturity to run a broker — that team should adopt Kafka. A team building a personal agent that runs on a developer’s laptop should not.

Fix:


4. Compaction-Vulnerable State

Definition: Storing long-running state (goals, identity, active task pointers, user constraints) inside conversation history, where compaction can summarize or delete it.

Symptoms:

Evidence from the field: Forensic analysis of one production personal-agent codebase showed long-running goal state stored in session.metadata[GOAL_STATE_KEY] and re-injected into the Runtime Context block at every turn via a goal_state_runtime_lines() function. Because compaction operates on message history but never touches session metadata, the goal survived every compaction pass. A different project in the same category stored the goal as the first user message — and observed agents drifting from it after 40 turns once compaction reduced the early history to a single summary line.

Fix:


5. Ungated Background Work

Definition: Idle-time or scheduled LLM work that runs without checking machine state (battery, CPU, memory, network). Common in “subconscious” or “auto-improve” features on laptops.

Symptoms:

Evidence from the field: Forensic analysis of one full-stack personal-AI project’s scheduler module included an explicit decision log: “saturated memory from concurrent Ollama calls has crashed the user’s laptop twice.” The fix was a scheduler gate that reads battery state, CPU usage, and a model-saturation semaphore before allowing background LLM work to proceed. Policies: Aggressive, Normal, Throttled, Paused. Most agent platforms have no equivalent.

Fix:


Detecting These in Your Own Work

Read your longest prompt. If it contains words like CRITICAL, MUST, or NEVER followed by what looks like a control-flow instruction or a race-condition workaround, you have Prompted Architecture. The fix is to move that sentence into code and delete it from the prompt — if behavior gets worse, the prompt was carrying weight it should never have carried.

Look at your memory schema. If you have three or more “memory types” whose columns substantially overlap, you have Vector-Default Memory — or its cousin, Typed-But-Identical Memory. A schema that names six things and stores one thing is telling you the taxonomy was invented before the data.

Run your local dev environment. If it takes more than 60 seconds and requires more than two processes, you may have Premature Distribution. Then look at how many of those processes have more than one producer or one consumer in practice — that’s the real distribution-worthiness check. A broker with one writer and one reader is a queue with a network hop.

Search your codebase for session.metadata or its equivalent. If you have none and your agent has long-running goals or identity, you have Compaction-Vulnerable State. A goal that lives only in a user message will be compacted away — measure it: after 50 turns of natural conversation, ask the agent what its current goal is.

Look at your background-task scheduler. If it fires on wall-clock alone with no input from battery, CPU, or memory state, you have Ungated Background Work. The fix is rarely complex — it’s usually 30 lines of signal-reading code in front of the existing job loop.