OpenClaw: What We Learned from an Open-Source Autonomous Agent

We’ve been experimenting with AI agents for the past year. The pitch: instead of babysitting an LLM through each task, you describe what you want and walk away. OpenClaw, an open-source agent framework, has pushed this further than most. Its technical struggles, a major architecture pivot, and a serious security incident show where the technology actually breaks down.

The Original Architecture

OpenClaw launched in late 2025 with an architecture we found immediately appealing: instead of one monolithic agent trying to do everything, it used specialized components working together.¹1The multi-agent approach was inspired by research showing that specialized models outperform generalist ones on narrow tasks, but the question was whether that advantage would survive the coordination overhead.

The hierarchy looked like this:

Component	Responsibility
Planner	Breaks down goals into subtasks
Executor	Runs individual subtasks
Critic	Evaluates outputs, decides if tasks succeeded
Memory	Persists context across sessions

On paper, this separation of concerns made sense. In practice, we hit problems within hours of deployment.

The Coordination Problem

Multi-agent systems require coordination, and coordination requires each agent to understand what the others can do. Current LLMs are bad at this. They hallucinate capabilities, misremember previous actions, and confidently attempt impossible tasks.

We saw the same thing. The agent would call a tool with the wrong parameters, get an error, acknowledge the error in its own reasoning trace, then immediately make the same call again. The loop continues until context is exhausted.

The broader pattern: agents that can’t maintain an accurate model of their own capabilities.

The Pivot: Molt Cycles

By early 2026, the OpenClaw team rethought the architecture. Instead of multiple agents, they moved to a single agent with “molt cycles”: periodic self-reflection phases where the agent pauses, reviews its actions, and sheds unnecessary context.²2The biological metaphor comes from arthropods shedding their exoskeletons. The idea was that agents should periodically “shed” their accumulated context and start fresh with only the essential information.

def molt_cycle(agent, context):
    # Summarize what we've learned
    summary = agent.reflect(context.history)

    # Identify what's working and what isn't
    assessment = agent.assess(summary)

    # Shed unnecessary context, keep essentials
    new_context = agent.compress(assessment)

    return new_context

This solved the context window problem. Agents could run longer without hitting token limits. But it introduced a new failure mode: amnesia.

We ran into the same thing. Tasks that required remembering state across long sequences would fail unpredictably when a molt cycle happened at the wrong moment.

The Security Nightmare

The technical challenges were frustrating. The security situation was worse.

OpenClaw’s power comes from giving agents real access to your system: files, terminal, browser, long-term memory. That same access makes it a target for malware. The “skills” ecosystem, which distributes agent capabilities as markdown files, became the attack vector.³3Skills are just markdown containing instructions and tool recipes. That simplicity is both the feature and the vulnerability. Markdown can include links, commands, and anything else an attacker might want to inject.

In January 2026, 1Password’s security team discovered that hundreds of OpenClaw skills were distributing macOS malware. A top-downloaded Twitter skill contained “required dependency” instructions that led to malicious infrastructure: staged payloads, obfuscated scripts, and eventually an infostealer capable of grabbing browser sessions, credentials, API keys, and SSH keys.

The attack was clever: it looked like normal installation steps. Users following the instructions had no idea they were compromising their machines.

1Password’s recommendation was blunt: don’t run OpenClaw on company devices. If you already have, treat it as potentially compromised.

The Counterargument: Embrace the Risk?

Not everyone sees this as a dealbreaker. Brandon Wang’s writeup on his personal OpenClaw setup makes the opposite argument: the broad access is the point.

Wang runs OpenClaw on a Mac mini, connected to Slack, with access to his text messages (including 2FA codes), bank accounts, calendars, and contacts. His argument is that “the sweet elixir of context” creates a different kind of experience. The agent remembers preferences, learns workflows, and improves over time. Restrictive configurations, he argues, limit usefulness to the point of irrelevance.

He has a point, and the tradeoff is real: agents need access to be useful, but access creates attack surface.

The Memory-Coherence Tradeoff

Beyond security, both versions of OpenClaw fought the same problem: agents need memory to maintain coherence, but memory consumes context, and context is expensive and limited.⁴4This is sometimes called the “memory-coherence tradeoff” in agent research literature, though it doesn’t seem to have a canonical name yet.

The original multi-agent approach tried external memory stores (vector databases, key-value caches, structured logs). But the agent still needs to know when to query memory and what to query for. That’s a hard problem in itself.

The molt cycle approach tried intelligent compression. But summarization always loses information. Lose the wrong information at the wrong time, and the entire task derails.

Neither approach solves this. Current LLMs aren’t designed to maintain coherent state across many interactions.

Where OpenClaw Is Now

As of early 2026, the project has scaled back its ambitions. OpenClaw now positions itself as a “task automation framework,” essentially a way to chain LLM calls with error handling, focused on single-session tasks that complete before memory becomes a problem.

The maintainers have been upfront about it:

This matches our experience. The most reliable AI-assisted workflows we’ve built are ones where humans stay in the loop, not ones where agents run autonomously.

What We’ve Learned

After months of experimentation with OpenClaw, we’ve landed on a few principles:

Design for tight feedback loops. The longer an agent runs without human input, the more likely it drifts off course. Build in checkpoints.
Make capabilities explicit. Don’t rely on the LLM to “figure out” what tools it has. Enumerate them clearly in every prompt.
Fail fast and loud. Build tripwires that halt execution when something unexpected happens. Silent failures compound into catastrophic ones.
Memory is harder than reasoning. Single-turn reasoning is largely solved. Multi-turn state management is not.⁵5This is why retrieval-augmented generation (RAG) remains popular. It offloads the memory problem to a separate, more tractable system.
Security is not optional. If your agent has broad system access, your attack surface is enormous. Treat skills and plugins as untrusted code.

Fully autonomous AI agents aren’t dead as a concept, but they’re further away than the hype suggests. OpenClaw’s path from multi-agent coordination to molt cycles to task automation, with a serious security incident along the way, is worth studying. The team explored the boundaries honestly and showed us where they are.