Menu
Dev.to #systemdesign·March 7, 2026

OpenClaw: A Four-Layer Architecture for AI Assistant Runtimes

This article details the unique four-layer architecture of OpenClaw, an AI assistant runtime, focusing on its design philosophy. It explores the trade-offs behind a single-process gateway, a multi-source context assembly for LLMs, the ReAct loop for tool execution, and an innovative Markdown-based memory system with vector search. The architecture aims for lightweight yet powerful operation, connecting various channels and devices.

Read original on Dev.to #systemdesign

OpenClaw proposes a unique four-layer architecture for AI assistant runtimes, diverging from traditional microservice approaches for certain components. The design emphasizes simplicity, direct state consistency, and AI-native data formats. The layers are: Control Plane, Gateway, Agent Runtime, and Endpoint Nodes.

Gateway Layer: Single-Process Design and WebSocket Protocol

A key architectural decision is running the Gateway as a single Node.js process. This is a deliberate choice against microservices for a personal AI assistant, aiming to reduce complexity and overhead. The Gateway handles message routing, WebSocket connection management to endpoint nodes, session state, and plugin lifecycle. The rationale is that for this specific use case, the benefits of simplified deployment and zero-overhead internal calls outweigh the scalability advantages of distributed microservices.

  • Single-Process Benefits: Zero-overhead internal calls, simplified deployment, natural state consistency.
  • WebSocket Protocol: Employs `req`/`res` for synchronous calls (e.g., camera snap) and `event` for asynchronous notifications (e.g., location updates) between the Gateway and Nodes.
  • Security Model: A one-time device pairing process establishes trust, issuing long-lived tokens for subsequent authenticated connections, mirroring Bluetooth device pairing.

Agent Runtime: The AI's Brain for Context and Action

The Agent Runtime acts as the core intelligence, responsible for synthesizing a complete 'worldview' for the LLM and executing actions. It uses a multi-source context assembly and a ReAct (Reasoning + Acting) loop.

  1. Context Assembly: Gathers information from System Prompt, Workspace Files, Memory Files, Session History, and Tool Results. This ensures the LLM has all necessary information for informed decision-making.
  2. Context Window Optimization: To manage LLM token limits, older messages are compressed or truncated, oversized tool outputs are summarized, and memory files are ranked for relevance.
  3. ReAct Loop: Enables multi-step reasoning where the LLM decides to call tools, executes them, incorporates results into context, and iterates until a final response is generated. This is crucial for complex tasks involving multiple external interactions.
  4. Memory Flush: After a conversation, the Agent Runtime reviews and compresses key information, writing it to daily and long-term Markdown memory files. This mechanism provides cross-session continuity, simulating human-like memory consolidation.

Memory System: Markdown as a Database with Vector Search

OpenClaw makes an unconventional choice by using Markdown files for memory storage instead of traditional databases. This

ℹ️

Markdown for Memory

This design allows memory to be human-readable, version-controlled via Git, portable (plain text), and naturally AI-friendly as LLMs excel at processing Markdown.

There are two layers of memory: Long-term memory (e.g., `MEMORY.md` for preferences, key decisions) and Daily memory (e.g., `memory/YYYY-MM-DD.md` for conversation summaries). When memory files grow large, OpenClaw leverages vector search with multiple embedding models to retrieve relevant information efficiently, combining the benefits of human-readable storage with advanced AI retrieval techniques.

AI AssistantLLM ArchitectureSystem DesignWebSocketSingle-ProcessContext ManagementReActMarkdown Database

Comments

Loading comments...