This article details the unique four-layer architecture of OpenClaw, an AI assistant runtime, focusing on its design philosophy. It explores the trade-offs behind a single-process gateway, a multi-source context assembly for LLMs, the ReAct loop for tool execution, and an innovative Markdown-based memory system with vector search. The architecture aims for lightweight yet powerful operation, connecting various channels and devices.
Read original on Dev.to #systemdesignOpenClaw proposes a unique four-layer architecture for AI assistant runtimes, diverging from traditional microservice approaches for certain components. The design emphasizes simplicity, direct state consistency, and AI-native data formats. The layers are: Control Plane, Gateway, Agent Runtime, and Endpoint Nodes.
A key architectural decision is running the Gateway as a single Node.js process. This is a deliberate choice against microservices for a personal AI assistant, aiming to reduce complexity and overhead. The Gateway handles message routing, WebSocket connection management to endpoint nodes, session state, and plugin lifecycle. The rationale is that for this specific use case, the benefits of simplified deployment and zero-overhead internal calls outweigh the scalability advantages of distributed microservices.
The Agent Runtime acts as the core intelligence, responsible for synthesizing a complete 'worldview' for the LLM and executing actions. It uses a multi-source context assembly and a ReAct (Reasoning + Acting) loop.
OpenClaw makes an unconventional choice by using Markdown files for memory storage instead of traditional databases. This
Markdown for Memory
This design allows memory to be human-readable, version-controlled via Git, portable (plain text), and naturally AI-friendly as LLMs excel at processing Markdown.
There are two layers of memory: Long-term memory (e.g., `MEMORY.md` for preferences, key decisions) and Daily memory (e.g., `memory/YYYY-MM-DD.md` for conversation summaries). When memory files grow large, OpenClaw leverages vector search with multiple embedding models to retrieve relevant information efficiently, combining the benefits of human-readable storage with advanced AI retrieval techniques.