Architecture¶
Autobot is organized around a message-driven agent loop. Channels receive user input, the agent loop orchestrates LLM interactions and tool execution, and responses flow back through the same channel.
System Overview¶
graph LR
subgraph Channels["Channels"]
direction TB
CLI[CLI]
TG[Telegram]
SL[Slack]
WA[WhatsApp]
ZP[Zulip]
end
BUS((Event Bus))
subgraph Agent["Agent Core"]
direction TB
LOOP[Agent Loop]
CTX[Context Builder]
MEM[Memory]
SESS[Sessions]
end
LLM[LLM Provider\nAnthropic · OpenAI\nDeepSeek · Groq · Gemini]
subgraph Tools["Tools"]
direction TB
BASH[Bash]
FS[Filesystem]
WEB[Web Search]
MCP[MCP Servers]
SPAWN[Subagents]
PLG[Plugins]
end
SAND[Sandbox\nDocker · Bubblewrap]
CRON[Cron\nScheduler]
CLI & TG & SL & WA & ZP --> BUS
CRON -.->|scheduled| BUS
BUS --> LOOP
LOOP --- CTX & MEM & SESS
LOOP -->|request| LLM
LLM -->|tool calls| BASH & FS & WEB & MCP & SPAWN & PLG
BASH & FS --> SAND
LOOP -.->|response| BUS
style BUS fill:#7c4dff,stroke:#651fff,color:#fff
style LOOP fill:#5c6bc0,stroke:#3949ab,color:#fff
style CTX fill:#7986cb,stroke:#5c6bc0,color:#fff
style MEM fill:#7986cb,stroke:#5c6bc0,color:#fff
style SESS fill:#7986cb,stroke:#5c6bc0,color:#fff
style LLM fill:#26a69a,stroke:#00897b,color:#fff
style CLI fill:#42a5f5,stroke:#1e88e5,color:#fff
style TG fill:#42a5f5,stroke:#1e88e5,color:#fff
style SL fill:#42a5f5,stroke:#1e88e5,color:#fff
style WA fill:#42a5f5,stroke:#1e88e5,color:#fff
style ZP fill:#42a5f5,stroke:#1e88e5,color:#fff
style BASH fill:#ffa726,stroke:#fb8c00,color:#fff
style FS fill:#ffa726,stroke:#fb8c00,color:#fff
style WEB fill:#ffa726,stroke:#fb8c00,color:#fff
style MCP fill:#ffa726,stroke:#fb8c00,color:#fff
style SPAWN fill:#ffa726,stroke:#fb8c00,color:#fff
style PLG fill:#ffa726,stroke:#fb8c00,color:#fff
style SAND fill:#ef5350,stroke:#e53935,color:#fff
style CRON fill:#ab47bc,stroke:#8e24aa,color:#fff
Request Lifecycle¶
- Message ingress — A channel adapter (CLI, Telegram, Slack, WhatsApp, Zulip) receives user input and publishes it to the event bus.
- Context assembly — The agent loop picks up the message and builds the full LLM context: system prompt, conversation history from the session store, relevant memories, and available skills.
- LLM request — The assembled context is sent to the configured LLM provider (Anthropic, OpenAI, DeepSeek, Groq, Gemini, OpenRouter, or vLLM).
- Tool execution — If the LLM response contains tool calls, each tool is executed through the tool registry. Shell commands run inside a kernel-enforced sandbox (Docker or bubblewrap). MCP tools are proxied to external servers.
- Iteration — Tool results are fed back into the agent loop. The LLM can issue further tool calls, creating a multi-turn execution cycle until it produces a final text response.
- Message egress — The final response is published to the event bus and delivered back through the originating channel.
Core Components¶
Channels (channels/)¶
Channel adapters handle protocol-specific communication (HTTP webhooks, WebSocket, polling) and normalize messages into a common format. Each channel supports media downloads for vision and voice processing.
| Channel | Transport | Features |
|---|---|---|
| CLI | stdin/stdout | Interactive and single-command modes |
| Telegram | HTTP polling | Photos, voice, custom commands, allowlists |
| Slack | WebSocket | Thread support, file uploads |
| HTTP webhooks | Media messages | |
| Zulip | HTTP Polling | Private direct messages |
Agent Loop (agent/)¶
The core orchestration engine. When a message arrives, it builds context from bootstrap files (AGENTS.md, SOUL.md, USER.md), conversation history, and memory. It then enters a loop: call the LLM, execute any requested tools inside the sandbox, feed results back, and repeat — up to a configurable maximum of iterations. Once the LLM produces a final text response, the result is saved to the session and sent back to the channel.
graph LR
CHAT["📡 Channels\nTelegram · Slack · WhatsApp · Zulip · CLI"]
CHAT --> MSG["💬 Message"]
subgraph Loop["Agent Loop"]
MSG --> LLM["🤖 LLM"]
LLM --> TOOLS["🔧 Tools"]
TOOLS -->|results| LLM
LLM -->|done| RSP["💬 Response"]
end
RSP --> CHAT
subgraph Context
direction TB
MEM["🧠 Memory"]
SKL["⚡ Skills"]
end
Context <--> TOOLS
style MSG fill:#fff3cd,stroke:#ffc107,color:#333
style LLM fill:#ffe0b2,stroke:#ff9800,color:#333
style TOOLS fill:#ffe0b2,stroke:#ff9800,color:#333
style RSP fill:#ffcdd2,stroke:#ef5350,color:#333
style CHAT fill:#e3f2fd,stroke:#90caf9,color:#333
style MEM fill:#e8d5f5,stroke:#ab47bc,color:#333
style SKL fill:#e8d5f5,stroke:#ab47bc,color:#333
On each turn, the agent loop:
- Assembles context via the context builder (system prompt + history + memory + skills)
- Handles multimodal input (images, voice transcription)
- Manages the tool call loop (execute → feed results → repeat)
- Supports subagents for parallel or delegated tasks
- Triggers memory hooks for session consolidation
LLM Providers (providers/)¶
A unified HTTP interface to multiple LLM backends. All providers implement the same request/response contract, making model switching a config change. Token usage is tracked per request for observability.
Tool System (tools/)¶
Tools are the agent's hands. The registry discovers and exposes tools to the LLM. Built-in tools include:
- Bash — Shell commands in a sandboxed environment
- Filesystem — Read/write files within the workspace
- Web — HTTP requests and web search
- Cron — Schedule recurring or one-time tasks
- Spawn — Launch subagents for parallel work
- Message — Send messages to channel owners
All shell execution goes through the sandbox executor, which enforces OS-level isolation via Docker or bubblewrap.
Event Bus (bus/)¶
Internal pub/sub system that decouples channels from the agent loop. Messages, responses, and scheduled events all flow through the bus, enabling multi-channel operation from a single agent instance.
Session Store (session/)¶
JSONL-based conversation persistence. Each session captures the full message history, enabling context continuity across restarts. Sessions are scoped per owner for multi-user isolation.
Memory (memory/)¶
Long-term memory with two tiers:
- Session memory — Automatic consolidation of conversation history
- Persistent memory — Facts and preferences that survive across sessions
See Memory for details.
MCP Servers (mcp/)¶
Model Context Protocol client that connects external tool servers. Tools from MCP servers are auto-discovered and exposed to the LLM as mcp_{server}_{tool}.
Plugins (plugins/)¶
Plugin system with bash auto-discovery and markdown skills. Plugins extend the agent's capabilities without modifying core code.
Cron (cron/)¶
Scheduler supporting cron expressions, fixed intervals, and one-time triggers. When a job fires, it publishes a message through the event bus, triggering a full agent turn with all tools. See Cron & Scheduling for details.