Architecture | VexAI Docs

Boot Flow

When VexAI starts, index.ts checks for a configuration file. If none exists, the interactive setup wizard runs; otherwise the bot boots up and initializes every subsystem in sequence.

index.ts
  ├── Check for data/config.json
  │   ├── Missing → Run setup wizard (src/setup/wizard.ts)
  │   │              └── Interactive prompts → write config.json
  │   └── Found → Start bot (src/bot/client.ts)
  │               ├── Initialize database (SQLite or PostgreSQL)
  │               ├── Run migrations
  │               ├── Register skills (built-in + user skills)
  │               ├── Initialize LLM provider
  │               ├── Start security observer
  │               ├── Register slash commands
  │               ├── Start integrations (webhooks, RSS, email)
  │               ├── Start event system
  │               ├── Start message backfill
  │               └── Ready: listening for messages

ℹ️ First Run

On the very first launch the wizard creates data/config.json, identity files, and the database. Subsequent starts skip the wizard entirely.

Message Pipeline

Every user message flows through message-handler.ts in a well-defined sequence. The tool execution loop can iterate up to 20 times by default, letting the LLM chain multiple tool calls in a single turn.

User Message (Discord)
  │
  ├── 1. Load conversation history (memory + SQLite cold tier)
  ├── 2. Assemble system prompt:
  │      ├── Identity (IDENTITY.md, SOUL.md)
  │      ├── Rules (RULES.md)
  │      ├── Core memory (CORE_MEMORY.md)
  │      ├── Per-user memory (MEMORY.md)
  │      └── Tool list (optimized by prompt optimizer)
  ├── 3. Send to LLM provider
  ├── 4. Tool execution loop (max iterations, default 20):
  │      ├── Parse tool calls from LLM response
  │      ├── For each tool call:
  │      │   ├── Check tool cache → return cached if hit
  │      │   ├── Security observer gate:
  │      │   │   ├── Tier 0: auto-approve
  │      │   │   ├── Tier 1: rule-based check
  │      │   │   ├── Tier 2: cached LLM verdict
  │      │   │   └── Tier 3: full LLM review
  │      │   ├── Approval gate (if destructive)
  │      │   ├── Execute tool via skill registry
  │      │   └── Cache result
  │      └── Send tool results back to LLM → repeat
  ├── 5. Split response at 2000 chars (Discord limit)
  └── 6. Send reply to Discord

💡 Tool Cache

Identical tool calls within the same conversation turn are automatically de-duplicated via the tool cache, reducing LLM API costs and latency.

LLM Abstraction Layer

All LLM communication goes through a unified provider interface. The BaseProvider abstract class handles retry logic with 3 attempts and exponential backoff. Four concrete implementations cover every major provider.

Provider	Class	Notes
OpenAI	`OpenAIProvider`	Native function calling
Anthropic	`AnthropicProvider`	Tool use via Anthropic API
OpenRouter	`OpenRouterProvider`	Routes to any model
OpenAI-Compatible	`OpenAICompatibleProvider`	Any `/v1/chat/completions` endpoint

Tool/message adapters normalize format differences between providers. Each provider translates tool calls and results to/from its native API format.
ProviderCache manages per-user model overrides, letting individual users switch models without affecting others.

Security Flow

Every tool invocation passes through the security observer before execution. The tiered system balances safety with performance: low-risk tools fly through instantly while high-risk actions get full LLM review.

Tool Call Received
  │
  ├── Check risk tier (risk-tiers.ts)
  │   ├── Tier 0 → APPROVE (no overhead)
  │   ├── Tier 1 → Rate limit check → APPROVE/DENY
  │   ├── Tier 2 → Check verdict cache
  │   │           ├── Cache hit → return cached verdict
  │   │           └── Cache miss → LLM review → cache → APPROVE/DENY/ESCALATE
  │   └── Tier 3 → Always LLM review → APPROVE/DENY/ESCALATE
  │
  ├── On APPROVE → execute tool
  ├── On DENY → return denial message to LLM
  └── On ESCALATE → alert channel + deny execution

⚠️ Fail-Closed

If the security observer encounters an error (e.g., LLM timeout), the tool call is denied by default. This fail-closed design ensures the bot never accidentally executes an unchecked destructive action.

Skill Architecture

Skills follow a registry pattern. Each skill exposes a set of tools that the LLM can invoke during conversation.

Skill Interface

Each Skill has: name, description, tools[], execute(toolName, args, context)
SkillContext provides: channelId, userId, guildId, client, executeTool()

Loading Order

Built-in skills: registered on startup from src/skills/built-in/
User skills: hot-loaded from the skills/ directory at root

Internal Tool Calls

executeTool() allows skills to call other tools internally. These internal calls have a maximum recursion depth of 5 and bypass the security observer since they are trusted by design.

🚨 Recursion Limit

Internal tool calls are capped at depth 5. If a skill exceeds this limit, a SkillError is thrown to prevent infinite recursion.

Database Layer

A DatabaseAdapter interface abstracts the underlying engine. VexAI ships with two implementations:

Engine	Library	Features
SQLite	better-sqlite3	WAL mode, zero config, local-first
PostgreSQL	pg	Connection pooling, tsvector, pgvector

Migrations

Sequential migrations (v1–v8 for SQLite, separate Postgres migrations) run automatically on boot. Each migration is idempotent.

Storage Services

Conversation store: tiered memory (hot in-memory, cold in database)
Memory store: per-user and core memory persistence
Message mirror: full Discord message archival for search
Embedding service: vector storage for semantic search

Error Handling

VexAI uses a custom error hierarchy rooted in ClawAIError. Each error type carries structured context for logging and debugging.

ClawAIError (base)
  ├── ConfigError:    configuration problems
  ├── LLMError:       LLM provider failures
  ├── SecurityError:  security violations
  └── SkillError:     skill execution failures

ℹ️ Error Source

All custom errors are defined in src/utils/errors.ts and extend the base ClawAIError class for consistent catch handling across the codebase.

Key Design Decisions

Decision	Detail
ESM throughout	`"type": "module"` in package.json. All imports use `.js` extensions
TypeScript strict mode	ES2022 target, ESNext modules, bundler module resolution
Local-first	All searches hit local DB first, external API as fallback
Fail-closed security	Errors default to deny: never execute unchecked actions
Runtime data in `data/`	Config, database, identity files, and user data all live in `data/`
2000 char split	Discord message limit handled transparently by `src/utils/markdown.ts`

🏗️ Architecture