Architecture

đŸ—ī¸ Architecture

VexAI is built with TypeScript (ESM), using discord.js for Discord integration, better-sqlite3/pg for databases, and a modular skill architecture. This page covers the internal architecture, data flow, and key design decisions.

Boot Flow

When VexAI starts, index.ts checks for a configuration file. If none exists, the interactive setup wizard runs; otherwise the bot boots up and initializes every subsystem in sequence.

index.ts
  ├── Check for data/config.json
  │   ├── Missing → Run setup wizard (src/setup/wizard.ts)
  │   │              └── Interactive prompts → write config.json
  │   └── Found → Start bot (src/bot/client.ts)
  │               ├── Initialize database (SQLite or PostgreSQL)
  │               ├── Run migrations
  │               ├── Register skills (built-in + user skills)
  │               ├── Initialize LLM provider
  │               ├── Start security observer
  │               ├── Register slash commands
  │               ├── Start integrations (webhooks, RSS, email)
  │               ├── Start event system
  │               ├── Start message backfill
  │               └── Ready: listening for messages
â„šī¸ First Run

On the very first launch the wizard creates data/config.json, identity files, and the database. Subsequent starts skip the wizard entirely.

Message Pipeline

Every user message flows through message-handler.ts in a well-defined sequence. The tool execution loop can iterate up to 20 times by default, letting the LLM chain multiple tool calls in a single turn.

User Message (Discord)
  │
  ├── 1. Load conversation history (memory + SQLite cold tier)
  ├── 2. Assemble system prompt:
  │      ├── Identity (IDENTITY.md, SOUL.md)
  │      ├── Rules (RULES.md)
  │      ├── Core memory (CORE_MEMORY.md)
  │      ├── Per-user memory (MEMORY.md)
  │      └── Tool list (optimized by prompt optimizer)
  ├── 3. Send to LLM provider
  ├── 4. Tool execution loop (max iterations, default 20):
  │      ├── Parse tool calls from LLM response
  │      ├── For each tool call:
  │      │   ├── Check tool cache → return cached if hit
  │      │   ├── Security observer gate:
  │      │   │   ├── Tier 0: auto-approve
  │      │   │   ├── Tier 1: rule-based check
  │      │   │   ├── Tier 2: cached LLM verdict
  │      │   │   └── Tier 3: full LLM review
  │      │   ├── Approval gate (if destructive)
  │      │   ├── Execute tool via skill registry
  │      │   └── Cache result
  │      └── Send tool results back to LLM → repeat
  ├── 5. Split response at 2000 chars (Discord limit)
  └── 6. Send reply to Discord
💡 Tool Cache

Identical tool calls within the same conversation turn are automatically de-duplicated via the tool cache, reducing LLM API costs and latency.

LLM Abstraction Layer

All LLM communication goes through a unified provider interface. The BaseProvider abstract class handles retry logic with 3 attempts and exponential backoff. Four concrete implementations cover every major provider.

Provider Class Notes
OpenAI OpenAIProvider Native function calling
Anthropic AnthropicProvider Tool use via Anthropic API
OpenRouter OpenRouterProvider Routes to any model
OpenAI-Compatible OpenAICompatibleProvider Any /v1/chat/completions endpoint
  • Tool/message adapters normalize format differences between providers. Each provider translates tool calls and results to/from its native API format.
  • ProviderCache manages per-user model overrides, letting individual users switch models without affecting others.

Security Flow

Every tool invocation passes through the security observer before execution. The tiered system balances safety with performance: low-risk tools fly through instantly while high-risk actions get full LLM review.

Tool Call Received
  │
  ├── Check risk tier (risk-tiers.ts)
  │   ├── Tier 0 → APPROVE (no overhead)
  │   ├── Tier 1 → Rate limit check → APPROVE/DENY
  │   ├── Tier 2 → Check verdict cache
  │   │           ├── Cache hit → return cached verdict
  │   │           └── Cache miss → LLM review → cache → APPROVE/DENY/ESCALATE
  │   └── Tier 3 → Always LLM review → APPROVE/DENY/ESCALATE
  │
  ├── On APPROVE → execute tool
  ├── On DENY → return denial message to LLM
  └── On ESCALATE → alert channel + deny execution
âš ī¸ Fail-Closed

If the security observer encounters an error (e.g., LLM timeout), the tool call is denied by default. This fail-closed design ensures the bot never accidentally executes an unchecked destructive action.

Skill Architecture

Skills follow a registry pattern. Each skill exposes a set of tools that the LLM can invoke during conversation.

Skill Interface

  • Each Skill has: name, description, tools[], execute(toolName, args, context)
  • SkillContext provides: channelId, userId, guildId, client, executeTool()

Loading Order

  1. Built-in skills: registered on startup from src/skills/built-in/
  2. User skills: hot-loaded from the skills/ directory at root

Internal Tool Calls

executeTool() allows skills to call other tools internally. These internal calls have a maximum recursion depth of 5 and bypass the security observer since they are trusted by design.

🚨 Recursion Limit

Internal tool calls are capped at depth 5. If a skill exceeds this limit, a SkillError is thrown to prevent infinite recursion.

Database Layer

A DatabaseAdapter interface abstracts the underlying engine. VexAI ships with two implementations:

Engine Library Features
SQLite better-sqlite3 WAL mode, zero config, local-first
PostgreSQL pg Connection pooling, tsvector, pgvector

Migrations

Sequential migrations (v1–v8 for SQLite, separate Postgres migrations) run automatically on boot. Each migration is idempotent.

Storage Services

  • Conversation store: tiered memory (hot in-memory, cold in database)
  • Memory store: per-user and core memory persistence
  • Message mirror: full Discord message archival for search
  • Embedding service: vector storage for semantic search

Error Handling

VexAI uses a custom error hierarchy rooted in ClawAIError. Each error type carries structured context for logging and debugging.

ClawAIError (base)
  ├── ConfigError:    configuration problems
  ├── LLMError:       LLM provider failures
  ├── SecurityError:  security violations
  └── SkillError:     skill execution failures
â„šī¸ Error Source

All custom errors are defined in src/utils/errors.ts and extend the base ClawAIError class for consistent catch handling across the codebase.

Key Design Decisions

Decision Detail
ESM throughout "type": "module" in package.json. All imports use .js extensions
TypeScript strict mode ES2022 target, ESNext modules, bundler module resolution
Local-first All searches hit local DB first, external API as fallback
Fail-closed security Errors default to deny: never execute unchecked actions
Runtime data in data/ Config, database, identity files, and user data all live in data/
2000 char split Discord message limit handled transparently by src/utils/markdown.ts