Files
Pyra/CLAUDE.md
T
2026-05-18 15:28:06 +02:00

465 lines
23 KiB
Markdown

# Pyra — Developer Guide
## What Is This
Pyra is a personal AI assistant CLI combining a multi-provider AI chat interface with
a plugin/integration system (Stage 2+) and an encrypted vault (Stage 3+).
## Current Status
**Stage 3 — Memory Database: complete** (2026-05-18)
Next: Stage 4 — Vault Encryption
## Project Roadmap
### Stage 1 — Core CLI ✅ COMPLETE
Working `pyra` executable with provider setup wizard, streaming chat REPL, .md-based
memory in `~/.pyra/memory/`, and hard security boundaries around the vault.
### Stage 2 — Plugin Framework ✅ COMPLETE
- `src/pyra/plugins/` package: `base.py`, `loader.py`, `registry.py`, `executor.py`, `install.py`
- `src/pyra/bundled_plugins/` — ships bundled plugin scripts with pyra
- `src/pyra/daemon/` stub (CLI surface only)
- Config: `PluginConfig` + `DaemonConfig` added to `PyraConfig`
- Bootstrap: `~/.pyra/plugins/` and `~/.pyra/logs/` created on startup
- Chat session: AI tool-use loop (up to 10 iterations), approval gate, plugin slash commands
- CLI: `pyra plugin list/install/enable/disable/setup`, `pyra daemon *` stubs
### Stage 3 — Memory Database ✅ COMPLETE
- `src/pyra/memory/database.py`: SQLite + FTS5 via `memory_meta` + `memory_fts` tables
- `memory_meta` columns: `path`, `category`, `size_bytes`, `modified`, `summary`, `keywords`, `embedding BLOB` (reserved for Stage 8)
- `list_memories()` queries DB; `lookup_memories()` uses FTS5 with JSON-index fallback
- `write_memory()` / `append_memory()` upsert to DB on every write
- `bootstrap()` calls `init_db()` + `migrate_from_files()` (one-shot migration of existing `.md` files)
- `.md` files remain the canonical store; DB is the search index
### Stage 4 — Vault Encryption
Encrypt `~/.pyra/vault/secrets/` using `age` (or GPG fallback). Pyra decrypts in memory
at call time only — no plaintext ever written to disk after initial setup. Secret
rotation support. Per-key passphrases optional.
### Stage 5 — Skills System
YAML-defined multi-plugin workflows with event triggers and AI-driven selection.
Skills compose existing plugin tools into automated pipelines with conditional branching
and human-in-the-loop decision points.
### Stage 6 — Daemon + Messaging Bots
Always-on asyncio daemon, IPC socket, launchd/systemd service. Bundled bots:
`matrix_bot`, `telegram_bot`, `signal_bot`. Sender allowlist, bcrypt passphrase
challenge, rate limiting (20 msg/hr), injection scanning on all incoming messages,
tool approval over messaging (2-min timeout).
### Stage 7 — Security Audit Sub-agent
`pyra security audit` — sandboxed agent scanning for prompt injection in memory files,
unexpected vault access in `security.log`, outdated CVEs, permission drift on `~/.pyra/`.
Report written to `~/.pyra/security_audit.md` (not AI-readable during normal chat).
### Stage 8 — Web UI / Advanced Features
Optional local web interface (FastAPI + HTMX or similar). Embedding-based memory search
via `sqlite-vec`. Multi-profile support (work vs personal).
---
### Plugin Catalog (not stage-gated — ships when ready)
Plugins are developed independently on `plugin/<name>` branches and merged to `main`
only when complete. All integrations are standalone Python plugin scripts in
`~/.pyra/plugins/` — not hardcoded in `src/pyra/`. Plugin credentials are stored in
the vault under namespaced keys (`plugin:{name}:{key}`).
| Plugin | Branch | Status |
|--------|--------|--------|
| `nextcloud` | `plugin/nextcloud` | planned |
| `email` | `plugin/email` | planned |
| `websearch` | `plugin/websearch` | planned |
| `headless_browser` | `plugin/headless_browser` | planned |
| `server_manager` | `plugin/server_manager` | planned |
| `matrix_bot` | `plugin/matrix_bot` | planned |
| `telegram_bot` | `plugin/telegram_bot` | planned |
| `signal_bot` | `plugin/signal_bot` | planned |
| `ssh_tool` | `plugin/ssh_tool` | planned |
| `docker_tool` | `plugin/docker_tool` | planned |
| `gdrive` | `plugin/gdrive` | planned |
| `onedrive` | `plugin/onedrive` | planned |
| `dropbox_tool` | `plugin/dropbox_tool` | planned |
---
## Architecture
### Source: `src/pyra/`
| Module | Purpose |
|--------|---------|
| `cli.py` | Click entrypoint. Subcommands: `setup`, `chat`, `memory`, `plugin`, `daemon` |
| `setup/providers.py` | Provider registry — pure data, no I/O |
| `setup/wizard.py` | questionary-based interactive setup wizard |
| `config/schema.py` | Pydantic v2 models — `PyraConfig`, `PluginConfig`, `DaemonConfig` |
| `config/manager.py` | ruamel.yaml round-trip config read/write, chmod 600 enforced |
| `config/dirs.py` | `bootstrap()` — creates `~/.pyra/` tree, checks vault sentinel every startup |
| `chat/session.py` | prompt_toolkit REPL loop, AI tool-use loop, plugin slash commands |
| `chat/renderer.py` | Streaming + non-streaming markdown via rich, injection warning panel |
| `chat/history.py` | Conversation list, token budget trimming, tool message support |
| `memory/database.py` | SQLite+FTS5 — `init_db()`, `upsert()`, `remove()`, `search()`, `list_all()`, `migrate_from_files()` |
| `memory/reader.py` | `list_memories()` (DB-backed), `read_memory()`, `lookup_memories()` (FTS5), `load_context_for_session()` |
| `memory/writer.py` | `write_memory()`, `append_memory()` — writes file + upserts to DB |
| `memory/index.py` | Auto-regenerate `MEMORY_INDEX.md` + `memory_index.json` on every write |
| `vault/reader.py` | `get_key(key)` — sole accessor of `vault/secrets/api_keys.json` |
| `vault/writer.py` | `set_key()`, `delete_key()` — only called from setup wizard + plugin setup |
| `security/boundaries.py` | `assert_safe_path()`, `check_vault_lock()`, `BLOCKED_PREFIXES` |
| `security/injection.py` | `scan_response()` — 15 regex patterns, 4 categories, logs to `security.log` |
| `utils/paths.py` | `pyra_home()`, `ensure_dir()`, `safe_chmod()`, `expand()` |
| `plugins/base.py` | `Tool` dataclass, `PyraPlugin` Protocol, `BasePlugin` helper class |
| `plugins/loader.py` | Discovers + loads plugins via importlib; failures isolated per plugin |
| `plugins/registry.py` | Singleton: aggregates tools, slash commands, system prompt additions |
| `plugins/executor.py` | Approval gate: scan args → prompt → execute → scan result → log |
| `plugins/install.py` | Copies bundled plugins to `~/.pyra/plugins/` |
| `bundled_plugins/` | Standalone plugin scripts shipped with pyra (installed on demand) |
| `daemon/__init__.py` | Daemon package stub (implementation in Stage 2.4) |
### Runtime: `~/.pyra/`
```
~/.pyra/
├── config.yaml chmod 600 ← provider_id, model, base_url, enabled plugins
├── security.log chmod 600 ← injection event log
├── memory/ chmod 700
│ ├── user/profile.md
│ ├── context/
│ ├── knowledge/
│ └── MEMORY_INDEX.md
├── plugins/ chmod 700 ← active plugins (each is a dir with manifest.json + plugin.py)
│ └── <name>/
│ ├── manifest.json
│ └── plugin.py
├── logs/ chmod 700
│ ├── tool_executions.log chmod 600 ← every tool call: approved/declined, args, result preview
│ └── plugin_errors.log chmod 600 ← plugin load failures
└── vault/ chmod 700 ← AI CANNOT ACCESS
├── .vault_lock chmod 400 ← sentinel; missing = refuse to start
└── secrets/
└── api_keys.json chmod 400 ← ALL secrets (AI keys + plugin credentials)
```
### Plugin Credential Naming Convention
Plugin credentials live in the vault under namespaced keys:
```
plugin:{plugin_name}:{key_name}
```
Examples: `plugin:nextcloud:password`, `plugin:matrix_bot:access_token`
The vault's `get_key()` / `set_key()` accept any string — the namespace is enforced
by convention in each plugin's `setup()` method.
### Writing a Plugin
1. Create `~/.pyra/plugins/<name>/manifest.json`:
```json
{"name": "<name>", "version": "1.0.0", "description": "...", "author": "you"}
```
2. Create `~/.pyra/plugins/<name>/plugin.py` exporting `get_plugin() -> BasePlugin`:
```python
from pyra.plugins.base import BasePlugin, Tool
class MyPlugin(BasePlugin):
name = "<name>"
description = "..."
version = "1.0.0"
def on_load(self, vault_reader):
self._secret = vault_reader("plugin:<name>:secret")
def tools(self):
return [
Tool("my_tool", "Does X", {"type": "object", "properties": {}},
self._do_x, requires_approval=True)
]
def _do_x(self):
return "result"
def setup(self, console, vault_writer):
secret = console.input("Enter secret: ")
vault_writer("plugin:<name>:secret", secret)
def get_plugin():
return MyPlugin()
```
3. `pyra plugin enable <name>`
**Plugin rules:**
- Never import from `pyra.vault` directly — use the `vault_reader`/`vault_writer` callables
- All write/destructive tools must set `requires_approval=True`
- Return strings from tool handlers (truncated to 4000 chars by executor)
---
## Security Rules (never break these)
1. **Never pass config file contents into a system prompt** — config may reveal provider/model
2. **Never bypass `assert_safe_path()`** — not even in tests (use `tmp_pyra_home` fixture instead)
3. **Always `chmod 600/400`** after writing any file in `~/.pyra/`
4. **No shell execution from AI-generated text** — plugins use explicit approval gates
5. **`vault/reader.py` and `vault/writer.py` are the only modules that may open `api_keys.json`**
6. **API key retrieved inline at call time** — never stored as an instance variable or logged
7. **Tool arguments and results are always injection-scanned** before being used or returned to AI
8. **Plugin directories are validated with `assert_safe_path()`** before loading (symlink protection)
9. **Messaging bot security**: sender allowlist + bcrypt passphrase + rate limiting (Stage 2.4)
## Adding a New Provider
Edit `src/pyra/setup/providers.py`. Add a new `Provider` dataclass entry with all required fields.
litellm handles dispatch automatically via the `litellm_prefix` field.
Add a test in `tests/unit/test_providers.py` to verify the new entry.
## Installing for Development
```bash
uv venv && source .venv/bin/activate
uv pip install -e ".[dev]"
pyra setup
# Install optional plugin dependencies:
uv pip install -e ".[nextcloud]" # Nextcloud plugin
uv pip install -e ".[ssh]" # SSH plugin
uv pip install -e ".[all-plugins]" # Everything
```
## Running Tests
```bash
pytest tests/ -v # all unit + security tests (161 tests)
pytest tests/integration/test_lmstudio.py # requires LM Studio at localhost:1234
```
## Commit Convention
```
feat(module): short description
fix(module): short description
test: description
docs: description
chore: description
```
---
## Workflow Rules
### Bugfixes
- **Stay under 50 lines changed.** Find the root cause and fix it directly.
- If the fix seems to require more than 50 lines, it is probably a refactor, not a bugfix — stop and discuss with the user before proceeding.
- Do not write workarounds, fallback layers, or compatibility shims to route around a bug. Remove the cause.
### Committing Changes
- **Commit after every logical unit of work** — do not batch unrelated changes into one commit and do not wait until the end of a session.
- **One commit per concern.** If a session touches a file for two different reasons (e.g. a bugfix and a cleanup), those are two separate commits — staged and committed independently, even if the file is the same.
- Use the project commit convention: `feat(module):`, `fix(module):`, `test:`, `docs:`, `chore:` followed by a short description.
- Always `git add` only the files relevant to that commit — never `git add .` blindly.
- **Always push after committing** — every commit goes to the remote Gitea repository immediately.
### Plugin Branches
- Every plugin is developed on its own branch: `plugin/<name>` (e.g. `plugin/nextcloud`).
- A plugin branch is **never merged to `main` until the plugin is complete and tested**.
- `main` always contains only production-ready core source code (`src/pyra/` framework).
- If plugin work uncovers a bug in core Pyra code, fix it on a dedicated `fix/...` branch
off `main`, commit it to `main`, push, then rebase the plugin branch onto the updated `main`.
- Plugin branches may be pushed to remote for backup/review at any time.
- Do **not** merge plugin branches to `main` prematurely — a half-working plugin on `main`
is worse than one that isn't there yet.
### Avoid Duplication — Check the Inventory First
Before writing any new utility function, class, or import block, check the **Code Inventory** section below. Everything listed there already exists and is importable. Writing a duplicate wastes code and introduces divergence.
---
## Code Inventory
### Third-party libraries (`pyproject.toml` dependencies)
| Library | Min version | Used in | Purpose |
|---------|-------------|---------|---------|
| `litellm` | 1.40.0 | `chat/session.py`, `setup/wizard.py` | Multi-provider LLM completion (streaming + non-streaming) and tool-use dispatch |
| `rich` | 13.0.0 | `chat/renderer.py`, `cli.py`, `setup/wizard.py`, `plugins/executor.py` | Terminal UI — `Console`, `Panel`, `Markdown`, `Live`, `Text` |
| `click` | 8.1.0 | `cli.py` | CLI entrypoint, `@click.group`, `@click.command`, arguments |
| `prompt_toolkit` | 3.0.0 | `chat/session.py` | REPL input loop — `PromptSession`, `FileHistory` |
| `questionary` | 2.0.0 | `setup/wizard.py` | Interactive `select` / `text` / `password` prompts |
| `ruamel.yaml` | 0.18.0 | `config/manager.py` | Round-trip YAML read/write (preserves comments and formatting) |
| `pydantic` | 2.0.0 | `config/schema.py` | Config validation via `BaseModel` |
| `httpx` | 0.27.0 | `setup/wizard.py` | HTTP GET for local-server connectivity checks |
Optional plugin extras (declared in `pyproject.toml [project.optional-dependencies]`):
| Extra | Libraries | Intended for |
|-------|-----------|--------------|
| `nextcloud` | `caldav`, `webdav4`, `vobject` | CalDAV / CardDAV / WebDAV |
| `matrix` | `matrix-nio`, `aiofiles` | Matrix bot |
| `telegram` | `python-telegram-bot` | Telegram bot |
| `ssh` | `paramiko` | SSH plugin |
| `docker` | `docker` | Docker plugin |
| `gdrive` | `google-api-python-client`, `google-auth-oauthlib` | Google Drive |
| `onedrive` | `msal` | OneDrive device-flow auth |
| `dropbox` | `dropbox` | Dropbox |
### Standard library modules in use
| Module | Used in | Notes |
|--------|---------|-------|
| `pathlib.Path` | everywhere | Default for all paths — never use `os.path` string joins |
| `os` | `utils/paths.py` | Only for `os.name` (Windows guard) |
| `json` | `vault/reader.py`, `vault/writer.py`, `plugins/loader.py`, `plugins/executor.py`, `plugins/install.py` | Vault file, manifests, tool args/results |
| `re` | `security/injection.py` | Compiled injection-detection patterns |
| `datetime` | `security/injection.py`, `memory/reader.py`, `memory/index.py`, `plugins/loader.py`, `plugins/executor.py` | Log timestamps, file mtimes |
| `dataclasses` | `security/injection.py`, `memory/reader.py`, `plugins/base.py` | `@dataclass` — `InjectionWarning`, `MemoryFile`, `Tool` |
| `importlib.util` | `plugins/loader.py` | Dynamic plugin loading (`spec_from_file_location`) |
| `sys` | `cli.py`, `plugins/loader.py` | `sys.exit`, `sys.modules` for dynamic module registration |
| `shutil` | `plugins/install.py` | `copytree`, `rmtree` for bundled plugin installation |
| `typing` | `plugins/base.py`, `chat/history.py`, `plugins/registry.py` | `Protocol`, `Callable`, `Coroutine`, `Any`, `TYPE_CHECKING` |
### Internal utility functions — import, do not rewrite
#### `utils.paths`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `pyra_home` | `() -> Path` | Returns `~/.pyra/` |
| `ensure_dir` | `(path: Path, mode=0o700) -> Path` | `mkdir -p` + `chmod` in one call |
| `safe_chmod` | `(path: Path, mode: int) -> None` | `chmod` that silently skips on Windows |
#### `security.boundaries`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `assert_safe_path` | `(path: Path) -> None` | Raises `VaultAccessError` if path resolves into vault |
| `check_vault_lock` | `() -> None` | Raises `PyraSecurityError` if vault sentinel is missing |
Exceptions: `VaultAccessError(PermissionError)`, `PyraSecurityError(RuntimeError)`
#### `security.injection`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `scan_response` | `(text: str) -> list[InjectionWarning]` | Runs 15 compiled regex patterns, logs hits to `security.log` |
| `redact_api_keys` | `(text: str) -> str` | Replaces key-shaped strings with `[REDACTED]` |
Dataclass: `InjectionWarning(pattern_label: str, matched_text: str)`
#### `config.manager`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `load_config` | `() -> PyraConfig` | Reads `config.yaml`, validates via Pydantic; raises `FileNotFoundError` if missing |
| `save_config` | `(cfg: PyraConfig) -> None` | Writes `config.yaml`, enforces `chmod 600` |
| `config_exists` | `() -> bool` | True if `config.yaml` exists |
| `config_path` | `() -> Path` | Absolute path to `config.yaml` |
#### `config.dirs`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `bootstrap` | `() -> None` | Creates `~/.pyra/` directory tree and checks vault sentinel; called at every startup |
#### `vault.reader` / `vault.writer`
| Function | Module | Signature | Purpose |
|----------|--------|-----------|---------|
| `get_key` | `vault.reader` | `(provider_id: str) -> str \| None` | Sole vault reader — never call `open(api_keys.json)` anywhere else |
| `set_key` | `vault.writer` | `(provider_id: str, api_key: str) -> None` | Stores or overwrites a key in the vault |
| `delete_key` | `vault.writer` | `(provider_id: str) -> bool` | Removes a key; returns `True` if it existed |
#### `memory.database`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `init_db` | `() -> None` | Creates `memory.db` with `memory_meta` + `memory_fts` tables; chmod 600 |
| `upsert` | `(path, *, content, category, size_bytes, modified, summary, keywords) -> None` | Insert or replace one entry in both tables |
| `remove` | `(path: str) -> None` | Delete entry from both tables |
| `search` | `(query: str, limit: int = 20) -> list[dict]` | FTS5 MATCH search; returns `[{file, summary, keywords, snippet}]` |
| `list_all` | `() -> list[dict]` | All rows from `memory_meta` ordered by path |
| `migrate_from_files` | `() -> None` | One-shot: populate DB from existing `.md` files if DB is empty |
#### `memory.reader`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `list_memories` | `() -> list[MemoryFile]` | Queries DB (`memory_meta`); falls back to file scan if DB empty |
| `read_memory` | `(name: str) -> str` | Reads memory file by relative path; validates against vault/traversal |
| `lookup_memories` | `(query: str) -> list[dict]` | FTS5 full-text search; falls back to JSON index substring search |
| `load_context_for_session` | `() -> str` | Concatenates all memory files into a system-prompt block |
Dataclass: `MemoryFile(name, path, category, size_bytes, modified)`
#### `memory.writer`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `write_memory` | `(name: str, content: str, summary: str, keywords: list[str]) -> Path` | Creates/overwrites a memory `.md` file, updates index and DB |
| `append_memory` | `(name: str, content: str) -> Path` | Appends to a memory file (creates if missing), updates index and DB |
#### `memory.index`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `update_index` | `() -> None` | Regenerates `MEMORY_INDEX.md` and `memory_index.json` — called automatically by writer functions |
#### `setup.providers`
| Symbol | Kind | Purpose |
|--------|------|---------|
| `PROVIDERS` | `list[Provider]` | All registered providers in display order |
| `PROVIDERS_BY_ID` | `dict[str, Provider]` | Fast id lookup |
| `get_provider` | `(provider_id: str) -> Provider` | Raises `KeyError` for unknown ids |
| `Provider` | frozen dataclass | `id`, `display_name`, `requires_key`, `default_model`, `litellm_prefix`, `base_url`, `key_env_var`, `connectivity_check`, `group` |
#### `plugins.loader`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `load_plugins` | `(plugins_dir: Path) -> list[PyraPlugin]` | Discovers all valid plugin directories |
| `load_plugin_by_name` | `(name: str, plugins_dir: Path) -> PyraPlugin \| None` | Loads a single plugin; returns `None` on any failure |
#### `plugins.install`
| Function | Signature | Purpose |
|----------|-----------|---------|
| `get_bundled_plugins_dir` | `() -> Path` | Path to `src/pyra/bundled_plugins/` |
| `install_bundled_plugin` | `(name, bundled_dir, plugins_dir) -> None` | Copies bundled plugin dir to `~/.pyra/plugins/`, sets permissions |
| `list_bundled_plugins` | `(bundled_dir: Path) -> list[str]` | Names of all bundled plugins that have a `manifest.json` |
| `read_manifest` | `(plugin_dir: Path) -> dict` | Reads `manifest.json`; returns `{}` if missing |
#### `chat.renderer` — rendering functions and shared `console`
Import `console` from here; do not create a second `rich.Console()` in new code.
| Symbol | Purpose |
|--------|---------|
| `console` | Module-level `rich.Console` — the single shared terminal instance |
| `render_streaming_response(stream)` | Renders a litellm streaming response with `Live` + `Markdown`, returns final text |
| `render_text_response(text)` | Renders a complete string as `Markdown` |
| `render_injection_warning(warnings)` | Yellow `Panel` showing detected pattern labels |
| `render_error(message)` | Red `Panel` |
| `render_info(message)` | Dim plain text line |
| `render_system(message)` | Cyan `Panel` |
### Internal classes
| Class | Module | Notes |
|-------|--------|-------|
| `PyraConfig` | `config.schema` | Top-level config; fields: `ai`, `memory`, `security`, `plugins`, `daemon` |
| `ProviderConfig` | `config.schema` | `ai:` block — `provider_id`, `model`, `base_url` |
| `PluginConfig` | `config.schema` | `plugins:` block — `enabled`, `require_approval`, `log_executions` |
| `DaemonConfig` | `config.schema` | `daemon:` block |
| `MemoryConfig` | `config.schema` | `memory:` block — `max_tokens_in_context`, `auto_load` |
| `SecurityConfig` | `config.schema` | `security:` block — `injection_detection`, `log_injections` |
| `ConversationHistory` | `chat.history` | Holds message list; builds API payload via `build_for_api()`; trims to token budget |
| `PluginRegistry` | `plugins.registry` | Singleton (`instance()` / `reset()`); aggregates tools, slash commands, system prompt additions |
| `ToolExecutor` | `plugins.executor` | Approval gate + injection scan + logging; call via `execute()` or `execute_tool_call_batch()` |
| `Tool` | `plugins.base` | Dataclass — `name`, `description`, `parameters` (JSON Schema), `handler`, `requires_approval` |
| `PyraPlugin` | `plugins.base` | `@runtime_checkable` Protocol — the plugin interface |
| `BasePlugin` | `plugins.base` | Concrete base with no-op defaults; plugins should inherit this |