1. Manifest schema + Sidecar foundation #2

Open
opened 2026-06-15 02:46:41 +03:00 by doru · 0 comments
Owner

Parent

  • #1 Epic: Model Switching via Sidecar

What to build

Create the Sidecar service skeleton and the manifest file format, plus the Router endpoint that exposes available models to Hermes.

  • Sidecar (new module sidecar/): Python FastAPI service with two endpoints:
    • GET /models/available — reads manifest YAML, returns list of profiles as {id, name, model_path, flags}
    • GET /models/status — returns {active_profile: null, llama_server_running: false} (no model loaded yet)
  • Manifest: YAML schema at /home/bigt/AI/llm/manifest.yaml with profile shape:
- id: qwen-3-8b
  name: "Qwen 3 8B"
  model_path: "/home/bigt/AI/llm/qwen/qwen3-8b-q4.gguf"
  flags:
    n_ctx: 8192
    n_gpu_layers: 35

Manifest is re-read on every /models/available call (no file watcher).

  • Router GET /v1/models — new endpoint that proxies to Sidecar /models/available and returns OpenAI-compatible model list where each model id is the profile id.

Acceptance criteria

  • Sidecar starts and serves on a configurable port (default 8081)
  • GET /models/available returns profiles from manifest YAML
  • GET /models/status returns {active_profile: null, llama_server_running: false}
  • Router GET /v1/models returns OpenAI-compatible list from Sidecar
  • Empty manifest returns empty list without error
  • Invalid YAML returns 500 with error message
  • Tests: manifest parsing (empty, valid, invalid), both Sidecar endpoints, Router /v1/models

Blocked by

  • None - can start immediately

User stories covered

1, 14, 15

## Parent - #1 Epic: Model Switching via Sidecar ## What to build Create the Sidecar service skeleton and the manifest file format, plus the Router endpoint that exposes available models to Hermes. - **Sidecar** (new module `sidecar/`): Python FastAPI service with two endpoints: - `GET /models/available` — reads manifest YAML, returns list of profiles as `{id, name, model_path, flags}` - `GET /models/status` — returns `{active_profile: null, llama_server_running: false}` (no model loaded yet) - **Manifest**: YAML schema at `/home/bigt/AI/llm/manifest.yaml` with profile shape: ```yaml - id: qwen-3-8b name: "Qwen 3 8B" model_path: "/home/bigt/AI/llm/qwen/qwen3-8b-q4.gguf" flags: n_ctx: 8192 n_gpu_layers: 35 ``` Manifest is re-read on every `/models/available` call (no file watcher). - **Router** `GET /v1/models` — new endpoint that proxies to Sidecar `/models/available` and returns OpenAI-compatible model list where each model `id` is the profile `id`. ## Acceptance criteria - [ ] Sidecar starts and serves on a configurable port (default 8081) - [ ] `GET /models/available` returns profiles from manifest YAML - [ ] `GET /models/status` returns `{active_profile: null, llama_server_running: false}` - [ ] Router `GET /v1/models` returns OpenAI-compatible list from Sidecar - [ ] Empty manifest returns empty list without error - [ ] Invalid YAML returns 500 with error message - [ ] Tests: manifest parsing (empty, valid, invalid), both Sidecar endpoints, Router `/v1/models` ## Blocked by - None - can start immediately ## User stories covered 1, 14, 15
doru added the
type:afk
triage:ready
labels 2026-06-15 02:46:41 +03:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: doru/intelligence-router#2
No description provided.