1. Manifest schema + Sidecar foundation #2

New Issue

doru · 2026-06-15T02:46:41+03:00

doru commented

2026-06-15 02:46:41 +03:00

Parent

#1 Epic: Model Switching via Sidecar

What to build

Create the Sidecar service skeleton and the manifest file format, plus the Router endpoint that exposes available models to Hermes.

Sidecar (new module sidecar/): Python FastAPI service with two endpoints:
- GET /models/available — reads manifest YAML, returns list of profiles as {id, name, model_path, flags}
- GET /models/status — returns {active_profile: null, llama_server_running: false} (no model loaded yet)
Manifest: YAML schema at /home/bigt/AI/llm/manifest.yaml with profile shape:

- id: qwen-3-8b
  name: "Qwen 3 8B"
  model_path: "/home/bigt/AI/llm/qwen/qwen3-8b-q4.gguf"
  flags:
    n_ctx: 8192
    n_gpu_layers: 35

Manifest is re-read on every /models/available call (no file watcher).

Router GET /v1/models — new endpoint that proxies to Sidecar /models/available and returns OpenAI-compatible model list where each model id is the profile id.

Acceptance criteria

Sidecar starts and serves on a configurable port (default 8081)
GET /models/available returns profiles from manifest YAML
GET /models/status returns {active_profile: null, llama_server_running: false}
Router GET /v1/models returns OpenAI-compatible list from Sidecar
Empty manifest returns empty list without error
Invalid YAML returns 500 with error message
Tests: manifest parsing (empty, valid, invalid), both Sidecar endpoints, Router /v1/models

Blocked by

None - can start immediately

User stories covered

1, 14, 15

## Parent - #1 Epic: Model Switching via Sidecar ## What to build Create the Sidecar service skeleton and the manifest file format, plus the Router endpoint that exposes available models to Hermes. - **Sidecar** (new module `sidecar/`): Python FastAPI service with two endpoints: - `GET /models/available` — reads manifest YAML, returns list of profiles as `{id, name, model_path, flags}` - `GET /models/status` — returns `{active_profile: null, llama_server_running: false}` (no model loaded yet) - **Manifest**: YAML schema at `/home/bigt/AI/llm/manifest.yaml` with profile shape: ```yaml - id: qwen-3-8b name: "Qwen 3 8B" model_path: "/home/bigt/AI/llm/qwen/qwen3-8b-q4.gguf" flags: n_ctx: 8192 n_gpu_layers: 35 ``` Manifest is re-read on every `/models/available` call (no file watcher). - **Router** `GET /v1/models` — new endpoint that proxies to Sidecar `/models/available` and returns OpenAI-compatible model list where each model `id` is the profile `id`. ## Acceptance criteria - [ ] Sidecar starts and serves on a configurable port (default 8081) - [ ] `GET /models/available` returns profiles from manifest YAML - [ ] `GET /models/status` returns `{active_profile: null, llama_server_running: false}` - [ ] Router `GET /v1/models` returns OpenAI-compatible list from Sidecar - [ ] Empty manifest returns empty list without error - [ ] Invalid YAML returns 500 with error message - [ ] Tests: manifest parsing (empty, valid, invalid), both Sidecar endpoints, Router `/v1/models` ## Blocked by - None - can start immediately ## User stories covered 1, 14, 15

doru added the

type:afk

triage:ready

labels 2026-06-15 02:46:41 +03:00

doru referenced this issue

2026-06-15 02:48:58 +03:00

2. Sidecar model switch + Router request queue #3

doru referenced this issue from a commit

2026-06-15 03:49:31 +03:00

Epic: Model Switching via Sidecar — Issues #2-#3

Sign in to join this conversation.