2. Sidecar model switch + Router request queue #3
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Parent
What to build
Add the model switch capability to the Sidecar and the request queue to the Router. This is the core end-to-end slice: a request triggers a switch, subsequent requests queue, and the queue drains when the model is ready.
POST /models/switch— body{profile_id}. Stops current llama-server subprocess, starts new one with the profile'smodel_pathandflags, pollslocalhost:8080/v1/modelsevery 500ms for readiness, returns{status: "ready", active_profile}or{status: "error", message}. In-memory switch lock prevents concurrent switches.GET /models/status— updated to return{active_profile: Profile | null, llama_server_running: bool}based on actual subprocess state.429when queue is full. Drains queue once Sidecar reports ready.Acceptance criteria
POST /models/switchstops current llama-server and starts new one with profile flagslocalhost:8080/v1/modelsevery 500msBlocked by
User stories covered
2, 3, 11, 12, 13, 16