Go to file
root 37fee5341e fix: capture llama-server stderr, fix YAML boolean flag conversion, reduce polling timeout
Three fixes for the model-not-loading bug:

1. **YAML boolean → CLI flag bug**: YAML parses 'on'/'off'/'yes'/'no' as Python
   bools. str(True)='True' which is INVALID for llama.cpp's --flash-attn flag
   (expects 'on'/'off'/'auto'). Added _flag_value() converter that maps bools
   to 'on'/'off' strings.

2. **llama-server stderr was DEVNULL**: All error messages (bad model path,
   OOM, invalid flag) were invisible. Now captured to /tmp/llama-server-stderr.log
   and dumped to the sidecar log on failure.

3. **Reduce polling timeout**: 240 retries × 0.5s = 120s hang. Reduced to
   60 retries × 0.5s = 30s. Still dumps stderr + exit code on failure.

4. **Manifest VRAM fix**: gemma4-26b-compact-long-128k used q8_0 KV cache at
   128K context (~24GB on 24GB RTX 3090 — borderline OOM). Changed to q4_0
   (~18GB, comfortable).
2026-06-16 00:06:45 +00:00
.hermes/plans fix: add probe endpoints and no-model fallback for Hermes Desktop compatibility 2026-06-15 15:22:15 +00:00
deploy fix: capture llama-server stderr, fix YAML boolean flag conversion, reduce polling timeout 2026-06-16 00:06:45 +00:00
docs Added next changes 2026-06-15 00:09:31 +00:00
scripts feat: add sync_models.py script to auto-update Hermes custom_providers from router model list 2026-06-15 21:10:36 +00:00
sidecar fix: capture llama-server stderr, fix YAML boolean flag conversion, reduce polling timeout 2026-06-16 00:06:45 +00:00
tests fix: change sidecar port from 8081 to 8080 2026-06-15 13:17:31 +00:00
.env .env 2026-06-09 13:57:22 +03:00
.gitignore Epic: Model Switching via Sidecar — Issues #2-#3 2026-06-15 00:49:24 +00:00
CONTEXT.md Epic: Model Switching via Sidecar — Issues #4-#7 + #8 deployment 2026-06-15 01:13:36 +00:00
docker-compose.yml fix: resolve port conflict between sidecar and llama-server 2026-06-15 15:31:31 +00:00
Dockerfile Initial commit: migrate intelligence-router files 2026-06-09 11:48:43 +01:00
main.py fix: add probe endpoints and no-model fallback for Hermes Desktop compatibility 2026-06-15 15:22:15 +00:00
pytest.ini feat: add 15 model profiles to manifest.yaml 2026-06-15 12:34:46 +00:00
requirements.txt Epic: Model Switching via Sidecar — Issues #4-#7 + #8 deployment 2026-06-15 01:13:36 +00:00