fix: add --host 0.0.0.0 to llama-server command

llama-server defaults to binding on 127.0.0.1 (localhost only).
When the router runs on a separate Docker host (10.0.4.100), all
chat completion requests fail with:

  PROXY EXCEPTION on primary http://10.0.4.11:8081/v1/chat/completions:
    ConnectError: All connection attempts failed

Added --host 0.0.0.0 after --port so llama-server listens on all
network interfaces, reachable from the Docker host.
This commit is contained in:
root 2026-06-16 21:46:07 +00:00
parent 75248741e7
commit bcf45129f1

View File

@ -111,6 +111,7 @@ async def _start_llama_server(profile: dict):
cmd = ["/home/bigt/AI/llama.cpp/build/bin/llama-server"]
cmd += ["--model", profile["model_path"]]
cmd += ["--port", str(LLAMA_SERVER_PORT)]
cmd += ["--host", "0.0.0.0"]
for key, value in profile.get("flags", {}).items():
cmd += ["--" + _flag_key(key), _flag_value(value)]