feat: add 15 model profiles to manifest.yaml #18

doru · 2026-06-15T15:40:29+03:00

doru commented

2026-06-15 15:40:29 +03:00

Qwen3.6-27B: 3 profiles (balanced/thinking/extended)
Gemma 4 12B: 4 profiles (Q6_K_XL and IQ4_XS variants)
Gemma 4 26B-A4B: 3 profiles (Q4_K_M and IQ4_XS)
Qwen3.6-35B-A3B: 3 profiles (fast/thinking/extended, non-MTP)
Uncensored: 3 profiles (HauhauCS, Genesis APEX)
Add pytest.ini for test discovery
All profiles use KV cache quantization (q8_0/q4_0) for 64K-128K context
Embedded sampling parameters per model family
Based on research from r/LocalLLaMA, Unsloth benchmarks, HF model cards

- Qwen3.6-27B: 3 profiles (balanced/thinking/extended) - Gemma 4 12B: 4 profiles (Q6_K_XL and IQ4_XS variants) - Gemma 4 26B-A4B: 3 profiles (Q4_K_M and IQ4_XS) - Qwen3.6-35B-A3B: 3 profiles (fast/thinking/extended, non-MTP) - Uncensored: 3 profiles (HauhauCS, Genesis APEX) - Add pytest.ini for test discovery - All profiles use KV cache quantization (q8_0/q4_0) for 64K-128K context - Embedded sampling parameters per model family - Based on research from r/LocalLLaMA, Unsloth benchmarks, HF model cards

doru added 1 commit 2026-06-15 15:40:29 +03:00

feat: add 15 model profiles to manifest.yaml e9790c00dc

- Qwen3.6-27B: 3 profiles (balanced/thinking/extended)
- Gemma 4 12B: 4 profiles (Q6_K_XL and IQ4_XS variants)
- Gemma 4 26B-A4B: 3 profiles (Q4_K_M and IQ4_XS)
- Qwen3.6-35B-A3B: 3 profiles (fast/thinking/extended, non-MTP)
- Uncensored: 3 profiles (HauhauCS, Genesis APEX)
- Add pytest.ini for test discovery
- All profiles use KV cache quantization (q8_0/q4_0) for 64K-128K context
- Embedded sampling parameters per model family
- Based on research from r/LocalLLaMA, Unsloth benchmarks, HF model cards

doru merged commit 39a8f09232 into master

2026-06-15 15:40:48 +03:00

doru referenced this issue from a commit

2026-06-15 15:40:49 +03:00

Merge pull request 'feat: add 15 model profiles to manifest.yaml' (#18) from feature/add-model-profiles into master

Sign in to join this conversation.

No reviewers