Anyone want to try my llama.cpp DeepSeek V3.2 PR?

Anyone want to try my llama.cpp DeepSeek V3.2 PR?

Code: https://github.com/fairydreaming/llama.cpp/tree/deepseek-dsa

git clone https://github.com/fairydreaming/llama.cpp -b deepseek-dsa --single-branch 

Supported GGUFs (Q4_K_M ~ 404GB, Q8_0 ~ 714GB):

Chat template to use: models/templates/deepseek-ai-DeepSeek-V3.2.jinja

If you experience OOM errors in CUDA ggml_top_k() try lowering the ubatch size or/and increasing `-fitt` value.

Let me know if you encounter any problems.

submitted by /u/fairydreaming
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top