🌐
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-0528-Qwen3-8B-GGUF
unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF · Hugging Face
This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.
🌐
Unsloth
unsloth.ai › blog › deepseek-r1-0528
How to Run Deepseek-R1-0528 Locally
DeepSeek also released a R1-0528 distilled version by fine-tuning Qwen3 (8B). The distill achieves the same performance as Qwen3 (235B). Qwen3 GGUF: DeepSeek-R1-0528-Qwen3-8B-GGUF You can also fine-tune the Qwen3 model with Unsloth.
🌐
Unsloth
docs.unsloth.ai › models › tutorials-how-to-fine-tune-and-run-llms › deepseek-r1-0528-how-to-run-locally
DeepSeek-R1-0528: How to Run Locally | Unsloth Documentation
DeepSeek-R1-0528 is DeepSeek's new update to their R1 reasoning model. The full 671B parameter model requires 715GB of disk space. The quantized dynamic 1.66-bit version uses 162GB (-80% reduction in size).
🌐
Reddit
reddit.com › r/localllama › what's your analysis of unsloth/deepseek-r1-0528-qwen3-8b-gguf locally
What's your analysis of unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF locally : r/LocalLLaMA
May 7, 2025 - deepseek-r1-qwen3-8b doesn't live up to its 'reputation' it could be used for story or dialogue, maybe design documents, but ask it to do some basic stuff in python - it will fail · my suggestion is buy a mac mini, it will serve you as a nice LLM machine and as an ok video player for your grandchildren ... Not an answer to your question but remember that it is "just" Qwen3-8b (is NOT Deepseek-r1) so even Qwen3-14b should be better.
🌐
LM Studio
lmstudio.ai › models › deepseek › deepseek-r1-0528-qwen3-8b
deepseek/deepseek-r1-0528-qwen3-8b
Distilled version of the DeepSeek-R1-0528 model, created by continuing the post-training process on the Qwen3 8B Base model using Chain-of-Thought (CoT) from DeepSeek-R1-0528.
🌐
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-GGUF
unsloth/DeepSeek-R1-GGUF · Hugging Face
DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models.
🌐
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-0528-GGUF
unsloth/DeepSeek-R1-0528-GGUF · Hugging Face
This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.
🌐
Artificial Analysis
artificialanalysis.ai › models › deepseek-r1-qwen3-8b
DeepSeek R1 0528 Qwen3 8B - Intelligence, Performance & Price Analysis
Analysis of DeepSeek's DeepSeek R1 0528 Qwen3 8B and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more.
Find elsewhere
🌐
Unsloth
docs.unsloth.ai › get-started › all-our-models
All Our Models | Unsloth Documentation
DeepSeek-R1 · R1-0528 · link · — · R1-0528-Qwen3-8B · link · link · R1 · link · — · R1 Zero · link · — · Distill Llama 3 8 B · link · link · Distill Llama 3.3 70 B · link · link · Distill Qwen 2.5 1.5 B · link · link · Distill Qwen 2.5 7 B · link · link · Distill Qwen 2.5 14 B · link · link · Distill Qwen 2.5 32 B · link · link · Model · Variant · GGUF ·
🌐
Reddit
reddit.com › r/localllama › deepseek-r1-0528 official benchmarks released!!!
r/LocalLLaMA on Reddit: DeepSeek-R1-0528 Official Benchmarks Released!!!
May 29, 2025 - Picked this one, hope it works - ollama run hf.co/bartowski/deepseek-ai_DeepSeek-R1-0528-Qwen3-8B-GGUF:Q6_K Continue this thread Continue this thread ... unsloth has some information on his versions about nothink https://huggingface.co/unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF "For NON thinking mode, we purposely enclose and with nothing:
🌐
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-0528-Qwen3-8B-GGUF › blob › main › DeepSeek-R1-0528-Qwen3-8B-Q3_K_M.gguf
DeepSeek-R1-0528-Qwen3-8B-Q3_K_M.gguf · unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF at main
GGUF · qwen3 · unsloth · deepseek · qwen · conversational · arxiv: 2501.12948 · License: mit · Model card Files Files and versions · xet Community · 10 · Deploy · Use this model · main · DeepSeek-R1-0528-Qwen3-8B-GGUF / DeepSeek-R1-0528-Qwen3-8B-Q3_K_M.gguf ·
🌐
Codersera
codersera.com › blog › run-and-install-deepseek-r1-0528-locally-on-your-computer
Run and Install DeepSeek-R1-0528 Locally on Your Computer
May 30, 2025 - ollama run hf.co/unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF:Q4_K_XL · 7. Run Inference · Use the terminal interface to send prompts and receive real-time responses. vLLM is ideal for high-throughput inference in scalable environments. Python 3.8+ CUDA-enabled GPU ·
🌐
MacStories
macstories.net › home › testing deepseek r1-0528 on the m3 ultra mac studio and installing local gguf models with ollama on macos
Testing DeepSeek R1-0528 on the M3 Ultra Mac Studio and Installing Local GGUF Models with Ollama on macOS - MacStories
October 30, 2025 - we distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B ...
🌐
Runpod
runpod.io › home › blog › the 'minor upgrade' that’s anything but: deepseek r1 0528 deep dive
The 'Minor Upgrade' That’s Anything But: DeepSeek R1 0528 Deep Dive | Runpod Blog
Perhaps most impressively, the ... DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking....
🌐
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-0528-Qwen3-8B-GGUF › discussions › 4
unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF · How to disable reasoning?
Because I had the issue with Qwen3 models mistyping it! Maybe they use the same special token would need to check their ... Edit lol: looking at https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B/raw/main/tokenizer.json, the dictionary unfortunately doesn't contain such /think, /nothink(neither /no_think), it's all or nothing :D So only solution is already mentioned (I think it's called hot steering or test time steering).
🌐
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-GGUF-UD
unsloth/DeepSeek-R1-GGUF-UD · Hugging Face
DeepSeek-R1-0528 is here! The most powerful reasoning open LLM, available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models.