Hugging Face
huggingface.co › unsloth › DeepSeek-R1-0528-Qwen3-8B-GGUF
unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF · Hugging Face
This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.
Videos
r/LocalLLaMA on Reddit: DeepSeek-R1-0528-Qwen3-8B on iPhone 16 Pro
r/LocalLLaMA on Reddit: deepseek r1 0528 qwen 8b on android MNN chat
02:30
AI News: DeepSeek-R1 V2, the new open source SoTA! - YouTube
10:25
Run DeepSeek-R1-0528-Qwen3-8B Locally with Gaia (Easy Tutorial!)
17:15
DeepSeek R1 0528 Qwen3 8B - Small Upgraded Student Model - Install ...
41:46
DeepSeek R1 0528 : 8B vs 671B (Live Test) - YouTube
Reddit
reddit.com › r/localllama › what's your analysis of unsloth/deepseek-r1-0528-qwen3-8b-gguf locally
What's your analysis of unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF locally : r/LocalLLaMA
May 7, 2025 - deepseek-r1-qwen3-8b doesn't live up to its 'reputation' it could be used for story or dialogue, maybe design documents, but ask it to do some basic stuff in python - it will fail · my suggestion is buy a mac mini, it will serve you as a nice LLM machine and as an ok video player for your grandchildren ... Not an answer to your question but remember that it is "just" Qwen3-8b (is NOT Deepseek-r1) so even Qwen3-14b should be better.
Unsloth
docs.unsloth.ai › get-started › all-our-models
All Our Models | Unsloth Documentation
DeepSeek-R1 · R1-0528 · link · — · R1-0528-Qwen3-8B · link · link · R1 · link · — · R1 Zero · link · — · Distill Llama 3 8 B · link · link · Distill Llama 3.3 70 B · link · link · Distill Qwen 2.5 1.5 B · link · link · Distill Qwen 2.5 7 B · link · link · Distill Qwen 2.5 14 B · link · link · Distill Qwen 2.5 32 B · link · link · Model · Variant · GGUF ·
Reddit
reddit.com › r/localllama › deepseek-r1-0528 official benchmarks released!!!
r/LocalLLaMA on Reddit: DeepSeek-R1-0528 Official Benchmarks Released!!!
May 29, 2025 - Picked this one, hope it works - ollama run hf.co/bartowski/deepseek-ai_DeepSeek-R1-0528-Qwen3-8B-GGUF:Q6_K Continue this thread Continue this thread ... unsloth has some information on his versions about nothink https://huggingface.co/unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF "For NON thinking mode, we purposely enclose and with nothing:
AI Models
aimodels.fyi › models › huggingFace › deepseek-r1-0528-qwen3-8b-gguf-unsloth
DeepSeek-R1-0528-Qwen3-8B-GGUF | AI Model Details
We are having trouble loading . Please try again later
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-0528-Qwen3-8B-GGUF › blob › main › DeepSeek-R1-0528-Qwen3-8B-Q3_K_M.gguf
DeepSeek-R1-0528-Qwen3-8B-Q3_K_M.gguf · unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF at main
GGUF · qwen3 · unsloth · deepseek · qwen · conversational · arxiv: 2501.12948 · License: mit · Model card Files Files and versions · xet Community · 10 · Deploy · Use this model · main · DeepSeek-R1-0528-Qwen3-8B-GGUF / DeepSeek-R1-0528-Qwen3-8B-Q3_K_M.gguf ·
Runpod
runpod.io › home › blog › the 'minor upgrade' that’s anything but: deepseek r1 0528 deep dive
The 'Minor Upgrade' That’s Anything But: DeepSeek R1 0528 Deep Dive | Runpod Blog
Perhaps most impressively, the ... DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking....
Hugging Face
huggingface.co › unsloth › DeepSeek-R1-0528-Qwen3-8B-GGUF › discussions › 4
unsloth/DeepSeek-R1-0528-Qwen3-8B-GGUF · How to disable reasoning?
Because I had the issue with Qwen3 models mistyping it! Maybe they use the same special token would need to check their ... Edit lol: looking at https://huggingface.co/deepseek-ai/DeepSeek-R1-0528-Qwen3-8B/raw/main/tokenizer.json, the dictionary unfortunately doesn't contain such /think, /nothink(neither /no_think), it's all or nothing :D So only solution is already mentioned (I think it's called hot steering or test time steering).