Hugging Face
huggingface.co › deepseek-ai › DeepSeek-R1-0528
deepseek-ai/DeepSeek-R1-0528 · Hugging Face
1 month ago - Meanwhile, we distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.
DeepSeek-R1-0528 🔥
And MIT License, as always. More on reddit.com
What does DeepSeek R1 0528 do that DeepSeek R1 can't
It finally manages to think in languages other than English, and the second model to pass my native language poetry test, after gemini 2.5 pro More on reddit.com
DeepSeek: R1 0528 is lethal
The lack of sleep due to the never ending stream of new and better models may be lethal. More on reddit.com
DeepSeek’s new R1-0528-Qwen3-8B is the most intelligent 8B parameter model yet, but not by much: Alibaba’s own Qwen3 8B is just one point behind
Those benchmarks are a meme. ArtificialAnalysis uses benchmarks established by other research groups, which are often old and overtrained, so they aren't reliable. They carefully show or hide models on default list to paint a picture of bigger models doing better, but when you enable Qwen 8B and 32B with reasoning to be shown, this all falls apart. It's nice enough to brag about a model on LinkedIn, and they are somewhat useful - they seem to be independent and the image and video arenas are great, but they're not capable of maintaining a leak-proof expert benchmarks. Look at math reasoning: DeepSeek R10528 (May '25) - 94 Qwen3 14B (reasoning) - 86 Qwen3 8B (Reasoning) - 83 DeepSeek R1 (Jan '25) - 82 DeepSeek R1 05-28 Qwen3 8B - 79 Claude 3.7 Sonnet (thinking) - 72 Overall bench (Intelligence Index) : DeepSeek R1 (Jan '25) - 60 Qwen3 32B (Reasoning) - 59 Do you believe that it makes sense for Qwen3 8B to score above DeepSeek R1 or for Claude Sonnet 3.7 to be outclassed by DeepSeek R1 05-28 Qwen3 8B with a big margin? Another bench - LiveCodeBench Qwen3 14B (Reasoning) - 52 Claude 3.7 Sonnet thinking - 47 Why are devs using Claude 3.7/4 in Windsurf/Cursor/Roo/Cline/Aider and not Qwen 3 14B? Qwen3 14B is apparently a much better coder lmao. I can't call it benchmark contamination but it's definitely overfit to benchmarks. For god's sake, when you let base Qwen 2.5 32B non-Instruct generate random tokens with trash prompt it will often generate MMLU-style questions and answer pairs out of itself. It's trained to do well at benchmarks that they test on. More on reddit.com
Videos
15:30
DeepSeek R1 0528 - Better Coding & Tool Calling | Is It Faster Now?
06:04
DeepSeek R1 0528 in 6 Minutes - YouTube
r/LocalLLaMA on Reddit: deepseek r1 0528 qwen 8b on android MNN chat
03:12
Run DeepSeek R1 Locally. Easiest Method - YouTube
r/DeepSeek on Reddit: DeepSeek-R1-0528 + MCP → one model, 10 ...
X
x.com › deepseek_ai › status › 1928061589107900779
🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark ...
Improved benchmark performance Enhanced front-end capabilities Reduced hallucinations Supports JSON output & function calling Try it now: https://chat.deepseek.com No change to API usage — docs here: https://api-docs.deepseek.com/guides/reasoning_model… Open-source weights: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528…
Reddit
reddit.com › r/localllama › deepseek-r1-0528 🔥
r/LocalLLaMA on Reddit: DeepSeek-R1-0528 🔥
May 28, 2025 -
https://huggingface.co/deepseek-ai/DeepSeek-R1-0528
Google Cloud Platform
console.cloud.google.com › vertex-ai › publishers › deepseek-ai › model-garden › deepseek-r1-0528-maas
DeepSeek R1 0528 API Service – Vertex AI
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com. Possible reasons are:www.gstatic.com or its IP addresses are blocked by your network administratorGoogle has temporarily blocked your account or network due to excessive automated requestsPlease contact your network ...
YouTube
youtube.com › watch
Deepseek R1 0528 685b Q4 Full Local Ai Review - YouTube
Quad 3090 Ryzen AI Rig Build 2025 Video w/cheaper components vs EPYC Build https://youtu.be/So7tqRSZ0s8 Written Build Guide with all the Updated AI Rig Compo...
Published May 31, 2025
Together AI
together.ai › models › deepseek-r1
DeepSeek-R1-0528 API | Together AI
Meanwhile, we distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.
Reddit
reddit.com › r/deepseek › what does deepseek r1 0528 do that deepseek r1 can't
r/DeepSeek on Reddit: What does DeepSeek R1 0528 do that DeepSeek R1 can't
March 29, 2025 -
What's different in DeepSeek R1 0528 compared to the original R1?Any improvements or issues you've noticed ?I'm curious to hear your experience with it...
Reddit
reddit.com › r/localllama › deepseek: r1 0528 is lethal
r/LocalLLaMA on Reddit: DeepSeek: R1 0528 is lethal
May 28, 2025 -
I just used DeepSeek: R1 0528 to address several ongoing coding challenges in RooCode.
This model performed exceptionally well, resolving all issues seamlessly. I hit up DeepSeek via OpenRouter, and the results were DAMN impressive.