Brave Search

1 month ago - Meanwhile, we distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.

Baseten

baseten.co › resources › changelog › model-api-deprecation-deepseek-r1-0528

Model API Deprecation (DeepSeek R1 0528)

4 days ago - The DeepSeek R1 0528 Model API was deprecated at 5pm PT on December 19th. The model ID is currently inactive and will return an error for all requests.

Discussions

DeepSeek-R1-0528 🔥

And MIT License, as always. More on reddit.com

r/LocalLLaMA

100

433

May 28, 2025

What does DeepSeek R1 0528 do that DeepSeek R1 can't

It finally manages to think in languages other than English, and the second model to pass my native language poetry test, after gemini 2.5 pro More on reddit.com

r/DeepSeek

March 29, 2025

DeepSeek: R1 0528 is lethal

The lack of sleep due to the never ending stream of new and better models may be lethal. More on reddit.com

r/LocalLLaMA

203

613

May 28, 2025

DeepSeek’s new R1-0528-Qwen3-8B is the most intelligent 8B parameter model yet, but not by much: Alibaba’s own Qwen3 8B is just one point behind

Those benchmarks are a meme. ArtificialAnalysis uses benchmarks established by other research groups, which are often old and overtrained, so they aren't reliable. They carefully show or hide models on default list to paint a picture of bigger models doing better, but when you enable Qwen 8B and 32B with reasoning to be shown, this all falls apart. It's nice enough to brag about a model on LinkedIn, and they are somewhat useful - they seem to be independent and the image and video arenas are great, but they're not capable of maintaining a leak-proof expert benchmarks. Look at math reasoning: DeepSeek R10528 (May '25) - 94 Qwen3 14B (reasoning) - 86 Qwen3 8B (Reasoning) - 83 DeepSeek R1 (Jan '25) - 82 DeepSeek R1 05-28 Qwen3 8B - 79 Claude 3.7 Sonnet (thinking) - 72 Overall bench (Intelligence Index) : DeepSeek R1 (Jan '25) - 60 Qwen3 32B (Reasoning) - 59 Do you believe that it makes sense for Qwen3 8B to score above DeepSeek R1 or for Claude Sonnet 3.7 to be outclassed by DeepSeek R1 05-28 Qwen3 8B with a big margin? Another bench - LiveCodeBench Qwen3 14B (Reasoning) - 52 Claude 3.7 Sonnet thinking - 47 Why are devs using Claude 3.7/4 in Windsurf/Cursor/Roo/Cline/Aider and not Qwen 3 14B? Qwen3 14B is apparently a much better coder lmao. I can't call it benchmark contamination but it's definitely overfit to benchmarks. For god's sake, when you let base Qwen 2.5 32B non-Instruct generate random tokens with trash prompt it will often generate MMLU-style questions and answer pairs out of itself. It's trained to do well at benchmarks that they test on. More on reddit.com

r/LocalLLaMA

136

June 5, 2025

Videos

15:30

YouTube

DeepSeek R1 0528 - Better Coding & Tool Calling | Is It Faster Now?

May 31, 2025

06:04

YouTube

DeepSeek R1 0528 in 6 Minutes - YouTube

May 29, 2025

reddit.com

r/LocalLLaMA on Reddit: deepseek r1 0528 qwen 8b on android MNN chat

February 7, 2025

03:12

YouTube

Run DeepSeek R1 Locally. Easiest Method - YouTube

February 2, 2025

reddit.com

r/DeepSeek on Reddit: DeepSeek-R1-0528 + MCP → one model, 10 ...

April 2, 2025

View all

OpenRouter

openrouter.ai › deepseek › deepseek-r1-0528:free

Models: 'deepseek/deepseek-r1-0528:free'

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Run R1 0528 (free) with API

Unsloth

docs.unsloth.ai › models › tutorials-how-to-fine-tune-and-run-llms › deepseek-r1-0528-how-to-run-locally

DeepSeek-R1-0528: How to Run Locally | Unsloth Documentation

2 weeks ago - DeepSeek-R1-0528 is DeepSeek's new update to their R1 reasoning model. The full 671B parameter model requires 715GB of disk space. The quantized dynamic 1.66-bit version uses 162GB (-80% reduction in size).

OpenRouter

openrouter.ai › deepseek › deepseek-r1-0528

R1 0528 - API, Providers, Stats | OpenRouter

May 28, 2025 - May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass. Run R1 0528 with API

LM Studio

lmstudio.ai › models › deepseek › deepseek-r1-0528-qwen3-8b

deepseek/deepseek-r1-0528-qwen3-8b

May 29, 2025 - This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.

NVIDIA

build.nvidia.com › deepseek-ai › deepseek-r1-0528

deepseek-r1-0528 Model by Deepseek-ai | NVIDIA NIM

Updated version of DeepSeek-R1 with enhanced reasoning, coding, math, and reduced hallucination.

Find elsewhere

Google Bing Mojeek

x.com › deepseek_ai › status › 1928061589107900779

🚀 DeepSeek-R1-0528 is here! 🔹 Improved benchmark ...

Improved benchmark performance Enhanced front-end capabilities Reduced hallucinations Supports JSON output & function calling Try it now: https://chat.deepseek.com No change to API usage — docs here: https://api-docs.deepseek.com/guides/reasoning_model… Open-source weights: https://huggingface.co/deepseek-ai/DeepSeek-R1-0528…

Gaoxiaoma

xugj520.cn › home › deepseek-r1-0528: revolutionizing ai reasoning capabilities with advanced problem-solving

DeepSeek-R1-0528: Revolutionizing AI Reasoning Capabilities with Advanced Problem-Solving - Efficient Coder

May 30, 2025 - How does DeepSeek-R1-0528 transform AI reasoning? Explore the latest upgrade in large language models, featuring 87.5% accuracy on AIME 2025 and enhanced tool integration for developers.

Artificial Analysis

artificialanalysis.ai › models › deepseek-r1

DeepSeek R1 0528 - Intelligence, Performance & Price Analysis

Analysis of DeepSeek's DeepSeek R1 0528 (May '25) and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more.

reddit.com › r/localllama › deepseek-r1-0528 🔥

r/LocalLLaMA on Reddit: DeepSeek-R1-0528 🔥

May 28, 2025 -

https://huggingface.co/deepseek-ai/DeepSeek-R1-0528

Top answer

1 of 5

146

And MIT License, as always.

2 of 5

damn i guess this means R2 is probably not coming anywhere near as soon as we thought but I guess we cant complain of R1 was already sota for open source so an even better version I cant complain about

Google Cloud Platform

console.cloud.google.com › vertex-ai › publishers › deepseek-ai › model-garden › deepseek-r1-0528-maas

DeepSeek R1 0528 API Service – Vertex AI

Google Cloud Console has failed to load JavaScript sources from www.gstatic.com. Possible reasons are:www.gstatic.com or its IP addresses are blocked by your network administratorGoogle has temporarily blocked your account or network due to excessive automated requestsPlease contact your network ...

YouTube

youtube.com › watch

Deepseek R1 0528 685b Q4 Full Local Ai Review - YouTube

11:57

Quad 3090 Ryzen AI Rig Build 2025 Video w/cheaper components vs EPYC Build https://youtu.be/So7tqRSZ0s8 Written Build Guide with all the Updated AI Rig Compo...

Published May 31, 2025

Hugging Face

huggingface.co › nvidia › DeepSeek-R1-0528-FP4

nvidia/DeepSeek-R1-0528-FP4 · Hugging Face

November 5, 2025 - The NVIDIA DeepSeek-R1-0528-FP4 model is the quantized version of the DeepSeek AI's DeepSeek R1 0528 model, which is an auto-regressive language model that uses an optimized transformer architecture. For more information, please check here.

Together AI

together.ai › models › deepseek-r1

DeepSeek-R1-0528 API | Together AI

Meanwhile, we distilled the chain-of-thought from DeepSeek-R1-0528 to post-train Qwen3 8B Base, obtaining DeepSeek-R1-0528-Qwen3-8B. This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.

Clarifai

clarifai.com › deepseek-ai › deepseek-chat › models › DeepSeek-R1-0528-Qwen3-8B

DeepSeek-R1-0528-Qwen3-8B model | Clarifai - The World's AI

DeepSeek-R1-0528 improves reasoning and logic via better computation and optimization, nearing the performance of top models like O3 and Gemini 2.5 Pro.

Unsloth

unsloth.ai › blog › deepseek-r1-0528

How to Run Deepseek-R1-0528 Locally

DeepSeek-R1-0528 is DeepSeek's new update to their R1 reasoning model. R1-0528 is the world's most powerful open-source model, rivalling OpenAI's GPT-4.5, o3 and Google's Gemini 2.5 Pro.

reddit.com › r/deepseek › what does deepseek r1 0528 do that deepseek r1 can't

r/DeepSeek on Reddit: What does DeepSeek R1 0528 do that DeepSeek R1 can't

March 29, 2025 -

What's different in DeepSeek R1 0528 compared to the original R1?Any improvements or issues you've noticed ?I'm curious to hear your experience with it...