🌐
DeepSeek
deepseek.com › en
DeepSeek
© 2025 DeepSeek. All rights reserved.浙ICP备2023025841号浙B2-20250178浙公网安备33010502011812号 · ResearchDeepSeek R1DeepSeek V3DeepSeek Coder V2DeepSeek VLDeepSeek V2DeepSeek CoderDeepSeek MathDeepSeek LLM

Chinese artificial intelligence company

Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., doing business as DeepSeek, is a Chinese artificial intelligence (AI) company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, DeepSeek is … Wikipedia
Factsheet
Native name 杭州深度求索人工智能基础技术研究有限公司
Company type Private
Factsheet
Native name 杭州深度求索人工智能基础技术研究有限公司
Company type Private
🌐
Hugging Face
huggingface.co › deepseek-ai › DeepSeek-R1
deepseek-ai/DeepSeek-R1 · Hugging Face
1 month ago - DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
Discussions

How was DeepSeek-R1 built; For dummies
Just wrote about it, it's absolutely great, and the less is more will definitely redefine AI as we know it More on reddit.com
🌐 r/LLMDevs
59
876
January 27, 2025
Notes on Deepseek r1: Just how good it is compared to OpenAI o1
Aside from the LLM model itself, this shown that OpenAI isn't that ahead anymore from others, I mean, OpenAI still has the money and the hype, but 1 year ago, no one could beat them. The game has changed, surely. Of course OpenAI is gonna make moves, but this is a huge W for LLM in general More on reddit.com
🌐 r/LocalLLaMA
486
1266
October 25, 2024
🌐
arXiv
arxiv.org › abs › 2501.12948
[2501.12948] DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
January 22, 2025 - DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrates remarkable reasoning capabilities.
🌐
GitHub
github.com › deepseek-ai › DeepSeek-R1
GitHub - deepseek-ai/DeepSeek-R1
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
Starred by 91.6K users
Forked by 11.8K users
🌐
MIT Technology Review
technologyreview.com › artificial intelligence › quantum physicists have shrunk and “de-censored” deepseek r1
Quantum physicists have shrunk and “de-censored” DeepSeek R1 | MIT Technology Review
November 19, 2025 - A group of quantum physicists claims to have created a version of the powerful reasoning AI model DeepSeek R1 that strips out the censorship built into the original by its Chinese creators.
🌐
Unsloth
docs.unsloth.ai › get-started › unsloth-model-catalog
Unsloth Model Catalog | Unsloth Documentation
4 days ago - DeepSeek-R1-0528 · R1-0528-Qwen3-8B · link · link · R1-0528 · link · — · Model · Variant · GGUF · Instruct (4-bit) DeepSeek-V3.1 · Terminus · link · V3.1 · link · DeepSeek-V3 · V3-0324 · link · — · V3 · link · — · DeepSeek-R1 · R1-0528 · link ·
Find elsewhere
🌐
DeepSeek
api-docs.deepseek.com › deepseek-r1 release 2025/01/20
DeepSeek-R1 Release | DeepSeek API Docs
🛠️ DeepSeek-R1: Technical Highlights · 📈 Large-scale RL in post-training · 🏆 Significant performance boost with minimal labeled data · 🔢 Math, code, and reasoning tasks on par with OpenAI-o1 · 📄 More details: https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf ·
🌐
Understanding AI
understandingai.org › p › the-best-chinese-open-weight-models
The best Chinese open-weight models — and the strongest US rivals
1 week ago - And DeepSeek R1 powered the first consumer-facing chatbot to show the full chain of thought before answering.
🌐
YouTube
youtube.com › watch
DeepSeek R1 - Everything you need to know - YouTube
Greg's DeepSeek Cheetsheet: From Installation to Expert Prompting: https://www.gregisenberg.com/deepseekRay Fernando, a former Apple engineer, gives an in-de...
Published   January 29, 2025
🌐
Nature
nature.com › articles › article
DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning | Nature
September 17, 2025 - A new artificial intelligence model, DeepSeek-R1, is introduced, demonstrating that the reasoning abilities of large language models can be incentivized through pure reinforcement learning, removing the need for human-annotated demonstrations.
🌐
Ollama
ollama.com › library › deepseek-r1
deepseek-r1
DeepSeek-R1 is a family of open reasoning models with performance approaching that of leading models, such as O3 and Gemini 2.5 Pro.
🌐
DeepSeek
deepseek-r1.com
DeepSeek R1 Online (Free|Nologin)
DeepSeek R1 Online (Free|nologin) is Open-Source AI Model for Advanced Reasoning that beats Openai o1
🌐
Reddit
reddit.com › r/llmdevs › how was deepseek-r1 built; for dummies
r/LLMDevs on Reddit: How was DeepSeek-R1 built; For dummies
January 27, 2025 -

Over the weekend I wanted to learn how was DeepSeek-R1 trained, and what was so revolutionary about it. So I ended up reading the paper, and wrote down my thoughts. < the article linked is (hopefully) written in a way that it's easier for everyone to understand it -- no PhD required!

Here's a "quick" summary:

1/ DeepSeek-R1-Zero is trained with pure-reinforcement learning (RL), without using labeled data. It's the first time someone tried and succeeded doing that. (that we know of, o1 report didn't show much)

2/ Traditional RL frameworks (like PPO) have something like an 'LLM coach or critic' that tells the model whether the answer was good or bad -- based on given examples (labeled data). DeepSeek uses GRPO, a pure-RL framework that skips the critic and calculates the group average of LLM answers based on predefined rules

3/ But, how can you evaluate the performance if you don't have labeled data to test against it? With this framework, the rules aren't perfect—they’re just a best guess at what "good" looks like. The RL process tries to optimize on things like:

Does the answer make sense? (Coherence)

Is it in the right format? (Completeness)

Does it match the general style we expect? (Fluency)

For example, for the DeepSeek-R1-Zero model, for mathematical tasks, the model could be rewarded for producing outputs that align to mathematical principles or logical consistency.

It makes sense.. and it works... to some extent!

4/ This model (R1-Zero) had issues with poor readability and language mixing -- something that you'd get from using pure-RL. So, the authors wanted to go through a multi-stage training process and do something that feels like hacking various training methods:

5/ What you see above is the DeepSeek-R1 model that goes through a list of training methods for different purposes

(i) the cold start data lays a structured foundation fixing issues like poor readability
(ii) pure-RL develops reasoning almost on auto-pilot
(iii) rejection sampling + SFT works with top-tier training data that improves accuracy, and
(iv) another final RL stage ensures additional level of generalization.

And with that they're doing as good as or better than o1 models.

Lmk if you have any questions (i might be able to answer them).

🌐
Together AI
together.ai › models › deepseek-r1
DeepSeek-R1-0528 API | Together AI
Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations.", ] response = client.rerank.create( model="deepseek-ai/DeepSeek-R1", query=query, documents=documents, top_n=2 ) for result in response.results: print(f"Relevance Score: {result.relevance_score}") from together import Together client = Together() response = client.embeddings.create( model = "deepseek-ai/DeepSeek-R1", input = "Our solar system orbits the Milky Way galaxy at about 515,000 mph" )
🌐
Reddit
reddit.com › r/localllama › notes on deepseek r1: just how good it is compared to openai o1
r/LocalLLaMA on Reddit: Notes on Deepseek r1: Just how good it is compared to OpenAI o1
October 25, 2024 -

Finally, there is a model worthy of the hype it has been getting since Claude 3.6 Sonnet. Deepseek has released something anyone hardly expected: a reasoning model on par with OpenAI’s o1 within a month of the v3 release, with an MIT license and 1/20th of o1’s cost.

This is easily the best release since GPT-4. It's wild; the general public seems excited about this, while the big AI labs are probably scrambling. It feels like things are about to speed up in the AI world. And it's all thanks to this new DeepSeek-R1 model and how they trained it. 

Some key details from the paper

  • Pure RL (GRPO) on v3-base to get r1-zero. (No Monte-Carlo Tree Search or Process Reward Modelling)

  • The model uses “Aha moments” as pivot tokens to reflect and reevaluate answers during CoT.

  • To overcome r1-zero’s readability issues, v3 was SFTd on cold start data.

  • Distillation works, small models like Qwen and Llama trained over r1 generated data show significant improvements.

Here’s an overall r0 pipeline

  • v3 base + RL (GRPO) → r1-zero

r1 training pipeline.

  1. DeepSeek-V3 Base + SFT (Cold Start Data) → Checkpoint 1

  2. Checkpoint 1 + RL (GRPO + Language Consistency) → Checkpoint 2

  3. Checkpoint 2 used to Generate Data (Rejection Sampling)

  4. DeepSeek-V3 Base + SFT (Generated Data + Other Data) → Checkpoint 3

  5. Checkpoint 3 + RL (Reasoning + Preference Rewards) → DeepSeek-R1

We know the benchmarks, but just how good is it?

Deepseek r1 vs OpenAI o1.

So, for this, I tested r1 and o1 side by side on complex reasoning, math, coding, and creative writing problems. These are the questions that o1 solved only or by none before.

Here’s what I found:

  • For reasoning, it is much better than any previous SOTA model until o1. It is better than o1-preview but a notch below o1. This is also shown in the ARC AGI bench.

  • Mathematics: It's also the same for mathematics; r1 is a killer, but o1 is better.

  • Coding: I didn’t get to play much, but on first look, it’s up there with o1, and the fact that it costs 20x less makes it the practical winner.

  • Writing: This is where R1 takes the lead. It gives the same vibes as early Opus. It’s free, less censored, has much more personality, is easy to steer, and is very creative compared to the rest, even o1-pro.

What interested me was how free the model sounded and thought traces were, akin to human internal monologue. Perhaps this is because of the less stringent RLHF, unlike US models.

The fact that you can get r1 from v3 via pure RL was the most surprising.

For in-depth analysis, commentary, and remarks on the Deepseek r1, check out this blog post: Notes on Deepseek r1

What are your experiences with the new Deepseek r1? Did you find the model useful for your use cases?

🌐
Hacker News
news.ycombinator.com › item
For those who haven't realized it yet, Deepseek-R1 is better than claude 3.5 and... | Hacker News
January 31, 2025 - It is simply smarter -- a lot less stupid, more careful, more astute, more aware, more meta-aware, etc · We know that Anthropic and OpenAI and Meta are panicking. They should be. The bar is a lot higher now
🌐
Microsoft Azure
azure.microsoft.com › blog home › ai + machine learning › deepseek r1 is now available on azure ai foundry and github
DeepSeek R1 is now available on Azure AI Foundry and GitHub | Microsoft Azure Blog
March 26, 2025 - DeepSeek R1 is now available in the model catalog on Azure AI Foundry and GitHub, joining a diverse portfolio of over 1,800 models, including frontier, open-source, industry-specific, and task-based AI models.
🌐
Replicate
replicate.com › deepseek-ai › deepseek-r1
deepseek-ai/deepseek-r1 | Run with an API on Replicate
DeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning.
🌐
Medium
medium.com › @soaltinuc › deepseek-r1-the-open-source-ai-changing-the-game-in-technology-15132b99b9d7
DeepSeek-R1: The Open-Source AI Changing the Game in Technology | by Sinan Onur ALTINUÇ | Medium
January 27, 2025 - DeepSeek-R1 represents another milestone in the history of open-source AI, offering superior skills such as reasoning efficiency, problem-solving capabilities, and cost-effectiveness, alongside a widely available alternative compared to proprietary ...
🌐
Built In
builtin.com › artificial-intelligence › deepseek-r1
What Is DeepSeek-R1? | Built In
R1 is also open sourced under an MIT license, allowing free commercial and academic use. DeepSeek-R1, or R1, is an open source language model made by Chinese AI startup DeepSeek that can perform the same text-based tasks as other advanced models, ...
Published   October 6, 2025