deepseek-r1-zero huggingface - Brave Search

huggingface.co › deepseek-ai › DeepSeek-R1

deepseek-ai/DeepSeek-R1 · Hugging Face

DeepSeek-R1-Zero demonstrates capabilities such as self-verification, reflection, and generating long CoTs, marking a significant milestone for the research community. Notably, it is the first open research to validate that reasoning capabilities ...

huggingface.co › deepseek-ai › DeepSeek-R1-Zero

deepseek-ai/DeepSeek-R1-Zero · Hugging Face

DeepSeek-R1-Zero demonstrates capabilities such as self-verification, reflection, and generating long CoTs, marking a significant milestone for the research community. Notably, it is the first open research to validate that reasoning capabilities ...

Videos

Deploying Deepseek R1 on Hugging Face - YouTube

How to Run the Deepseek R1 LLM on Windows (7b mini) #ollama ...

January 29, 2025

Hugging Face Journal Club - DeepSeek R1 - YouTube

January 22, 2025

ArrrZero: Why DeepSeek R1 is less important than R1-Zero - YouTube

January 31, 2025

DeepSeek R1 Theory Overview | GRPO + RL + SFT - YouTube

January 31, 2025

DeepSeek-R1 vs DeepSeek-R1-Zero - YouTube

January 20, 2025

huggingface.co › unsloth › DeepSeek-R1-Zero

unsloth/DeepSeek-R1-Zero · Hugging Face

DeepSeek-R1-Zero demonstrates capabilities such as self-verification, reflection, and generating long CoTs, marking a significant milestone for the research community. Notably, it is the first open research to validate that reasoning capabilities ...

huggingface.co › unsloth › DeepSeek-R1-Zero-GGUF

unsloth/DeepSeek-R1-Zero-GGUF · Hugging Face

DeepSeek-R1-Zero demonstrates capabilities such as self-verification, reflection, and generating long CoTs, marking a significant milestone for the research community. Notably, it is the first open research to validate that reasoning capabilities ...

github.com › deepseek-ai › DeepSeek-R1

GitHub - deepseek-ai/DeepSeek-R1

DeepSeek-R1-Zero · 671B · 37B · 128K · 🤗 HuggingFace · DeepSeek-R1 · 671B · 37B · 128K · 🤗 HuggingFace · DeepSeek-R1-Zero & DeepSeek-R1 are trained based on DeepSeek-V3-Base. For more details regarding the model architecture, please refer to DeepSeek-V3 repository.

Starred by 91.6K users

Forked by 11.8K users

huggingface.co › unsloth › DeepSeek-R1-GGUF

unsloth/DeepSeek-R1-GGUF · Hugging Face

DeepSeek-R1-Zero demonstrates capabilities such as self-verification, reflection, and generating long CoTs, marking a significant milestone for the research community. Notably, it is the first open research to validate that reasoning capabilities ...

huggingface.co › deepseek-ai › DeepSeek-R1-Zero › tree › main

deepseek-ai/DeepSeek-R1-Zero at main

DeepSeek-R1-Zero · 689 GB · 5 contributors History: 27 commits · msr2000 · Small fix 7223428 9 months ago · figures · Release DeepSeek-R1 11 months ago · .gitattributes 1.52 kB · initial commit 11 months ago · LICENSE 1.06 kB · Release DeepSeek-R1 11 months ago ·

github.com › huggingface › open-r1

GitHub - huggingface/open-r1: Fully open reproduction of DeepSeek-R1

Step 2: replicate the pure RL pipeline that DeepSeek used to create R1-Zero.

Starred by 25.7K users

Forked by 2.4K users

Languages Python 89.4% | Shell 10.0% | Makefile 0.6%

huggingface.co › blog › open-r1

Open-R1: a fully open reproduction of DeepSeek-R1

DeepSeek also introduced two models: DeepSeek-R1-Zero and DeepSeek-R1, each with a distinct training approach. DeepSeek-R1-Zero skipped supervised fine-tuning altogether and relied entirely on reinforcement learning (RL), using Group Relative Policy Optimization (GRPO) to make the process more ...

Find elsewhere

Google Bing Mojeek

huggingface.co › deepseek-ai › DeepSeek-R1-Zero › discussions › 8

deepseek-ai/DeepSeek-R1-Zero · Thank you deepseek

The comment above is understated. I just glanced their DeepSeek R1 paper. They have: Proved that LLMs can be post-trained to reason with CoT without any human supervised data, with R1 zero.

huggingface.co › unsloth › DeepSeek-R1

unsloth/DeepSeek-R1 · Hugging Face

DeepSeek-R1-Zero demonstrates capabilities such as self-verification, reflection, and generating long CoTs, marking a significant milestone for the research community. Notably, it is the first open research to validate that reasoning capabilities ...

huggingface.co › learn › llm-course › en › chapter12 › 3

Understanding the DeepSeek R1 Paper - Hugging Face LLM Course

DeepSeek-R1-Zero: A model trained purely using reinforcement learning.

huggingface.co › deepseek-ai › DeepSeek-R1-Zero › commit › cfa4541c2f577dafd72f51f2b14119bde0f3c812

Release DeepSeek-R1 · deepseek-ai/DeepSeek-R1-Zero at cfa4541

deepseek_v3 · conversational · custom_code · text-generation-inference · fp8 · arXiv: 2501.12948 · License: mit · Model card Files Files and versions · xet Community · 24 · Train · Deploy · Use this model · msr2000 commited on Jan 20 · Commit · cfa4541 ·

huggingface.co › blog › NormalUhr › deepseek-r1-explained

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

DeepSeek-R1-Zero: A model that learns complex reasoning behaviors purely through reinforcement learning without any supervised fine-tuning, showing emergent abilities like extended chain-of-thought, reflection, and self-correction.

unsloth.ai › blog › deepseek-r1

Run Deepseek-R1 / R1 Zero

DeepSeek also distilled from R1 and fine-tuned it on Llama 3 and Qwen 2.5 models, meaning you can now also fine-tune the models out of the box with Unsloth. See our collection for all versions of the R1 model series including GGUF's, 4-bit and more! huggingface.co/collections/unsloth/deepseek-r1 Jan 27, 2025 update: We've released 1.58-bit Dynamic GGUFs for DeepSeek-R1 allowing you to run R1 even better with a 80% size reduction: 1.58-bit Dynamic R1 Feb 6, 2025 update: You can now train your own reasoning model like R1 using: GRPO + Unsloth

huggingface.co › deepseek-ai › DeepSeek-R1-Zero › tree › 945a40ef65921ea791b3d8232f7a8bb8c35a67f0 › figures

deepseek-ai/DeepSeek-R1-Zero at 945a40ef65921ea791b3d8232f7a8bb8c35a67f0

DeepSeek-R1-Zero / figures · Ctrl+K · Ctrl+K · 5 contributors History: 1 commit · msr2000 · Release DeepSeek-R1 cfa4541 5 months ago · benchmark.jpg · Safe 777 kB ·

reddit.com › r/localllama › deepseek r1 / r1 zero

r/LocalLLaMA on Reddit: Deepseek R1 / R1 Zero

January 20, 2025 - When asked for its name in chat.deepseek.com, it says DeepSeek R1. ... I use it from Open-WebUI via OpenRouter. Continue this thread ... Continue this thread Continue this thread Continue this thread Continue this thread Continue this thread ... There seems to be distilled versions as well: https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B

huggingface.co › deepseek-ai › DeepSeek-R1-Zero › discussions › 20

deepseek-ai/DeepSeek-R1-Zero · locally running ideas

deepseek_v3 · conversational · custom_code · text-generation-inference · fp8 · arxiv: 2501.12948 · License: mit · Model card Files Files and versions · xet Community · 24 · Train · Deploy · Use this model · #20 · by gopi87 - opened Jan 30 · Discussion ·

huggingface.co › deepseek-ai

deepseek-ai (DeepSeek)

Org profile for DeepSeek on Hugging Face, the AI community building the future.

huggingface.co › DevQuasar › deepseek-ai.DeepSeek-R1-Zero-GGUF

DevQuasar/deepseek-ai.DeepSeek-R1-Zero-GGUF · Hugging Face

Quantized version of: deepseek-ai/DeepSeek-R1-Zero · Downloads last month · 41 · GGUF · Model size · 671B params · Architecture · deepseek2 · Chat template · Hardware compatibility · Log In to view the estimation · 2-bit · Q2_K · 244 GB · 3-bit ·