The web interface, as well as the official API, use the actual, full, 671B R1 model. ALL other "R1" models are finetunes of existing architectures (Llama 3.3 and Qwen 2.5). Answer from Zalathustra on reddit.com
🌐
Hugging Face
huggingface.co › deepseek-ai › DeepSeek-R1
deepseek-ai/DeepSeek-R1 · Hugging Face
1 month ago - DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models. NOTE: Before running DeepSeek-R1 series models locally, we kindly recommend reviewing the Usage Recommendation section.
Discussions

Yes, you can run DeepSeek-R1 locally on your device (20GB RAM min.)
Props for your work! sum of your VRAM+CPU = 80GB+ This should read "VRAM+RAM", shouldn't it? More on reddit.com
🌐 r/selfhosted
683
2125
February 13, 2025
Hardware requirements for running the full size deepseek R1 with ollama?
Step 1: Acquire ~25 4090's Step 2: Network them together Step 3: Start homebrew nuclear cold-fusion reactor Step 4: Plug into cluster More on reddit.com
🌐 r/ollama
56
60
October 1, 2024
🌐
Snowkylin
snowkylin.github.io › blogs › a-note-on-deepseek-r1.html
A Note on DeepSeek R1 Deployment
January 30, 2025 - The original DeepSeek R1 671B model is 720GB in size, which is huge. Even a $200k monster like NVIDIA DGX H100 (with 8xH100) can barely hold it. Here I use Unsloth AI’s dynamically quantized version, which selectively quantize a few important layers to higher bits, while leaving most of the ...
🌐
Fireworks AI
fireworks.ai › blog › deepseek-r1-deepdive
DeepSeek-R1 Overview: Features, Capabilities, Parameters
Operational expenses are estimated ... spend on OpenAI’s o1 model**.** Cost of running DeepSeek R1 on Fireworks AI is $8/ 1 M token (both input & output), whereas, running OpenAI o1 model costs $15/ 1M input tokens and $60/ ...
🌐
GitHub
github.com › deepseek-ai › DeepSeek-R1
GitHub - deepseek-ai/DeepSeek-R1
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen.
Starred by 91.6K users
Forked by 11.8K users
🌐
DeepSeek
api-docs.deepseek.com › models & pricing
Models & Pricing | DeepSeek API Docs
The prices listed below are in units of per 1M tokens. A token, the smallest unit of text that the model recognizes, can be a word, a number, or even a punctuation mark. We will bill based on the total number of input and output tokens by the model.
🌐
Towards AI
pub.towardsai.net › deepseek-r1-model-architecture-853fefac7050
DeepSeek-R1: Model Architecture. This article provides an in-depth… | by Shakti Wadekar | Towards AI
March 13, 2025 - Efficient Learning — Experts specialize in different aspects of data, improving generalization. Computation Savings — Since only a subset of experts is used per token, MoE models are cheaper to run compared to dense models of the same size. DeepSeek-V3/R1 has 671 billion total parameters, of ...
Find elsewhere
🌐
OpenRouter
openrouter.ai › deepseek › deepseek-r1
R1 - API, Providers, Stats | OpenRouter
DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
🌐
DEV Community
dev.to › askyt › deepseek-r1-671b-complete-hardware-requirements-optimal-deployment-setup-2e48
DeepSeek-R1 671B: Complete Hardware Requirements - DEV Community
February 6, 2025 - With 671 billion parameters, it matches the performance of leading models like OpenAI’s GPT-4, excelling in tasks such as mathematics, coding, and complex reasoning. The model was trained using 2,048 NVIDIA H800 GPUs over approximately two ...
🌐
Ollama
ollama.com › library › deepseek-r1
deepseek-r1
DeepSeek-R1 has received a minor version upgrade to DeepSeek-R1-0528 for the 8 billion parameter distilled model and the full 671 billion parameter model. In this update, DeepSeek R1 has significantly improved its reasoning and inference ...
🌐
BentoML
bentoml.com › blog › the-complete-guide-to-deepseek-models-from-v3-to-r1-and-beyond
The Complete Guide to DeepSeek Models: V3, R1, V3.1, V3.2 and Beyond
Instead of training new models from scratch, DeepSeek took a smart shortcut: Started with 6 open-source models from Llama 3.1/3.3 and Qwen 2.5 · Generated 800,000 high-quality reasoning samples using R1 · Fine-tuned the smaller models on these synthetic reasoning data · Unlike R1, these distilled models rely solely on SFT and they do not include an RL stage. Despite their smaller size, these models perform remarkably well on reasoning tasks, proving that large-scale AI reasoning can be efficiently distilled.
🌐
Reddit
reddit.com › r/selfhosted › yes, you can run deepseek-r1 locally on your device (20gb ram min.)
r/selfhosted on Reddit: Yes, you can run DeepSeek-R1 locally on your device (20GB RAM min.)
February 13, 2025 -

I've recently seen some misconceptions that you can't run DeepSeek-R1 locally on your own device. Last weekend, we were busy trying to make you guys have the ability to run the actual R1 (non-distilled) model with just an RTX 4090 (24GB VRAM) which gives at least 2-3 tokens/second.

Over the weekend, we at Unsloth (currently a team of just 2 brothers) studied R1's architecture, then selectively quantized layers to 1.58-bit, 2-bit etc. which vastly outperforms basic versions with minimal compute.

  1. We shrank R1, the 671B parameter model from 720GB to just 131GB (a 80% size reduction) whilst making it still fully functional and great

  2. No the dynamic GGUFs does not work directly with Ollama but it does work on llama.cpp as they support sharded GGUFs and disk mmap offloading. For Ollama, you will need to merge the GGUFs manually using llama.cpp.

  3. Minimum requirements: a CPU with 20GB of RAM (but it will be very slow) - and 140GB of diskspace (to download the model weights)

  4. Optimal requirements: sum of your VRAM+RAM= 80GB+ (this will be somewhat ok)

  5. No, you do not need hundreds of RAM+VRAM but if you have it, you can get 140 tokens per second for throughput & 14 tokens/s for single user inference with 2xH100

  6. Our open-source GitHub repo: github.com/unslothai/unsloth

Many people have tried running the dynamic GGUFs on their potato devices and it works very well (including mine).

R1 GGUFs uploaded to Hugging Face: huggingface.co/unsloth/DeepSeek-R1-GGUF

To run your own R1 locally we have instructions + details: unsloth.ai/blog/deepseekr1-dynamic

🌐
Tenable®
tenable.com › blog › frequently-asked-questions-about-deepseek-large-language-model-llm-v3-r1
Frequently Asked Questions About DeepSeek Large Language Model (LLM) - Blog | Tenable®
August 5, 2025 - DeepSeek R1 has 671 billion parameters and requires multiple expensive high-end GPUs to run. There are distilled versions of the model starting at 1.5 billion parameters, going all the way up to 70 billion parameters.
🌐
LM Studio
lmstudio.ai › blog › deepseek-r1
DeepSeek R1: open source reasoning model | LM Studio Blog
January 29, 2025 - According to several popular reasoning benchmarks like AIME 2024, MATH-500, and CodeForces, the open-source flagship 671B parameter DeepSeek-R1 model performs comparably to OpenAI's full-sized o1 reasoning model.
🌐
Built In
builtin.com › artificial-intelligence › deepseek-r1
What Is DeepSeek-R1? | Built In
DeepSeek-R1 has 671 billion parameters in total. But DeepSeek also released six “distilled” versions of R1, ranging in size from 1.5 billion parameters to 70 billion parameters.
Published   October 6, 2025
🌐
GitHub
github.blog › home › changelogs › deepseek-r1 is now available in github models (public preview)
DeepSeek-R1 is now available in GitHub Models (Public Preview) - GitHub Changelog
April 30, 2025 - DeepSeek-R1 is a 671B parameter AI model designed to enhance deep learning, natural language processing, and computer vision capabilities.