๐ŸŒ
Fireworks AI
fireworks.ai โ€บ blog โ€บ deepseek-r1-deepdive
DeepSeek-R1 Overview: Features, Capabilities, Parameters
Image source: DeepSeek R1 Research Paper (Modified) Despite having a massive 671 billion parameters in total, only 37 billion are activated per forward pass, making DeepSeek R1 more resource-efficient than most similarly large models.
๐ŸŒ
BentoML
bentoml.com โ€บ blog โ€บ the-complete-guide-to-deepseek-models-from-v3-to-r1-and-beyond
The Complete Guide to DeepSeek Models: V3, R1, V3.1, V3.2 and Beyond
Despite their smaller size, these ... open-sourced all six distilled models, and released their model weights, ranging from 1.5B to 70B parameters......
๐ŸŒ
OpenRouter
openrouter.ai โ€บ deepseek โ€บ deepseek-r1 โ€บ parameters
DeepSeek: R1 | OpenRouter
DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
๐ŸŒ
Tomas Hensrud Gulla
gulla.net โ€บ en โ€บ blog โ€บ run-deepseek-r1-locally-with-all-671-billion-parameters
Run DeepSeek R1 Locally โ€“ With All 671 Billion Parameters - Tomas Hensrud Gulla
Ok, I get it, a quantized model of only 130GB isn't really the full model. Ollama's model library seem to include a full version of DeepSeek R1. It's 404GB with all 671 billion parameters โ€“ that should be real enough, right?
๐ŸŒ
OpenRouter
openrouter.ai โ€บ deepseek โ€บ deepseek-r1
R1 - API, Providers, Stats | OpenRouter
DeepSeek R1 is here: Performance on par with OpenAI o1, but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active in an inference pass.
๐ŸŒ
Milvus
milvus.io โ€บ ai-quick-reference โ€บ what-is-the-parameter-count-of-deepseeks-r1-model
What is the parameter count of DeepSeek's R1 model?
DeepSeek's R1 model is a Mixture of Experts (MoE) architecture with a total parameter count of 145 billion, of which app
๐ŸŒ
NVIDIA
build.nvidia.com โ€บ deepseek-ai โ€บ deepseek-r1 โ€บ modelcard
deepseek-r1 Model by Deepseek-ai | NVIDIA NIM
See the official DeepSeek-R1 Model Card on Hugging Face for further details. GOVERNING TERMS: This trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Community Model License. Additional Information: MIT License. ... Distilled Models: Smaller, fine-tuned versions based on Qwen and Llama architectures. ... Input Type(s): Text Input Format(s): String Input Parameters: (1D) Other Properties Related to Input: DeepSeek recommends adhering to the following configurations when utilizing the DeepSeek-R1 series models, including benchmarking, to achieve the expected performance:
Find elsewhere
๐ŸŒ
Ollama
ollama.com โ€บ library โ€บ deepseek-r1
deepseek-r1
DeepSeek-R1 has received a minor version upgrade to DeepSeek-R1-0528 for the 8 billion parameter distilled model and the full 671 billion parameter model. In this update, DeepSeek R1 has significantly improved its reasoning and inference ...
๐ŸŒ
Hugging Face
huggingface.co โ€บ deepseek-ai โ€บ DeepSeek-R1
deepseek-ai/DeepSeek-R1 ยท Hugging Face
DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen.
๐ŸŒ
Medium
medium.com โ€บ @isaakmwangi2018 โ€บ a-simple-guide-to-deepseek-r1-architecture-training-local-deployment-and-hardware-requirements-300c87991126
A Simple Guide to DeepSeek R1: Architecture, Training, Local Deployment, and Hardware Requirements | by Isaak Kamau | Medium
January 23, 2025 - It features 671 billion parameters, utilizing a mixture-of-experts (MoE) architecture where each token activates parameters equivalent to 37 billion. This model showcases emergent reasoning behaviors, such as self-verification, reflection, and ...
๐ŸŒ
Unsloth
unsloth.ai โ€บ blog โ€บ deepseekr1-dynamic
Run DeepSeek-R1 Dynamic 1.58-bit
We explored how to enable more local users to run it & managed to quantize DeepSeekโ€™s R1 671B parameter model to 131GB in size, a 80% reduction in size from the original 720GB, whilst being very functional.
๐ŸŒ
Reddit
reddit.com โ€บ r/localllama โ€บ deepseek app - how many parameters is the model?
r/LocalLLaMA on Reddit: Deepseek App - How many parameters is the model?
January 30, 2025 -

So, it looks like Sambanova is going to be removing access to Llama 3.1 Instruct 405B for free soon, and with the release of deepseek R1, and the wide array of models they have released, it makes me wonder how many paramters the model in the app is using.

I cant find a clear answer - albeit I didn't look for TOO long. Sambanova was clearly flexing their tech by offering Llama 3.1 Instruct 405B for free at over 100 token/second - a marketing ploy. Makes sense, because to offer a model that big for free would take serious resources.

Resources I'm not sure Deepseek has, in spite of their impressive model and hedgefund daddies.

OR maybe i'm wrong, and they want to throw some weight around and put the big 671B model out for free for the whole world to see in the app. I don't think they want to burn cash like that... but maybe i'm wrong...

Anybody have any insight into how many parameters the models on the deepseek app are, that are available for public use in their free offering?

๐ŸŒ
Cerebras
cerebras.ai โ€บ blog โ€บ cerebras-launches-worlds-fastest-deepseek-r1-llama-70b-inference
Cerebras Launches World Fastest DeepSeek R1 Llama-70B Inference - Cerebras
March 24, 2025 - Despite its relatively compact 70B parameter size, DeepSeek R1 Llama-70B outperforms both GPT-4o and o1-mini on challenging mathematics and coding tasks. However, reasoning models like o1 and R1 typically require minutes to compute their answers, ...
๐ŸŒ
LM Studio
lmstudio.ai โ€บ blog โ€บ deepseek-r1
DeepSeek R1: open source reasoning model | LM Studio Blog
January 29, 2025 - According to several popular reasoning benchmarks like AIME 2024, MATH-500, and CodeForces, the open-source flagship 671B parameter DeepSeek-R1 model performs comparably to OpenAI's full-sized o1 reasoning model.