Hugging Face
huggingface.co › deepseek-ai › DeepSeek-R1
deepseek-ai/DeepSeek-R1 · Hugging Face
For all our models, the maximum generation length is set to 32,768 tokens. For benchmarks requiring sampling, we use a temperature of $0.6$, a top-p value of $0.95$, and generate 64 responses per query to estimate pass@1.
GitHub
github.com › deepseek-ai › DeepSeek-R1
GitHub - deepseek-ai/DeepSeek-R1
To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, ...
Starred by 91.6K users
Forked by 11.8K users
Videos
15:10
DeepSeek R1 Fully Tested - Insane Performance - YouTube
DeepSeek R1 BLOWS AWAY The Competition - How Did ...
06:04
DeepSeek R1 0528 in 6 Minutes - YouTube
27:54
OpenAI’s O3 Mini vs. DeepSeek R1: Which One Wins? - YouTube
05:58
RTX 5090 vs 4090 for AI inference | Testing deepseek r1 performance ...
Will TextCortex use our data to train AI models?
Absolutely not. None of your data will ever be used for model training or fine-tuning.
textcortex.com
textcortex.com › home › blog posts › deepseek r1 review: performance in benchmarks & evals
DeepSeek R1 Review: Performance in Benchmarks & Evals
Is TextCortex AI free?
Yes, TextCortex is completely free to use with many of its core features. When you sign up, you receive 100 free creations. Then you will receive 20 recurring creations every day on the free plan.
textcortex.com
textcortex.com › home › blog posts › deepseek r1 review: performance in benchmarks & evals
DeepSeek R1 Review: Performance in Benchmarks & Evals
Does TextCortex offer Text Generation API?
Yes, we have a Text Generation API, please talk to us directly to implement it. You can reach out to us at [email protected]
textcortex.com
textcortex.com › home › blog posts › deepseek r1 review: performance in benchmarks & evals
DeepSeek R1 Review: Performance in Benchmarks & Evals
arXiv
arxiv.org › html › 2501.12948v1
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
October 13, 2025 - For engineering-related tasks, ... achieves outstanding results, significantly outperforming DeepSeek-V3 with scores of 90.8% on MMLU, 84.0% on MMLU-Pro, and 71.5% on GPQA Diamond....
GitHub
github.com › paradite › deepseek-r1-speed-benchmark
GitHub - paradite/deepseek-r1-speed-benchmark: Code for benchmarking the speed of DeepSeek R1 from different providers' APIs.
DeepSeek : Speed: 24.34 tokens/s, Total: 424 tokens, Prompt: 12 tokens, Completion: 412 tokens, Time: 16.93s, Latency: 1.11s, Length: 1904 chars Fireworks : Speed: 21.41 tokens/s, Total: 373 tokens, Prompt: 10 tokens, Completion: 363 tokens, ...
Author paradite
Towards Data Science
towardsdatascience.com › home › latest › how to benchmark deepseek-r1 distilled models on gpqa using ollama and openai’s simple-evals
How to Benchmark DeepSeek-R1 Distilled Models on GPQA Using Ollama and OpenAI’s simple-evals | Towards Data Science
April 24, 2025 - Additionally, I did not find any other tools or platforms, other than simple-evals, that provide a straightforward, Python-native way to run numerous key benchmarks, such as GPQA, particularly when working with Ollama. As part of the evaluation, I selected 20 random questions from the GPQA-Diamond 198-question set for the 14B distilled model to work on. The total time taken was 216 minutes, which is ~11 minutes per question. The outcome was admittedly disappointing, as it scored only 10%, far below the reported 73.3% score for the 671B DeepSeek-R1 model.
Ollama
ollama.com › library › deepseek-r1
deepseek-r1
In this update, DeepSeek R1 has significantly improved its reasoning and inference capabilities. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic.