🌐
Hugging Face
huggingface.co › deepseek-ai › DeepSeek-R1
deepseek-ai/DeepSeek-R1 · Hugging Face
For all our models, the maximum generation length is set to 32,768 tokens. For benchmarks requiring sampling, we use a temperature of $0.6$, a top-p value of $0.95$, and generate 64 responses per query to estimate pass@1.
🌐
GitHub
github.com › deepseek-ai › DeepSeek-R1
GitHub - deepseek-ai/DeepSeek-R1
To support the research community, we have open-sourced DeepSeek-R1-Zero, DeepSeek-R1, and six dense models distilled from DeepSeek-R1 based on Llama and Qwen. DeepSeek-R1-Distill-Qwen-32B outperforms OpenAI-o1-mini across various benchmarks, ...
Starred by 91.6K users
Forked by 11.8K users
People also ask

Will TextCortex use our data to train AI models?
Absolutely not. None of your data will ever be used for model training or fine-tuning.
🌐
textcortex.com
textcortex.com › home › blog posts › deepseek r1 review: performance in benchmarks & evals
DeepSeek R1 Review: Performance in Benchmarks & Evals
Is TextCortex AI free?
Yes, TextCortex is completely free to use with many of its core features. When you sign up, you receive 100 free creations. Then you will receive 20 recurring creations every day on the free plan.
🌐
textcortex.com
textcortex.com › home › blog posts › deepseek r1 review: performance in benchmarks & evals
DeepSeek R1 Review: Performance in Benchmarks & Evals
Does TextCortex offer Text Generation API?
Yes, we have a Text Generation API, please talk to us directly to implement it. You can reach out to us at [email protected]
🌐
textcortex.com
textcortex.com › home › blog posts › deepseek r1 review: performance in benchmarks & evals
DeepSeek R1 Review: Performance in Benchmarks & Evals
🌐
arXiv
arxiv.org › html › 2501.12948v1
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
October 13, 2025 - For engineering-related tasks, ... achieves outstanding results, significantly outperforming DeepSeek-V3 with scores of 90.8% on MMLU, 84.0% on MMLU-Pro, and 71.5% on GPQA Diamond....
🌐
Prompt Hub
prompthub.us › blog › deepseek-r-1-model-overview-and-how-it-ranks-against-openais-o1
PromptHub Blog: DeepSeek R-1 Model Overview and How it Ranks Against OpenAI's o1
... Temperature: 0.6. Top-p value: 0.95. Pass@1 estimation: Generated 64 responses per query. ... DeepSeek R1 outperformed o1, Claude 3.5 Sonnet and other models in the majority of reasoning benchmarks
🌐
Medium
medium.com › @leucopsis › deepseeks-new-r1-0528-performance-analysis-and-benchmark-comparisons-6440eac858d6
DeepSeek’s New R1–0528: Performance Analysis and Benchmark Comparisons | by Barnacle Goose | Medium
May 30, 2025 - Improvements are also dramatic in coding benchmarks. On the Codeforces programming challenge, R1–0528’s rating is about 1930, up from ~1530 before- a ~400-point leap that reflects far better code generation and problem-solving.
🌐
DataCamp
datacamp.com › blog › deepseek-r1
DeepSeek-R1: Features, o1 Comparison, Distilled Models & More | DataCamp
June 4, 2025 - DeepSeek-R1 performs strongly with a score of 49.2%, slightly ahead of OpenAI o1-1217’s 48.9%. This result positions DeepSeek-R1 as a strong contender in specialized reasoning tasks like software verification.
🌐
TextCortex
textcortex.com › home › blog posts › deepseek r1 review: performance in benchmarks & evals
DeepSeek R1 Review: Performance in Benchmarks & Evals
According to the same benchmark, ... and coding performance, it has a score of 96.3 in the Codeforce benchmark, 71.5 in the GPQA-diamond benchmark, and 97.3 in the MATH-500 benchmark....
🌐
Artificial Analysis
artificialanalysis.ai › models › deepseek-r1
DeepSeek R1 0528 - Intelligence, Performance & Price Analysis
Analysis of DeepSeek's DeepSeek R1 0528 (May '25) and comparison to other AI models across key metrics including quality, price, performance (tokens per second & time to first token), context window & more.
Find elsewhere
🌐
Fireworks AI
fireworks.ai › blog › deepseek-r1-deepdive
DeepSeek-R1 Overview: Features, Capabilities, Parameters
Reasoning Tasks: Shows performance on par with OpenAI’s o1 model across complex reasoning benchmarks. Image source: DeepSeek R1 Research Paper (Modified)
🌐
GitHub
github.com › paradite › deepseek-r1-speed-benchmark
GitHub - paradite/deepseek-r1-speed-benchmark: Code for benchmarking the speed of DeepSeek R1 from different providers' APIs.
DeepSeek : Speed: 24.34 tokens/s, Total: 424 tokens, Prompt: 12 tokens, Completion: 412 tokens, Time: 16.93s, Latency: 1.11s, Length: 1904 chars Fireworks : Speed: 21.41 tokens/s, Total: 373 tokens, Prompt: 10 tokens, Completion: 363 tokens, ...
Author   paradite
🌐
Vals AI
vals.ai › models › fireworks_deepseek-r1
DeepSeek R1
Accuracy · 69.64% Latency · 77.95s · Cost (In/Out) 1.35 / 5.4 · Context Window · 164k · Max Output Tokens · 164k · Input Modality · Benchmarks · Accuracy · Rankings · CaseLaw (v2) 0.0% 57/ 57 ·
🌐
Reddit
reddit.com › r/localllama › deepseek-r1-0528 official benchmarks released!!!
r/LocalLLaMA on Reddit: DeepSeek-R1-0528 Official Benchmarks Released!!!
May 29, 2025 - AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking. We believe that the chain-of-thought from DeepSeek-R1-0528 will hold significant importance for both academic research on reasoning models and industrial ...
🌐
Hugging Face
huggingface.co › deepseek-ai › DeepSeek-R1-0528
deepseek-ai/DeepSeek-R1-0528 · Hugging Face
This model achieves state-of-the-art (SOTA) performance among open-source models on the AIME 2024, surpassing Qwen3 8B by +10.0% and matching the performance of Qwen3-235B-thinking.
🌐
ARC Prize
arcprize.org › blog › r1-zero-r1-results-analysis
An Analysis of DeepSeek's R1-Zero and R1
Last week, DeepSeek published their new R1-Zero and R1 “reasoner” systems that is competitive with OpenAI’s o1 system on ARC-AGI-1. R1-Zero, R1, and o1 (low compute) all score around 15-20% – in contrast to GPT-4o’s 5%, the pinnacle ...
🌐
Ollama
ollama.com › library › deepseek-r1:8b
deepseek-r1:8b
In this update, DeepSeek R1 has significantly improved its reasoning and inference capabilities. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic.
🌐
DeepSeek
api-docs.deepseek.com › deepseek-r1-lite release 2024/11/20
🚀 DeepSeek-R1-Lite-Preview is now live: unleashing supercharged reasoning power! | DeepSeek API Docs
Longer Reasoning, Better Performance. DeepSeek-R1-Lite-Preview shows steady score improvements on AIME as thought length increases.
🌐
Ollama
ollama.com › library › deepseek-r1
deepseek-r1
In this update, DeepSeek R1 has significantly improved its reasoning and inference capabilities. The model has demonstrated outstanding performance across various benchmark evaluations, including mathematics, programming, and general logic.
🌐
Analytics Vidhya
analyticsvidhya.com › home › deepseek r1 vs openai o1: which one is faster, cheaper and smarter?
DeepSeek R1 vs OpenAI o1: Which One is Faster, Cheaper and Smarter?
April 4, 2025 - Benchmark Excellence: R1 matches OpenAI o1 in key tasks, with some areas of clear outperformance. While DeepSeek R1 builds upon the collective work of open-source research, its efficiency and performance demonstrate how creativity and strategic ...
🌐
PromptLayer
blog.promptlayer.com › openai-o3-vs-deepseek-r1-an-analysis-of-reasoning-models
OpenAI o3 vs DeepSeek r1: Which Reasoning Model is Best?
January 28, 2025 - SWE-bench Verified Benchmark: O3 scored 71.7%, surpassing DeepSeek R1 (49.2%) and OpenAI's O1 (48.9%).