Chinese artificial intelligence company

Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., doing business as DeepSeek, is a Chinese artificial intelligence (AI) company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, DeepSeek is … Wikipedia
Factsheet
Native name 杭州深度求索人工智能基础技术研究有限公司
Company type Private
Factsheet
Native name 杭州深度求索人工智能基础技术研究有限公司
Company type Private
🌐
Hugging Face
huggingface.co › deepseek-ai › DeepSeek-V3.1
deepseek-ai/DeepSeek-V3.1 · Hugging Face
3 weeks ago - Search agents are evaluated with our internal search framework, which uses a commercial search API + webpage filter + 128K context window. Seach agent results of R1-0528 are evaluated with a pre-defined workflow. SWE-bench is evaluated with our internal code agent framework. HLE is evaluated with the text-only subset. import transformers tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.1") messages = [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Who are you?"}, {"role": "assistant", "content": "<think>Hmm</think>I am
🌐
DeepSeek
api-docs.deepseek.com › deepseek v3.1 release 2025/08/21
DeepSeek-V3.1 Release | DeepSeek API Docs
🔹 Tokenizer & chat template updated — new tokenizer config: https://huggingface.co/deepseek-ai/DeepSeek-V3.1/blob/main/tokenizer_config.json
Discussions

DeepSeek v3.1
Qwen: Deepseek must have concluded that hybrid models are worse. Deepseek: Qwen must have cnocluded that hybrid models are better. More on reddit.com
🌐 r/LocalLLaMA
115
548
August 19, 2025
So.. What's the consensus on Deepseek-V3.1 for RP?
Agreed. 3.1 feels “technically correct” but flat like it stripped all the flavor out of the RP. R1 wasn’t perfect but it had way more life and descriptive flair More on reddit.com
🌐 r/SillyTavernAI
54
51
August 26, 2025
DeepSeek V3.1 (Thinking) aggregated benchmarks (vs. gpt-oss-120b)
That proves that benchmarks are barely usefull now. More on reddit.com
🌐 r/LocalLLaMA
70
203
August 21, 2025
deepseek-ai/DeepSeek-V3.1 · Hugging Face
OK, so here are my quick takes on DeepSeek V3.1. Improving agentic capability seems to be the focus of this update. More specifically: 29.8% on HLE with search and Python, compared to 24.8% for R1-0528, 35.2% for GPT-5 Thinking, 24.3% for o3, 38.6% for Grok 4, and 26.9% for Gemini Deep Research. Caveats apply: DeepSeek models are exclusively evaluated on text subset, although I believe this subset is not easier for SotA models. Grok 4 is (possibly) evaluated without a webpage filter so data contamination is possible. 66.0% on SWE-Bench Verified without Thinking, compared to 44.6% for R1-0528, 74.9% for GPT-5 Thinking, 69.1% for o3, 74.5% for Claude 4.1 Opus, and 65.8 for Kimi K2. Again, caveats apply: OpenAI models are evaluated on a subset of 477 problems, not the 500 full set. 31.3% on Terminal Bench with Terminus 1 framework, compared to 30.2% for o3, 30.0% for GPT-5, and 25.3% for Gemini 2.5 Pro. A slight bump on other coding and math capabilities (AIME, LiveCodeBench, Codeforces, Aider) but most users would not be able to tell the difference, as R1-0528 already destroys 98% of human programmers on competitive programming. A slight reduction on GPQA, HLE (offline, no tools), and maybe in your own use case. I do not find V3.1 Thinking to be better than R1-0528 as a Chat LLM, for example. A few concluding thoughts: Right now I am actually more worried about how the open-source ecosystem will be deploying DeepSeek V3.1 in an agentic environment more than anything else. For agentic LLMs, prompts and agent frameworks make a huge difference in user experience. Gemini, Anthropic, and OpenAI all have branded search and code agents (e.g. Deep Research, Claude Code), but DeepSeek has none. So it remains to be seen how well V3.1 can work with prompts and tools from Claude Code, for example. Maybe DeepSeek will open-source their internal search and coding framework in a future date to ensure the best user experience. I also noticed a lot of serverless LLM inference providers cheap out on their deployment. They may serve with lowered precision, pruned experts, or poor sampling parameters. So the provider you use will definitely impact your user experience. It also starts to make sense why they merged the R1 with V3 and made 128K context window the default on the API. Agentic coding usually does not benefit much from a long CoT but consume a ton of tokens. So a singular model is a good way to reduce deployment TCO. This is probably as far as they can push on the V3 base - you can already see some regression on things like GPQA, offline HLE. Hope to see V4 soon. More on reddit.com
🌐 r/LocalLLaMA
93
559
August 21, 2025
🌐
DeepSeek
deepseek.com › en
DeepSeek
3 weeks ago - 🎉 Launching DeepSeek-V3.2 — Reasoning-first models built for agents. Now available on web, app & API.
🌐
OpenRouter
openrouter.ai › deepseek › deepseek-chat-v3.1
DeepSeek V3.1 - API, Providers, Stats | OpenRouter
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. Run DeepSeek V3.1 with API
🌐
Together AI
together.ai › models › deepseek-v3-1
DeepSeek-V3.1 API | Together AI
Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations.", ] response = client.rerank.create( model="deepseek-ai/DeepSeek-V3.1", query=query, documents=documents, top_n=2 ) for result in response.results: print(f"Relevance Score: {result.relevance_score}")
Find elsewhere
🌐
NVIDIA
build.nvidia.com › deepseek-ai › deepseek-v3_1
deepseek-v3.1 Model by Deepseek-ai | NVIDIA NIM
DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.
🌐
DeepSeek
api-docs.deepseek.com › deepseek v3.1 update 2025/09/22
DeepSeek-V3.1-Terminus | DeepSeek API Docs
September 22, 2025 - The latest update builds on V3.1’s strengths while addressing key user feedback. ... 📊 DeepSeek-V3.1-Terminus delivers more stable & reliable outputs across benchmarks compared to the previous version.
🌐
YouTube
youtube.com › watch
DeepSeek V3.1: Bigger Than You Think! - YouTube
DeepSeek V3.1 is a unified hybrid reasoning open-weight model that powers agentic workflows—FP8 training, strong post-training for tool/function calling (non...
Published   August 22, 2025
🌐
Google Cloud Platform
console.cloud.google.com › vertex-ai › publishers › deepseek-ai › model-garden › deepseek-v3-1
DeepSeek-V3.1 – Vertex AI
Google Cloud Console has failed to load JavaScript sources from www.gstatic.com. Possible reasons are:www.gstatic.com or its IP addresses are blocked by your network administratorGoogle has temporarily blocked your account or network due to excessive automated requestsPlease contact your network ...
🌐
Hugging Face
huggingface.co › deepseek-ai › DeepSeek-V3.1-Terminus
deepseek-ai/DeepSeek-V3.1-Terminus · Hugging Face
3 weeks ago - For the model's chat template other than search agent, please refer to the DeepSeek-V3.1 repo.
🌐
DeepSeek
deepseek.ai › blog › deepseek-v31
DeepSeek V3.1: The New Frontier in Artificial Intelligence
DeepSeek AI is the leading provider of advanced AI language models and enterprise solutions. Experience state-of-the-art artificial intelligence technology for your business needs.
🌐
Hugging Face
huggingface.co › deepseek-ai › DeepSeek-V3.1-Base
deepseek-ai/DeepSeek-V3.1-Base · Hugging Face
3 weeks ago - Search agents are evaluated with our internal search framework, which uses a commercial search API + webpage filter + 128K context window. Seach agent results of R1-0528 are evaluated with a pre-defined workflow. SWE-bench is evaluated with our internal code agent framework. HLE is evaluated with the text-only subset. import transformers tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.1") messages = [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Who are you?"}, {"role": "assistant", "content": "<think>Hmm</think>I am
🌐
AWS
aws.amazon.com › blogs › aws › deepseek-v3-1-now-available-in-amazon-bedrock
DeepSeek-V3.1 model now available in Amazon Bedrock | Amazon Web Services
September 20, 2025 - AWS launches DeepSeek-V3.1 as a fully managed models in Amazon Bedrock. DeepSeek-V3.1 is a hybrid open weight model that switches between thinking mode for detailed step-by-step analysis and non-thinking mode for faster responses.
🌐
Fireworks AI
fireworks.ai › models › fireworks › deepseek-v3p1
DeepSeek V3.1 API & Playground | Fireworks AI
DeepSeek V3.1 is a hybrid large language model (LLM) developed by DeepSeek AI.
🌐
Venturebeat
venturebeat.com › ai › deepseek-v3-1-just-dropped-and-it-might-be-the-most-powerful-open-ai-yet
DeepSeek V3.1 just dropped — and it might be the most powerful open AI yet | VentureBeat
October 14, 2025 - China's DeepSeek has released a 685-billion parameter open-source AI model, DeepSeek V3.1, challenging OpenAI and Anthropic with breakthrough performance, hybrid reasoning, and zero-cost access on Hugging Face.
🌐
South China Morning Post
scmp.com › tech › big tech
DeepSeek’s V3.1 update and missing R1 label spark speculation over fate of R2 AI model | South China Morning Post
Chinese artificial intelligence start-up DeepSeek has updated its foundational V3 model and removed references to its reasoning model R1 from its chatbot, prompting speculation about a shift in the company’s research focus.
Published   August 20, 2025