DeepSeek

Chinese artificial intelligence company

Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd., doing business as DeepSeek, is a Chinese artificial intelligence (AI) company that develops large language models (LLMs). Based in Hangzhou, Zhejiang, DeepSeek is … Wikipedia

Factsheet

Native name 杭州深度求索人工智能基础技术研究有限公司

Company type Private

Industry Information technology
Artificial intelligence

Factsheet

Native name 杭州深度求索人工智能基础技术研究有限公司

Company type Private

Industry Information technology
Artificial intelligence

Hugging Face

huggingface.co › deepseek-ai › DeepSeek-V3.1

deepseek-ai/DeepSeek-V3.1 · Hugging Face

3 weeks ago - Search agents are evaluated with our internal search framework, which uses a commercial search API + webpage filter + 128K context window. Seach agent results of R1-0528 are evaluated with a pre-defined workflow. SWE-bench is evaluated with our internal code agent framework. HLE is evaluated with the text-only subset. import transformers tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.1") messages = [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Who are you?"}, {"role": "assistant", "content": "<think>Hmm</think>I am

DeepSeek

api-docs.deepseek.com › deepseek v3.1 release 2025/08/21

DeepSeek-V3.1 Release | DeepSeek API Docs

🔹 Tokenizer & chat template updated — new tokenizer config: https://huggingface.co/deepseek-ai/DeepSeek-V3.1/blob/main/tokenizer_config.json

Discussions

DeepSeek v3.1

Qwen: Deepseek must have concluded that hybrid models are worse. Deepseek: Qwen must have cnocluded that hybrid models are better. More on reddit.com

r/LocalLLaMA

115

548

August 19, 2025

So.. What's the consensus on Deepseek-V3.1 for RP?

Agreed. 3.1 feels “technically correct” but flat like it stripped all the flavor out of the RP. R1 wasn’t perfect but it had way more life and descriptive flair More on reddit.com

r/SillyTavernAI

54

51

August 26, 2025

DeepSeek V3.1 (Thinking) aggregated benchmarks (vs. gpt-oss-120b)

That proves that benchmarks are barely usefull now. More on reddit.com

r/LocalLLaMA

70

203

August 21, 2025

deepseek-ai/DeepSeek-V3.1 · Hugging Face

OK, so here are my quick takes on DeepSeek V3.1. Improving agentic capability seems to be the focus of this update. More specifically: 29.8% on HLE with search and Python, compared to 24.8% for R1-0528, 35.2% for GPT-5 Thinking, 24.3% for o3, 38.6% for Grok 4, and 26.9% for Gemini Deep Research. Caveats apply: DeepSeek models are exclusively evaluated on text subset, although I believe this subset is not easier for SotA models. Grok 4 is (possibly) evaluated without a webpage filter so data contamination is possible. 66.0% on SWE-Bench Verified without Thinking, compared to 44.6% for R1-0528, 74.9% for GPT-5 Thinking, 69.1% for o3, 74.5% for Claude 4.1 Opus, and 65.8 for Kimi K2. Again, caveats apply: OpenAI models are evaluated on a subset of 477 problems, not the 500 full set. 31.3% on Terminal Bench with Terminus 1 framework, compared to 30.2% for o3, 30.0% for GPT-5, and 25.3% for Gemini 2.5 Pro. A slight bump on other coding and math capabilities (AIME, LiveCodeBench, Codeforces, Aider) but most users would not be able to tell the difference, as R1-0528 already destroys 98% of human programmers on competitive programming. A slight reduction on GPQA, HLE (offline, no tools), and maybe in your own use case. I do not find V3.1 Thinking to be better than R1-0528 as a Chat LLM, for example. A few concluding thoughts: Right now I am actually more worried about how the open-source ecosystem will be deploying DeepSeek V3.1 in an agentic environment more than anything else. For agentic LLMs, prompts and agent frameworks make a huge difference in user experience. Gemini, Anthropic, and OpenAI all have branded search and code agents (e.g. Deep Research, Claude Code), but DeepSeek has none. So it remains to be seen how well V3.1 can work with prompts and tools from Claude Code, for example. Maybe DeepSeek will open-source their internal search and coding framework in a future date to ensure the best user experience. I also noticed a lot of serverless LLM inference providers cheap out on their deployment. They may serve with lowered precision, pruned experts, or poor sampling parameters. So the provider you use will definitely impact your user experience. It also starts to make sense why they merged the R1 with V3 and made 128K context window the default on the API. Agentic coding usually does not benefit much from a long CoT but consume a ton of tokens. So a singular model is a good way to reduce deployment TCO. This is probably as far as they can push on the V3 base - you can already see some regression on things like GPQA, offline HLE. Hope to see V4 soon. More on reddit.com

r/LocalLLaMA

93

559

August 21, 2025

Videos

25:23

YouTube

DeepSeek V3.1 Local Ai Review - Chat GPT 5 Killer?? - YouTube

August 22, 2025

13:00

YouTube

Let's Run DeepSeek V3.1-TERMINUS Local AI | Developer Review - YouTube

September 23, 2025

09:28

YouTube

DeepSeek V3.1 Terminus is INSANE!🤯 - YouTube

September 24, 2025

16:48

YouTube

DeepSeek v3.1 Update is Better than I Expected... BUT? - YouTube

3 weeks ago - 🎉 Launching DeepSeek-V3.2 — Reasoning-first models built for agents. Now available on web, app & API.

reddit.com › r/localllama › deepseek v3.1

r/LocalLLaMA on Reddit: DeepSeek v3.1

August 19, 2025 -

It’s happening!

DeepSeek online model version has been updated to V3.1, context length extended to 128k, welcome to test on the official site and app. API calling remains the same.

Top answer

1 of 5

121

Qwen: Deepseek must have concluded that hybrid models are worse. Deepseek: Qwen must have cnocluded that hybrid models are better.

2 of 5

69

More observation:  1. The model is very very verbose. 2. The “r1” in the think button has gone, indicating this is a mixed reasoning model! Well we’ll know when the official blog is out.

OpenRouter

openrouter.ai › deepseek › deepseek-chat-v3.1

DeepSeek V3.1 - API, Providers, Stats | OpenRouter

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. Run DeepSeek V3.1 with API

Together AI

together.ai › models › deepseek-v3-1

DeepSeek-V3.1 API | Together AI

Guanacos are one of two wild South American camelids; the other species is the vicuña, which lives at higher elevations.", ] response = client.rerank.create( model="deepseek-ai/DeepSeek-V3.1", query=query, documents=documents, top_n=2 ) for result in response.results: print(f"Relevance Score: {result.relevance_score}")

Find elsewhere

Google Bing Mojeek

X

x.com › deepseek_ai › status › 1958417062008918312

DeepSeek on X: "Introducing DeepSeek-V3.1: our first step toward the agent era! 🚀 🧠 Hybrid inference: Think & Non-Think — one model, two modes ⚡️ Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs. DeepSeek-R1-0528 🛠️ Stronger agent skills: Post-training boosts tool use and" / X

Introducing DeepSeek-V3.1: our first step toward the agent era! Hybrid inference: Think & Non-Think — one model, two modes Faster thinking: DeepSeek-V3.1-Think reaches answers in less time vs.

NVIDIA

build.nvidia.com › deepseek-ai › deepseek-v3_1

deepseek-v3.1 Model by Deepseek-ai | NVIDIA NIM

DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.

DeepSeek

api-docs.deepseek.com › deepseek v3.1 update 2025/09/22

DeepSeek-V3.1-Terminus | DeepSeek API Docs

September 22, 2025 - The latest update builds on V3.1’s strengths while addressing key user feedback. ... 📊 DeepSeek-V3.1-Terminus delivers more stable & reliable outputs across benchmarks compared to the previous version.

YouTube

youtube.com › watch

DeepSeek V3.1: Bigger Than You Think! - YouTube

15:19

DeepSeek V3.1 is a unified hybrid reasoning open-weight model that powers agentic workflows—FP8 training, strong post-training for tool/function calling (non...

Published August 22, 2025

Google Cloud Platform

console.cloud.google.com › vertex-ai › publishers › deepseek-ai › model-garden › deepseek-v3-1

DeepSeek-V3.1 – Vertex AI

Google Cloud Console has failed to load JavaScript sources from www.gstatic.com. Possible reasons are:www.gstatic.com or its IP addresses are blocked by your network administratorGoogle has temporarily blocked your account or network due to excessive automated requestsPlease contact your network ...

Hugging Face

huggingface.co › deepseek-ai › DeepSeek-V3.1-Terminus

deepseek-ai/DeepSeek-V3.1-Terminus · Hugging Face

3 weeks ago - For the model's chat template other than search agent, please refer to the DeepSeek-V3.1 repo.

DeepSeek

deepseek.ai › blog › deepseek-v31

DeepSeek V3.1: The New Frontier in Artificial Intelligence

DeepSeek AI is the leading provider of advanced AI language models and enterprise solutions. Experience state-of-the-art artificial intelligence technology for your business needs.

Hugging Face

huggingface.co › deepseek-ai › DeepSeek-V3.1-Base

deepseek-ai/DeepSeek-V3.1-Base · Hugging Face

3 weeks ago - Search agents are evaluated with our internal search framework, which uses a commercial search API + webpage filter + 128K context window. Seach agent results of R1-0528 are evaluated with a pre-defined workflow. SWE-bench is evaluated with our internal code agent framework. HLE is evaluated with the text-only subset. import transformers tokenizer = transformers.AutoTokenizer.from_pretrained("deepseek-ai/DeepSeek-V3.1") messages = [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Who are you?"}, {"role": "assistant", "content": "<think>Hmm</think>I am

AWS

aws.amazon.com › blogs › aws › deepseek-v3-1-now-available-in-amazon-bedrock

DeepSeek-V3.1 model now available in Amazon Bedrock | Amazon Web Services

September 20, 2025 - AWS launches DeepSeek-V3.1 as a fully managed models in Amazon Bedrock. DeepSeek-V3.1 is a hybrid open weight model that switches between thinking mode for detailed step-by-step analysis and non-thinking mode for faster responses.

Fireworks AI

fireworks.ai › models › fireworks › deepseek-v3p1

DeepSeek V3.1 API & Playground | Fireworks AI

DeepSeek V3.1 is a hybrid large language model (LLM) developed by DeepSeek AI.

Venturebeat

venturebeat.com › ai › deepseek-v3-1-just-dropped-and-it-might-be-the-most-powerful-open-ai-yet

DeepSeek V3.1 just dropped — and it might be the most powerful open AI yet | VentureBeat

October 14, 2025 - China's DeepSeek has released a 685-billion parameter open-source AI model, DeepSeek V3.1, challenging OpenAI and Anthropic with breakthrough performance, hybrid reasoning, and zero-cost access on Hugging Face.

South China Morning Post

scmp.com › tech › big tech

DeepSeek’s V3.1 update and missing R1 label spark speculation over fate of R2 AI model | South China Morning Post

Chinese artificial intelligence start-up DeepSeek has updated its foundational V3 model and removed references to its reasoning model R1 from its chatbot, prompting speculation about a shift in the company’s research focus.

Published August 20, 2025