Search: "deepseek" — WeSearch Press

SOUTH CHINA MORNING POST

DeepSeek mystery: who is speaking for start-up as CEO Liang Wenfeng remains out of sight?

Researcher Chen Deli is emerging as DeepSeek’s new public face as speculation over the whereabouts of the company’s founder and CEO lingers.…

Tue, 28 Apr 2026 11:05:49 GMT · 4 views

LOCALLLAMA

No GGUFs for DeepSeek V4-Flash as yet?

Wondering why there aren't any "name brand" (like unsloth, bartowski) GGUFs as yet for DeepSeek V4 Flash?…

Mon, 27 Apr 2026 10:48:33 GMT · 7 views

GOOGLE NEWS

China's DeepSeek slashes prices for new AI model - Reuters

China's DeepSeek slashes prices for new AI model Reuters…

Mon, 27 Apr 2026 08:52:24 GMT · 4 views

SIMON WILLISON'S WEBLOG

DeepSeek V4 - almost on the frontier, a fraction of the price

Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) last December . They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, DeepSeek-…

Sun, 26 Apr 2026 22:44:19 GMT · 6 views

llama.cpp DeepSeek v4 Flash experimental inference

Hi, here you can find experimental llama.cpp support for DeepSeek v4, and here there is the GGUF you can use to run the inference with "just" (lol) 128GB of RAM. The model, even quantized at 2 bit, lo…

Sun, 26 Apr 2026 22:44:09 GMT · 7 views

Decreased Intelligence Density in DeepSeek V4 Pro

In the V3.2 paper, they mentioned: Second, token efficiency remains a challenge; DeepSeek-V3.2 typically requires longer generation trajectories (i.e., more tokens) to match the output quality of mode…

Sun, 26 Apr 2026 13:42:28 GMT · 12 views

DeepSeek V4 Update

DeepSeek V4 Update…

Sun, 26 Apr 2026 08:59:43 GMT · 8 views

LMSYS

DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles

We are thrilled to announce Day-0 support for DeepSeek-V4 across both inference and RL training. SGLang and Miles form the first open-source stack to serve and train DeepSeek-V4 on launch day — with s…

Sun, 26 Apr 2026 08:59:39 GMT · 5 views

R/LOCALLLAMA

Deepseek v4 pricing is genuinely silly, did the math and now i am questioning my entire stack

Wed, 29 Apr 2026 05:04:25 GMT · 1 view

ANNAJC

A 3D Flappy Bird side-scroller game built with DeepSeek V4 Pro

Wed, 29 Apr 2026 03:34:24 GMT · 2 views

R/LOCALLLAMA

100M tokens for $2.65 (Deepseek V4 Pro)

Wed, 29 Apr 2026 03:04:24 GMT · 2 views

HACKER NEWS: NEWEST

DeepSeek Unveils Newest Flagship AI Model a Year After Upending Silicon Valley

Tue, 28 Apr 2026 13:24:59 GMT · 12 views

GOOGLE NEWS

China’s DeepSeek rolls out a long-anticipated update of its AI model - AP News

Comprehensive up-to-date news coverage, aggregated from sources all over the world by Google News.…

Tue, 28 Apr 2026 11:49:39 GMT · 10 views

LOCALLLAMA

Deepseek Vision Coming

From Xiaokang Chen on 𝕏:…

Tue, 28 Apr 2026 11:38:42 GMT · 7 views

Kimi K2.6 vs DeepSeek V4 Pro

Tue, 28 Apr 2026 02:24:30 GMT · 4 views

DeepSeek temporarily slashing prices on V4-Pro by 75%

Tue, 28 Apr 2026 01:24:29 GMT · 4 views

DeepSeek-V4 arrives with near state-of-the-art intelligence at 1/6th the cost of Opus 4.7, GPT-5.5

Mon, 27 Apr 2026 16:54:14 GMT · 3 views

anyone actually tried deepseek v4 pro for coding?

so v4 pro dropped and barely anyone is talking about it. feels weird since when kimi k2.6 came out i seen post about it everywhere anyone here tried v4 pro for actual code work? hows it compare to k2.…

Mon, 27 Apr 2026 06:01:11 GMT · 7 views

The exact KV cache usage of DeepSeek V4

Figure 1 of DSV4 paper seems to imply that DSV3.2 uses ~50GB at 1m context and DSV4 uses ~5GB: ***Numbers updated with the KV cache breakdown from vllm*** From my own calculations, the correct FP16 KV…

Sun, 26 Apr 2026 22:44:10 GMT · 7 views

CHINAPULSE.COM

US State Department upgrades AI theft accusations to target China AI companies

US State Department says China is stealing US intellectual property US AI models are being ‘distilled’ to produce cheaper models for China Deepseek, Moonshot AI and MiniMax accused of alleged theft……

Tue, 28 Apr 2026 20:01:24 GMT · 4 views

LLMETER

LLM Budget Guard – open-source runtime cutoff for OpenAI/Anthropic

Alerts won't stop a runaway agent at 3 AM. Budget Guard enforces hard token cutoffs across OpenAI, Anthropic & DeepSeek before bans or surprise invoices.…

Tue, 28 Apr 2026 17:04:12 GMT · 4 views

ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

Why do only big ML labs dominate widely-used models despite many open-source pretrained models smaller labs could do RL on? [D]

I’m trying to understand why models from major labs (GPT, Claude, etc.) dominate real-world usage? You might say it's due to the expensive pretraining compute budge, but there already exists many pret…

Sun, 26 Apr 2026 20:54:40 GMT · 6 views

Results for "deepseek".