WeSearch
Hub / Search / deepseek
SEARCH · DEEPSEEK

Results for "deepseek".

23 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

23 results for "deepseek"

SOUTH CHINA MORNING POST

DeepSeek mystery: who is speaking for start-up as CEO Liang Wenfeng remains out of sight?

Researcher Chen Deli is emerging as DeepSeek’s new public face as speculation over the whereabouts of the company’s founder and CEO lingers.…

· 4 views
LOCALLLAMA

No GGUFs for DeepSeek V4-Flash as yet?

Wondering why there aren't any "name brand" (like unsloth, bartowski) GGUFs as yet for DeepSeek V4 Flash?…

· 7 views
GOOGLE NEWS

China's DeepSeek slashes prices for new AI model - Reuters

China's DeepSeek slashes prices for new AI model Reuters…

· 4 views
SIMON WILLISON'S WEBLOG

DeepSeek V4 - almost on the frontier, a fraction of the price

Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) last December . They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, DeepSeek-…

· 6 views
REDDIT

llama.cpp DeepSeek v4 Flash experimental inference

Hi, here you can find experimental llama.cpp support for DeepSeek v4, and here there is the GGUF you can use to run the inference with "just" (lol) 128GB of RAM. The model, even quantized at 2 bit, lo…

· 7 views
REDDIT

Decreased Intelligence Density in DeepSeek V4 Pro

In the V3.2 paper, they mentioned: Second, token efficiency remains a challenge; DeepSeek-V3.2 typically requires longer generation trajectories (i.e., more tokens) to match the output quality of mode…

· 12 views
REDDIT

DeepSeek V4 Update

DeepSeek V4 Update…

· 8 views
LMSYS

DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles

We are thrilled to announce Day-0 support for DeepSeek-V4 across both inference and RL training. SGLang and Miles form the first open-source stack to serve and train DeepSeek-V4 on launch day — with s…

· 5 views
R/LOCALLLAMA

Deepseek v4 pricing is genuinely silly, did the math and now i am questioning my entire stack

· 1 view
ANNAJC

A 3D Flappy Bird side-scroller game built with DeepSeek V4 Pro

· 2 views
R/LOCALLLAMA

100M tokens for $2.65 (Deepseek V4 Pro)

· 2 views
HACKER NEWS: NEWEST

DeepSeek Unveils Newest Flagship AI Model a Year After Upending Silicon Valley

· 12 views
GOOGLE NEWS

China’s DeepSeek rolls out a long-anticipated update of its AI model - AP News

Comprehensive up-to-date news coverage, aggregated from sources all over the world by Google News.…

· 10 views
LOCALLLAMA

Deepseek Vision Coming

From Xiaokang Chen on 𝕏:…

· 7 views
REDDIT

Kimi K2.6 vs DeepSeek V4 Pro

· 4 views
REDDIT

DeepSeek temporarily slashing prices on V4-Pro by 75%

· 4 views
REDDIT

DeepSeek-V4 arrives with near state-of-the-art intelligence at 1/6th the cost of Opus 4.7, GPT-5.5

· 3 views
REDDIT

anyone actually tried deepseek v4 pro for coding?

so v4 pro dropped and barely anyone is talking about it. feels weird since when kimi k2.6 came out i seen post about it everywhere anyone here tried v4 pro for actual code work? hows it compare to k2.…

· 7 views
REDDIT

The exact KV cache usage of DeepSeek V4

Figure 1 of DSV4 paper seems to imply that DSV3.2 uses ~50GB at 1m context and DSV4 uses ~5GB: ***Numbers updated with the KV cache breakdown from vllm*** From my own calculations, the correct FP16 KV…

· 7 views
CHINAPULSE.COM

US State Department upgrades AI theft accusations to target China AI companies

US State Department says China is stealing US intellectual property US AI models are being ‘distilled’ to produce cheaper models for China Deepseek, Moonshot AI and MiniMax accused of alleged theft……

· 4 views
LLMETER

LLM Budget Guard – open-source runtime cutoff for OpenAI/Anthropic

Alerts won't stop a runaway agent at 3 AM. Budget Guard enforces hard token cutoffs across OpenAI, Anthropic & DeepSeek before bans or surprise invoices.…

· 4 views
ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

· 3 views
REDDIT

Why do only big ML labs dominate widely-used models despite many open-source pretrained models smaller labs could do RL on? [D]

I’m trying to understand why models from major labs (GPT, Claude, etc.) dominate real-world usage? You might say it's due to the expensive pretraining compute budge, but there already exists many pret…

· 6 views