Hub / social / r/LocalLLaMA

social · source

r/LocalLLaMA on WeSearch

Recent social headlines from r/LocalLLaMA.

Nvidia teases new PC laptop chip to be announced at Computex June 2

5/29/2026 · 40 views

PSA

5/29/2026 · 33 views

Step 3.7 Flash passes the car wash test

5/29/2026 · 30 views

Llama.cpp B9406 MTP mmproj fix

5/29/2026 · 38 views

FP16 on Qwen 3.6 27B

5/29/2026 · 24 views

Comparing Vector search libraries

5/29/2026 · 28 views

OAM waterblocks

5/29/2026 · 27 views

A moment of thanks for DeepSeek

5/29/2026 · 39 views

How do I make MTP work in llama-server?

5/29/2026 · 38 views

StepFun 3.7 Flash - Speed Benchmark in M5 Max

5/29/2026 · 49 views

Step 3.7 Flash Config + Early Data on 2x RTX 6000's

5/29/2026 · 37 views

Liquid AI releases LFM2.5-8B-A1B

5/29/2026 · 36 views

Which Coding Agent Features Are Useful For Local LLMs

5/29/2026 · 36 views

Beware!! Users trying to fork and steal your projects

5/29/2026 · 41 views

StepFun 3.7 Flash

5/29/2026 · 41 views

UPDATE: "Gentle Coding" is mathematically proven. 1,500+ test runs show major gain for Kimi K2.6 and even more for GLM-5.1! GPT 5.4/5.5 and Claude Sonnet 3.5/Opus 4.6 also better, with ZERO REGRESSION ACROSS THE BOARD.

5/29/2026 · 48 views

Ubuntu 26.04 on DGX Spark

5/28/2026 · 30 views

Upgrade path from 4x 3090s

5/28/2026 · 30 views

Mimo 2.5 Pro - 40t/s on 8x Nvidia Spark/GB10 cluster

5/28/2026 · 30 views

CrankGPT by Squeez Labs - hand-cranked edge AI - talk about local AI!!!

5/27/2026 · 42 views

I built a 103B-token Usenet corpus (1980–2013) — pre-web, human-only, zero AI contamination. Got strong traction on r/ML, thought this community would find it useful.

5/27/2026 · 31 views

Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop

5/27/2026 · 26 views

Qwen3.6 huge quality gain from Q4 to Q6 for coding agent

5/27/2026 · 33 views

Looking for a working Deepseek-v4-Flash quant

5/27/2026 · 32 views

Why are the AI Companies spreading F.U.D. about AI?

5/27/2026 · 33 views

Is a 128 GB MacBook Pro M5 Max actually too slow for large-context local LLM coding workflows?

5/27/2026 · 35 views

Hugging Face Dataset Lineage Explorer

5/27/2026 · 41 views

Finally pioneering beyond the local 256k context window frontier!

5/27/2026 · 27 views

Found a Rust TUI coding agent that aggressively trims context with AST-level chunking. Cut my token bleed sharply with DeepSeek V4 Flash.

5/27/2026 · 45 views

Hyvemind OSS - Looking for some testers

5/27/2026 · 28 views

I made a small tool to inspect retrieval results before feeding them into RAG

5/27/2026 · 39 views

New DeepSWE benchmark finds Claude Opus cheats

5/27/2026 · 36 views

Turning every "no thats not what i meant" in chat into actual LoRA training data

5/27/2026 · 32 views

Does Engram Do Memory Retrieval in Autoregressive Image Generation?

5/27/2026 · 38 views

Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything)

5/27/2026 · 44 views

How Qwen3.6-35B-A3B fails differently as a sub agent compared to solo

5/27/2026 · 39 views

I made a Windows app for managing llama.cpp in WSL/Ubuntu

5/26/2026 · 40 views

Long-context performance at lower quants

5/26/2026 · 37 views

OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face

5/26/2026 · 31 views

Built a local-first AI memory system that indexes screen activity, meetings, and voice notes ( MCP + automations)

5/26/2026 · 39 views

SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery

5/26/2026 · 34 views

Strix Halo users, a rejected PR can give you up to 30% faster PP for MOEs.

5/26/2026 · 39 views

Stop pretending self-hosting is cheaper. It's not. We do it for different reasons and we should say so.

5/26/2026 · 32 views

Qwen3.5 35B A3B uncensored heretic Native MTP Preserved is Out Now With the Full 785 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats

5/26/2026 · 40 views

Running on a macbook, and having issues with crashing? Maybe this will help...

5/26/2026 · 35 views

I finally put my NPU (Intel Arrow Lake) to use doing ASR for my smart home

5/26/2026 · 40 views

CXMT started selling ram to corsair

5/26/2026 · 33 views

Is something went wrong with those online free model, why I feel they worse than Gemma 4 26B A4B Q4_KM ??

5/26/2026 · 35 views

One letter to appease them all

5/26/2026 · 33 views

Shard - getting to 10× KV cache compression

5/26/2026 · 34 views

How WeSearch handles this source

WeSearch's declared handling of r/LocalLLaMA's content. Indexing, snippets, summaries, retrieval and training are separate questions — see the rights registry or read this source's machine-readable record.

Indexing: Allowed Snippet: Allowed AI summary: Limited Retrieval / RAG: Not asserted Model training: Not asserted Commercial reuse: Not permitted

More social sources

r/programming r/webdev r/typescript r/javascript r/Python r/rust r/golang r/cpp r/csharp r/java r/elixir r/haskell r/ruby r/PHP r/reactjs r/vuejs r/sveltejs r/node

Visit r/LocalLLaMA directly →