social · source
r/LocalLLaMA on WeSearch
Recent social headlines from r/LocalLLaMA.
R/LOCALLLAMA
Nvidia teases new PC laptop chip to be announced at Computex June 2
R/LOCALLLAMA
PSA
R/LOCALLLAMA
Step 3.7 Flash passes the car wash test
R/LOCALLLAMA
Llama.cpp B9406 MTP mmproj fix
R/LOCALLLAMA
FP16 on Qwen 3.6 27B
R/LOCALLLAMA
Comparing Vector search libraries
R/LOCALLLAMA
OAM waterblocks
R/LOCALLLAMA
A moment of thanks for DeepSeek
R/LOCALLLAMA
How do I make MTP work in llama-server?
R/LOCALLLAMA
StepFun 3.7 Flash - Speed Benchmark in M5 Max
R/LOCALLLAMA
Step 3.7 Flash Config + Early Data on 2x RTX 6000's
R/LOCALLLAMA
Liquid AI releases LFM2.5-8B-A1B
R/LOCALLLAMA
Which Coding Agent Features Are Useful For Local LLMs
R/LOCALLLAMA
Beware!! Users trying to fork and steal your projects
R/LOCALLLAMA
StepFun 3.7 Flash
R/LOCALLLAMA
UPDATE: "Gentle Coding" is mathematically proven. 1,500+ test runs show major gain for Kimi K2.6 and even more for GLM-5.1! GPT 5.4/5.5 and Claude Sonnet 3.5/Opus 4.6 also better, with ZERO REGRESSION ACROSS THE BOARD.
R/LOCALLLAMA
Ubuntu 26.04 on DGX Spark
R/LOCALLLAMA
Upgrade path from 4x 3090s
R/LOCALLLAMA
Mimo 2.5 Pro - 40t/s on 8x Nvidia Spark/GB10 cluster
R/LOCALLLAMA
CrankGPT by Squeez Labs - hand-cranked edge AI - talk about local AI!!!
R/LOCALLLAMA
I built a 103B-token Usenet corpus (1980–2013) — pre-web, human-only, zero AI contamination. Got strong traction on r/ML, thought this community would find it useful.
R/LOCALLLAMA
Inferencing at 10.33 t/s on Qwen 3.5 35B on a $300 laptop
R/LOCALLLAMA
Qwen3.6 huge quality gain from Q4 to Q6 for coding agent
R/LOCALLLAMA
Looking for a working Deepseek-v4-Flash quant
R/LOCALLLAMA
Why are the AI Companies spreading F.U.D. about AI?
R/LOCALLLAMA
Is a 128 GB MacBook Pro M5 Max actually too slow for large-context local LLM coding workflows?
R/LOCALLLAMA
Hugging Face Dataset Lineage Explorer
R/LOCALLLAMA
Finally pioneering beyond the local 256k context window frontier!
R/LOCALLLAMA
Found a Rust TUI coding agent that aggressively trims context with AST-level chunking. Cut my token bleed sharply with DeepSeek V4 Flash.
R/LOCALLLAMA
Hyvemind OSS - Looking for some testers
R/LOCALLLAMA
I made a small tool to inspect retrieval results before feeding them into RAG
R/LOCALLLAMA
New DeepSWE benchmark finds Claude Opus cheats
R/LOCALLLAMA
Turning every "no thats not what i meant" in chat into actual LoRA training data
R/LOCALLLAMA
Does Engram Do Memory Retrieval in Autoregressive Image Generation?
R/LOCALLLAMA
Stop traumatizing AI into loops and turn hallucinations into an honest "I don't know!" by being NICE to them (Proof of Concept, Research, I don't want to sell anything)
R/LOCALLLAMA
How Qwen3.6-35B-A3B fails differently as a sub agent compared to solo
R/LOCALLLAMA
I made a Windows app for managing llama.cpp in WSL/Ubuntu
R/LOCALLLAMA
Long-context performance at lower quants
R/LOCALLLAMA
OpenMOSS-Team/MOSS-TTS-v1.5 · Hugging Face
R/LOCALLLAMA
Built a local-first AI memory system that indexes screen activity, meetings, and voice notes ( MCP + automations)
R/LOCALLLAMA
SkillOpt treats markdown skill files as trainable parameters with proper optimization machinery
R/LOCALLLAMA
Strix Halo users, a rejected PR can give you up to 30% faster PP for MOEs.
R/LOCALLLAMA
Stop pretending self-hosting is cheaper. It's not. We do it for different reasons and we should say so.
R/LOCALLLAMA
Qwen3.5 35B A3B uncensored heretic Native MTP Preserved is Out Now With the Full 785 MTPs Preserved and Retained, Available in Safetensors, GGUFs. NVFP4, NVFP4 GGUFs and GPTQ-Int4 Formats
R/LOCALLLAMA
Running on a macbook, and having issues with crashing? Maybe this will help...
R/LOCALLLAMA
I finally put my NPU (Intel Arrow Lake) to use doing ASR for my smart home
R/LOCALLLAMA
CXMT started selling ram to corsair
R/LOCALLLAMA
Is something went wrong with those online free model, why I feel they worse than Gemma 4 26B A4B Q4_KM ??
R/LOCALLLAMA
One letter to appease them all
R/LOCALLLAMA