Hub / social / r/LocalLLaMA

social · source

r/LocalLLaMA on WeSearch

Recent social headlines from r/LocalLLaMA.

I’ve done it!!! FINALLY I have become a (quasi-local) summoner!!! AMA [imtiredboss.jpg]

5/22/2026 · 23 views

Low-level coding dataset

5/22/2026 · 31 views

Anyone evaluated the difference between Qwen Code for the local qwen models vs another harness? CC, OC, LC, Aider etc..

5/22/2026 · 32 views

What model weights (quantized included) under 150GB have the best general knowledge depth?

5/22/2026 · 27 views

When your LLM treats data center GPUs like an optional DLC

5/22/2026 · 30 views

Latest b9274 Addresses MTP VRAM leak

5/21/2026 · 31 views

Waiting for Qwen 3.7 open weight... The new King has arrived...

5/21/2026 · 34 views

Gorgon Halo is 6.7% faster than predecessor Strix Halo

5/21/2026 · 34 views

Strix Halo 128GB vs M5 pro 64GB

5/21/2026 · 32 views

Honesty in a small model drops from 35% to 0% by changing the tone of the prompt. Sharing the findings.

5/21/2026 · 29 views

110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp

5/21/2026 · 25 views

Open-source LLMs are still weak against long reasoning jailbreaks, even with lightweight defenses

5/21/2026 · 33 views

Model Golf for some Runpod Credits!

5/21/2026 · 32 views

Back again, many changes have taken place.

5/21/2026 · 25 views

How can you stop your model from looping

5/21/2026 · 31 views

"AWS secures rare Mac Studios while ordinary Apple customers remain completely locked out"

5/20/2026 · 25 views

Guide to building smoltorrent | A Distributed ML Checkpoint Storage System

5/20/2026 · 37 views

What small speech to text (STT) model is best at recognizing whispered speech?

5/20/2026 · 21 views

Gemma 4 MTP with LlamaCPP

5/20/2026 · 36 views

Impulse Purchase.

5/20/2026 · 26 views

Qwen3.7 Max scored by Artificial Analysis, 27B/35B waiting room

5/20/2026 · 23 views

Guardrails take an 8B model from 53% to 99% on agentic tasks [ACM CAIS '26 preprint]

5/20/2026 · 24 views

Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks

5/20/2026 · 26 views

LM Studio finally added support for MTP Speculative Decoding

5/20/2026 · 36 views

Claude Code plugins a risk to local ecosystem?

5/19/2026 · 40 views

anyone else spending more time managing ai markdown files than actually coding?

5/19/2026 · 32 views

Carbon: Decoding the Language of Life

5/19/2026 · 31 views

Llama-server and MTP

5/19/2026 · 25 views

Qwen is cooking hard

5/19/2026 · 23 views

We have sub-agents at home

5/19/2026 · 26 views

Why might MTP be net negative for tool heavy agentic flows?

5/19/2026 · 29 views

Is there any <3B model with usable 200k+ context window?

5/19/2026 · 27 views

How many GPUs do you have on your local system/server/AI PC?

5/19/2026 · 22 views

favorite Agentic Coding Harness

5/18/2026 · 29 views

Still happy for yall

5/18/2026 · 21 views

Is the llama.cpp nixos flake just broken?

5/18/2026 · 31 views

MTP (Multi-Token Prediction): 2x Faster Token Generation on AMD Strix Halo & Radeon 9700 AI Pro

5/18/2026 · 32 views

Qwen cant wait to release 3.7 models

5/18/2026 · 21 views

Qwen 35b a3b surprises me

5/18/2026 · 31 views

Hopes and dreams for Google IO tomorrow? 👀

5/18/2026 · 33 views

What happens to local LLM if/when LLMs are no longer released for free?

5/18/2026 · 21 views

I tested 42 LLMs on their willingness to build the apocalypse. The "safest" closed-source models are lying to you.

5/18/2026 · 28 views

Quantizing MTP KV Cache = free lunch?

5/18/2026 · 35 views

GGUF with MTP vs MLX without. Is mlx still the way to go for mac users?

5/18/2026 · 26 views

New models when? Forecasting release date.

5/18/2026 · 32 views

The Lurk Report - The last 30 days of r/LocalLLaMA

5/18/2026 · 20 views

Is anyone prioritizing code quality checks via a small local model?

5/18/2026 · 37 views

Big new memory tool with local benchmarks

5/18/2026 · 44 views

I built a coding agent that gets 87% on benchmarks with a 4B parameter model, here's how

5/18/2026 · 37 views

May 2026 updated chart of strix halo mini pc size chart

5/18/2026 · 35 views

How WeSearch handles this source

WeSearch's declared handling of r/LocalLLaMA's content. Indexing, snippets, summaries, retrieval and training are separate questions — see the rights registry or read this source's machine-readable record.

Indexing: Allowed Snippet: Allowed AI summary: Limited Retrieval / RAG: Not asserted Model training: Not asserted Commercial reuse: Not permitted

More social sources

r/programming r/webdev r/typescript r/javascript r/Python r/rust r/golang r/cpp r/csharp r/java r/elixir r/haskell r/ruby r/PHP r/reactjs r/vuejs r/sveltejs r/node

Visit r/LocalLLaMA directly →