WeSearch
Hub / social / r/LocalLLaMA
social · source

r/LocalLLaMA on WeSearch

Recent social headlines from r/LocalLLaMA.

R/LOCALLLAMA

GPU VRAM only for small models with llama.cpp: is it possible?

5/24/2026 · 12 views
R/LOCALLLAMA

Gemma 4 2B handling structured JSON output + tool calling + reasoning traces correctly via Spring AI / LM Studio — including identifying a real Java bug in code review

5/24/2026 · 19 views
R/LOCALLLAMA

Qwen3.6-35B-A3B vs Gemma4-26B-A4B

5/24/2026 · 14 views
R/LOCALLLAMA

Measuring AI intelligence vs Human intelligence

5/24/2026 · 12 views
R/LOCALLLAMA

gemma 4 e2b quality degrades after ~30-40 continuous inferences on 4gb vram?

5/24/2026 · 21 views
R/LOCALLLAMA

Qwen Plays ̶p̶̶o̶̶k̶̶e̶̶m̶̶o̶̶n̶ ? / QWEN PLAYS DCSS! - qwen3.6-35b-a3b@q4_k_xl plays open source roguelike adventure DCSS (and does a decent job)

5/24/2026 · 20 views
R/LOCALLLAMA

Frustrating results with product searching

5/24/2026 · 11 views
R/LOCALLLAMA

Why not dynamic active parameters (and other questions for the knowledgeable)

5/24/2026 · 12 views
R/LOCALLLAMA

Choosing an abliterated version of Gemma 4 31B and 26B-A4B

5/24/2026 · 24 views
R/LOCALLLAMA

Qwen3.6-35B-A3B-Uncensored-Genesis-APEX-MTP

5/24/2026 · 16 views
R/LOCALLLAMA

I built a local GUI for the TradingAgents framework — works with Ollama

5/24/2026 · 17 views
R/LOCALLLAMA

Anyone down to test this? Just uploaded a model using rys

5/24/2026 · 18 views
R/LOCALLLAMA

TTS Benchmark Comparison (all known TTS up until May 2026)

5/24/2026 · 17 views
R/LOCALLLAMA

Performance When Offloading Large Models to System RAM?

5/24/2026 · 14 views
R/LOCALLLAMA

How are you all handling agents and sub agents?

5/24/2026 · 14 views
R/LOCALLLAMA

Is there any reason for an uncensored model if you have no interest in roleplaying?

5/24/2026 · 16 views
R/LOCALLLAMA

Vision-capable LLMs vs. OCR for long-document (including charts, images, tables, etc.) QA

5/24/2026 · 14 views
R/LOCALLLAMA

minor speed bump for MTP with Qwen3.6-27B-MTP Q6_K_XL

5/24/2026 · 19 views
R/LOCALLLAMA

llampart 1.0.0 - I released a standalone local web UI for llama-server with translations, extended settings and a polished conversation sidebar

5/24/2026 · 21 views
R/LOCALLLAMA

How to keep up to date on latest models?

5/23/2026 · 16 views
R/LOCALLLAMA

llama.cpp server have built-in native tools (exec_shell, edit_file, etc.)

5/23/2026 · 15 views
R/LOCALLLAMA

Local model doing accounting tasks

5/23/2026 · 15 views
R/LOCALLLAMA

MLID claims nova lake-ax not cancelled just renamed razor lake-ax

5/23/2026 · 13 views
R/LOCALLLAMA

For users have have both 6000 PRO MaxQ and Workstation Edition (or Server Edition), how much slower is the MaxQ vs the WS/SV on compute? (Prompt processing, Diffusion, etc)

5/23/2026 · 21 views
R/LOCALLLAMA

Command A+ (218B MoE) running on Apple Silicon — MLX port, PR open

5/23/2026 · 20 views
R/LOCALLLAMA

Inference provider tiers by Cache-hit rates, using openrouter data

5/23/2026 · 16 views
R/LOCALLLAMA

Any reason to run dense over MOE for RAGs?

5/23/2026 · 9 views
R/LOCALLLAMA

$16 refactor, 400 steps, 95% routed to open MoE

5/23/2026 · 11 views
R/LOCALLLAMA

7900XTX idle power draw when running headless?

5/23/2026 · 9 views
R/LOCALLLAMA

Local, low code, node based agentic development workspace... that actually works?

5/23/2026 · 16 views
R/LOCALLLAMA

Qwen3.6 35B-A3B MTP hits 249 t/s on a 24GB consumer GPU (RTX 5090M) — 3.4× the dense 27B variant on the same image

5/23/2026 · 21 views
R/LOCALLLAMA

found this little known channel with some really good content

5/23/2026 · 8 views
R/LOCALLLAMA

First AI to Beat Every Human in a Programming Competition - Agentic GRPO Explained

5/23/2026 · 19 views
R/LOCALLLAMA

Have we passed the peak of inflated expectations?

5/23/2026 · 7 views
R/LOCALLLAMA

DGX Spark agentic usage numbers

5/23/2026 · 9 views
R/LOCALLLAMA

Best open-source & proprietary options for Indic language ASR

5/23/2026 · 21 views
R/LOCALLLAMA

LLaMa.cpp basic question

5/23/2026 · 19 views
R/LOCALLLAMA

Gemma4 26b a4b Apex quant is quite good

5/23/2026 · 14 views
R/LOCALLLAMA

Gemma is so much better than Qwen, prove me wrong

5/23/2026 · 15 views
R/LOCALLLAMA

G4-MeroMero-26B-A4B-it-uncensored-heretic Is Out Now, a Finetune of gemma-4-26B-A4B-it, With KLD of 0.0152 and 12/100 Refusals!

5/23/2026 · 16 views
R/LOCALLLAMA

Qwen3.6-35B-A3B Q4 262k context on 8GB 3070 Ti = +30tps

5/22/2026 · 16 views
R/LOCALLLAMA

NVIDIA Removes Gaming Revenue Category From Financial Reports

5/22/2026 · 9 views
R/LOCALLLAMA

How small can the orchestration model in an agent be? (separating it from code-gen — that obviously wants a big model)

5/22/2026 · 17 views
R/LOCALLLAMA

If one .gguf makes it past the great filter, humanity survives in some way.

5/22/2026 · 16 views
R/LOCALLLAMA

Seeking resources to read about llama.cpp server and how offloading works

5/22/2026 · 22 views
R/LOCALLLAMA

OpenBMB presents the model BitCPM-CANN 1.58 bit

5/22/2026 · 14 views
R/LOCALLLAMA

Holding machine upgrade waiting for a model?

5/22/2026 · 13 views
R/LOCALLLAMA

Quick note on sudden performance loss when running GGUFs

5/22/2026 · 11 views
R/LOCALLLAMA

ztok — a fast multithreaded tokenizer in Zig that loads tiktoken / HF / SentencePiece and is 2–5× faster

5/22/2026 · 12 views
R/LOCALLLAMA

New Release of ROCm based MLX LLM Engine - lemon-mlx-engine

5/22/2026 · 16 views

More social sources

Visit r/LocalLLaMA directly →