Hub / social / r/LocalLLaMA

social · source

r/LocalLLaMA on WeSearch

Recent social headlines from r/LocalLLaMA.

"Generate a photorealistic realtime render of a human face with webGL" (Qwen3.5-122B-A10B UD-Q3_K_XL)

5/17/2026 · 36 views

MTP experiences on 7900xtx?

5/17/2026 · 41 views

Grafting vision onto text models for fun and profit.

5/17/2026 · 27 views

Are local models good enough yet for AI meeting memory?

5/17/2026 · 23 views

llama: avoid copying logits during prompt decode in MTP by am17an · Pull Request #23198 · ggml-org/llama.cpp

5/17/2026 · 38 views

The power of structured workflows and small local models

5/17/2026 · 36 views

Developers who use local AI - Q4_0 vs Q8_0 KV quant?

5/17/2026 · 27 views

Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention

5/17/2026 · 31 views

How do I get the superfast DFlash / MTP tokens per second that I'm seeing on here? Dual 3090s

5/17/2026 · 35 views

Dual GPU llama.cpp speedup

5/17/2026 · 30 views

Convert With MPT Support?

5/17/2026 · 27 views

Good candidate model to act as a PA

5/17/2026 · 25 views

Is that was a right purchase for Qwen3.6 27/35

5/17/2026 · 25 views

Llama.cpp MTP with Qwen3.6 27B on Headless RTX 3090

5/17/2026 · 36 views

Jackrong/Qwopus3.5-9B-Coder-GGUF · Hugging Face

5/17/2026 · 29 views

Very happy with Qwen 3.5 122B output. But is slowness expected?

5/17/2026 · 27 views

LeanLoop, the Tool Claude Leans on

5/17/2026 · 24 views

"Elias Thorne" is what eight different LLMs name a lighthouse keeper. He's also selling cancer treatment advice on Amazon

5/17/2026 · 33 views

Looking to migrate off of Ollama and LMStudio

5/17/2026 · 35 views

Hardware Recommendations for realtime voice and a simple personal assistant/organisation agent.

5/17/2026 · 47 views

Meet Ronald

5/17/2026 · 24 views

webui: support video files as input by foldl · Pull Request #22830 · ggml-org/llama.cpp

5/17/2026 · 37 views

How do I correct a memory that was retrieved without asking for any help from the backend team? (personal experience)

5/17/2026 · 37 views

G4-Meromero-31B-Uncensored-Heretic Is Out Now, a Finetune of Gemma 4 31B It Designed for Creative Tasks, With Kld of 0.0100 and 15/100 Refusals!

5/17/2026 · 27 views

Ran the same models across Strix Halo, RTX 3090, and RTX 5070 because I wanted my own numbers

5/16/2026 · 35 views

an alternative = similar experience to using windsurf but on local?

5/16/2026 · 24 views

Now that MTP is merged... What's the best outputs you're getting on Qwen 3.6 35B on 2x3090s?

5/16/2026 · 36 views

WSL can't reach Kobold.cpp running on Windows, even though the API works fine in PowerShell, SillyTavern & a Kenshi SentientSands Mod. Does anyone know the solution?

5/16/2026 · 39 views

I fitted the new δ-mem research for apple silicon using mlx and openclaw integration! My findings

5/16/2026 · 32 views

Qwen3.5-122B-Q5-MTP - Qwen3.5-122B-Q6-MTP

5/16/2026 · 34 views

Best llama.cpp launch config for Qwen3.6 27B on RX 7800 XT (16 GB VRAM) for OpenClaw?

5/16/2026 · 36 views

gemma-4-Ortenzya-The-Creative-Wordsmith-31B-it-uncensored-heretic is Out Now, A Writing Finetune that Aims to Improve Gemma 4 31B it Writing Quality with More Natural English and Better Prose, Good for Creative Writings, Translations and RPs!

5/16/2026 · 38 views

Local Qwen 3.6 vs frontier models on a coding primitive: single-file HTML canvas driving animation - results and GIFs

5/16/2026 · 33 views

How I started programming differently over the last year. What about you?

5/16/2026 · 25 views

Corsair desktop PC with Ryzen 395 and 128GB of unified RAM, has anyone tested it for LLM? Seems "a good" price

5/16/2026 · 33 views

Qwen 27b MTP Config, Llama.cpp Single 3090

5/16/2026 · 30 views

Using Intel Arc Pro series, any thoughts ?

5/16/2026 · 25 views

b9180 llama.ccp MTP landed

5/16/2026 · 38 views

LLM Phone Home: Reliable Apps that can deliver inference from local backend

5/16/2026 · 26 views

Extension idea: llama-server with custom samplers

5/16/2026 · 24 views

Local speech to text for iOS using Apple Watch

5/16/2026 · 23 views

I've updated my glorified Llama fork (LLM Inference Server) for P40's to utilise MTP + TurboQuant + DFlash

5/16/2026 · 30 views

A very important milestone for me in the AI field.

5/16/2026 · 21 views

Built a 6x cheaper CodeRabbit alternative using open source models

5/16/2026 · 33 views

Reduce your GPU power limit

5/16/2026 · 20 views

When you run small LMM on RAM, dont use all Theards.

5/16/2026 · 18 views

What's in a GGUF, besides the weights - and what's still missing?

5/16/2026 · 23 views

How WeSearch handles this source

WeSearch's declared handling of r/LocalLLaMA's content. Indexing, snippets, summaries, retrieval and training are separate questions — see the rights registry or read this source's machine-readable record.

Indexing: Allowed Snippet: Allowed AI summary: Limited Retrieval / RAG: Not asserted Model training: Not asserted Commercial reuse: Not permitted

More social sources

r/programming r/webdev r/typescript r/javascript r/Python r/rust r/golang r/cpp r/csharp r/java r/elixir r/haskell r/ruby r/PHP r/reactjs r/vuejs r/sveltejs r/node

Visit r/LocalLLaMA directly →