WeSearch
Hub / social / r/LocalLLaMA
social · source

r/LocalLLaMA on WeSearch

Recent social headlines from r/LocalLLaMA.

R/LOCALLLAMA

how do you decide between q4 and q5 on a 70b when 24gb is the cap?

5/26/2026 · 15 views
R/LOCALLLAMA

Added direct model downloads right from the UI in Anubis OSS - if anyone would help test that would be great

5/26/2026 · 20 views
R/LOCALLLAMA

New local model reaching near frontier on PII removal at 9 ms CPU inference

5/26/2026 · 22 views
R/LOCALLLAMA

Need Help - What would you build? Air-gapped NL assistant that is integrated with Splunk

5/25/2026 · 17 views
R/LOCALLLAMA

Update on 12x32gb sxm v100 cluster / local AI for legal drafting

5/25/2026 · 15 views
R/LOCALLLAMA

Anyone use QwQ-32B? It's over a year old? Has Qwen 3.6 27b basically replaced it?

5/25/2026 · 19 views
R/LOCALLLAMA

Server build for local inference. 128 gb 3200 or 256 gb 2133mhz RAM?

5/25/2026 · 19 views
R/LOCALLLAMA

CUDA: add fast walsh-hadamard transform by am17an · Pull Request #23615 · ggml-org/llama.cpp

5/25/2026 · 17 views
R/LOCALLLAMA

Locally-hosted language-learning AI you can talk to comparable to Pingo AI?

5/25/2026 · 16 views
R/LOCALLLAMA

Can you jailbreak Llama 3.1 8B? (Red-Teaming Challenge)

5/25/2026 · 21 views
R/LOCALLLAMA

Whats the best Qwen 27B Q8 quant?

5/25/2026 · 15 views
R/LOCALLLAMA

Best coding model on RTX 3060

5/25/2026 · 12 views
R/LOCALLLAMA

Llama.cpp : Split Mode Tensor Fix Incoming?

5/25/2026 · 18 views
R/LOCALLLAMA

(Yet Another) KV cache calculator - kvanta.vcerny.cz

5/25/2026 · 11 views
R/LOCALLLAMA

Sharing my 'Local-LLM-Toolkit' repo

5/25/2026 · 18 views
R/LOCALLLAMA

Save Safetensor LLM from C#

5/25/2026 · 15 views
R/LOCALLLAMA

Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps

5/25/2026 · 18 views
R/LOCALLLAMA

The Financial Times has published an article about Heretic

5/25/2026 · 16 views
R/LOCALLLAMA

OSCAR RotationZoo - Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization

5/25/2026 · 19 views
R/LOCALLLAMA

Want Built a React-style looping agent with small LLMs (Qwen 3.5 9B / Gemma4) + LangGraph?

5/25/2026 · 20 views
R/LOCALLLAMA

Are GPU prices hitting peak and falling?

5/25/2026 · 15 views
R/LOCALLLAMA

llama.cpp oom issue

5/25/2026 · 17 views
R/LOCALLLAMA

How local AI improved your live?

5/25/2026 · 9 views
R/LOCALLLAMA

I pioneered AI slop in 2019 with my Tensorflow rig. (24GB back then, too.) AMA.

5/25/2026 · 15 views
R/LOCALLLAMA

Please give me your best tips for fine tuning RTX Pro 6000 on Intel i7-14700KF

5/25/2026 · 16 views
R/LOCALLLAMA

I built a computer use sandbox framework for codex on headless linux. GPU passthrough, computer use, and sudo access for codex all work. It's the perfect dev sandbox to allow full auto work while minimizing the "rm -rf /" risk

5/25/2026 · 16 views
R/LOCALLLAMA

We added W8A8 activation quantization to MLX — prefill went from 2.84s to 2.52s on M5 Pro

5/25/2026 · 17 views
R/LOCALLLAMA

Next year we're getting 0.5T model from Grok

5/25/2026 · 15 views
R/LOCALLLAMA

I made a local-first MCP tutorial repo with node-llama-cpp and a custom agent loop

5/25/2026 · 18 views
R/LOCALLLAMA

server: fix checkpoints creation by jacekpoplawski · Pull Request #22929 · ggml-org/llama.cpp

5/25/2026 · 17 views
R/LOCALLLAMA

NVIDIA Jetson AGX Orin 64GB

5/25/2026 · 17 views
R/LOCALLLAMA

Qwen 3.6 benchmarks on 2x RTX PRO 6000

5/25/2026 · 10 views
R/LOCALLLAMA

It was fun while it lasted... They're advertising now.

5/25/2026 · 16 views
R/LOCALLLAMA

1000 tps generation on Qwen3.6 27B with V100s

5/25/2026 · 14 views
R/LOCALLLAMA

Wrote a custom C++ engine for MiniCPM-V 4.6 on Orange Pi AIPro (Ascend 310B) to bypass framework overhead

5/25/2026 · 14 views
R/LOCALLLAMA

llama.cpp has a clever trick for speeding up KV cache decode

5/25/2026 · 20 views
R/LOCALLLAMA

Could someone please help explain these results?

5/25/2026 · 14 views
R/LOCALLLAMA

opensource music reccomendation / playlist, similar to spotify radio / YT music mix?

5/25/2026 · 19 views
R/LOCALLLAMA

how to install llamacpp the better way to wrapping it in python ui (CPU use only) ?

5/25/2026 · 18 views
R/LOCALLLAMA

Qwen 3.6 27B MTP speed on 3080ti (getting 4.5 t/s)

5/24/2026 · 22 views
R/LOCALLLAMA

hipEngine: Fast Native Qwen 3.6 Inference for RDNA3 (Strix Halo, 7900 XTX)

5/24/2026 · 19 views
R/LOCALLLAMA

Could Open Models be trained to secretly go rogue?

5/24/2026 · 15 views
R/LOCALLLAMA

Generative Recursive Education: Creating Custom Interactive Textbooks on the Fly.

5/24/2026 · 20 views
R/LOCALLLAMA

What frontend do you guys use?

5/24/2026 · 12 views
R/LOCALLLAMA

Can someone help me understand MCP?

5/24/2026 · 9 views
R/LOCALLLAMA

magic incantation to get llama-bench to work with MTP ?

5/24/2026 · 10 views
R/LOCALLLAMA

Need Help Choosing a Harness for Qwen 3.6 27B

5/24/2026 · 17 views
R/LOCALLLAMA

Is NVIDIA still the default best choice for local LLMs in 2026?

5/24/2026 · 19 views
R/LOCALLLAMA

What is the smallest amount of RAM sufficient to run any available on HF GGUF LLM model locally?

5/24/2026 · 16 views
R/LOCALLLAMA

BitCPM-CANN: Native 1.58-Bit Large Language Model Training on Ascend NPU

5/24/2026 · 10 views

More social sources

Visit r/LocalLLaMA directly →