14 results for "local llm"
Running Local LLMs Offline on a Ten-Hour Flight
I flew from London to Google Cloud Next 2026 in Las Vegas. Ten hours with no in-flight wifi. I used the time to test how far a modern MacBook can carry engineering work on local LLMs alone. Setup A we…
I asked my local LLM to add 23 numbers and got seven wrong answers
Seven attempts, seven different wrong answers — lessons from setting up a local LLM.…
Home Assistant's local LLM support outperforms Gemini for Home, and Google knows it
The smarter smart home is local.…
Privacy-first Markdown app with wiki links, image links, slash command, and a Local LLM plugin, no split panel preview
Binderus is a free, local privacy-first Markdown wysiwyg editor with a Notion-style block UI. Your notes never leave your machine unless you put them somewhere yourself. What's new in 0.9.0: 🔗 Wiki l…
Built a Character Portrait Generator that reads books, identifies characters, and generates consistent portraits using ComfyUI (full RAG pipeline, local LLM, open-source)
PrePrompt – MCP server that rewrites vague prompts before they reach the LLM
MCP server that intercepts and optimizes prompts in Claude Code and Cursor before they reach the LLM. Zero noise, sub-ms latency, runs locally.…
I’ve been spending the last few weeks testing local music generation on Apple Silicon, mostly around ACE-Step 1.5 + MLX.
I’ve been spending the last few weeks testing local music generation on Apple Silicon, mostly around ACE-Step 1.5 + MLX. Sharing notes because most local AI discussion is still LLM/VLM/TTS-heavy, but …
What would be the best OS to run LLMs?
Hi there, I've ordered a mini PC with 128GB of RAM and the AMD AI Max 395. I intend to use it with Proxmox (like my actual machine), where I run Windows for some gaming and macOS for my music library …
Show HN: Knowerage – code coverage for LLM analysis
Local MCP server that tracks AI analysis coverage against your codebase - MTimma/knowerage…
Skymizer Taiwan Inc. Unveils Breakthrough Architecture Enabling Ultra-Large LLM Inference on a Single Card
Source Article excerpt: With a single PCIe card — powered by six HTX301 chips and 384 GB of memory — enterprises can now run 700B-parameter model inference locally at just ~240W per card. The memory-b…
Claude Cowork Now Runs Any LLM. Test It Free
OpenAI, Gemma, Kimi K2, or run locally. Free via OpenRouter. Anthropic shipped it quietly.…
AMD Hipfire - a new inference engine optimized for AMD GPU's
Came across hipfire the other day. It's a brand new inference engine focused on all AMD GPU's (not just the latest). Github. It uses a special mq4 quantization method. The hipfire creator is pumping o…
Is there any top level hobbyist hardware you guys are waiting to come out this year?
So I've explored buy everything from an RTX 6000 to Mac Studio 512gb M3 ultra to a DGX Spark (I need to travel) for local llm generation. I was about to start looking into a M5 macbook, but I figured …
Is there a way to mitigate performance as context grows?
In my local LLM setup I get from 30 to 80 t/s generation at the beginning, but it drops quite a lot as context grows. I use llama.cpp/Vulkan with an MI50 and a V100, is there some command line flags t…