60 stories tagged with #cpp, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Cpp"
ui: Mermaid Diagrams in chat + interactive preview by allozaur · Pull Request #24032 · ggml-org/llama.cpp
Encodec.cpp, a portable C++ implementation of Meta's EnCodec using Eigen [P]
Tensor split mode: CUDA error on latest llama.cpp with Qwen-3.6-27b
llama.cpp b9455 Finally Caught vLLM: 70t/s on 2x3090 Qwen 27B UQ8
Test post…
Mellum & Granite Embedding models are ready on llama.cpp
Another shout out to llama.cpp build b9455 2x3090
Dockerfiles for cpp projects
Show HN: I wrote a program that hashes files into poems
Hash files into LLM-generated poems locally. Contribute to alebeck/rhymesum development by creating an account on GitHub.…
Better C++ Meetup preview on SwedenCpp , shows now online, hybrid or in person
Llama.cpp now has an official website: llama.app
llama.cpp now has an official website: https://t.co/9akc1jm8jV Our goal is to make local AI accessible to everyone, and improving the user experience is a big part of that. On the…
C or Cpp In Ethical Hacking/Cyber-Security
Llama.cpp B9406 MTP mmproj fix
ReNew Energy Global receives buyout proposal from CPP Investments and CEO
LLM-Manager: Orchestrating Ollama and Llama.cpp with Pure Bash
LLM-Manager is a lightweight, modular Bash suite with a dual JSON/Interactive interface designed to...…
VSCode extension that integrates cppreference docs into editor/LSP
C++26: Ordering of constraints involving fold expressions
You have two overloads of g(). One requires A<T> for each element in a pack, the other requires C<T> — where C is a stricter concept that subsumes A. Both apply to the types you’re…
When Your ChatLlamaCpp Stream Causes an Infinite Loop
When Your ChatLlamaCpp Stream Causes an Infinite Loop You've been there. Your AI agent...…
Fibonacci in C++ Templates
I made a Windows app for managing llama.cpp in WSL/Ubuntu
The Story Behind Java: From C++ Limitations to Platform Independence
Introduction Today I learned Java basics from my trainer, and he explained it in a simple...…
Ollama v0.30.0-rc23: "directly support llama.cpp" & "compatibility with GGUF"
This version of Ollama will change the architecture to directly support llama.cpp instead of building on top of GGML, and allows for compatibility with GGUF file format. MLX is use…
Creating a Custom Grid Editor tool in Unreal Engine
Hello there, Dev community! I'm a beginner, relatively, in the Game Development / Programming...…
CUDA: add fast walsh-hadamard transform by am17an · Pull Request #23615 · ggml-org/llama.cpp
Llama.cpp : Split Mode Tensor Fix Incoming?
llama.cpp oom issue
First Gemma 4 ExecuTorch Deployment on Raspberry Pi 5 — and Why It's 7.7 Slower Than llama.cpp
On April 2, ARM published a blog post announcing Gemma 4 optimised for ARM devices via XNNPACK +...…
I made a local-first MCP tutorial repo with node-llama-cpp and a custom agent loop
server: fix checkpoints creation by jacekpoplawski · Pull Request #22929 · ggml-org/llama.cpp
llama.cpp has a clever trick for speeding up KV cache decode
how to install llamacpp the better way to wrapping it in python ui (CPU use only) ?
Running Gemma 4 on a Modest Machine: Unsloth vs LM Studio vs llama.cpp vs Ollama
This is a submission for the Gemma 4 Challenge: Write About Gemma 4 When local AI conversations...…
GAS Input Tags: Ability Activation Without Hardcoded Bindings
What are input tags in GAS and why you should use them Input tags are one way of...…
GPU VRAM only for small models with llama.cpp: is it possible?
The C++ Standard Library Has Been Walking Itself Back for Fifteen Years
# The C++ standard library has been walking itself back for fifteen years, and the receipts are public Sandor Dargo's [post this month on `std::copyable_function`](https://www.sand…
llama.cpp server have built-in native tools (exec_shell, edit_file, etc.)
I ditched LM Studio for llama.cpp, and my local LLM doesn't feel like a downgrade anymore
My new main runner…
Interesting talk about profiles semantics and mechanisms in using std::cpp 2025
LLaMa.cpp basic question
Seeking resources to read about llama.cpp server and how offloading works
Building SQLite from Scratch: 740 Lines of C++23 to Understand Every Byte of a .db File
You fire up a MySQL client, connect to port 3306, send off your SQL, and the server parses,...…
CPPIB earned 7.8% for year on boosts from stocks, energy and infrastructure investments
Fund’s performance was dragged down by losses on foreign currency and fell far short of 13.2% benchmark…
The bloated CPP Investment Board is trounced by its own benchmarks – again
For two decades, managers have tried – and failed – to beat the markets…
CPP Investments earns 7.8% for fiscal 2026 helped by holdings in public equities
Real assets, particularly energy and infrastructure assets, also contributed to the gains…
110 tok/s with 12GB VRAM on Qwen3.6 35B A3B and ik_llama.cpp
Is cppreference.com compiler support up to date again?
CPPIB sells portfolio of private equity fund interests for about $4-billion
Blackstone Strategic Partners and private investment firm Ardian acquire interests from the pension manager…
Gemma 4 MTP with LlamaCPP
cpp-linter-hooks: The Most Complete pre-commit Solution for C/C++ Projects
If you work on Python projects, you've probably used pre-commit — running black, ruff, mypy before...…
Building Claude Code from Scratch: A Minimal Agent in 393 Lines of C++
An AI coding assistant that reads your files, writes code, and runs shell commands. The core logic? A...…
Ollama vs llama.cpp vs vLLM: Which Should You Use in 2026?
Ollama vs llama.cpp vs vLLM compared — ease of use, speed, GPU needs. Which inference engine is right for your workflow?…
Benchmarking llama.cpp's new MTP support on Strix Halo
After llama.cpp merged Multi-Token Prediction (MTP) speculative decoding support, I benchmarked Qwen3.6 27B and 35B-A3B on Strix Halo and an RTX 3090. Up to 2.44× speedup, lossless…
Is the llama.cpp nixos flake just broken?
CppCon 2026 Hudson River Trading Scholarship
Why MTP doesn't speed up your llama.cpp inference (and how to actually fix it)
Why MTP often fails to speed up llama.cpp inference, and how to debug acceptance rate, VRAM pressure, and CUDA graph capture issues.…
CppCast: GPU Programming and HLSL with Chris Bieneman
Find bugs in YOUR code using OpenCode, Llama.cpp and Qwen3.6
Background For quite some time I had been submitting tasks to LLMs via llama-cli (natively) or llama-server (API), both from the excellent...…
Building and Running Llama.cpp on an Air-Gapped Mac
If you ever tried to run Llama.cpp on a MacOS device that doesn't have internet on it, you've...…
GPU Hardware & Driver Update: RTX 5090 Benchmarks, llama.cpp MTP, Windows 11 Fix
GPU Hardware & Driver Update: RTX 5090 Benchmarks, llama.cpp MTP, Windows 11 Fix ...…
llama: avoid copying logits during prompt decode in MTP by am17an · Pull Request #23198 · ggml-org/llama.cpp
Building a 3D engine from scratch with C++ and Vulkan for web developers Part II: Rendering your first triangle
Where we left off In Part I, we set up the Vulkan infrastructure: instance, window,...…