WeSearch
Hub / Search / vram
SEARCH · VRAM

Results for "vram".

12 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

12 results for "vram"

TOM'S GUIDE

Nvidia RTX 5070 laptop GPU officially has 12GB of VRAM — and it’s about time

Nvidia has officially announced the RTX 5070 laptop GPU with 12GB of GDDR7 VRAM. This could be a huge win for mid-range gaming laptops.…

· 4 views
REDDIT

To 16GB VRAM users, plug in your old GPU

For those who want to run latest dense ~30b models and only have 16GB VRAM, if you have a old card with 6GB VRAM or more, plug it in. It matters that everything fits on the VRAM, even on 2 cards. Even…

· 6 views
REDDIT

VRAM.cpp: Running llama-fit-params directly in your browser

Lots of people are always asking on this subreddit if their system can run a certain model. A lot of the "VRAM calculators" that I've found only provide either very rough estimates or are severely lim…

· 8 views
TOM'S HARDWARE

Nvidia quietly launches 12GB RTX 5070 laptop GPU — midrange mobile gaming gets more VRAM amid the RAMpocalypse

The new model will use 3GB modules, so memory bandwidth should stay close to the RTX 5070 8GB mobile part.…

· 3 views
REDDIT

[Qwen3.6 35b a3b] Used the top config for my setup 8gb vram and 32gb ram, and found that somehow the Q4_K_XL model from Unsloth runs just slightly faster and used less tokens for output compared to Q4_K_M despite more memory usage

Config CtxSize: 131,072 GpuLayers: 99 CpuMoeLayers: 38 Threads: 16 BatchSize/UBatchSize: 4096/4096 CacheType K/V: q8_0 Tool Context: file mode (tools.kilocode.official.md) Metric M Model XL Model Diff…

· 5 views
LOCALLLAMA

Quant Qwen3.6-27B on 16GB VRAM with 100k context length

I have experimented how to run Qwen3.6-27B on my laptop with an A5000 16GB GPU. I have created an own IQ4_XS GGUF "qwen3.6-27b-IQ4_XS-pure.gguf" with the Unsloth imatrix and compared the mean KLD of i…

· 6 views
LOCALLLAMA

[7900XT] Qwen3.6 27B for OpenCode

I'm just looking for some advice on optimally setting up Qwen3.6 27B for OpenCode. The VRAM is a little bit scarce, but I ended up with this so far: llama-server --model models/Qwen3.6-27B-IQ4_XS.gguf…

· 5 views
LOCALLLAMA

AMG GPUs are faster at pre filling

I did give same prompt same document to 1660ti running Gemma 4 e2b q4 coz of the small vram and another to and igpu running Gemma 4 e4b q8 prefill rate before token generation was like 4-5 times faste…

· 3 views
REDDIT

Switched from Qwen3.6 35b-a3b to Qwen3.6 27b mid coding and it's noticeably better!

A bit of context. I was coding up a little html tower defense game where you can alter the path by placing additional waypoints. My setup: 32gb ram with 16gb vram 5070 ti. Using AesSedai/Qwen3.6-35B-A…

· 11 views
REDDIT

Qwen3.6 35B A3B Heretic (KLD 0.0015!) Incredible model. Best 35B I have found!

Been using this for a few days. It is BY FAR the best uncensored model I have found for Qwen 3.6 35B. With IQ4XS, Q8 KVcache, 262K context, it fits in 24GB of VRAM and does not fail on multi turn tool…

· 7 views
REDDIT

Hardware Choice for 27b to 31b models.

I've come to a point where I find the 27b and 31b models quite impressive. I have a 16 GB AMD Radeon 7800xt. It performs quite well. It was $700. Here is my question: Is the dual GPU approach performa…

· 7 views
REDDIT

(Linux) Has anyone succeeded in using NVMe space as substitute RAM for larger models? Is it worthwhile?

So I have a consumer-grade AMD GPU with 24gb VRAM and 64gb DDR5 RAM which have served me well enough for models up to around 120B. Of course, this just isn't enough for larger models in the 300B+ rang…

· 6 views