WeSearch
Hub / Tags / Qwen
TAG · #QWEN

Qwen coverage.

Every story in the WeSearch catalog tagged with #qwen, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

48 stories tagged with #qwen, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag →   or   search "Qwen"

RELATED TAGS
#ai5#ollama5#local-llms3#qwen3-6-27b2#claude2#anthropic2#qwen32#ml2#minecraft2#coding-model1#open-weight1#llama-cpp1
REDDIT

Qwen 3.6 35b a3b Q4 vs qwen 3.6 27b q6, on m5 pro 64gb

Tried to test the two versions of models in my own m5 pro 64, curated the results on claude, not an expert so settings/config might not be the best. do share what results or improv…

21 views ·
LOCALLLAMA

Quant Qwen3.6-27B on 16GB VRAM with 100k context length

I have experimented how to run Qwen3.6-27B on my laptop with an A5000 16GB GPU. I have created an own IQ4_XS GGUF "qwen3.6-27b-IQ4_XS-pure.gguf" with the Unsloth imatrix and compar…

18 views ·
REDDIT

Qwen3.6-35B-A3B KLDs - INTs and NVFPs

KLD for INTs and NVFP4s. AS ALWAYS - Use Case is important. Accuracy versus speed versus native kernels on your GPUs. Things to note again: This is done in VLLM, with REAL logits. …

17 views ·
THE REGISTER

Usage-based pricing killing your vibe - here's how to roll your own local AI coding agents

Take those token limits and shove them by vibe coding with a local LLM With model devs pushing more aggressive rate limits, raising prices, or even abandoning subscriptions for usa…

8 views ·
#ai coding#local llms#usage-based pricing
LOCALLLAMA

Been using Qwen-3.6-27B-q8_k_xl + VSCode + RTX 6000 Pro As Daily Driver

So in response to the Great Token Reconning of 2026, I decided to try out Qwen 3.6 as a daily driver, and although it's only been about a day, I have to say I'm thoroughly impresse…

6 views ·
LOCALLLAMA

Qwen3.6-27B-NVFP4 - images

Model: Abiray-Qwen3.6-27B-NVFP4.gguf Specs: - Legion 7i Gen10 - NVIDIA GeForce RTX™ 5090 - Intel® Core™ Ultra 9 275HX × 24 - RAM 32.0 GiB llamacpp settings: ./build/bin/llama-serve…

5 views ·
LOCALLLAMA

"LLM is created so engineer don't have to write a report", anyway found out ONLYOFFICE can connect to OpenAI compatible, using Qwen 3.6 to do elaboration.

It is pluggin made for ONLYOFFICE, much simpler than copy-paste from webui. PS. Switch to non thinking/reasoning when using this, and the best model for this is Gemma line up. even…

5 views ·
LOCALLLAMA

Have Qwen said anything about further Qwen 3.6 models?

Have Qwen hinted at whether other models (9B, 122B, 397B) would be getting the 3.6 treatment? Or have they in any way confirmed or hinted at "this is it"? Genuinely curious if I mi…

9 views ·
LOCALLLAMA

Qwen3.6-27B at 72 tok/s on RTX 3090 on Windows using native vLLM (no WSL, no Docker), portable launcher and installer

The angle here is native Windows, no WSL. Simple installation, open source, no telemetry. Not selling or promoting anything: Numbers (RTX 3090, Windows 10): - 72 tok/s short prompt…

6 views ·
LOCALLLAMA

We are finally there: Qwen3.6-27B + agentic search; 95.7% SimpleQA on a single 3090, fully local

LDR maintainer here. Thanks to the strong support of r/LocalLLaMA community LDR got very far. I haven't reported in a while because I thought I was not ready for another prominent …

3 views ·
LOCALLLAMA

What's your tps on 3090 + Qwen 3.6 27B in real tasks?

I struggle to wrap my head around all this. My goal is local agent to solve low complexity tasks, in the same harness where I would use frontier models. So naturally this means a l…

1 view ·
HACKER NEWS: FRONT PAGE

Show HN: Hollow is an open-sourced self-modifying agentic system

10 views ·
#ai#open source#self-modifying systems
DEV COMMUNITY

Stuck in the Birch Log Blues 🪵😩

Alright folks, buckle up. The last four hours with Kiwi-chan have been… a journey. A repetitive, birch-log-less journey. As you can see from the logs, we're stuck in a loop of "gat…

7 views ·
#ai#minecraft#web3
DEV.TO (TOP)

Tenacious-Bench: Building a Sales Domain Evaluation Benchmark When No Dataset Exists

The Gap General-purpose LLM benchmarks like τ²-Bench evaluate task completion in retail...…

4 views ·
#machine learning#llm evaluation#sales automation
PRISMML

Bonsai: The First Commercially Viable 1-Bit LLM

Today, we are announcing 1-bit Bonsai models that bring advanced intelligence to the devices where people actually live and work.…

9 views ·
#ai efficiency#edge computing#model compression
DEV.TO (TOP)

Function Calling with Ollama: Make Your Local LLM Run Real Tools

Function Calling with Ollama: Make Your Local LLM Run Real Tools Most Ollama tutorials end...…

5 views ·
#ai#ollama#typescript
LOCALLLAMA

Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB

## Got DFlash speculative decoding working on Qwen3.5-35B-A3B with an RTX 2080 SUPER 8GB I managed to get **DFlash speculative decoding** working in llama.cpp on a pretty VRAM-limi…

5 views ·
LOCALLLAMA

Qwen3.6-27B - Closed-loop SVG Images

Yesterday, I saw an impressive presentation of Qwen 3.6 27B's SVG capabilities on the sub . To maximize the model's capabilities in terms of SVG generation, I put together a closed…

5 views ·
XDA

I replaced ChatGPT and Claude with this powerful local LLM and saved over $20 a month while gaining full control

Qwen3.6 runs on my old GPU and does what ChatGPT does for free…

6 views ·
#local llm#ai privacy#cost savings
XDA

I replaced Claude Pro with a local 9B model for a week, and finally found out what I was paying $20 a month for

The gap was smaller than I expected…

8 views ·
#ai models#local llms#claude pro
R/LOCALLLAMA

Follow-up: Qwen3.6-27B on 1× RTX 3090 — pushing to ~218K context + ~50–66 TPS, tool calls now stable (PN12 fix)

8 views ·
DEV.TO (TOP)

Your AI summarizer is leaking its own chain-of-thought. Here's the 30-line fix.

I caught my own production summarization API doing something embarrassing today, and I think yours...…

6 views ·
#ai#debugging#python
QWEN

Qwen-Scope: Official Sparse Autoencoders (SAEs) for Qwen 3.5 Models

Qwen Studio offers comprehensive functionality spanning chatbot, image and video understanding, image generation, document processing, web search integration, tool utilization, and…

6 views ·
X (FORMERLY TWITTER)

Post-trained Qwen3-Coder with a debugger: 70% → 89% solve rate, 59% fewer turns

8 views ·
LOCALLLAMA

Qwen 3.6-35B-A3B KV cache bench: f16 vs q8_0 vs turbo3 vs turbo4 from 0 to 1M context on M5 Max

Took TheTom's TurboQuant Metal fork of llama.cpp (github.com/TheTom/llama-cpp-turboquant, the feature/turboquant-kv-cache branch) and ran a depth sweep on Qwen 3.6-35B-A3B Q8. TheT…

9 views ·
STABLEDIFFUSION

Ernie VS Qwen and ZiT - Big Test

A large test of 100 images in a gallery Big image generator showdown: 100 prompts, 3 models, 1 winner. This comparison brings together three open image models with very different s…

10 views ·
REDDIT

Qwen 3.6 27B BF16 vs Q4_K_M vs Q8_0 GGUF evaluation

10 views ·
REDDIT

AMD Radeon RX 6900 XT - ROCm vs Vulkan - Gemma 4 and Qwen 3.5 speed benchmarks

Did some quick tests after building llama.cpp with ROCm 6.4.2 and latest Vulkan for my 6900 XT gemma4 E2B Q4_K ubatch ROCm pp512 Vulkan pp512 ROCm tg128 Vulkan tg128 32 1536.60 142…

10 views ·
LOCALLLAMA

[7900XT] Qwen3.6 27B for OpenCode

I'm just looking for some advice on optimally setting up Qwen3.6 27B for OpenCode. The VRAM is a little bit scarce, but I ended up with this so far: llama-server --model models/Qwe…

11 views ·
WILLIAMANGEL

Offline Agentic Coding

Offline Agentic Coding: Ollama and Claude code…

5 views ·
#ai#llms#agents
REDDIT

GBNF grammar tweak for faster Qwen3.6 35B-A3B and Qwen3.6 27B

Hi folks, Enjoy an optimised Qwen3.6 35B-A3B and Qwen3.6 27B for coding and general purpose - it's able to solve puzzles correctly more often too. The initial intent was to optimis…

9 views ·
REDDIT

Used a Claude Code skill to fine-tune Qwen3-1.7B from 327 noisy traces, matches GLM-5

Had 327 production traces from a restaurant-reservation agent I wanted to retrain. The plan was to fine-tune a smaller self-hostable model so I could ditch the frontier-API bill. T…

12 views ·
REDDIT

Luce DFlash: Qwen3.6-27B at up to 2x throughput on a single RTX 3090

Hey fellow Llamas, your time is precious, so I'll keep it short. We built a GGUF port of DFlash speculative decoding. Standalone C++/CUDA stack on top of ggml, runs on a single 24 …

46 views ·
LOCALLLAMA

Simple to use vLLM Docker Container for Qwen3.6 27b with Lorbus AutoRound INT4 quant and MTP speculative decoding - 118 tokens/second on 2x 3090s

8 views ·
REDDIT

Agents for end-to-end document redaction and review tasks (OCR and PII identification - Qwen 3.6 vs closed-source comparison)

(Links to all files, apps, and repos mentioned in this post can be found in the 'full post' link at the bottom) Agents for document redaction and review tasks Document redaction ta…

9 views ·
REDDIT

Qwen3.6-27B-3bit-mlx · Hugging Face: 3 & 5 mixed quant for RAM poor Mac users.

Just dropped a 3bit mixed quant (5bit for embeds and prediction layers) for Mac users. There was only one 3 bit version of this model (from Unsloth), but it was very heavy and pain…

10 views ·
REDDIT

Brief Ngram-Mod Test Results - R9700/Qwen3.6 27B

Decided to try out the new --spec-type ngram-mod feature in llama.cpp using Qwen3.6 27B during an OpenCode bug chasing session. TLDR: Performance is variable, but so far it seems t…

11 views ·
REDDIT

Switched from Qwen3.6 35b-a3b to Qwen3.6 27b mid coding and it's noticeably better!

A bit of context. I was coding up a little html tower defense game where you can alter the path by placing additional waypoints. My setup: 32gb ram with 16gb vram 5070 ti. Using Ae…

12 views ·
SIMON WILLISON'S WEBLOG

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model

Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model Big claims from Qwen about their latest open weight model: Qwen3.6-27B delivers flagship-level agentic coding performance, s…

14 views ·
#qwen3.6-27b#coding model#open-weight
REDDIT

Qwen3.6-27B-INT4 clocking 100 tps with 256k context length on 1x RTX 5090 via vllm 0.19

Thanks to the community the Qwen3.6-27B speed keeps getting better. The following improves upon my recipe from yesterday and delivered a whopping 100+ tps (TG). Model: - MTP suppor…

10 views ·
REDDIT

Qwen3.6 35B A3B Heretic (KLD 0.0015!) Incredible model. Best 35B I have found!

Been using this for a few days. It is BY FAR the best uncensored model I have found for Qwen 3.6 35B. With IQ4XS, Q8 KVcache, 262K context, it fits in 24GB of VRAM and does not fai…

10 views ·
REDDIT

Qwen 3.6 27B in Claude Code says it will do something then stops and prompts for user reply (not failing a tool call)

I'm running Qwen/Qwen3.6-27B-FP8 via vLLM using this command: vllm serve Qwen/Qwen3.6-27B-FP8 --tensor-parallel-size 4 --gpu-memory-utilization 0.95 --max-num-seqs 8 \ --enable-aut…

11 views ·
REDDIT

Qwen3.5/3.6 Coder?

With practically all of LocalLlama glazing Qwen 3.5/3.6 for it's coding skills. Along with the fact that Alibaba themselves are focusing on making Qwen a reliable coding agent, doe…

94 views ·
REDDIT

[Qwen3.6 35b a3b] Used the top config for my setup 8gb vram and 32gb ram, and found that somehow the Q4_K_XL model from Unsloth runs just slightly faster and used less tokens for output compared to Q4_K_M despite more memory usage

Config CtxSize: 131,072 GpuLayers: 99 CpuMoeLayers: 38 Threads: 16 BatchSize/UBatchSize: 4096/4096 CacheType K/V: q8_0 Tool Context: file mode (tools.kilocode.official.md) Metric M…

8 views ·
REDDIT

Qwen3.6-27B at ~80 tps with 218k context window on 1x RTX 5090 served by vllm 0.19

Qwen3.6-27B is out for a few days and the NVFP4 with MTP is dropped earlier on HF: Can follow the same recipe I used for Qwen3.5-27B to achieve ~80 tps on a single RTX 5090 at 218k…

10 views ·
REDDIT

Field report: coding with Qwen 3.6 35B-A3B on an M2 Macbook Pro with 32GB RAM

TL;DR: I finally have this working and doing real work within the tight specs of my 32GB RAM Mac. So for those who would like to fly like Julien Chaumond , here's an updated HOW-TO…

9 views ·
REDDIT

Qwen3.6 35b a3b Particle System

Started testing Qwen3.6 35b a3b. I let it code a particle System with my Pi Agent. It just made one little ValueError but I was impressed how fast it got it right. Which task are y…

10 views ·
REDDIT

Qwen3.5/3.6 Coder?

8 views ·