16 results for "open weight"
US startup Poolside debuts its first open-weight model, Laguna XS.2, a 33B-A3B-parameter MoE model, and Laguna M.1, a proprietary 225B-A23B-parameter MoE model (Carl Franzen/VentureBeat)
Carl Franzen / VentureBeat : US startup Poolside debuts its first open-weight model, Laguna XS.2, a 33B-A3B-parameter MoE model, and Laguna M.1, a proprietary 225B-A23B-parameter MoE model — The AI ra…
Open Weights Kill the Moat
American capital financed AI on the assumption it would be the next great monopoly. Open-weight models are commoditizing the capability that monopoly was supposed to protect. The collision between the…
open models keep catching up and the frontier keeps moving. at some point one of those has to stop
a year ago there was a clear tier gap. now i'm less sure, but not in the way i expected. the tasks where open-weight models have genuinely caught up are real: coding assistance, summarization, instruc…
Xiaomi releases MiMo-v2.5 Family weights with strong coding and agent benchmarks
Peking University gives its computer science students a compiler project every semester. Build a complete SysY compiler in Rust including lexer, parser, abstract syntax tree, IR code generation, assem…
AI heavyweights’ court battle could unravel the entire sector
Elon Musk is suing to make OpenAI a non-profit again and remove Sam Altman as its CEO. If he succeeds, the complex web of deal between the world’s AI companies could fall apart.…
Show HN: Utilyze – an open source GPU monitoring tool more accurate than nvtop
The standard GPU utilization metric reported by nvidia-smi, nvtop, Weights & Biases, Amazon CloudWatch, Google Cloud Monitoring, and Azure Monitor is highly misleading. It reports the fraction of time…
Architectural Requirements for Agentic AI Containment
The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…
Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis
Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…
GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs
Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is groun…
Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions
We instruct an AI agent to construct two separate agentic AI platforms: one for autonomous training of predictive ML models for human-human and virus-human PPI, and the other for inducing explicit gen…
A2DEPT: Large Language Model-Driven Automated Algorithm Design via Evolutionary Program Trees
Designing heuristics for combinatorial optimization problems (COPs) is a fundamental yet challenging task that traditionally requires extensive domain expertise. Recently, Large Language Model (LLM)-b…
Anthropic's Claude remote uses GLM-4.7
I just noticed this after a bug wasn't getting fixed. If you start a Claude code remote environment the default model (hidden on mobile) is glm 4.7 I assumed anthropic only used their own models for e…
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense Model Big claims from Qwen about their latest open weight model: Qwen3.6-27B delivers flagship-level agentic coding performance, surpassing the previo…
DeepSeek V4 - almost on the frontier, a fraction of the price
Chinese AI lab DeepSeek's last model release was V3.2 (and V3.2 Speciale) last December . They just dropped the first of their hotly anticipated V4 series in the shape of two preview models, DeepSeek-…
Introducing AutoMuon, a one line drop in for AdamW [P]
Hey everyone, I've been working on a small Python package called AutoMuon that makes the Muon optimizer usable as a drop-in replacement for AdamW in arbitrary PyTorch training pipelines. The core idea…
ast-outline: a parallel structural code summarizer written in Rust (5–10x token savings for LLM agents)
I just open-sourced ast-outline – a fast, zero-dependency CLI tool that extracts the structural outline of source files (classes, functions, signatures, fields, doc comments + line numbers) and drops …