21 stories tagged with #distillation, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Distillation"
Distilling Stale Gasoline to Make it Usable Again
The propensity of gasoline to ‘go stale’ through the process of oxidation is the reason why gasoline that has been stored for a long period of time is considered to be unusable, as…
Skill Distillation
How a personal AI agent built on markdown skills lets a frontier model teach smaller, local models to do real work, without retraining.…
8-step FLUX.2-dev DMD2 distillation
How Model Distillation Actually Works (and What the 'China Distilled Our Model' Headlines Really Mean)
A practical, no-hype explainer of knowledge distillation in LLMs — the actual mechanics, why distilling from a closed API is different, and what the OpenAI/Anthropic vs DeepSeek al…
High Liquor Taxes and a Home Distillation Ban Guarantee a Thriving Booze Black Market
Between a home distillation ban and high liquor taxes, government officials have created the perfect conditions for a black market in distilled spirits.…
Counteraction-Aware Multi-Teacher On-Policy Distillation for General Capability Recovery with Domain Preservation
Domain specialization can improve LLM behavior in vertical domains, but often weakens the general capabilities inherited from the original model. Recent Multi-Teacher On-Policy Dis…
StepOPSD: Step-Aware Online Preference Distillation for Agent Reinforcement Learning
Reinforcement learning for multi-turn agents suffers from a credit-assignment mismatch: rewards are sparse and trajectory-level, while success often hinges on a few local decisions…
When Does Adaptive Guidance Help? Belief-Aware Privileged Distillation for Autonomous Driving Under Partial Observability
Guided Soft Actor-Critic (GSAC) distills knowledge from a privileged full-state teacher to a partial-observation student for autonomous driving, but uses a fixed distillation coeff…
PANDO: Efficient Multimodal AI Agents via Online Skill Distillation
Recent advances in multimodal web agents often rely on increased inference-time computation, including rollout search, verifier passes, offline skill discovery, and specialist mode…
EDGE-OPD: Internalizing Privileged Context with Evidence Guided On-Policy Distillation
On-Policy Distillation (OPD) has gained wide attraction as an LLM post-training paradigm due to its effectiveness in improving capabilities without introducing model distribution d…
It Takes Two: Complementary Self-Distillation for Contextual Integrity in LLMs
Contextual Integrity (CI) defines privacy not merely as keeping information hidden, but as governing information flows according to the norms of a given context. As large language …
Consistently Informative Soft-Label Temperature for Knowledge Distillation
Knowledge distillation (KD) transfers knowledge from a high-capacity teacher to a compact student by matching their predictive distributions, with temperature scaling serving as a …
AVSD: Adaptive-View Self-Distillation by Balancing Consensus and Teacher-Specific Privileged Signals
Self-distillation enables language models to learn on-policy from their own trajectories by using the same model as both student and teacher, with the teacher being conditioned on …
PACD-Net: Pseudo-Augmented Contrastive Distillation for Glycemic Control Estimation from SMBG
Effective diabetes management requires continuous monitoring of glycemic levels. Clinically, glycemic control is assessed using metrics such as Time in Range (TIR), Time Below Rang…
What and When to Distill: Selective Hindsight Distillation for Multi-Turn Agents
Reinforcement learning can train LLM agents from sparse task rewards, but long-horizon credit assignment remains challenging: a single success-or-failure signal must be distributed…
From Sparsity to Simplicity: Enabling Simpler Sequential Replacements via Sparse Attention Distillation
Self-attention serves as the core foundation of large-scale transformer pretraining, but its quadratic token interaction cost makes inference expensive. Replacing attention with si…
SD-Search: On-Policy Hindsight Self-Distillation for Search-Augmented Reasoning
Search-augmented reasoning agents interleave internal reasoning with calls to an external retriever, and their performance relies on the quality of each issued query. However, unde…
AMR-SD: Asymmetric Meta-Reflective Self-Distillation for Token-Level Credit Assignment
The alignment of Large Language Models (LLMs) for complex reasoning heavily relies on Reinforcement Learning with Verifiable Rewards (RLVR). However, standard algorithms like GRPO …
DeltaPrompts: Escaping the Zero-Delta Trap in Multimodal Distillation
Distillation enables compact Vision-Language Models (VLMs) to obtain strong reasoning capabilities, yet the prompts driving this process are typically chosen via simple heuristics …
Towards Generalization of Block Attention via Automatic Segmentation and Block Distillation
Block attention, which processes the input as separate blocks that cannot attend to one another, offers significant potential to improve KV cache reuse in long-context scenarios su…
Self-Distillation Enables Continual Learning [PDF]
Continual learning, enabling models to acquire new skills and knowledge without degrading existing capabilities, remains a fundamental challenge for foundation models. While on-pol…