Skill Distillation
Skill distillation is a method where advanced AI models teach smaller models how to perform tasks. This process involves a structured approach with layers that include a knowledge base, skill files, and an iterative agent loop. The smaller models execute the learned procedures without needing to understand the underlying evaluations, making the system efficient and adaptable.
- ▪Skill distillation involves a frontier model teaching smaller models through markdown files.
- ▪The system includes a local knowledge base, atomic skill files, and an iterative agent loop.
- ▪This approach differs from classical knowledge distillation by focusing on procedural knowledge rather than just compressing model outputs.
Opening excerpt (first ~120 words) tap to expand
Skill Distillation May 29, 2026 AI Agents Productivity I’ve been using state-of-the-art models to teach small models running on my computer how I work. My personal agent, based on Pi, runs my inbox, my deal pipeline, my blog publishing, my calendar, & my research. It looks less like a chatbot & more like a small operating system. The first layer is QMD, a local markdown knowledge base of about eighty workflow files in ~/memories. Before answering any procedural question, the agent searches QMD for the right playbook. The second layer is Skills, atomic SKILL.md files that describe one job each. The skills are written by a frontier model. So are the evaluations that grade them. The same system writes, tests, and rewrites each skill until accuracy converges.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Tomasz Tunguz.