All of human cooking compressed into 2 megabytes
Researchers have developed Epicure, a new AI model that compresses a vast amount of culinary knowledge into just 2 megabytes. This model utilizes a multilingual recipe corpus containing over 4 million recipes to create ingredient embeddings. The study explores the relationships between ingredients and compounds through various graph-based approaches.
- ▪Epicure is a family of three sibling skip-gram ingredient embeddings retrained from scratch on a multilingual recipe corpus.
- ▪The model aggregates 4.14 million recipes from 11 sources across seven languages.
- ▪It normalizes raw ingredient strings to 1,790 canonical entries using an LLM-augmented pipeline.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Artificial Intelligence arXiv:2605.22391 (cs) [Submitted on 21 May 2026] Title:Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings Authors:Jakub Radzikowski, Josef Chen View a PDF of the paper titled Epicure: Navigating the Emergent Geometry of Food Ingredient Embeddings, by Jakub Radzikowski and Josef Chen View PDF HTML (experimental) Abstract:We present Epicure, a family of three sibling skip-gram ingredient embeddings retrained from scratch on a multilingual recipe corpus. We aggregate 4.14M recipes from 11 sources spanning seven languages, English, Chinese, Russian, Vietnamese, Spanish, Turkish, Indonesian, German, and Indian-English, and normalise the raw ingredient strings to 1,790 canonical entries via an LLM-augmented pipeline.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.