RecipeScape: An Interactive Tool for Analyzing Cooking Instructions at Scale
RecipeScape is an interactive tool that analyzes cooking instructions at scale by converting recipe text into tree structures using a computational pipeline involving POS tagging and human annotation. It calculates pairwise similarities between recipes using weighted tree edit distance based on semantic word embeddings and visualizes them via hierarchical clustering. The system gathers recipes from websites using schema.org's Recipe format and processes them to reveal procedural patterns across variations of the same dish. A distance matrix and coordinate mapping enable spatial representation of recipe relationships.
Opening excerpt (first ~120 words) tap to expand
Computational Pipeline that uses Part-of-Speech tagger and human annotation to convert recipe text into a tree representation, and calculates pairwise distance to visualize the similarities. Data Gathering: In the data gathering step, we crawl all search results for a queried dish, like chocolate chip cookie and tomato pasta, from recipe websites that use the schema.org’s Recipe scheme. Parsing: We use off the shelf POS tagger and human annotation to parse tokens of the crawled recipes. More detail is provided in the section below on annotation interface. Similarity Comparison: In order to obtain similarities between the recipes, we use a tree edit distance, a commonly used technique for comparing tree structures.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Kixlab.