WeSearch

BEAVER: Enterprise benchmark for LLM Text-to-SQL from private data warehouses

·1 min read · 0 reactions · 0 comments · 8 views
BEAVER: Enterprise benchmark for LLM Text-to-SQL from private data warehouses
Original article
Github
Read full at Github →
Opening excerpt (first ~120 words) tap to expand

BEAVER is a large-scale enterprise text-to-SQL dataset containing 9128 queries spanning 812 tables across 19 diverse domains. Of these, 7978 queries are publicly released, while the remaining portion is held out as a private test set. Queries and databases were collected from private organizations. To facilitate fine-grained evaluation and analysis, we provide annotations for five subtasks: multi-table retrieval, join key detection, column mapping, domain knowledge extraction, and query decomposition three categories of queries: complex queries without domain knowledge, domain-specific queries with minimal complexity, and domain-specific complex queries

Excerpt limited to ~120 words for fair-use compliance. The full article is at Github.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Github