No AI Model Can Carry a Creative Project End to End. The HCB Just Proved It.
The Human Creativity Benchmark by Contra Labs tested 15 AI models across 93 prompts in five creative domains, revealing that no single model excels in all phases of a creative project. Professional creatives evaluated outputs across ideation, mockup, and refinement stages, consistently finding that different models perform best at different stages. The results suggest that multi-model workflows are essential for optimal creative output, rather than relying on one AI for an entire project.
- ▪The benchmark included 93 prompts across five domains: landing pages, product videos, ad images, brand design, and desktop apps.
- ▪Each domain was evaluated across three phases—ideation, mockup, and refinement—resulting in roughly 15,000 judgments from professional creatives.
- ▪No single AI model led in all three phases in any domain, indicating that different models are better suited to different stages of creative work.
- ▪In product videos, Veo 3.1 excelled in ideation but performed worse under constraints, a unique degradation pattern not seen in other models.
- ▪The study concludes that multi-model pipelines are not just advantageous but necessary, as no model can carry a creative project from start to finish effectively.
Opening excerpt (first ~120 words) tap to expand
try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 3860701) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Igor Gridel Posted on May 1 • Originally published at igorgridel.com No AI Model Can Carry a Creative Project End to End. The HCB Just Proved It. #webdev #ai #productivity #design No AI Model Can Carry a Creative Project End to End. The HCB Just Proved It. Subtitle: Contra Labs ran 15 AI models through 93 prompts across 5 creative domains. Professional creatives judged the output.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at DEV.to (Top).