6M Fake GitHub Stars: How to Vet Open-Source AI Tools
A recent study from Carnegie Mellon University revealed that approximately 6 million fake stars have been distributed across GitHub repositories, particularly affecting AI and LLM projects. The study highlights the unreliability of GitHub stars as a quality signal, as they often reflect casual interest rather than genuine usage or endorsement. Organizations are advised to use alternative metrics, such as fork-to-star ratios and contributor activity, to evaluate open-source AI tools more effectively.
- ▪The CMU study found 6 million fake stars across over 18,600 GitHub repositories.
- ▪AI and LLM projects were identified as the most manipulated category of repositories.
- ▪GitHub stars are often used as a credibility signal, despite being unreliable indicators of quality.
Opening excerpt (first ~120 words) tap to expand
← Back to BlogTrends & Strategy•13 min read6 Million Fake GitHub Stars: How to Vet Open-Source AI Tools Before You Bet on ThemApril 14, 2026•By ChatGPT.ca TeamYour team finds a promising AI agent framework on GitHub. It has 12,000 stars, an active-looking README, and a Discord link. The CTO greenlights a proof-of-concept. Three months later the project is abandoned, the maintainer vanishes, and someone on Hacker News points out that 70% of those stars came from bot accounts created in the same week. You are now maintaining a fork of a dead project as a core dependency. This scenario is not hypothetical. A peer-reviewed study from Carnegie Mellon University, presented at ICSE 2026, found approximately 6 million fake stars distributed across 18,617 repositories by roughly 301,000 accounts.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Hacker News (AI / LLM).