WeSearch

Why AI Doesnt Replace Real Engineering

·9 min read · 0 reactions · 0 comments · 1 view
Why AI Doesnt Replace Real Engineering

AI is just a giant probablistic calulcator, nothing more, nothing less The...

Original article
DEV Community
Read full at DEV Community →
Full article excerpt tap to expand

try { if(localStorage) { let currentUser = localStorage.getItem('current_user'); if (currentUser) { currentUser = JSON.parse(currentUser); if (currentUser.id === 3879108) { document.getElementById('article-show-container').classList.add('current-user-is-article-author'); } } } } catch (e) { console.error(e); } Augmented Mike Posted on Apr 28 • Originally published at augmentedmike.com Why AI Doesnt Replace Real Engineering #ai #softwareengineering #automation #programming AI is just a giant probablistic calulcator, nothing more, nothing less The Training Distribution "Problem" Models are pretrained on the same information. All of them. Here is the simplest way I can describe the distribution "problem". Imagine you took all the division and manipulation built into the modern web and then fed it into ALL the models during pretraining. Models that are built to find patterns in language to better understand the semantic meaning of things. Imagine all the mind viruses (memes, ala Dawkins.) are now in ChatGPT, Claude, and other models. You don't have to imagine it, its obvious to anyone who uses them. But it doesn't stop at harmful human behavior - it greatly affect the ability for models to "code" as well. When Microsoft bought GitHub, the play was clear - get training data for models by buying the worlds largest repository of code. It seemed like a great idea at the time by whoever was in charge, but it was also extremely harmful and (in my mind) unethical. People learning to code in new languages write a ton of slop - slop being defined as code that might technically work, but is not in any way "idiomatic" and fails to consider any of the lessons we have learned in the last 75 years of software development. Half finish projects with horrible code, and very little engineering involved. So if 80% of the code on github was garbage, and you fed it into a model, what would you get out of a distribution? Thats right, garbage. Just because a manager looks at code and says, this looks ok to me, it doesnt mean Rich Hickey, Ken Thompson, Kent Beck, or Robert Martin would agree, in fact, they would rate all this code in pretraining as hot piles of burning trash. The VAST majority of code on Github is trash. The best code is from the top 5% of programmers in the world. 50% of the code on Github will be far worse than the top 80%. If you know much about distributions you will see how this is painfully obvious. There are normal distributions and pareto distributions. The former tells you where most people cluster, but the latter tells you where most of the value comes from. So when models write shitty code and have no concept of what clean, well engineered code looks like - chaulk it up to a distribution problem. GOOD CODE is gatekept, it is compiled, obfuscated, and otherwise made unavailable, and it isnt shouted out to the world - it is worth too much. Sure there are open source projects that are awesomely coded, but they are in a tiny minority of what the models saw. You can hire a average programmer for $50 an hour, but a great programmer can cost $300/hr. Which persons code do you think you will find more of in public? Models Don't Actually "Know" Anything Next we come to a simple fact that very few experts, save Demis Hassabis, Yan LeCun, and AI fatalists like Gary Marcus, want to admit. Models do not know anything. They are just guessing at the best answers based on their training distributions, which I have already laid out are frought with…

This excerpt is published under fair use for community discussion. Read the full article at DEV Community.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Email

Discussion

0 comments

More from DEV Community