WeSearch

Porting microgpt to Futhark, Part I

·7 min read · 0 reactions · 0 comments · 5 views
#futhark#microgpt#parallel programming#neural networks#porting code#Andrej Karpathy#microgpt#Futhark#Python#GPT-2
⚡ TL;DR · AI summary

The author explores porting Andrej Karpathy's microgpt, a minimal GPT-2-like implementation in Python, to the data-parallel language Futhark to improve scalability. This first part focuses on translating the forward pass while maintaining structural similarity to the original code. While the Futhark version scales better, it sacrifices some conciseness compared to the Python version.

Original article
Kmjn
Read full at Kmjn →
Opening excerpt (first ~120 words) tap to expand

.highlight { margin-left: 3em; } I have been wanting to find a project to try out the data-parallel language Futhark. They have a very good blog that I've been following for years, but I've never actually written anything in it. Andrej Karpathy's microgpt, a self-contained implementation of a GPT-2-like neural network in 200 lines of Python, finally provided the excuse. I like microgpt, but it does not scale at all. Obviously the point of this implementation is not efficiency, but it's not just that it's slow: you also can't scale up to even slightly larger networks, because you quickly hit Python recursion depth errors. So, I was curious whether I could port it as 1-to-1 as possible and get much better scaling without losing too much concision.

Excerpt limited to ~120 words for fair-use compliance. The full article is at Kmjn.

Anonymous · no account needed
Share 𝕏 Facebook Reddit LinkedIn Threads WhatsApp Bluesky Mastodon Email

Discussion

0 comments

More from Kmjn