Why are neural networks and cryptographic ciphers so similar?
Neural networks and cryptographic ciphers exhibit structural similarities despite solving different problems, as both process sequences and rely on repeated layers of linear and nonlinear transformations. They independently evolved parallel processing methods with position-aware encodings to improve efficiency and performance. These parallels arise from shared computational constraints rather than direct knowledge transfer between fields.
- ▪Neural networks and cryptographic ciphers both use sequential absorption and state-squeezing mechanisms, resembling the Sponge construction in SHA-3.
- ▪Modern designs in both fields use parallel processing with position encodings, as seen in Transformers and high-speed Message Authentication Codes.
- ▪Both rely on repeating identical layers of alternating linear and nonlinear transformations to achieve complexity and efficient mixing.
- ▪Efficient state mixing in both domains often involves alternating row and column operations, enhancing parallelism and cache efficiency.
- ▪The similarities stem from shared computational principles rather than cross-disciplinary idea copying.
Opening excerpt (first ~120 words) tap to expand
Why are neural networks and cryptographic ciphers so similar? At first glance, training language models and encrypting data seem like completely different problems: one learns patterns from examples to generate text, the other scrambles information to hide it. Yet their underlying algorithms share a curious resemblance, and it’s not for lack of creativity. Sequence processing: the sequential version Consider the venerable recurrent neural network, feeding text token by token into a recurrent state before generating the output text: f.arrow{stroke:#1a365d;fill:none;marker-end:url(#marker-0)}text{fill:#1a365d;font-family:"EB Garamond";font-size:20px;white-space:pre;stroke-width:1}.sub{baseline-shift:-7px;font-size:15px}in0in1inn<S>out0outmout0out1<E>encoderdecoder This is structurally…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Reiner’s webpage and articles..