#attention-mechanism — Tagged Stories

Every story in the WeSearch catalog tagged with #attention-mechanism, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

3 stories tagged with #attention-mechanism, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Attention Mechanism"

RELATED TAGS

#positional-encoding2 #attention-mechanisms2 #ml2 #group-theory1 #mathematical-constraints1 #transformer-architecture1 #sequential-modeling1 #hailing-cheng1 #daqi-sun1 #xinyu-lu1 #arxiv1 #siren-rope1

ARXIV CS.AI

Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling

Every Transformer architecture dedicates enormous capacity to learning rich representations in semantic embedding space -- yet the rotation manifold acted upon by Rotary Positional…

6 views · Wed, 29 Apr 2026 04:04:25 GMT

#transformer architecture #positional encoding #sequential modeling

ARXIV CS.AI

FreqFormer: Hierarchical Frequency-Domain Attention with Adaptive Spectral Routing for Long-Sequence Video Diffusion Transformers

Long-sequence video diffusion transformers hit a quadratic self-attention cost that dominates runtime and memory for very long token sequences. Most efficient attention methods use…

6 views · Wed, 29 Apr 2026 04:04:25 GMT

#computer vision #transformer models #video diffusion

JANE STREET BLOG

Using group theory to explore the space of positional encodings for attention

Attention is a computational primitive at the core of modern language models, allowing internal representations to reference and influence each other. It’s h...…

6 views · Tue, 28 Apr 2026 08:54:13 GMT

#positional encoding #attention mechanisms #group theory

Browse more

All tags Search "Attention Mechanism" RSS feed World US Technology Markets

Attention Mechanism coverage.

Learning to Rotate: Temporal and Semantic Rotary Encoding for Sequential Modeling

FreqFormer: Hierarchical Frequency-Domain Attention with Adaptive Spectral Routing for Long-Sequence Video Diffusion Transformers

Using group theory to explore the space of positional encodings for attention

Browse more