WeSearch
Hub / Tags / Mamba
TAG · #MAMBA

Mamba coverage.

Every story in the WeSearch catalog tagged with #mamba, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

6 stories tagged with #mamba, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag →   or   search "Mamba"

RELATED TAGS
#vanessa-bryant3#gianna-bryant3#kobe-bryant3#state-space-models2#sports2#time-series-forecasting1#mamba-model1#frequency-domain-analysis1#adaptive-learning1#disaggregated-serving1#vllm1#rdma1
YAHOO SPORTS

Vanessa Bryant honors late daughter Gianna on her birthday

Vanessa Bryant honored her late daughter, Gianna, on what would have been her 20th birthday.…

4 views ·
#vanessa bryant#kobe bryant#gianna bryant
NEW YORK POST

Vanessa Bryant honors Gianna on birthday, announces student-athlete scholarships: ‘Your spirit and beautiful heart continue to light the way’

Vanessa Bryant shared several tributes to her late daughter, Gianna, on Friday — and each was quite emotional.…

3 views ·
#sports#celebrity#charity
ARXIV CS.AI

The Randomness Floor: Measuring Intrinsic Non-Randomness in Language Model Token Distributions

Language models cannot be random. This paper introduces Entropic Deviation (ED), the normalised KL divergence between a model's token distribution and the uniform distribution, and…

7 views ·
#language models#randomness#entropic deviation
VERCEL

Disaggregated Serving for Hybrid SSM Models in vLLM

Hybrid architectures that interleave Mamba-style SSM layers with standard full-attention (FA) layers — such as NVIDIA Nemotron-H — are gaining traction as a way…

6 views ·
#disaggregated serving#vllm
ARXIV.ORG

AdaMamba: Adaptive Frequency-Gated Mamba for Long-Term Time Series Forecasting

Accurate long-term time series forecasting (LTSF) requires the capture of complex long-range dependencies and dynamic periodic patterns. Recent advances in frequency-domain analysi…

8 views ·
#time series forecasting#mamba model#frequency domain analysis
MACHINE LEARNING

Going from 3B/7B dense to Nemotron 3 Nano (hybrid Mamba-MoE) for multi-task reasoning — what changes in the fine-tuning playbook? [D]

Following up on something I posted a few days back about fine-tuning for multi-task reasoning. Read a lot since then, and I've moved past the dense 3B vs 7B question — landing on N…

10 views ·