WeSearch
Hub / Search / ai inference
SEARCH · AI INFERENCE

Results for "ai inference".

19 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

19 results for "ai inference"

SEEKING ALPHA

AMD: Inference And Agentic AI Are Expanding Its Runway

Advanced Micro Devices is Buy-rated on expanding AI demand, strong EPYC/data center momentum, and discounted valuation. Learn more about AMD stock here.…

· 4 views
ARXIV.ORG

Active Inference: A method for Phenotyping Agency in AI systems?

The proliferation of agentic artificial intelligence has outpaced the conceptual tools needed to characterize agency in computational systems. Prevailing definitions mainly rely on autonomy and goal-d…

· 2 views
REDDIT

Skymizer Taiwan Inc. Unveils Breakthrough Architecture Enabling Ultra-Large LLM Inference on a Single Card

Source Article excerpt: With a single PCIe card — powered by six HTX301 chips and 384 GB of memory — enterprises can now run 700B-parameter model inference locally at just ~240W per card. The memory-b…

· 3 views
ALL NEWS

DigitalOcean launches AI inference engine with routing capabilities

· 1 view
ARXIV.ORG

An Intelligent Fault Diagnosis Method for General Aviation Aircraft Based on Multi-Fidelity Digital Twin and FMEA Knowledge Enhancement

Fault diagnosis of general aviation aircraft faces challenges including scarce real fault data, diverse fault types, and weak fault signatures. This paper proposes an intelligent fault diagnosis frame…

· 2 views
SANS INTERNET STORM CENTER

TeamPCP Supply Chain Campaign: Update 008

TeamPCP Supply Chain Campaign: Update 008 - 26-Day Pause Ends with Three Concurrent Compromises (Checkmarx KICS, Bitwarden CLI Cascade, xinference PyPI), CanisterSprawl npm Worm Identified, and Tier 1…

· 2 views
LOCALLLAMA

We benchmarked gpt-oss-120b across 6 inference providers and found a 10x throughput spread

We ran a benchmark across 10+ LLM routers, providers, and inference backends to answer the questions that come up every time someone picks a provider. Key findings: Do LLM routers add latency? No, Ope…

· 3 views
LMSYS

DeepSeek-V4 on Day 0: From Fast Inference to Verified RL with SGLang and Miles

We are thrilled to announce Day-0 support for DeepSeek-V4 across both inference and RL training. SGLang and Miles form the first open-source stack to serve and train DeepSeek-V4 on launch day — with s…

· 4 views
REDDIT

your daily driver stack, what's it look like? and why?

What it says in the title, I'm interested in hearing what you all have landed on as a workable / useful stack for you. Mine looks like this: back end inference servers - llama.cpp, vLLM | V hermes-age…

· 5 views
LOCALLLAMA

I got 3× faster HFQ4 prefill on Strix Halo in hipfire with an opt-in MMQ path

I recently contributed an experimental HFQ4-G256 MMQ prefill path to hipfire, an RDNA-focused LLM inference engine. Disclaimer: I authored the PR, so this is partly a contribution note, but I am mainl…

· 3 views
ARXIV.ORG

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is groun…

· 2 views
ARXIV.ORG

Ulterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models

Chain-of-Thought (CoT) reasoning has emerged as a key technique for eliciting complex reasoning in Large Language Models (LLMs). Although interpretable, its dependence on natural language limits the m…

· 2 views
ARXIV.ORG

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions…

· 2 views
ARXIV.ORG

Tandem: Riding Together with Large and Small Language Models for Efficient Reasoning

Recent advancements in large language models (LLMs) have catalyzed the rise of reasoning-intensive inference paradigms, where models perform explicit step-by-step reasoning before generating final ans…

· 2 views
ARXIV.ORG

PhysNote: Self-Knowledge Notes for Evolvable Physical Reasoning in Vision-Language Model

Vision-Language Models (VLMs) have demonstrated strong performance on textbook-style physics problems, yet they frequently fail when confronted with dynamic real-world scenarios that require temporal …

· 2 views
ARXIV.ORG

MIMIC: A Generative Multimodal Foundation Model for Biomolecules

Biological function emerges from coupled constraints across sequence, structure, regulation, evolution, and cellular context, yet most foundation models in biology are trained within one modality or f…

· 2 views
ARXIV.ORG

Microsoft TRELLIS.2: An Open-Source, 4B-Parameter, Image-to-3D Model [pdf]

Recent advancements in 3D generative modeling have significantly improved the generation realism, yet the field is still hampered by existing representations, which struggle to capture assets with com…

· 2 views
REDDIT

Speculative Decoding Implementations: EAGLE-3, Medusa-1, PARD, Draft Models, N-gram and Suffix Decoding from scratch

I’ve been working on an educational implementation repo for speculative decoding: The goal is not to wrap existing libraries, but to implement several speculative decoding methods from scratch behind …

· 5 views
REDDIT

Speculative Decoding Implementations: EAGLE-3, Medusa-1, PARD, Draft Models, N-gram and Suffix Decoding from scratch [P]

I’ve been working on an educational implementation repo for speculative decoding: The goal is not to wrap existing libraries, but to implement several speculative decoding methods from scratch behind …

· 5 views