Search: "system failure" — WeSearch Press

ARTIFICIAL INTELLIGENCE (AI)

The One Substrate Failure Behind Every AI System in 2026

Tue, 28 Apr 2026 13:24:59 GMT · 3 views

GRITH

Five AI Agent Failures in 36 Days. Zero Times the Agent Caught It

Between March 18 and April 22, 2026, public failures at Meta, Mercor, CrewAI, Vercel, and Bitwarden all pointed at the same missing layer: the system acted, and someone else noticed later.…

Tue, 28 Apr 2026 15:10:00 GMT · 13 views

WESPISER

AI Can Find the Code. It Didn't Know How the System Worked

21 bug fixes, two models, same failures. Better LLMs marginally improve things, but still failed on system boundaries and integration.…

Tue, 28 Apr 2026 13:49:59 GMT · 3 views

ARXIV.ORG

IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance

Industrial maintenance environments increasingly rely on AI systems to assist operators in understanding asset behavior, diagnosing failures, and evaluating interventions. Although large language mode…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents

This paper presents PSA-Eval, a failure-centered runtime evaluation framework for deployed trilingual public-space agents. The central claim is that, when the evaluation object shifts from a static in…

Tue, 28 Apr 2026 04:13:21 GMT · 5 views

ARXIV.ORG

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

We explore a central question in AI for mathematics: can AI systems produce original, nontrivial proofs for open research problems? Despite strong benchmark performance, producing genuinely novel proo…

Tue, 28 Apr 2026 04:13:21 GMT · 6 views

ARXIV.ORG

The Controllability Trap: A Governance Framework for Military AI Agents

Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by exis…

Tue, 28 Apr 2026 21:33:22 GMT · 1 view

THE INDEPENDENT

'It took nine seconds': Claude AI agent deletes company's database

PocketOS founder says ‘systemic failures’ with AI infrastructure made catastrophic failure inevitable…

Tue, 28 Apr 2026 20:01:24 GMT · 1 view

ARXIV.ORG

Architectural Requirements for Agentic AI Containment

The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…

Tue, 28 Apr 2026 15:10:00 GMT · 3 views

SUBSTACK

Why the same LLM gives different answers in different environments

What I found diagnosing a failure mode in my own system, and the moment retrieval turned out to be already shaped before it started…

Tue, 28 Apr 2026 10:51:47 GMT · 6 views

ARXIV.ORG

AI Identity: Standards, Gaps, and Research Directions for AI Agents

AI agents are now running real transactions, workflows, and sub-agent chains across organizational boundaries without continuous human supervision. This creates a problem no current infrastructure is …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

Multi-component natural language processing (NLP) pipelines are increasingly deployed for high-stakes decisions, yet no existing adversarial method can test their robustness under realistic conditions…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

When AI reviews science: Can we trust the referee?

The volume of scientific submissions continues to climb, outpacing the capacity of qualified human referees and stretching editorial timelines. At the same time, modern large language models (LLMs) of…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Information-Theoretic Measures in AI: A Practical Decision Guide

Information-theoretic (IT) measures are ubiquitous in artificial intelligence: entropy drives decision-tree splits and uncertainty quantification, cross-entropy is the default classification loss, mut…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data

The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), maintained by the Observational Health Data Sciences and Informatics (OHDSI) collaboration, enabled the harmonisation of el…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

Results for "system failure".

The One Substrate Failure Behind Every AI System in 2026

Five AI Agent Failures in 36 Days. Zero Times the Agent Caught It

AI Can Find the Code. It Didn't Know How the System Worked

IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance

Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents

QED: An Open-Source Multi-Agent System for Generating Mathematical Proofs on Open Problems

The Controllability Trap: A Governance Framework for Military AI Agents

'It took nine seconds': Claude AI agent deletes company's database

Architectural Requirements for Agentic AI Containment

Why the same LLM gives different answers in different environments

AI Identity: Standards, Gaps, and Research Directions for AI Agents

Agentic Adversarial Rewriting Exposes Architectural Vulnerabilities in Black-Box NLP Pipelines

When AI reviews science: Can we trust the referee?

Information-Theoretic Measures in AI: A Practical Decision Guide

Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols

FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Or browse by topic