WeSearch
Hub / Search / agent safety
SEARCH · AGENT SAFETY

Results for "agent safety".

14 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

14 results for "agent safety"

ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

· 3 views
GOOGLE NEWS

Alpha Vision Introduces AI Agent for Construction Safety and Operations at ENR Future Tech 2026 - Morningstar

Comprehensive up-to-date news coverage, aggregated from sources all over the world by Google News.…

· 3 views
ARXIV.ORG

The Controllability Trap: A Governance Framework for Military AI Agents

Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by exis…

· 4 views
ARXIV.ORG

Architectural Requirements for Agentic AI Containment

The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…

· 3 views
ARXIV.ORG

Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture

Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user reque…

· 3 views
ARXIV.ORG

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic fra…

· 3 views
ARXIV.ORG

FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data

The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), maintained by the Observational Health Data Sciences and Informatics (OHDSI) collaboration, enabled the harmonisation of el…

· 3 views
ARXIV.ORG

Evaluating whether AI models would sabotage AI safety research

We evaluate the propensity of frontier models to sabotage or refuse to assist with safety research when deployed as AI research agents within a frontier AI company. We apply two complementary evaluati…

· 3 views
ARXIV.ORG

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …

· 3 views
ARXIV.ORG

Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents

Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift without any code change. We propose the \textbf{Informationa…

· 3 views
FOX NEWS — LATEST

Witnesses recount chaos at WHCA Dinner after shooting, Secret Service agents drew guns to evacuate Trump

Witness described chaos inside the ballroom as Secret Service rushed Trump and officials to safety during the White House Correspondents' Dinner shooting.…

· 4 views
DOING THE MATH FOR YOU

The Pious Little Delete Button

A satirical look at AI safety theatre, agentic overreach, and the strange ritual of blaming users after the database is gone.…

· 3 views
REST OF WORLD

Humanitarian aid turns to AI as crises outpace capacity

Purpose-designed AI agents with a focus on safety can provide critical assistance to vulnerable populations.…

· 5 views
ARXIV.ORG

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…

· 3 views