Search: "agent safety" — WeSearch Press

ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

GOOGLE NEWS

Alpha Vision Introduces AI Agent for Construction Safety and Operations at ENR Future Tech 2026 - Morningstar

Comprehensive up-to-date news coverage, aggregated from sources all over the world by Google News.…

Tue, 28 Apr 2026 10:39:28 GMT · 3 views

ARXIV.ORG

The Controllability Trap: A Governance Framework for Military AI Agents

Agentic AI systems - capable of goal interpretation, world modeling, planning, tool use, long-horizon operation, and autonomous coordination - introduce distinct control failures not addressed by exis…

Tue, 28 Apr 2026 21:33:22 GMT · 4 views

ARXIV.ORG

Architectural Requirements for Agentic AI Containment

The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…

Tue, 28 Apr 2026 15:10:00 GMT · 3 views

ARXIV.ORG

Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture

Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user reque…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

Indoor navigation remains a critical accessibility challenge for the blind and low-vision (BLV) individuals, as existing solutions rely on costly per-building infrastructure. We present an agentic fra…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data

The Observational Medical Outcomes Partnership Common Data Model (OMOP CDM), maintained by the Observational Health Data Sciences and Informatics (OHDSI) collaboration, enabled the harmonisation of el…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Evaluating whether AI models would sabotage AI safety research

We evaluate the propensity of frontier models to sabotage or refuse to assist with safety research when deployed as AI research agents within a frontier AI company. We apply two complementary evaluati…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Given the increased use of LLMs in financial systems today, it becomes important to evaluate the safety and robustness of such systems. One failure mode that LLMs frequently display in general domain …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents

Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift without any code change. We propose the \textbf{Informationa…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

FOX NEWS — LATEST

Witnesses recount chaos at WHCA Dinner after shooting, Secret Service agents drew guns to evacuate Trump

Witness described chaos inside the ballroom as Secret Service rushed Trump and officials to safety during the White House Correspondents' Dinner shooting.…

Tue, 28 Apr 2026 21:11:24 GMT · 4 views

DOING THE MATH FOR YOU

The Pious Little Delete Button

A satirical look at AI safety theatre, agentic overreach, and the strange ritual of blaming users after the database is gone.…

Tue, 28 Apr 2026 12:24:59 GMT · 3 views

REST OF WORLD

Humanitarian aid turns to AI as crises outpace capacity

Purpose-designed AI agents with a focus on safety can provide critical assistance to vulnerable populations.…

Tue, 28 Apr 2026 10:04:13 GMT · 5 views

ARXIV.ORG

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

Results for "agent safety".

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Alpha Vision Introduces AI Agent for Construction Safety and Operations at ENR Future Tech 2026 - Morningstar

The Controllability Trap: A Governance Framework for Military AI Agents

Architectural Requirements for Agentic AI Containment

Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

FastOMOP: A Foundational Architecture for Reliable Agentic Real-World Evidence Generation on OMOP CDM data

Evaluating whether AI models would sabotage AI safety research

The Price of Agreement: Measuring LLM Sycophancy in Agentic Financial Applications

Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents

Witnesses recount chaos at WHCA Dinner after shooting, Secret Service agents drew guns to evacuate Trump

The Pious Little Delete Button

Humanitarian aid turns to AI as crises outpace capacity

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Or browse by topic