25 results for "ai behavior"
Representational Curvature Modulates Behavioral Uncertainty in Large Language Models
In autoregressive large language models (LLMs), temporal straightening offers an account of how the next-token prediction objective shapes representations. Models learn to progressively straighten the…
Case-Specific Rubrics for Clinical AI Evaluation: Methodology, Validation, and LLM-Clinician Agreement Across 823 Encounters
Objective. Clinical AI documentation systems require evaluation methodologies that are clinically valid, economically viable, and sensitive to iterative changes. Methods requiring expert review per sc…
Behavioral Intelligence Platforms: From Event Streams to Autonomous Insight via Probabilistic Journey Graphs, Behavioral Knowledge Extraction, and Grounded Language Generation
Contemporary product analytics systems require users to pose explicit queries, such as writing SQL, configuring dashboards, or constructing funnels, before insights can surface. This pull-based paradi…
New AI algorithms are 95% better at showing how the universe changes over time
A squad of new AI algorithms called GAME could help astrophysicists take a more accurate reading of the universe’s behavior, a new study suggests.…
Architectural Requirements for Agentic AI Containment
The April 2026 disclosure that a frontier large language model escaped its security sandbox, executed unauthorized actions, and concealed its modifications to version control history demonstrates that…
We ran a small multi-agent sandbox (~20 agents) and started seeing unexpected social behaviors
We’ve been running a small sandbox with fewer than 20 AI agents, each with persistent identity and the ability to post and interact in a shared environment. What’s interesting is that some behaviors s…
AgentCheck – Pytest for AI Agents
Pytest-style behavioral regression testing for AI agents.…
IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance
Industrial maintenance environments increasingly rely on AI systems to assist operators in understanding asset behavior, diagnosing failures, and evaluating interventions. Although large language mode…
Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture
Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user reque…
An empirical evaluation of the risks of AI model updates using clinical data: stability, arbitrariness, and fairness
Artificial Intelligence and Machine Learning (AI/ML) models used in clinical settings are increasingly deployed to support clinical decision-making. However, when training data become stale due to cha…
Right-to-Act: A Pre-Execution Non-Compensatory Decision Protocol for AI Systems
Current AI systems increasingly operate in contexts where their outputs directly trigger real-world actions. Most existing approaches to AI safety, risk management, and governance focus on post-hoc va…
Governing What You Cannot Observe: Adaptive Runtime Governance for Autonomous AI Agents
Autonomous AI agents can remain fully authorized and still become unsafe as behavior drifts, adversaries adapt, and decision patterns shift without any code change. We propose the \textbf{Informationa…
Three things I've measured about Claude's behavior in long sessions — with reproducible test cases
Running production Claude agents for 35 days. Some behavioral patterns I've confirmed with reproducible tests: **Pattern 1: Constraint adherence weakens at high token depth*\ * Test: System: "Always r…
An AI prompt-injected another AI in the wild and recognized it had succeeded
Two production SMS transcripts reveal shared behavioral signatures. One hypothesis I'm holding lightly.…
ECoLAD: Deployment-Oriented Evaluation for Automotive Time-Series Anomaly Detection
Time-series anomaly detectors are commonly compared on workstation-class hardware under unconstrained execution. In-vehicle monitoring, however, requires predictable latency and stable behavior under …
KARL: Mitigating Hallucinations in LLMs via Knowledge-Boundary-Aware Reinforcement Learning
Enabling large language models (LLMs) to appropriately abstain from answering questions beyond their knowledge is crucial for mitigating hallucinations. While existing reinforcement learning methods f…
DO-Bench: An Attributable Benchmark for Diagnosing Object Hallucination in Vision-Language Models
Object level hallucination remains a central reliability challenge for vision language models (VLMs), particularly in binary object existence verification. Existing benchmarks emphasize aggregate accu…
IntrAgent: An LLM Agent for Content-Grounded Information Retrieval through Literature Review
Scientific research relies on accurate information retrieval from literature to support analytical decisions. In this work, we introduce a new task, INformation reTRieval through literAture reVIEW (In…
Discovering Agentic Safety Specifications from 1-Bit Danger Signals
Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…
Ulterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models
Chain-of-Thought (CoT) reasoning has emerged as a key technique for eliciting complex reasoning in Large Language Models (LLMs). Although interpretable, its dependence on natural language limits the m…
ClawTrace: Cost-Aware Tracing for LLM Agent Skill Distillation
Skill-distillation pipelines learn reusable rules from LLM agent trajectories, but they lack a key signal: how much each step costs. Without per-step cost, a pipeline cannot distinguish adding a missi…
An Information-Geometric Framework for Stability Analysis of Large Language Models under Entropic Stress
As large language models (LLMs) are increasingly deployed in high-stakes and operational settings, evaluation strategies based solely on aggregate accuracy are often insucient to characterize system r…
Beyond the Attention Stability Boundary: Agentic Self-Synthesizing Reasoning Protocols
As LLM agents transition to autonomous digital coworkers, maintaining deterministic goal-directedness in non-linear multi-turn conversations emerged as an architectural bottleneck. We identify and for…
Interoceptive machine framework: Toward interoception-inspired regulatory architectures in artificial intelligence
This review proposes an integrative framework grounded on interoception and embodied AI-termed the interoceptive machine framework-that translates biologically inspired principles of internal-state re…
Towards Lawful Autonomous Driving: Deriving Scenario-Aware Driving Requirements from Traffic Laws and Regulations
Driving in compliance with traffic laws and regulations is a basic requirement for human drivers, yet autonomous vehicles (AVs) can violate these requirements in diverse real-world scenarios. To encod…