Search: "llms" — WeSearch Press

ARXIV.ORG

LLMs Corrupt Your Documents When You Delegate

Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation t…

Tue, 28 Apr 2026 12:54:59 GMT · 4 views

NEWSWEEK

Yann LeCun: LLMs Are Nearing the End, but Better AI Is Coming (2025)

Yann LeCun, Chief AI Scientist at Meta, believes LLMs are doomed due to their inability to represent the high-dimensional spaces that characterize our world…

Tue, 28 Apr 2026 12:49:59 GMT · 5 views

ARXIV.ORG

Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs

Medical and public health experts must make real-time resource decisions, such as expanding hospital bed capacity, based on projected hospitalization trends during large-scale healthcare disruptions (…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

DMITRI LERKO

Running Local LLMs Offline on a Ten-Hour Flight

I flew from London to Google Cloud Next 2026 in Las Vegas. Ten hours with no in-flight wifi. I used the time to test how far a modern MacBook can carry engineering work on local LLMs alone. Setup A we…

Mon, 27 Apr 2026 15:38:07 GMT · 3 views

LOCALLLAMA

What would be the best OS to run LLMs?

Hi there, I've ordered a mini PC with 128GB of RAM and the AMD AI Max 395. I intend to use it with Proxmox (like my actual machine), where I run Windows for some gaming and macOS for my music library …

Mon, 27 Apr 2026 08:03:51 GMT · 4 views

HACKER NEWS (NEWEST)

LLMs Can't Generate Influence

Tue, 28 Apr 2026 20:16:24 GMT · 1 view

R/CSCAREERQUESTIONS

Are people using AI/LLMs in Defense or Secure Environments?

Tue, 28 Apr 2026 20:16:24 GMT · 1 view

GITHUB

Show HN: Waiting for LLMs Suck – Give your user a game

Give your user a game while they wait for the LLM to return a result.…

Tue, 28 Apr 2026 04:27:05 GMT · 4 views

ARXIV.ORG

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is groun…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

/R/TECHNOLOGY

Google DeepMind Paper Argues LLMs Will Never Be Conscious | Philosophers said the paper’s argument is sound, but that “all these arguments have been presented years and years ago.”

Tue, 28 Apr 2026 01:02:13 GMT · 6 views

RAGNEROCK

Show HN: Ragnerock, an AI data analysis tool

Hi HN, I’m Matt Mahowald, and together with my cofounder John, we’re launching the public beta of Ragnerock today. As a data scientist, you spend the majority of your time wrangling data. Even though …

Tue, 28 Apr 2026 17:04:12 GMT · 9 views

MEDIUM

Agents Are Microservices with a Brain

We solved this in 2010. It was called microservices. Now we're making the same mistakes with LLMs.…

Tue, 28 Apr 2026 16:25:01 GMT · 1 view

ARXIV.ORG

Does Point Cloud Boost Spatial Reasoning of Large Language Models?

3D Large Language Models (LLMs) leveraging spatial information in point clouds for 3D spatial reasoning attract great attention. Despite some promising results, the role of point clouds in 3D spatial …

Tue, 28 Apr 2026 14:55:00 GMT · 3 views

WESPISER

AI Can Find the Code. It Didn't Know How the System Worked

21 bug fixes, two models, same failures. Better LLMs marginally improve things, but still failed on system boundaries and integration.…

Tue, 28 Apr 2026 13:49:59 GMT · 3 views

PROMPTENGINEERING

Help with historical documents transcriptions

Hey there! I’m currently trying to transcribe some historical data from the NYSE. Specifically, the stock prices and (weekly) volume of set stocks. At the moment, I have tried manually transcribing th…

Tue, 28 Apr 2026 10:16:27 GMT · 6 views

ARXIV.ORG

AI prefers resumes written by itself: Self-preferencing in Algorithmic Hiring

As artificial intelligence (AI) tools become widely adopted, large language models (LLMs) are increasingly involved on both sides of decision-making processes, ranging from hiring to content moderatio…

Tue, 28 Apr 2026 09:57:42 GMT · 5 views

GIZMODO

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

AI agents are powered by the same obsequious LLMs as consumer chatbots.…

Tue, 28 Apr 2026 09:34:13 GMT · 4 views

ARXIV.ORG

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and pe…

Tue, 28 Apr 2026 08:54:13 GMT · 2 views

ARXIV.ORG

FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean

Formalising informal mathematical reasoning into formally verifiable code is a significant challenge for large language models. In scientific fields such as physics, domain-specific machinery (\textit…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

A Systematic Approach for Large Language Models Debugging

Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Don't Make the LLM Read the Graph: Make the Graph Think

We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trials across four LLM families in the cooperative card game Hanab…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

Refereeing is vital in sports, where fair, accurate, and explainable decisions are fundamental. While intelligent assistant technologies are being widely adopted in soccer refereeing, current AI-assis…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance

Industrial maintenance environments increasingly rely on AI systems to assist operators in understanding asset behavior, diagnosing failures, and evaluating interventions. Although large language mode…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

Ulterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models

Chain-of-Thought (CoT) reasoning has emerged as a key technique for eliciting complex reasoning in Large Language Models (LLMs). Although interpretable, its dependence on natural language limits the m…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

Financial AI systems must produce answers grounded in specific regulatory filings, yet current LLMs fabricate metrics, invent citations, and miscalculate derived quantities. These errors carry direct …

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

ARXIV.ORG

When AI reviews science: Can we trust the referee?

The volume of scientific submissions continues to climb, outpacing the capacity of qualified human referees and stretching editorial timelines. At the same time, modern large language models (LLMs) of…

Tue, 28 Apr 2026 04:13:21 GMT · 3 views

Results for "llms".

LLMs Corrupt Your Documents When You Delegate

Yann LeCun: LLMs Are Nearing the End, but Better AI Is Coming (2025)

Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs

Running Local LLMs Offline on a Ten-Hour Flight

What would be the best OS to run LLMs?

LLMs Can't Generate Influence

Are people using AI/LLMs in Defense or Secure Environments?

Show HN: Waiting for LLMs Suck – Give your user a game

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Google DeepMind Paper Argues LLMs Will Never Be Conscious | Philosophers said the paper’s argument is sound, but that “all these arguments have been presented years and years ago.”

Show HN: Ragnerock, an AI data analysis tool

Agents Are Microservices with a Brain

Does Point Cloud Boost Spatial Reasoning of Large Language Models?

AI Can Find the Code. It Didn't Know How the System Worked

Help with historical documents transcriptions

AI prefers resumes written by itself: Self-preferencing in Algorithmic Hiring

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

Mitigating Belief Inertia via Active Intervention in Embodied Agents

FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean

A Systematic Approach for Large Language Models Debugging

Don't Make the LLM Read the Graph: Make the Graph Think

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance

Ulterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

When AI reviews science: Can we trust the referee?

Or browse by topic