WeSearch
Hub / Search / llms
SEARCH · LLMS

Results for "llms".

30 stories match your query across our 700+ source catalog. Ranked by relevance and recency.

30 results for "llms"

ARXIV.ORG

LLMs Corrupt Your Documents When You Delegate

Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation t…

· 4 views
NEWSWEEK

Yann LeCun: LLMs Are Nearing the End, but Better AI Is Coming (2025)

Yann LeCun, Chief AI Scientist at Meta, believes LLMs are doomed due to their inability to represent the high-dimensional spaces that characterize our world…

· 5 views
ARXIV.ORG

Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs

Medical and public health experts must make real-time resource decisions, such as expanding hospital bed capacity, based on projected hospitalization trends during large-scale healthcare disruptions (…

· 3 views
DMITRI LERKO

Running Local LLMs Offline on a Ten-Hour Flight

I flew from London to Google Cloud Next 2026 in Las Vegas. Ten hours with no in-flight wifi. I used the time to test how far a modern MacBook can carry engineering work on local LLMs alone. Setup A we…

· 3 views
LOCALLLAMA

What would be the best OS to run LLMs?

Hi there, I've ordered a mini PC with 128GB of RAM and the AMD AI Max 395. I intend to use it with Proxmox (like my actual machine), where I run Windows for some gaming and macOS for my music library …

· 4 views
HACKER NEWS (NEWEST)

LLMs Can't Generate Influence

· 1 view
R/CSCAREERQUESTIONS

Are people using AI/LLMs in Defense or Secure Environments?

· 1 view
GITHUB

Show HN: Waiting for LLMs Suck – Give your user a game

Give your user a game while they wait for the LLM to return a result.…

· 4 views
ARXIV.ORG

GSAR: Typed Grounding for Hallucination Detection and Recovery in Multi-Agent LLMs

Autonomous multi-agent LLM systems are increasingly deployed to investigate operational incidents and produce structured diagnostic reports. Their trustworthiness hinges on whether each claim is groun…

· 3 views
/R/TECHNOLOGY

Google DeepMind Paper Argues LLMs Will Never Be Conscious | Philosophers said the paper’s argument is sound, but that “all these arguments have been presented years and years ago.”

· 6 views
RAGNEROCK

Show HN: Ragnerock, an AI data analysis tool

Hi HN, I’m Matt Mahowald, and together with my cofounder John, we’re launching the public beta of Ragnerock today. As a data scientist, you spend the majority of your time wrangling data. Even though …

· 9 views
MEDIUM

Agents Are Microservices with a Brain

We solved this in 2010. It was called microservices. Now we're making the same mistakes with LLMs.…

· 1 view
ARXIV.ORG

Does Point Cloud Boost Spatial Reasoning of Large Language Models?

3D Large Language Models (LLMs) leveraging spatial information in point clouds for 3D spatial reasoning attract great attention. Despite some promising results, the role of point clouds in 3D spatial …

· 3 views
WESPISER

AI Can Find the Code. It Didn't Know How the System Worked

21 bug fixes, two models, same failures. Better LLMs marginally improve things, but still failed on system boundaries and integration.…

· 3 views
PROMPTENGINEERING

Help with historical documents transcriptions

Hey there! I’m currently trying to transcribe some historical data from the NYSE. Specifically, the stock prices and (weekly) volume of set stocks. At the moment, I have tried manually transcribing th…

· 6 views
ARXIV.ORG

AI prefers resumes written by itself: Self-preferencing in Algorithmic Hiring

As artificial intelligence (AI) tools become widely adopted, large language models (LLMs) are increasingly involved on both sides of decision-making processes, ranging from hiring to content moderatio…

· 5 views
GIZMODO

Claude-Powered Agent Apparently Deletes Company Database, Debases Itself Further in Confession

AI agents are powered by the same obsequious LLMs as consumer chatbots.…

· 4 views
ARXIV.ORG

Mitigating Belief Inertia via Active Intervention in Embodied Agents

Recent advancements in large language models (LLMs) have enabled agents to tackle complex embodied tasks through environmental interaction. However, these agents still make suboptimal decisions and pe…

· 2 views
ARXIV.ORG

FormalScience: Scalable Human-in-the-Loop Autoformalisation of Science with Agentic Code Generation in Lean

Formalising informal mathematical reasoning into formally verifiable code is a significant challenge for large language models. In scientific fields such as physics, domain-specific machinery (\textit…

· 3 views
ARXIV.ORG

A Systematic Approach for Large Language Models Debugging

Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains…

· 3 views
ARXIV.ORG

Don't Make the LLM Read the Graph: Make the Graph Think

We investigate whether explicit belief graphs improve LLM performance in cooperative multi-agent reasoning. Through 3,000+ controlled trials across four LLM families in the cooperative card game Hanab…

· 3 views
ARXIV.ORG

Analytica: Soft Propositional Reasoning for Robust and Scalable LLM-Driven Analysis

Large language model (LLM) agents are increasingly tasked with complex real-world analysis (e.g., in financial forecasting, scientific discovery), yet their reasoning suffers from stochastic instabili…

· 3 views
ARXIV.ORG

Towards Automated Ontology Generation from Unstructured Text: A Multi-Agent LLM Approach

Automatically generating formal ontologies from unstructured natural language remains a central challenge in knowledge engineering. While large language models (LLMs) show promise, it remains unclear …

· 3 views
ARXIV.ORG

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Can large language model agents discover hidden safety objectives through experience alone? We introduce EPO-Safe (Experiential Prompt Optimization for Safe Agents), a framework where an LLM iterative…

· 3 views
ARXIV.ORG

CAP-CoT: Cycle Adversarial Prompt for Improving Chain of Thoughts in LLM Reasoning

Chain-of-Thought (CoT) prompting has emerged as a simple and effective way to elicit step-by-step solutions from large language models (LLMs). However, CoT reasoning can be unstable across runs on lon…

· 3 views
ARXIV.ORG

SoccerRef-Agents: Multi-Agent System for Automated Soccer Refereeing

Refereeing is vital in sports, where fair, accurate, and explainable decisions are fundamental. While intelligent assistant technologies are being widely adopted in soccer refereeing, current AI-assis…

· 3 views
ARXIV.ORG

IndustryAssetEQA: A Neurosymbolic Operational Intelligence System for Embodied Question Answering in Industrial Asset Maintenance

Industrial maintenance environments increasingly rely on AI systems to assist operators in understanding asset behavior, diagnosing failures, and evaluating interventions. Although large language mode…

· 3 views
ARXIV.ORG

Ulterior Motives: Detecting Misaligned Reasoning in Continuous Thought Models

Chain-of-Thought (CoT) reasoning has emerged as a key technique for eliciting complex reasoning in Large Language Models (LLMs). Although interpretable, its dependence on natural language limits the m…

· 3 views
ARXIV.ORG

FinGround: Detecting and Grounding Financial Hallucinations via Atomic Claim Verification

Financial AI systems must produce answers grounded in specific regulatory filings, yet current LLMs fabricate metrics, invent citations, and miscalculate derived quantities. These errors carry direct …

· 3 views
ARXIV.ORG

When AI reviews science: Can we trust the referee?

The volume of scientific submissions continues to climb, outpacing the capacity of qualified human referees and stretching editorial timelines. At the same time, modern large language models (LLMs) of…

· 3 views