47 stories tagged with #large-language-models, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Large Language Models"
Visual Graph Scaffolds for Structural Reasoning in Large Language Models
Graphs have been used to enhance large language models (LLMs) for structured reasoning, mostly as external knowledge sources are provided to models at test time. In this paper, we …
ChatHealthAI: Aligning Electronic Health Record Representations with Large Language Models for Grounded Clinical Reasoning
Large language models (LLMs) exhibit strong natural-language reasoning abilities for clinical decision support, but struggle to effectively model structured longitudinal electronic…
The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs
Inference-time scaling has emerged as a critical avenue for enhancing Large Language Models' performance, yet real-world deployment is constrained by strict computational budgets. …
ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models
Large language models (LLMs) have been widely adopted in healthcare, yet they still encounter significant challenges in complex clinical decision-making scenarios. Existing benchma…
From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models
Large language models are increasingly used as chemistry assistants, yet most chemistry benchmarks still score only final answers. This masks a critical failure mode: a model may o…
Code-on-Graph: Iterative Programmatic Reasoning via Large Language Models on Knowledge Graphs
Knowledge Graphs (KGs) are widely used to mitigate the limitations of Large Language Models (LLMs), such as outdated knowledge and hallucinations. Existing LLM-KG integration frame…
Why Are Large Language Models So Terrible at Video Games?
LLMs can code your retro shooter but still fail at playing Halo; see what this gap reveals about AI’s real limits in 2026…
Heuristic Parasites: A Behavioral Taxonomy of Recurrent Distortion Patterns in Large Language Models (Full System) V2
✨📊 🧠 The Ultimate Visual Guide to Large Language Models (LLMs)
Generative AI is a type of artificial intelligence that can produce new content including text,...…
Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions
Large Language Models (LLMs) achieve impressive accuracy on mathematical reasoning benchmarks, yet their performance drops when problems are modified with simple changes like diffe…
MedGuideX: Internalizing Decision Logic from Executable Guidelines into Large Language Models for Clinical Reasoning
Clinical practice guidelines (CPGs) encode evidence-based decision logic that clinicians apply by evaluating patient variables, conditional criteria, and recommendation rules. Howe…
Generating Robust Portfolios of Optimization Models using Large Language Models
Mathematical optimization is a powerful tool for structured decision-making across domains such as resource allocation and planning. Formulating optimization models faithful to rea…
Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications
Large Language Models (LLMs) have become the predominant paradigm in NLP, advancing both research and industry. As model sizes and pretraining data grow, concerns about Pretraining…
Confidence Calibration in Large Language Models
We investigate the calibration of large language models' (LLMs') confidence across diverse tasks. The results of our preregistered study show that the current crop of LLMs are, lik…
Breaking the Chains of Probability: Neutrosophic Logic as a New Framework for Epistemic Uncertainty in Large Language Models
Large Language Models (LLMs) are predominantly governed by probabilistic frameworks in which the sum of outcome probabilities is constrained to unity. This architectural limitation…
HyperGuide: Hyperbolic Guidance for Efficient Multi-Step Reasoning in Large Language Models
Multi-step reasoning remains a central challenge for large language models: single-pass generation is efficient but lacks accuracy; tree-search methods explore multiple paths but a…
Distilling Game Code World Model Generation into Lightweight Large Language Models
Large Language Models (LLMs) have shown great ability in generating executable code from natural language, opening the possibility of automatically constructing environments for AI…
PALoRA: Projection-Adaptive LoRA for Preserving Reasoning in Large Language Models
Efficiently updating Large Language Models (LLMs) with new or evolving factual knowledge remains a central challenge, as even parameter-efficient adaptation can erode previously ac…
Jailbreak to Protect: Buffering and Reinforcing via Temporary Jailbreaking for Safe Fine-Tuning in Large Language Models
Fine-tuning-as-a-Service (FaaS) enables personalization of large language models (LLMs), but it can weaken safety-alignment under harmful fine-tuning attacks. Recent work has shown…
Summoning the Oracle to Slay It: Mitigating Look-Ahead Bias in Financial Backtesting with Large Language Models
Backtesting large language models (LLMs) on historical financial data is unreliable because pre-training cuts off after the events happened. An LLM trained in 2024 already "knows" …
Emotional intelligence in large language models is fragmented across perception, cognition, and interaction
As large language models (LLMs) are increasingly integrated into emotionally sensitive domains, the structural integrity of their emotional intelligence (EI) becomes a critical fro…
GENSTRAT: Toward a Science of Strategic Reasoning in Large Language Models
Large language models (LLMs) are increasingly deployed as economic agents in marketplaces, auctions, and bidding settings. Anticipating their behavior in any specific deployment is…
Evaluating Large Language Models in a Complex Hidden Role Game
Quantifying the deceptive potential of Large Language Models (LLMs) is critical for AI safety, yet difficult to achieve in uncontrolled environments. This work investigates the rea…
How Far Will They Go? Red-Teaming Online Influence with Large Language Models
As large language model (LLM)-based agents increasingly participate in online discourse, red-teaming their capacity to support political influence campaigns is critical for informa…
MadEvolve: Evolutionary Optimization of Trading Systems with Large Language Models
We explore the application of LLM-driven algorithm optimization to several common tasks in quantitative finance. MadEvolve, a general-purpose algorithm optimization framework inspi…
PlanningBench: Generating Scalable and Verifiable Planning Data for Evaluating and Training Large Language Models
Planning is a fundamental capability for large language models (LLMs) because such complex tasks require models to coordinate goals, constraints, resources, and long-term consequen…
DEL: Digit Entropy Loss for Numerical Learning of Large Language Models
Number prediction stands as a fundamental capability of large language models (LLMs) in mathematical problem-solving and code generation. The widely adopted maximum likelihood esti…
Machine-Learning-Enhanced Non-Invasive Testing for MASLD Fibrosis: Shallow-Deep Neural Networks Versus FIB-4, Tabular Foundation Models, and Large Language Models
Advanced fibrosis is a major determinant of liver-related morbidity in metabolic dysfunction-associated steatotic liver disease (MASLD). FIB-4 is widely used as a first-line non-in…
Can Large Language Models Revolutionize Survey Research? Experiments with Disaster Preparedness Responses
Survey research faces mounting structural challenges: declining response rates, sample bias, block-wise missingness among at-risk respondents, and AI-assisted fraudulent completion…
BLINKG: A Benchmark for LLM-Integrated Knowledge Graph Generation
Generating Knowledge Graphs (KGs) remains one of the most time-consuming and labor-intensive tasks for knowledge engineers, as they need to identify semantic equivalences between i…
DarkLLM: Learning Language-Driven Adversarial Attacks with Large Language Models
While vision and multimodal foundation models underpin critical tasks from perception to complex reasoning, they remain highly vulnerable to adversarial attacks. However, tradition…
Sketch Then Paint: Hierarchical Reinforcement Learning for Diffusion Multi-Modal Large Language Models
Diffusion Multi-Modal Large Language Models (dMLLMs) are powerful for image generation, but optimizing them through reinforcement learning (RL) remains a major challenge. One prima…
PersonaArena: Dynamic Simulation for Evaluating and Enhancing Persona-Level Role-Playing in Large Language Models
Large language models (LLMs) increasingly serve as interactive social agents, yet their ability to maintain coherent and authentic persona-level role-playing remains limited, parti…
ChemVA: Advancing Large Language Models on Chemical Reaction Diagrams Understanding
While Large Language Models (LLMs) have revolutionized scientific text processing, they exhibit a significant capability gap when interpreting chemical reaction diagrams. We identi…
CyberCorrect: A Cybernetic Framework for Closed-Loop Self-Correction in Large Language Models
Large language model (LLM) self-correction -- the ability to detect and fix errors in generated outputs -- remains largely ad hoc, relying on generic prompts such as "please recons…
Episodic-Semantic Memory Architecture for Long-Horizon Scientific Agents
As Large Language Models (LLMs) evolve into persistent scientific collaborators, context window saturation has emerged as a critical bottleneck. Scientific workflows involving iter…
TeleCom-Bench: How Far Are Large Language Models from Industrial Telecommunication Applications?
While Large Language Models have achieved remarkable integration in various vertical scenarios, their deployment in the telecommunications domain remains exploratory due to the lac…
How Large Language Models Are Reshaping the Trial Lifecycle
Solvita: Enhancing Large Language Models for Competitive Programming via Agentic Evolution
Large language models (LLMs) still struggle with the rigorous reasoning demands of hard competitive programming. While recent multi-agent frameworks attempt to bridge this reliabil…
Zero-Shot Goal Recognition with Large Language Models
Large language models have recently reached near-parity with classical planners on well-known planning domains, yet this competence relies on world-knowledge exploitation rather th…
Retrieval-Augmented Large Language Models for Schema-Constrained Clinical Information Extraction
Conversational nurse-patient transcripts contain actionable observations, but converting these transcripts into structured representations at scale remains challenging. Documentati…
ASRU: Activation Steering Meets Reinforcement Unlearning for Multimodal Large Language Models
Multimodal large language models (MLLMs) may memorize sensitive cross-modal information during pretraining, making machine unlearning (MU) crucial. Existing methods typically evalu…
LLMs as Linguistic Probes: A Graduate Student's Guide to Advanced Syntax, Semantics, and Efficient Fine-Tuning
The intersection of large language models (LLMs) and advanced linguistics has moved beyond...…
Ask HN: What LLM models are you using and why?
The LLM Failure Atlas: A Structural Analysis of Failure Modes in Large Language Models (Free PDF)
Δ-Mem: Efficient Online Memory for Large Language Models
Large language models increasingly need to accumulate and reuse historical information in long-term assistants and agent systems. Simply expanding the context window is costly and …
InclusionAI/Ring-2.6-1T is now open-sourced
We’re on a journey to advance and democratize artificial intelligence through open source and open science.…