#language-models — Tagged Stories

Every story in the WeSearch catalog tagged with #language-models, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

60 stories tagged with #language-models, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Language Models"

RELATED TAGS

#ai122 #ml75 #large-language-models8 #technology5 #reinforcement-learning4 #research4 #ai-research3 #openai3 #computation3 #anthropic2 #document-editing2 #cybersecurity2

ARXIV.ORG

Rethinking Uncertainty Evaluation in Large Language Models

arXiv:2607.19367v1 Announce Type: new Abstract: Calibration is the primary criterion for evaluating LLM confidence, but it is insufficient: it admits trivially incoherent estimator…

21 views · Thu, 23 Jul 2026 04:00:00 GMT

#rethinking #uncertainty #evaluation

ARXIV.ORG

Logic-Guided Data Extraction with Answer Set Programming and Large Language Models

arXiv:2607.19365v1 Announce Type: new Abstract: When Large Language Models (LLMs) are used for semantic data extraction from unstructured text, producing candidate relational facts…

19 views · Thu, 23 Jul 2026 04:00:00 GMT

#logic-guided #data #extraction

ARXIV.ORG

Statistically Grounded Sparse-Feature Interventions for Activation-Space Control in Large Language Models

arXiv:2607.19364v1 Announce Type: new Abstract: Activation steering offers a lightweight alternative to fine-tuning for behavioral control of large language models, but SAE-based s…

22 views · Thu, 23 Jul 2026 04:00:00 GMT

#statistically #grounded #sparse-feature

ARXIV.ORG

Lifted Representation Hypothesis in Language Models

arXiv:2607.19360v1 Announce Type: new Abstract: Large language models (LLMs) often answer queries by mapping individual observations to more general rule-like structures. However, …

15 views · Thu, 23 Jul 2026 04:00:00 GMT

#lifted #representation #hypothesis

ARXIV.ORG

Information Discernment in Large Language Models

arXiv:2607.19355v1 Announce Type: new Abstract: LLMs are increasingly used with external knowledge sources like the internet. Do they weigh information appropriately -- updating mo…

18 views · Thu, 23 Jul 2026 04:00:00 GMT

#information #discernment #large

TOWARDS DATA SCIENCE

Loop Engineering with Adaptive Parsing in Action: Parsing Flat Tables with Azure and Figures with a Vision LLM

Enterprise Document Intelligence [Vol.1 #10B] - The LLM as last line of defence, then two real escalations walked end to end: a flat table to Azure, a figure to a vision model The …

28 views · Mon, 20 Jul 2026 15:00:00 GMT

#document intelligence #adaptive parsing #large language models

MIT TECHNOLOGY REVIEW

GPT-Red: an LLM super-hacker OpenAI built to make its models safer

Exclusive: The firm says it wants to future-proof its safety procedures and stay ahead of human attackers.…

Language Models coverage.

Rethinking Uncertainty Evaluation in Large Language Models

Logic-Guided Data Extraction with Answer Set Programming and Large Language Models

Statistically Grounded Sparse-Feature Interventions for Activation-Space Control in Large Language Models

Lifted Representation Hypothesis in Language Models

Information Discernment in Large Language Models

Loop Engineering with Adaptive Parsing in Action: Parsing Flat Tables with Azure and Figures with a Vision LLM

GPT-Red: an LLM super-hacker OpenAI built to make its models safer

Augmenting Fundamental Analysis with Large Language Models: A RAG-Based System for Generating Investor Briefs

Integrating Large Language Models and Graph Convolutional Networks for Semi-Supervised Image Classification

Accelerating GPU Inference of Large Language Models with Moderately Unstructured Sparse Weight Matrices

A Unified Approach to Interpreting Knowledge Distillation for Large Language Models via Interactions

Narration-of-Thought: Inference-Time Scaffolding for Defeasible Ethical Reasoning in Large Language Models

Prompt Injection as Role Confusion

GPU Forecasters: Language Models as Selective Surrogates for Kernel Optimization

Benchmarking LLM-as-a-Judge for Long-Form Output Evaluation

Code-on-Graph: Iterative Programmatic Reasoning via Large Language Models on Knowledge Graphs

From Answers to States: Verifiable Process-Level Evaluation of Chemical Reasoning in Large Language Models

ClinicalMC: A Benchmark for Multi-Course Clinical Decision-Making with Large Language Models

Uncertainty-Aware Clarification in LLM Agents with Information Gain

Decomposing how prompting steers behavior

The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMs

SkillDAG: Self-Evolving Typed Skill Graphs for LLM Skill Selection at Scale

ChatHealthAI: Aligning Electronic Health Record Representations with Large Language Models for Grounded Clinical Reasoning

Visual Graph Scaffolds for Structural Reasoning in Large Language Models

Why Are Large Language Models So Terrible at Video Games?

Scaling Laws for Agent Harnesses via Effective Feedback Compute

Heuristic Parasites: A Behavioral Taxonomy of Recurrent Distortion Patterns in Large Language Models (Full System) V2

AI Propaganda factories with language models

✨📊 🧠 The Ultimate Visual Guide to Large Language Models (LLMs)

📄Paper: RORA-VLM: Robust Retrieval Augmentation for Vision Language Models

LLMs believe false statements even after explicit warnings that they're false

Why does AI love writing about lighthouse keepers?

How sure is the activation oracle?

PitchBench: Measuring Pitch Hearing in Audio-Language Models

Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications

Gumbel Machine: Counterfactual Student Writing Generation via Gumbel Noise Steering

Detecting Is Not Resolving: The Monitoring Control Gap in Retrieval Augmented LLMs

Counteraction-Aware Multi-Teacher On-Policy Distillation for General Capability Recovery with Domain Preservation

Generating Robust Portfolios of Optimization Models using Large Language Models

Multi-Stakeholder LLM Alignment: Decomposing Estimation from Aggregation

What Makes Chain-of-Thought Work at Probe Time? Local Co-occurrence Rather Than Global Derivation

The Attribution Blind Spot: Detecting When Language Models Rely on Memory Rather Than Retrieved Context

Towards Feedback-to-Plan Decisions for Self-Evolving LLM Agents in CUDA Kernel Generation

AGORA: Adapter-Grounded Observation-Action Retention for Inference-Free Prompt Compression in LLM Agents

MedGuideX: Internalizing Decision Logic from Executable Guidelines into Large Language Models for Clinical Reasoning

The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence

Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions

OmniToM: Benchmarking Theory of Mind in LLMs via Explicit Belief Modeling

Can LLMs Introspect? A Reality Check

Microsoft Research: LLMs Corrupt your files during delegated work

Sparse Autoencoders Reveal Cortical Brain-LLM Semantic Mapping

Prompt Politeness Affects LLM Accuracy

You don't need all the LLM benchmarks

Credit Assignment with Resets in Language Model Reasoning

Second Guess: Detecting Uncertainty Through Abstention and Answer Stability in Small Language Models

DarkForest: Less Talk, Higher Accuracy for Multi-Agent LLMs

Representation Without Control: Testing the Realization Effect in Language Models

Beyond the Frontier: Stochastic Backtracking for Efficient Test-Time Scaling

Trust but Verify: Prover-Verifier Deliberation for Selective LLM Prediction

Privacy-Preserving Local Language Models for Longitudinal Data Retrieval in Chronic Dermatologic Disease: Implementation in Pemphigus Patients

Browse more