#computer-vision — Tagged Stories

Every story in the WeSearch catalog tagged with #computer-vision, chronological, with view counts. Subscribe to the per-tag RSS feed to follow this topic in your reader of choice.

60 stories tagged with #computer-vision, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.

⌘ RSS feed for this tag → or search "Computer Vision"

RELATED TAGS

#ai78 #ml34 #robotics9 #medical-imaging4 #technology3 #image-processing3 #deep-learning3 #image-generation3 #remote-sensing2 #video-modeling2 #research2 #video-editing2

ARXIV.ORG

Automated sign detection across the Electronic Babylonian Library

Learning to read cuneiform tablets is an extremely demanding task; consequently, of the roughly half million excavated tablets, only a small fraction has been analysed by Assyriolo…

10 views · Fri, 24 Jul 2026 02:06:35 GMT

#cuneiform #ocr

ARXIV.ORG

Mapping Networks: CVPR 2026 Best Paper Award Nominee

The escalating parameter counts in modern deep learning models pose a fundamental challenge to efficient training and resolution of overfitting. We address this by introducing the …

27 views · Fri, 26 Jun 2026 07:08:53 GMT

#deep learning #artificial intelligence

NVIDIA BLOG

NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale

New NVIDIA Research breakthroughs show how training at scale — across gripper types, driving scenarios and virtual worlds — creates AI that generalizes to diverse applications.…

Computer Vision coverage.

Automated sign detection across the Electronic Babylonian Library

Mapping Networks: CVPR 2026 Best Paper Award Nominee

NVIDIA Research Unlocks Advanced Grasping, Smarter Autonomous Driving and Agent Training at Scale

Effect of Demographic Bias on Skin Lesion Classification

Apple's AI research will be in a computer vision conference before WWDC

Apple to showcase computer vision studies at annual conference in June

VISTA: An End-to-End Benchmark for Visual Spec-to-Web-App Coding Agents

AssetGen: Deployable 3D Asset Generation at Interactive Speed

FAST-GOAL: Fast and Efficient Global-local Object Alignment Learning

Mitigating Object Hallucinations in Vision-Language Models through Region-Aware Attention Recalibration

Lattice theory and algebraic models for deep convolutional learning based on mathematical morphology

In Search of the Ingredients of Open-Endedness: Replicating Picbreeder with Large Vision-Language Models

Computer Vision Engineer, Looking for advice

Online Hand Gesture Recognition Using 3D Convolutional Neural Networks

CHASD: Language Increment-Calibrated Contrastive Decoding against Hallucination in LVLMs

ChainFlow-VLA: Causal Flow Planning with Vision-Language Models

Coloring the Noise: Adversarial Sobolev Alignment for Faithful Image Super Resolution

SimInsert: Seamless Video Object Insertion via Regional Sparse Attention Fusion

Lipschitz Optimization for Formal Verification of Homographies

Exploiting Longitudinal Context in Clinician-Verified Interactive Lesion Tracking

CoReVAD: A Contextual Reasoning Framework for Training-Free Video Anomaly Detection

Dithering Defense: Adversarial Robustness of Vision Foundation Models via Multi-Level Floyd-Steinberg Dithering

The TIME Machine: On The Power of Motion for Efficient Perception

Suicide Risk Assessment from AI-powered Video Surveillance: An Interpretable Framework for Prevention in Metro Stations

Seeing without Looking: Do Vision-Language Benchmarks Really Test Vision?

USV: Towards Understanding the User-generated Short-form Videos

ArchSIBench: Benchmarking the Architectural Spatial Intelligence of Vision-Language Models

TASTE: A Designer-Annotated Multi-Dimensional Preference Dataset for AI-Generated Graphic Design

SAVER: Selective As-Needed Vision Evidence for Multimodal Information Extraction

Rethinking Cross-Layer Information Routing in Diffusion Transformers

Pareto-Enhanced Portrait Generation: Vision-Aligned Text Supervision for Alignment, Realism, and Aesthetics

Retrieval-Augmented Long-Context Translation for Cultural Image Captioning: Gators submission for AmericasNLP 2026 shared task

Accelerating Video Inverse Problem Solvers with Autoregressive Diffusion Models

Beyond Routing: Characterising Expert Tuning and Representation in Vision Mixture-of-Experts

Faster or Stronger: Towards Flexible Visual Place Recognition via Weighted Aggregation and Token Pruning

NeuroQA: A Large-Scale Image-Grounded Benchmark for 3D Brain MRI Understanding

ShadeBench: A Benchmark Dataset for Building Shade Simulation in Sustainable Society

Tippett-minimum Fusion of Representation-space Diffusion Models for Multi-Encoder Out-of-Distribution Detection

EPC-3D-Diff: Equivariant Physics Consistent Conditional 3D Latent Diffusion for CBCT to CT Synthesis

Pixel Wised Lesion Prediction on COVID-19 CT Imagery: A Comparative Analysis of Automated Image Segmentation Architectures

STELLAR: Scaling 3D Perception Large Models for Autonomous Driving

ConceptSeg-R1: Segment Any Concept via Meta-Reinforcement Learning

SUGAR: A Scalable Human-Video-Driven Generalizable Humanoid Loco-Manipulation Learning Framework

Latent Space Guided Scenario Sampling for Multimodal Segmentation Under Missing Modalities

FullFlow: Upgrading Text-to-Image Flow Matching Models for Bidirectional Vision--Language Generation

Tiny-Engram: Trigger-Indexed Concept Tables for Generative Vision

SDM: A Powerful Tool for Evaluating Model Robustness

Co-Fusion4D: Spatio-temporal Collaborative Fusion for Robust 3D Object Detection

FusionCell: Cross-Attentive Fusion of Layout Geometry and Netlist Topology for Standard-Cell Performance Prediction

JUDO: A Juxtaposed Domain-Oriented Multimodal Reasoner for Industrial Anomaly QA

Can Vision Models Truly Forget? Mirage: Representation-Level Certification of Visual Unlearning

ClaimDiff-RL: Fine-Grained Caption Reinforcement Learning through Visual Claim Comparison

Regulating Anatomy-Aware Rewards via Trajectory-Integral Feedback for Volumetric Computed Tomography Analysis

You Don't Need Attention: Gated Convolutional Modeling for Watch-Based Fall Detection

Generation of Heterogeneous PET Images from Uniform Organ Activity Maps Using a Pretrained Domain-Adapted Diffusion Model

AI-Assisted Competency Assessment from Egocentric Video in Simulation-Based Nursing Education

Conflict-Aware Additive Guidance for Flow Models under Compositional Rewards

AQuaUI: Visual Token Reduction for GUI Agents with Adaptive Quadtrees

Beyond the Cartesian Illusion: Testing Two-Stage Multi-Modal Theory of Mind under Perceptual Bottlenecks

TaskGround: Structured Executable Task Inference for Full-Scene Household Reasoning

Browse more