8 results for "text detection"
See No Evil: Semantic Context-Aware Privacy Risk Detection for AR
Augmented reality (AR) systems pose unique privacy risks due to their continuous capture of visual data. Existing AR privacy frameworks lack semantic understanding of visual content, limiting their ef…
Quantifying Divergence in Inter-LLM Communication Through API Retrieval and Ranking
Large language models (LLMs) increasingly operate as autonomous agents that reason over external APIs to perform complex tasks. However, their reliability and agreement remain poorly characterized. We…
BiTA: Bidirectional Gated Recurrent Unit-Transformer Aggregator in a Temporal Graph Network Framework for Alert Prediction in Computer Networks
Proactive alert prediction in computer networks is critical for mitigating evolving cyber threats and enabling timely defensive actions. Temporal Graph Neural Networks (TGNs) provide a principled fram…
Avionic Main Fuel Pump Simulation and Fault-Diagnosis Benchmark
In many cyber-physical systems, especially in critical applications such as aeroplanes, data to train anomaly detection and diagnosis algorithms is lacking due to data protection issues and partial ob…
A Systematic Approach for Large Language Models Debugging
Large language models (LLMs) have become central to modern AI workflows, powering applications from open-ended text generation to complex agent-based reasoning. However, debugging these models remains…
ZenBrain: A Neuroscience-Inspired 7-Layer Memory Architecture for Autonomous AI Systems
Despite a century of empirical memory research, existing AI agent memory systems rely on system-engineering metaphors (virtual-memory paging, flat LLM storage, Zettelkasten notes), none integrating pr…
Turbo-OCR Update: Layout Model + Multilingual
Follow-up to my post 18 days ago about the C++/CUDA OCR server. Two additions: What's New: Layout model: Added PP-StructureV3 for layout detection Multilingual: No longer Latin-only. Now supports Chin…
Using PaddleOCR-VL-1.5 with llama-server for book OCR
I've been running PaddleOCR-VL-1.5 via llama.cpp's server for OCR on book pages. It handles complex layouts, tables, and mixed text/figure pages surprisingly well. Setup: - Model: PaddleOCR-VL-1.5-GGU…