49 stories tagged with #data-analysis, in publish-time order across the WeSearch catalog. Tag pages update as new stories ingest.
⌘ RSS feed for this tag → or search "Data Analysis"
Building a data analysis workstation (R9 9950X, 128GB RAM) – reusing old GPU & drives, need case/cooler/fan advice
Open Source Aviation Maps
Benchmarks & Tips for Big Data, Hadoop, AWS, Google Cloud, PostgreSQL, Spark, Python & More...…
I built an data analysis tool but the real differentiator isn’t AI
MIT researchers teach AI models to interpret charts
Researchers used a novel data generation pipeline to build ChartNet, a large synthetic dataset of chart images paired with corresponding information. They used this training datase…
Am I a Bad Friend?
Importing 1.2 million messages from Telegram, VK, Instagram, Facebook, and Twitter into a structured Obsidian vault with local LLM inference - to measure how my friendships work wi…
Anatomy of a 5-4 Champions League Thriller: A Football Data Case Study
A football data case study on PSG vs Bayern, expected threat, metric choice, and why the real value is when you can ask the 2nd, 3rd, and 4th question cheaply.…
Python in Excel is more powerful than I initially estimated
A surprisingly powerful partnership…
Former Tesla data labelers say FSD relies on laborious mapping for hazards; crash data analysis shows Tesla exaggerates FSD's safety via flawed methodology (Reuters)
Why ipynb is a perfect format for saving AI data analysis conversations
You will learn why ipynb notebook format is perfect for saving conversations with AI data analyst.…
Why shouldn’t I just have Claude code connect to Postgres production db for every data analysis task?
A coder fed 20 years of his messages to AI to audit his friendships
Software engineer Vadim Drobinin used GDPR data-access laws to download his entire chat history — ICQ and IRC logs from the 2000s, VK, Twitter, and Facebook from the 2010s, Instagr…
Pitfalls of Estimating Parameters from Aggregates
One of the most common mistakes in data analysis is treating observed aggregates as if they are the parameters themselves. Observed data is not the parameter — it is the result of …
Pandas GroupBy Explained With Examples
Learn how to use Pandas GroupBy to summarize, compare, and analyze grouped data with simple, practical examples.…
Python as a Declarative Programming Language (2017)
AI Making Work Easy for Data Analysts and Founders
A comprehensive full-stack data analysis platform for analysts…
TaBIIC2: Interactive Building of Ontological Taxonomies using Weighted Self-Organizing Maps
Ontologies represent the conceptual knowledge of a domain. At the core of an ontology is the taxonomy of concepts and subconcepts that represent specific entities, which can be com…
AI Cartography: Mapping the Latent Landscape of AI Benchmark Ecosystems
While aggregate leaderboard scores drive AI development, they contain substantial measurement noise whose sources and magnitudes remain unquantified, making it unclear when ranking…
If you've ever wondered how rigorous data analysis+social science research can look with AI, I've finally launched a nice website for my open-source Claude Code researcher's toolkit: the Data Analyst Augmentation Framework! Equal parts interactive explainer on agentic orchestration + free tool
DAAF: Rigorous+responsible data analysis/research with Claude Code (open-source)
A free, open-source AI toolkit for rigorous research. DAAF helps skilled researchers rapidly scale their expertise with Claude Code -- without sacrificing transparency, rigor, or r…
An AI Interface for Research Papers
The Research Paper Isn't Dead (Yet!)…
GPT Guesses Between 1 and 100
When asked to pick a random number between 1 and 100, ChatGPT does not follow a random uniform distribution - exmergo/research-chatgpt-guesses-between-1-and-100…
Gemma 4: A new, budget-focused model in Posit AI
Gemma 4 is…
DreamerNLplus: Interpretable Modeling of Mental Health Dynamics from Social Media Timelines using Hybrid Rule-Based and RAG Methods
We present DreamerNLplus, a hybrid framework for modeling mental health dynamics from social media timelines in the CLPsych 2026 shared task. Our system addresses three tasks: psyc…
CanLover – Can Bus Analyzer for Vector/Peak/Kvaser on Windows and Linux
A modern, lightweight CAN bus analyzer for Linux and Windows. The free alternative to Vector CANalyzer and PEAK PCAN-View — native SocketCAN, DBC/SYM decoding, J1939, Signal Plot, …
Cache hit rates of Inference are more meaningful than the headline costs
Robust Subspace-Constrained Quadratic Models for Low-Dimensional Structure Learning
In this paper, we propose a robust subspace-constrained quadratic model (SCQM) for learning low-dimensional structure from high-dimensional data. Building upon the subspace-constra…
How AI Is Changing the QA Engineer Role in 2026: A Data Analysis
How Has the QA Engineer Job Description Changed in 2026? Open a QA Engineer job posting...…
96% of IT pros use AI now: Their top 7 agentic applications and biggest implementation roadblocks
A new study points to an emerging skill set that is becoming more valuable in the AI age: validating AI outputs.…
Model Half-Life
I keep hearing people say that there is a model “half-life” which keeps dropping from years between model releases down to a few months, with the implied assumptions that model rel…
LLM Themes Are Not Observations
A practitioner's warning about generated variables in causal analysis…
AI Generated Code Looked Right, but the Data Was Wrong
I asked AI to load a CSV file for a medical data analysis use case. The code looked correct, but the dataframe was wrong. This is why checking AI output is so important.…
SQL Window Functions Beyond Basics: Solving Real Business Problems
You know window functions, but do you know how to use them to solve business problems? You will after you read this article.…
10 GitHub Repositories to Master Quant Trading
From your first backtest to a real trading system, here are GitHub repos that can seriously level up your quant trading skills fast.…
Position: Let's Develop Data Probes to Fundamentally Understand How Data Affects LLM Performance
Data is fundamental to large language models (LLMs). However, understanding of what makes certain data useful for different stages of an LLM workflow, including training, tuning, a…
From Intent to AI Pipelines: A Controlled Agentic Framework for Non-AI Expert Scientists
Artificial Intelligence (AI) pipelines have become integral to modern research, supporting fields such as Medical Sciences, Agriculture, and Social Sciences, and enabling large-sca…
KKRDB to set up policy planning and data analysis wing
KKRDB plans a new policy and data analysis wing to boost Kalyana Karnataka's GDP with targeted development initiatives.…
Why data sleuths are archiving the Jeffrey Epstein files: ‘We want to provide some clarity’
Tommy Carstensen oversees one of the most sophisticated archives of Epstein materials, while Tristan Lee’s database provides searches of faces who appear in the files…
A Conflict-aware Evidential Framework for Reliable Sleep Stage Classification
Multi-view learning has been widely applied for sleep stage classification using multi-modal data. However, existing methods typically assume that different modalities are well-ali…
The Infrastructure Behind Making Local LLM Agents Useful
Lessons from building a fast, reliable single-cell analysis agent on open-weight models…
188,000 Show HN posts, 14 years of data: what predicts GitHub stars
Does it actually matter when you post your Show HN? And does a front-page run translate into GitHub stars? I scraped 188,085 Show HN posts and cross-referenced the top 500 with the…
The Python Data Analytics Handbook: A Guide for Beginners
Introduction Python is a computer programming language often used to build websites and...…
Data Analysis with Vector Functional Programming (2016) [video]
Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.…
I Stopped Chunking My Logs. Then Gemma 4's 128K Context Found What I'd Missed for Weeks
Liquid syntax error: 'raw' tag was never closed…
Who actually needs data analysis aside from tech?
31 Pages, 30 Days. I Split the Pre/Post Data. The Win Predated the Work.
After 41 consecutive daily content improvements, one page looked like a win: 'forward pe calculator' at position 6.7 after the May 13 update. Then I split pre/post. The ranking was…
How I Built an AI Hotel Review Intelligence Platform in a Weekend (Prompts Included)
Hotel Grande Bretagne in Athens has a 9.3/10 on Booking.com. Here's what that score hides: Small...…
PCA vs. Regression Slope
A regression fit can sometimes look surprisingly poor when plotted on top of data, with the two having visually different slopes. The reason is that the line that visually seems to…
I Taught SQL to Complete Beginners: Here's What Actually Happened
A trainer's honest, fun retrospective on teaching SQL from scratch - JOINs, window functions, CTEs, and one very messy hotel dataset.…
Show HN: Ragnerock, an AI data analysis tool
Research Intelligence Platform - Turn artificial intelligence into genuine knowledge…