1,000,000,000 rows of data. No hand-tuning. Just an agent, a benchmark, and a budget.
The 1 Billion Row Challenge is simple on paper: read a file with 1B rows of weather station measurements, compute min/mean/max per station, as fast as possible. In Python, a naive solution takes minutes. The best human-optimized ones use memory-mapped files, multiprocessing, and numpy.
I'm not optimizing it by hand. I'm giving it to Hone — and letting it figure it out.
Hone is now on PyPI. Install it with pip install hone-ai.
This is a living document. I'll update it as each run completes. Follow the code at laxmena/hone-1brc.
A few weeks ago, I watched a Karpathy talk where he described running an agentic loop to auto-tune LLM fine-tuning pipelines. The core idea was simple: give the agent a goal, a way to measure progress, and let it iterate autonomously until it gets there.
I couldn't stop thinking about it.
Not because of the fine-tuning use case — but because the pattern felt universally useful. Most software has something you want to improve and a way to measure it. Why are we still doing the iteration loop by hand?
An ongoing weekend project documenting the journey of uncovering hidden connections in corporate financial filings—the stumbles, the learnings, the 'aha!' moments, and everything in between. Started January 2025.
What is RiskChain?
The core idea is simple but ambitious: find hidden connections and risk trails that aren't immediately obvious when you're just reading through a 10-K filing.