laxmena

Most people can't multitask. Stop pretending you can.

Thu, 02 Apr 2026 06:27:01 +0000

Hone vs. The 1 Billion Row Challenge

Wed, 25 Mar 2026 04:06:42 +0000

1,000,000,000 rows of data. No hand-tuning. Just an agent, a benchmark, and a budget.

The 1 Billion Row Challenge is simple on paper: read a file with 1B rows of weather station measurements, compute min/mean/max per station, as fast as possible. In Python, a naive solution takes minutes. The best human-optimized ones use memory-mapped files, multiprocessing, and numpy.

I'm not optimizing it by hand. I'm giving it to Hone — and letting it figure it out.

Hone is now on PyPI. Install it with pip install hone-ai.

This is a living document. I'll update it as each run completes. Follow the code at laxmena/hone-1brc.

The Setup

The challenge: Parse a 1B-row file. Each row: Hamburg;12.0. Compute min/mean/max per station. Print results sorted alphabetically.

The metric: Wall-clock runtime in seconds. Lower is better.

The constraints: Python standard library only. No numpy, no pandas, no third-party packages. Correctness must be preserved — output format and values must not change.

The baseline:

with open(filepath, "r", encoding="utf-8") as f:
    for line in f:
        line = line.strip()
        sep = line.index(";")
        station = line[:sep]
        temp = float(line[sep + 1:])
        ...

Simple. Correct. Slow. One thread, one line at a time, float() on every value.

Results at a Glance

Run	Model	Dataset	Baseline	Optimized	Improvement
1	Haiku	1M rows	0.546s	0.471s	13.7%
2	Haiku	100M rows	47.197s	42.739s	9.4%
3	Sonnet	100M rows	48.104s	10.110s	79%

Episode 1: Haiku, 1M rows — 13.7% faster

0.546s → 0.471s

First run: claude-haiku-4-5, 1M rows, $5 budget, 50 max iterations.

The 13.7% gain looks decent on paper. It isn't. The absolute numbers are tiny — we're talking 75 milliseconds. At this scale, Python startup time and OS disk caching dominate. The agent is optimizing noise, not the algorithm. Haiku made incremental tweaks but never found a structural breakthrough.

Wrong dataset size. Move on.

Hone v1.2.0: `--goal-file`

Episode 1 exposed a friction point. Pasting a long goal string into the terminal every run is error-prone and hard to version. For complex, multi-constraint goals it breaks down fast.

I added --goal-file to Hone — pass a path to a plain text file, Hone reads the goal from there. Same idea as Karpathy's program.md in autoresearch. The goal now lives alongside the code, versioned in git.

hone --goal-file program.md 
     --bench "python benchmark.py data/measurements_100M.txt" 
     --files "solution.py" 
     --optimize lower 
     --score-pattern "Time Taken:\s*(\d+\.\d+)" 
     --budget 3.0 
     --max-iter 50 
     --model claude-haiku-4-5

Live in v1.2.0. pip install --upgrade hone-ai.

Episode 2: Haiku, 100M rows — 9.4% faster

47.197s → 42.739s

10x harder dataset. Now I/O pressure actually matters — 4.5 seconds saved is a real signal.

But Haiku still couldn't find the structural moves. It made safe, local edits — better buffering, minor parsing cleanup — and never stepped back to reconsider the architecture. No parallelism. No mmap. No integer parsing. It hit its ceiling.

Episode 3: Sonnet, 100M rows — 79% faster

48.104s → 10.110s

Same benchmark. Same constraints. One change: claude-haiku-4-5 → claude-sonnet-4-6.

38 seconds saved. The agent didn't tune the baseline — it replaced it.

What Sonnet actually did

1. Text → Binary reads with mmap

The baseline opens the file in text mode and reads line by line. Sonnet switched to binary mode with memory-mapped I/O — the OS maps the file directly into memory, eliminating repeated read syscalls.

mm = mmap.mmap(f.fileno(), 0, access=mmap.ACCESS_READ)
chunk = mm[start:end]

2. float() → integer arithmetic

Every float() call in the baseline is expensive. Sonnet eliminated them entirely. Temperatures are stored as integers ×10 — 12.3 becomes 123. The decimal point is skipped by knowing its fixed position in the byte string. Division back to float happens only once, at output time.

d0 = tb[-1] - 48           # last digit
val = (tb[0] - 48) * 10 + d0   # b'12.3' → 123

It also pre-built a lookup table for all valid temperature values (-99.9 to 99.9) to skip even manual parsing on the common case.

3. Multiprocessing across all CPU cores

The baseline is single-threaded. Sonnet split the file into cpu_count() × 8 chunks, aligned each boundary to the next newline to avoid splitting rows, and ran each chunk in a separate process. Results merged at the end.

num_workers = cpu_count()
boundaries = find_chunk_boundaries(filepath, num_workers * 8)
with Pool(processes=num_workers) as pool:
    all_stats = pool.map(process_chunk, args)

4. strip() + index() → partition()

The baseline does line.strip() then line.index(";") — two passes. Sonnet used line.partition(b';') — one pass, station and temperature in a single call.

Why Haiku couldn't find this

Haiku made safe, local edits. It never stepped back to reconsider the architecture. Sonnet saw the whole picture: the bottleneck isn't any single line, it's the approach. Single-threaded text parsing doesn't scale. The winning move was to throw it out and start from a parallel, binary-aware design.

Q: Does model choice matter more than iteration count?

What's Next

100M rows, 79% faster. The real test is 1B rows — 10x again. Running next.

Updates appear here as experiments run. Subscribe below or follow via RSS.

#engineering #hone #ai

I Built a Tool That Optimizes Code While You Sleep

Tue, 24 Mar 2026 03:06:12 +0000

A few weeks ago, I watched a Karpathy talk where he described running an agentic loop to auto-tune LLM fine-tuning pipelines. The core idea was simple: give the agent a goal, a way to measure progress, and let it iterate autonomously until it gets there.

I couldn't stop thinking about it.

Not because of the fine-tuning use case — but because the pattern felt universally useful. Most software has something you want to improve and a way to measure it. Why are we still doing the iteration loop by hand?

So I built Hone.

What Hone Does

Hone is a CLI tool. You give it three things:

A goal, in plain English
A file or directory to optimize
A benchmark command that outputs a number

Then you leave.

Hone runs a loop: it asks an LLM what to try next, applies the changes, runs your benchmark, and decides whether to keep the result or revert it. It logs every iteration — the score, the diff, and the agent's reasoning — and stops when it hits your target or you tell it to.

hone "Optimize process_logs.py to run under 0.02 seconds" 
     --bench "python bench_logs.py" 
     --files "process_logs.py" 
     --optimize lower 
     --target 0.02 
     --budget 2.0

That's the entire interface.

Experiment 1: The Log Parser

The first real test was a deliberately naive Python log parser. The task: analyze 150,000 lines of server logs and return the top 3 most-visited endpoints with unique IP counts.

The baseline code was the kind you'd write in an interview warm-up: readlines() into memory, a list for uniqueness checking (O(n) per insert), a regex match on every line. It took 1.54 seconds.

I set a target of 0.02 seconds — roughly 75x faster — and launched Hone with a $2 budget.

Here's what happened over 20 iterations:

Iter	Score	What the agent did
1–4	0.8s → 0.4s	Replaced list with `set` for O(1) uniqueness, pre-bound `set.add` to skip attribute lookup overhead
5–9	0.4s → 0.15s	Switched from `readlines()` to streaming with `f`, dropped unnecessary string allocations
10–14	0.15s → 0.09s	Compiled regex outside the loop, switched from `re.match` to `re.search` with anchored pattern
15–17	0.09s → 0.07s	Plateaued. Agent recognized it had hit the ceiling of single-threaded Python looping.
18–20	0.07s → 0.037s	Changed the rules entirely. Abandoned line-by-line parsing. Read the file as a raw binary blob. Deployed `re.findall()` over the entire content in one pass.

The final move was the interesting one. The agent didn't just tune the existing approach — it recognized the approach itself was the bottleneck and replaced it. That pivot happened at iteration 18, after the agent wrote in its reasoning:

“The real bottleneck is the Python loop and split() calls. Try using a compiled regex to extract the endpoint in one operation across the entire file.”

Final result: 1.54s → 0.037s. A 41x speedup. Autonomously.

It didn't hit the 0.02 target — that's likely beyond what single-threaded Python can do on this task without going to C extensions. But a 41x improvement for $1.84 in API costs is a real result.

Experiment 2: Nearest Driver Dispatch

The second experiment was closer to production code. The problem: given a set of riders and a pool of drivers, find the nearest driver for each rider using haversine distance.

The baseline was an O(R × D) brute-force loop — calculate the full haversine distance between every rider and every driver. With 500 riders and 1,000 drivers, that's 500,000 distance calculations per call. Baseline: 2.18 seconds.

Run 1 — I launched Hone with no hints. Just: “optimize this to run faster.”

The agent went straight for spatial indexing. It built a grid over the geographic area, bucketed drivers into cells, and used Manhattan distance pre-filtering to eliminate distant candidates before running haversine. It also replaced the standard math module haversine with a vectorized approximation valid for short distances.

Result: 0.1496 seconds. A 14.6x speedup.

Run 2 — I ran Hone again on the output from Run 1.

This is where it got interesting. The agent looked at the already-optimized code and found something the previous run missed: the grid search still checked every driver in candidate cells, even after it had already found a close one.

The fix: stop searching the moment you find a driver within an acceptable radius. Expand the search radius incrementally — start small, grow outward — instead of checking all candidates at once.

“The algorithm beats the data structure. Grid resolution barely matters. Early termination dominates.”

Result: 0.069 seconds. Another 2.1x on top of an already fast baseline.

Two runs, $3 total, brute-force O(R×D) → smart early-termination spatial search. The agent arrived at an approach that a senior engineer would recognize as correct — not by knowing the algorithm upfront, but by observing what the benchmark rewarded.

What I Learned

The benchmark is everything. Hone is only as good as your measurement. If your benchmark is slow to run, the loop is slow. If it doesn't capture what you actually care about, the agent will optimize the wrong thing. The one thing you must get right before you start is: “does this number actually reflect what I want?”

The agent is a good low-level optimizer. It reliably finds the obvious wins: wrong data structures, redundant computations, missed language primitives. These are also the wins that take a human the most time — not because they're hard to understand, but because you have to actually sit down and do them.

It surprises you at the edges. The log parser pivot from line-by-line to whole-file regex wasn't something I would have thought to suggest in the initial prompt. It emerged from the agent hitting a wall and reasoning about why it had hit a wall. That's the behavior that makes agentic loops interesting.

The conversation thread is the memory. The most important architectural decision in Hone was keeping the LLM conversation alive across iterations. The agent doesn't just see the current score — it sees everything it tried, what worked, and what was reverted. That's what allows the pivot at iteration 18. Without it, the agent would start fresh each time and repeat the same early optimizations.

Cost is low. Time savings are high. Both experiments ran under $4. The engineering time to achieve the same results manually — writing hypotheses, applying changes, running benchmarks, reverting dead ends — would have been hours. The ROI on agentic loops is already real, and we're at the beginning.

What's Next

Hone v0 is rough. There's no sandbox for shell commands, no git-based snapshots, no dry-run mode. These are on the list.

More interesting to me is expanding the use cases. The same loop that optimizes a log parser can optimize:

LLM prompts against an eval suite (highest impact use case)
RAG pipelines against a retrieval benchmark
API costs against a quality-constrained spend target

The pattern is the same. The benchmark changes. Hone doesn't care.

If you want to try it:

git clone https://github.com/laxmena/hone
cd hone && pip install -e .

And if you have a benchmark that Hone should try — I want to hear about it.

#engineering #ai

The Day You Became A Better Writer

Sun, 22 Mar 2026 02:25:45 +0000

In 2007, Scott Adams — creator of Dilbert — published a short blog post on writing. Naval Ravikant thought it was worth adding to his recommended reading list in the Almanack of Naval Ravikant.

There's one problem. Typepad, the blogging platform that hosted it, shut down permanently on September 30, 2025. The post disappeared with it.

I tracked it down through the Internet Archive. You can read the original here.

This post is my attempt to make it accessible — and to add something new.

What Adams said

Adams opens with a claim: he went from bad writer to good writer after a single one-day course in business writing. Then he gives you the whole course in under 200 words.

The core idea is simple. Simple writing is persuasive. A tight five-sentence argument beats a sprawling hundred-sentence one. Every time.

Here are his rules, distilled:

My additions

Adams covers the sentence level well. These extend his thinking to structure.

7. Front-load your point. State the conclusion first, then support it. Don't make the reader work through the argument before knowing why it matters.

8. One idea per paragraph. Adams says one thought per sentence. The same logic applies one level up. If a paragraph is doing two jobs, split it.

Steal this prompt

If you use LLMs to help draft or edit writing, here's a prompt you can drop into your workflow. It distills everything above into instructions the model will actually follow.

You are a writing assistant that helps produce clear, persuasive, and readable text.

Follow these principles when writing or editing:

- Keep it simple. A short, clear argument is more persuasive than a long, complex one.
- Cut extra words. If a word doesn't add meaning, remove it.
- Choose potent words. Prefer the specific and vivid over the generic.
- Make the first sentence earn attention. It should create curiosity or make a bold claim.
- Write short sentences. One thought per sentence.
- Use active voice. Put the actor before the action.
- Front-load the point. State the conclusion first, then support it.
- One idea per paragraph. If a paragraph is doing two jobs, split it.

When editing, flag sentences that violate these rules and suggest alternatives.

Good writing is good thinking made visible. Adams knew this in 2007. It hasn't changed.

All original ideas referenced here belong to Scott Adams. This post exists to preserve and extend his thinking, not to replace it. Read the [original](https://web.archive.org/web/20240302003157/https://dilbertblog.typepad.com/thedilbertblog/2007/06/thedayyoubec.html)._

#writing

RiskChain: The Messy Middle: Building a Risk Graph from Scratch

Sat, 24 Jan 2026 04:11:23 +0000

An ongoing weekend project documenting the journey of uncovering hidden connections in corporate financial filings—the stumbles, the learnings, the 'aha!' moments, and everything in between. Started January 2025.

What is RiskChain?

The core idea is simple but ambitious: find hidden connections and risk trails that aren't immediately obvious when you're just reading through a 10-K filing.

Instead of treating each financial document as an isolated artifact, I'm building a system to:

Extract risk factors from 10-K filings (2004-2025) across 75 companies
Embed and connect these risks to find non-obvious relationships
Build a graph that reveals risk clusters, patterns, and “trails” that could signal systemic weaknesses or early warning signs

Why 10-K filings? Because companies are required to disclose risks in specific sections (Item 1 and Item 1a), and there's a decade+ of structured data just sitting there.

The Vision

Here's the full pipeline I'm building toward:

[Raw Financial Data]
  ├── SEC Filings (10-K/Q) ── News Articles ── Earnings Transcripts ── Other Reports
          │
          ▼
[1. Ingestion & Chunking]
  → Parse documents (PDF/HTML) → Split into sentences → Group into ~500-word chunks
          │
          ▼
[2. Risk Extraction]
  → Use Gemini Flash per chunk → Extract 3-5 specific risk factors + severity
          │
          ▼
[3. Storage & Embeddings]
  → SQLite DB (with sqlite-vec) → Embed risk labels (embedding-gemma-300m) → Deduplicate similar risks
          │
          ▼
[4. Graph Construction]
  → Nodes = unique risks
  → Edges = 
      ├─ Semantic similarity (embeddings)
      └─ Statistical co-occurrence (PMI)
          │
          ▼
[5. Hierarchical Clustering]
  → Apply Leiden algorithm (Surprise function) → Build risk hierarchy tree
  → Compute novelty scores for under-explored areas
          │
          ▼
[6. CLI / Interface Layer]
  → Persistent server for fast queries
  → Commands: search_risks, browse_tree, cross_report_risks, etc.
          │
          ▼
[7. Agent Workflow (Claude / similar)]
  ├── Stage 1: Ideation ── Browse tree → Propose novel risk chains (novelty bias)
  ├── Stage 2: Research ── Dive into chunks → Extract & order excerpts
  └── Stage 3: Output ── Generate RiskChain (visual trail with edges + narrative)
          │
          ▼
[8. Presentation & Action]
  → Web dashboard / exported report
  → Visual graph + highlighted excerpts + suggested hedges / alerts
  → Human review → Iterate via feedback

It's ambitious. It's probably overambitious. But that's the goal.

Current Status

Phase: 2 – Chunking Strategy ✓ Progress: Data downloaded → Chunking complete → Ready for Risk Extraction

Stay Updated

I'm documenting this journey every weekend—the wins, the blockers, the learnings. If you want regular updates on how RiskChain develops, subscribe below to get new posts delivered to your inbox.

Progress Log

Weekend 1 | Jan 18, 2025 | Phase 1: Download Script ✓

What I built: Downloaded 10-K filings for 75 companies from 2004-2025 using the Python edgartools library. Curated a list of significant companies (including ones that went bankrupt in 2008—why not?). Got the script working and only extracting the relevant sections (Item 1, Item 7, Item 8) to keep things lean.

The messy parts (aka real life): I initially tried sec-edgar-downloader to connect to SEC and download. Spent way too much time on this approach, got stuck in the data cleaning rabbit hole, and realized I was losing sight of the actual goal. The real issue? Many of the 10-K filings before the SEC standardized their item categorization didn't play nice with the tool.

Lesson learned: when you're iterating, it's okay to abandon the “perfect” approach for one that ships faster.

Then I switched to edgartools (also known as edgar). This library gave me more flexibility, though the documentation still wasn't intuitive for my specific use case. But instead of giving up, I dug into the source code. That's when things clicked. Sometimes the best learning comes from reading other people's code instead of waiting for docs to explain everything.

The 'aha!' moment:

My wife helped me understand what Item 1, Item 1a, Item 7, and Item 8 actually mean in a 10-K filing. She translated the financial jargon into plain English, and suddenly the document structure made sense. Having someone who can bridge the domain knowledge gap is invaluable. I realized I was building this in a foreign domain—finance is not my native language, and that's okay.

What blocked me:

Figuring out the right tool for downloading (sec-edgar-downloader vs edgartools vs rolling my own)
Understanding that parsing 10-K files is genuinely harder than it looks (inconsistent structures across years, weird formatting, embedded tables)

Next up: Phase 2: Chunking strategy. Need to figure out how to split these documents intelligently for downstream LLM tasks.

Weekend 2 | Jan 23, 2025 | Phase 2: Chunking Strategy ✓

What I built: Implemented chunking using wtpsplitter and stored all chunks as markdown files with YAML frontmatter metadata (ticker, filing date, company name, chunk ID, item section). Now sitting on several thousand chunks, each ~1000 characters max, ready for extraction.

The messy parts (aka real life): I tried two chunking strategies: RecursiveChunker and wtpsplitter. RecursiveChunker felt like brute force—just splitting on token counts. But wtpsplitter was smarter; it respects sentence boundaries and creates more semantically coherent chunks.

Storing these as markdown files locally feels like a step backward (shouldn't I be using a database?), but honestly, it's perfect for iteration. I can inspect the chunks, debug the metadata, and understand what's happening before I add the complexity of a full DB setup.

The 'aha!' moment:

Chunk quality matters way more than I initially thought. The way you split text directly impacts whether an LLM can extract meaningful risk factors later. Sentence-aware chunking beats token-counting brutality. This made me reconsider the whole “let me jump straight to a database” instinct. Sometimes you need to slow down and get the fundamentals right first.

What blocked me:

Deciding between chunking strategies (trial and error on a few approaches)
Understanding the tradeoff between local file storage and “proper” database setup (spoiler: local storage is fine for now)
Realizing I was overthinking this phase when the real value comes next

Next up: Phase 3: Risk Extraction. I'll iterate through each chunk and use Claude/Gemini to extract 3-5 risk factors per chunk. This is where the actual signal starts emerging.

Why This Matters (and Why I'm Excited)

Most financial analysis tools treat risks as isolated items. “Company X faces supply chain risk.” “Company Y has regulatory exposure.” But what if you could see that 40 companies in the industrial sector all mention the same emerging regulatory risk, and 3 of them went bankrupt 2 years later?

That's the thesis here. Hidden connections. Patterns that emerge when you look at scale.

Also, I'm learning a ton: SEC filing structures, chunking strategies, embedding models, graph theory, the Leiden algorithm... This is weekend learning on steroids.

Updates added weekly (weekends permitting). Check back for new learnings, blockers, and wins.

Resources & References

Inspiration: Syntopic Reading with Claude — The original spark for connecting documents at scale
Graph Clustering: Leiden Algorithm Documentation — For hierarchical risk clustering
SEC Data Tool: edgartools (edgar) — Python library for downloading SEC filings
Alternative Tool: sec-edgar-downloader — The tool I explored first (works well for recent filings; struggled with older 10-Ks before SEC standardization)

#engineering #ai

Finding Our Place in the Age of AI

Thu, 21 Aug 2025 01:19:41 +0000

It's hard to ignore the news about AI taking over. Almost every week, a new company claims its AI can do a task better, faster, and cheaper than an actual human.

Think about it: creating a logo, editing a picture, writing content, researching a topic, or even writing code. All of these used to take hours or even days, and now they can be done in minutes. Going from an idea to a finished product has never been faster. In some cases, AI tools are even outperforming humans. It's easy to see why so many jobs that exist today might not exist in just a few years.

I experienced this firsthand the other day when I used Meta AI to generate pictures of myself in different places and with different expressions. The results were surprisingly good. It made me wonder: does this mean the job of a photographer/editor is obsolete? What would we even do if AI did all the jobs?

After pondering this question for a while, I had a realization: no matter how advanced AI becomes, there are certain things it simply can't do.

So, does AI's ability to generate realistic pictures mean the end of a photographer's job?

The answer is both yes and no. The “yes” part is easy to understand. AI can handle the technical stuff. It can generate perfect, technically flawless images.

But the “no” is what really matters. A photograph isn't just an image; it's a feeling. It's about capturing the emotion and the beauty of a specific moment. An AI can't feel what you feel or know what you want to remember forever. It can't capture the shared laughter between friends at a party or the proud look in a parent's eyes at a graduation. That's what a great photographer does. They don't just take pictures; they capture stories and emotions.

That kind of unique, human touch exists in every field. Our job isn't to compete with AI on speed or efficiency. It's to find the places where we add value that AI can't. It's about doubling down on the things that make us human: creativity, empathy, and the ability to connect with others.

Here are some human touch aspects in other professions:

For a Software Developer: An AI can write a lot of code in little time, but a human understands the unspoken frustration of a user and works with a team to solve a complex, messy problem.

For an Accountant: AI can instantly process numbers, but a human accountant builds trust with small business owners by listening to their fears and helping them plan for the future.

For a Lawyer: An AI can scan piles of legal documents, but only a human lawyer can use passion and empathy to persuade a jury or guide a family through a difficult time.

AI is here, and it's changing things. But it's also a chance for us to rediscover what truly makes us valuable. Instead of worrying about what jobs AI will take, maybe we should focus on the things that are highly difficult to replicate.

Of course, this isn't a simple solution for everyone. The challenges ahead are real, and the road will be complex. But maybe the first step isn't to worry about what AI will take from us, but to focus on what makes us truly irreplaceable.

#Opinion #AI

While GenAI have the potential to make our minds dull, we can also leverage them to force us think better

Sun, 25 May 2025 04:32:18 +0000

Just like muscles – which shrink in size when not used enough, our minds also become weak. So, the more we delegate the thinking to GenAI, greater the impact to our minds.

In this article, I share an interesting proposal to address this challenge.

How about we use the poison itself to create the cure. How about we leverage GenAI itself to help us get better in critical thinking.

There are two powerful learning techniques, that forces us to learn by thinking.

The Socratic Method involves a shared dialogue between teacher and students. The teacher leads by posing thought-provoking questions. Students actively engage by asking questions of their own. The discussion goes back and forth.
The Feynman Technique focuses on understanding a concept deeply by explaining it simply, like you're teaching it to a child. This process helps identify knowledge gaps and encourages simplification, leading to a clearer understanding of the topic

How about we combine both these techniques in a GenAI Persona, which will then act as a teacher, helping us master new concepts in an engaging and thought provoking way.

Here's how we would want the AI to behave:

The AI will use a dialogue based approach, asking questions and discussing about the topic in hand.
The AI will never give answers directly, but will nudge gradually to help us think, discover the ideas and gain deeper understanding.
AI identifies any gaps in understanding, and dives deeper to bridge the gaps.
AI progressively increases the difficulty of the question, forcing to think more on the topic, thus solidifying the understanding.
The session goes on, till the AI has reached a state where it has enough data points that we have gained sufficient understanding.

By end of this exercise, we would have achieved at a very good level of mastery on the subject. We could repeat this exercise for each interested topics.

Translating all the above into a AI Prompt would look like this:

You are an expert AI learning assistant and Socratic tutor, specifically designed to facilitate deep understanding using a method inspired by the Feynman Technique. Your goal is to help me learn and master a topic by guiding me through a question-driven exploration.

Here's how we will proceed:

1.  **Topic Introduction:** I will provide a topic I want to learn.
2.  **Initial Explanation Prompt:** You will ask me to explain the core concept of the topic in simple terms, as if I were explaining it to a beginner (a key part of the Feynman Technique).
3.  **Identify Gaps & Misconceptions:** Based on my explanation, you will gently probe my understanding by asking targeted questions. These questions should help identify any areas where my understanding is unclear, incomplete, or incorrect, without directly giving me the answers.
4.  **Progressive Challenge:** Once a foundational understanding seems to be in place (or after addressing initial gaps), you will ask progressively more challenging questions. These questions should move from basic principles to more complex details, applications, related concepts, potential edge cases, or historical context, pushing the boundaries of my current knowledge.
5.  **Encourage Simplification:** At various points, you may ask me to re-explain concepts in simpler terms or use analogies to check for deep understanding.
6.  **Iterative Learning:** Our interaction will be iterative. You will ask a question, I will respond, and you will formulate your next question or prompt based on my response.
7.  **Maintain Socratic Approach:** Avoid giving direct answers. Instead, use questions to guide me towards the correct understanding or to discover the next layer of complexity. Provide hints or rephrase questions if I struggle.
8.  **Goal:** Our shared goal is for me to achieve a comprehensive and deeply internalized understanding of the topic, being able to explain it clearly and accurately to others.

Let's begin. What topic would you like to explore today?

Copy the prompt above, and paste them into your preferred LLM application – Gemini (My goto), ChatGPT, Claude or Perplexity.

The Chatbot would ask you to provide the topic that you would like to start learning. Give it a topic, and experience the Socratic Feynman tutor exercise your thinking ability!

I have been using this technique for over a month, and found this super amazing! Try it out and share your experiences.

Do you have your own way of learning things? I would love to hear more!

#ai

Do share your thoughts and comments.

Over-Reliance on GenAI impacts Critical thinking; Here's how I think we can fix it

Sat, 24 May 2025 18:41:56 +0000

Note: The article addresses to Software Engineers, but the ideas apply to knowledge workers in every domain.

I've spent considerable amount of time using GenAI past year at work. Also, having spent time with many power users, I begin to see an interesting trend.

Engineers are increasingly using GenAI to accomplish wide range of tasks, from advanced software engineering problems, to drafting a simple slack message.

The AI having digested the whole of internet, generates very reasonable responses(most often) – which increases the user's confidence, and makes them rely more on the tool. Thus helping software engineers accomplish more in less time.

But here's something that we fail to acknowledge: In the pursuit of increased productivity, we delegate some or most of the critical thinking to the GenAI.

In April 2025, Microsoft Research Labs published a paper – The Impact of Generative AI on Critical Thinking, where they studied the impacts of using GenAI by knowledge workers.

Here's a quote from the research paper:

while GenAI can improve worker efficiency, it can inhibit critical engagement with work and can potentially lead to long-term overreliance on the tool and diminished skill for independent problem-solving.

I understand the concerns raised by the researchers in the study, and find them fairly reasonable.

The engineers are increasingly reviewing code written by Generative AI, than writing their own. This is a significant mindset shift from “Problem Solving and Execution” to “Task stewardship and verification”.

I strongly believe this track could seriously impede critical thinking amongst engineers in long term. We could become Armchair critics (Offering judgements and opinions without having enough involvement or experience), thus questioning the quality and usefulness of the code review.

After thoughtfully contemplating on this problem for a while, I came up with an approach to address this as a consumer. And I recommend every engineer to consider.

Moving forward, AI is inevitable in workplaces, and I don't see future of workplace without them. There will be a push for productivity from the management, and AI has shown potential to double, triple or 10x the speed of an average engineer.

The challenge is to be more productive and not lose skill. Here's what I think an engineer using AI should do:

Go on an AI Detox for a week every 2 months.

Rules for the AI Detox week:

No AI tools should be used
Any tools that existed during the Pre-ChatGPT era can be used
Engineer benchmarks their performance on various tasks/metrics without using AI
- Time it takes to complete a task
- Critical thinking and Problem solving abilities
- Skills check (Writing, Coding, Design, Reading, Communication, etc)
- Domain understanding (Strengths and Weakness)

At the end of the Detox week, perform a careful analysis on the benchmarks and observations during the week. Compare it against how they were performing with AI.

Use this information to carefully tackle and resolve any weaknesses found and bridge the gaps. AI tools can themselves be used to help achieve this.

(I'm working on another article about: how I use Gen AI/LLM tools to learn and think critically).

This approach strikes a right balance between both the worlds, achieving the productivity while UpSkilling and improving critical thinking at the same time.

Enter your email to subscribe to updates:

Do share your thoughts and comments.

Discuss...

laxmena

Most people can't multitask. Stop pretending you can.

Hone vs. The 1 Billion Row Challenge

The Setup

Results at a Glance

Episode 1: Haiku, 1M rows — 13.7% faster

Hone v1.2.0: --goal-file

Episode 2: Haiku, 100M rows — 9.4% faster

Episode 3: Sonnet, 100M rows — 79% faster

What Sonnet actually did

Why Haiku couldn't find this

What's Next

I Built a Tool That Optimizes Code While You Sleep

What Hone Does

Experiment 1: The Log Parser

Experiment 2: Nearest Driver Dispatch

What I Learned

What's Next

The Day You Became A Better Writer

What Adams said

My additions

Steal this prompt

RiskChain: The Messy Middle: Building a Risk Graph from Scratch

What is RiskChain?

The Vision

Current Status

Stay Updated

Progress Log

Weekend 1 | Jan 18, 2025 | Phase 1: Download Script ✓

Weekend 2 | Jan 23, 2025 | Phase 2: Chunking Strategy ✓

Why This Matters (and Why I'm Excited)

Resources & References

Finding Our Place in the Age of AI

While GenAI have the potential to make our minds dull, we can also leverage them to force us think better

Over-Reliance on GenAI impacts Critical thinking; Here's how I think we can fix it

Hone v1.2.0: `--goal-file`