Issue 14 Reference — Am I Building?

Reference

The essay has one strong outside anchor, modest context evidence, and two claims held as testimony.

The short version

The strongest anchor is a METR study showing that how productive AI-assisted work feels can run opposite to how productive it was, plus a 2026 follow-up that keeps that result from being treated as timeless.

The app-market and software-delivery sources provide deliberately modest context: broad adoption is rare, and lower effort does not erase quality, judgment, complexity, or delivery trade-offs.

The author's personal brake and the role of nearby people have no public evidence. They are labeled as testimony and kept inside those limits.

What the references did

Changed

After publication, the central question's wording changed under reader pushback: "real" became "meaningful."
Before publication, the later-evidence sentence was revised so the follow-up research reads as weaker and harder to interpret.

Narrowed

METR supports the feeling-vs-fact split under a specific setup; it does not measure the reader's odds or all AI building.
"Broad adoption is rare" stays qualitative and AI-agnostic; the essay body imports no install or revenue numbers.
"Cheap attempt, not the outcome" stays qualitative; supporting delivery evidence does not become a universal claim about AI quality.

Not imported

No dopamine, ambiguity-aversion, or neuroscience mechanism was imported into the ending.
No public claim is made from private relational witness material.
No claim is made that process artifacts themselves prove the essay true.

Source ledger

Open any entry to see the outside constraint and what it could have changed.

Felt productivity can diverge from measured productivity; the feeling is not evidence.

The outside constraintMETR randomized experienced open-source developers on real tasks in repositories they knew. They expected a speedup and afterward believed they had one; measured completion time moved the other way.

Public sourceMETR study; arXiv paper.

How it could have changed the claimIf felt and measured speed had lined up, the essay would lose its strongest public anchor for the feeling/fact split.

What actually happenedConfirmed, with limits.

What it still cannot tell youExperienced developers, mature repositories, early-2025 tools; not a measurement of your odds. The size of the slowdown is not carried into the essay's prose.

The METR result should not be treated as timeless; later tools may behave differently.

The outside constraintMETR's 2026 update says the early-2025 result is out of date, that later data carry selection and measurement limits, and that raw estimates hint at a possible late-2025 speedup but are hard to interpret.

Public sourceMETR, "Uplift update" (2026-02-24).

How it could have changed the claimIf later data cleanly repeated the slowdown, the caveat could shrink; if they cleanly reversed it, the study paragraph would need a bigger revision.

What actually happenedNarrowed the claim.

What it still cannot tell youThe later evidence is weaker and harder to interpret, not a clean settlement. The essay's published "what would change my mind" falsifiers stand with no expiration date.

Broad outside adoption is rare and slow to verify.

The outside constraintApp-market data show downloads and revenue heavily concentrated, with many apps never reaching meaningful install thresholds; this supports only an AI-agnostic rarity floor.

Public sourceSensor Tower; AppBrain.

How it could have changed the claimIf broad outside adoption were common, the "fewer still, by a lot" and slow-verdict framing would be too pessimistic.

What actually happenedConfirmed, with limits / narrowed.

What it still cannot tell youApp-store data are a proxy, not a direct measure of all useful tools, private projects, or AI-built work.

Lowering the cost of trying does not lower the full cost of succeeding.

The outside constraintCursor-adoption evidence and the DORA 2024 report support the idea that AI can change effort and self-report without erasing quality, judgment, complexity, and delivery trade-offs.

Public sourceCursor paper; DORA 2024.

How it could have changed the claimIf cheap attempts reliably became durable quality, the "cheaper attempt, not the outcome" line would need narrowing.

What actually happenedStayed constant.

What it still cannot tell youCorroborating only; the essay makes no numerical or universal software-quality claim.

"Lifetime value uncertainty" works as a brake for this author. (No public evidence.)

The outside constraintNone available. The support is first-person testimony plus the author's own behavior and feedback over time.

Public sourceNone. This is testimony, not referenced evidence.

How it could have changed the claimA direct contradiction, negative feedback over time, or evidence that uncertainty tightens the grip rather than loosening it would weaken or kill the practical ending.

What actually happenedUnresolved, kept inside its limits.

What it still cannot tell youNo dopamine, outcome-uncertainty, or neuroscience mechanism is imported. Whether the brake works is observed over time, not on a schedule.

The people closest to you can report your absence better than you can certify your presence.

The outside constraintThe logic points outward, but the witness experience that generated it is private and family-framed.

Public sourceNone. Public wording only; no private detail.

How it could have changed the claimIf the people nearby were unreliable witnesses to absence, or if easier stopping did not improve real presence over time, the relational turn would need downgrading.

What actually happenedBounded; not publicly anchored.

What it still cannot tell youIt stays a practical tell, not a public generalization. Any public use of witness material would need separate, privacy-safe approval.

The central question is about meaning, not bare existence. (Changed after publication.)

The outside constraintAn unsolicited reader objection after publication, paraphrased with no identity: a thing can be real, it exists, without being meaningful, so "is it real?" was the wrong axis.

Public sourceThe live essay carries the revised wording; the public change log records the revision.

How it could have changed the claimIf the distinction held, and the author judged it did, the published wording could not stand.

What actually happenedChanged: the wording was revised after publication, and a leftover passage was cut.

What it still cannot tell youIt rests on one reader's objection over a private channel; it sharpened the question but is not evidence the essay's claims are correct.

Remaining risks

About this Reference record

A source counts here only if it could have changed, narrowed, or killed a claim. This is not a bibliography or a reading list.

Remaining risks, plainly: the essay still reasons from software evidence to AI building in general; the external-value evidence is platform-biased and not AI-specific; the brake is a working personal claim, not a tested intervention; the relational tell is important but not publicly inspectable as evidence; and the post-publication change rests on a single reader's objection, which improved precision, not proof.