<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[GroundCTRL Blog]]></title><description><![CDATA[This publication covers the daily thoughts of the maker of the GroundCTRL macOS app.]]></description><link>https://groundctrl.dev</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1767265133291/129b339a-274a-4385-9f6c-2f21fcf9deb7.png</url><title>GroundCTRL Blog</title><link>https://groundctrl.dev</link></image><generator>RSS for Node</generator><lastBuildDate>Sat, 18 Apr 2026 21:22:59 GMT</lastBuildDate><atom:link href="https://groundctrl.dev/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Boost Your Software Team Productivity with AI-Driven PR Reviews: A Step-by-Step Guide]]></title><description><![CDATA[Section 1: The Skepticism Paradox
Here's a paradox worth examining: GitHub's 2025 Octoverse reports that 72.6% of developers using Copilot code review found it improved their effectiveness.[^1] Yet Stack Overflow's 2025 Developer Survey reveals that ...]]></description><link>https://groundctrl.dev/boost-your-software-team-productivity-with-ai-driven-pr-reviews-a-step-by-step-guide</link><guid isPermaLink="true">https://groundctrl.dev/boost-your-software-team-productivity-with-ai-driven-pr-reviews-a-step-by-step-guide</guid><category><![CDATA[Peer review]]></category><category><![CDATA[Teamwork Makes the Dream Work]]></category><category><![CDATA[software development]]></category><category><![CDATA[Teamwork and Collaboration]]></category><category><![CDATA[  AI-Driven  ]]></category><category><![CDATA[step-by-step guide]]></category><category><![CDATA[copilot]]></category><category><![CDATA[Copilot Features]]></category><dc:creator><![CDATA[Deyan Aleksandrov]]></dc:creator><pubDate>Thu, 01 Jan 2026 22:00:00 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1767287528829/0a976a4d-5446-4de6-b911-8a15b3bcb61c.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-section-1-the-skepticism-paradox">Section 1: The Skepticism Paradox</h2>
<p>Here's a paradox worth examining: <a target="_blank" href="http://github.blog">GitHub's 2025 Octoverse reports that 72.6% of developers using Copilot code review found it improved their effectiveness</a>.[^1] Yet Stack Overflow's 2025 Developer Survey reveals that only <a target="_blank" href="https://survey.stackoverflow.co/2025/ai">33% of developers trust AI output accuracy—down from 43% the year before—with 46% now actively distrusting it</a>.[^2]</p>
<p>Developers are using tools they trust <em>less</em> than they did a year ago.</p>
<p>This isn't cognitive dissonance—it's pragmatism. The value proposition has shifted. The conversation around AI in software development has largely focused on code generation: can AI write production-ready code?</p>
<p><strong>That framing misses where AI can deliver immediate, measurable value with far less trust required.</strong></p>
<h3 id="heading-verification-vs-judgment">Verification vs. Judgment</h3>
<p>When I think about code review, I split it into two layers:</p>
<ul>
<li><p><strong>Judgment</strong>: architecture trade-offs, product intent, domain correctness, and long-term maintainability.</p>
</li>
<li><p><strong>Verification</strong>: consistency and completeness against <em>documented</em> standards—patterns, checklists, naming rules, analytics schemas, and “did we remember the boring but important stuff?”</p>
</li>
</ul>
<p>I don’t want AI making judgment calls for me. But I <em>do</em> want it relentlessly running the verification layer—because that’s the part humans agree matters, and still miss under deadline pressure.</p>
<p>PR review isn’t one thing. And skepticism about AI makes perfect sense when we ask it to architect systems or write business logic. But checking whether a PR follows established patterns? Whether analytics events include required parameters? Whether error handling matches conventions?</p>
<p><strong>That’s verification, not creation</strong>. <strong>And verification is where the bottleneck lives.</strong></p>
<p>Quality and productivity aren't separate concerns—they're linked through rework. Every analytics bug discovered three months post-release requires investigation, prioritization, a fix, another review cycle, and deployment. Fifteen seconds of AI verifying event parameters can prevent hours of future work.</p>
<p><strong><mark>My bet:</mark></strong></p>
<p><mark>PR review verification is one of the fastest places for skeptical teams to </mark> <em><mark>feel</mark></em> <mark> AI's value—because the output is auditable, and the risk is low.</mark></p>
<h3 id="heading-the-blueprint-in-30-seconds">The blueprint in 30 seconds</h3>
<p>If you're impatient, here's what this article will show you:</p>
<ol>
<li><p>Add instruction files to your repo (your team's actual patterns and rules)</p>
</li>
<li><p>Run AI review against your diff <em>before</em> opening the PR</p>
</li>
<li><p>Define severity levels so the AI doesn't flood you with noise</p>
</li>
<li><p>Let humans focus on judgment, let AI handle verification</p>
</li>
<li><p>Iterate weekly and:</p>
<ol>
<li><p>add what it missed,</p>
</li>
<li><p>remove what it nags about.</p>
</li>
</ol>
</li>
</ol>
<p>The rest of this article explains why this works, what can go wrong, and how to measure whether it's helping.</p>
<hr />
<h2 id="heading-section-2-the-bottleneck-everyone-measures">Section 2: The Bottleneck Everyone Measures</h2>
<p>Code review is one of the most visible bottlenecks in software delivery. In organizations that track DORA-style delivery metrics, review time shows up quickly as "time-to-merge," "time waiting for review," and "review rounds." <a target="_blank" href="https://www.faros.ai/blog/key-takeaways-from-the-dora-report-2025">DORA's 2025 report found that despite AI boosting PRs merged by 98%, code review time <em>increased</em> by 91%</a>—a counterintuitive result suggesting AI generates more code faster than teams can absorb.[^7]</p>
<p>The research on code review effectiveness is sobering. <a target="_blank" href="http://viewer.media.bitpipe.com/1253203751_753/1284482743_310/11_Best_Practices_for_Peer_Code_Review.pdf">A study from Cisco's programming team</a>—often summarized in industry guidance from SmartBear—converges on what most teams learn through experience:[^3]</p>
<ul>
<li><p><strong>200–400 lines of code</strong> is the optimal review size for defect detection</p>
</li>
<li><p>Review sessions longer than <strong>60 minutes</strong> show diminishing returns as reviewer attention degrades</p>
</li>
<li><p>Reviewers process approximately <strong>500 lines per hour</strong> effectively; beyond that, quality drops</p>
</li>
</ul>
<p>These aren't arbitrary guidelines. They reflect cognitive limits. <strong>A 2,000-line PR isn't just harder to review—it's fundamentally incompatible with how human attention works</strong>. Yet large PRs are common because splitting work creates coordination overhead.</p>
<p>The bottleneck isn't laziness or lack of process. It's that thorough code review competes with the same cognitive resources needed for feature development. When a senior engineer spends two hours reviewing a PR, those are two hours not spent on architecture decisions, mentoring, or their own deliverables.</p>
<p>Organizations respond predictably aand review depth decreases as deadlines approach. The checks that slip first are exactly the ones AI handles well—style consistency, documentation completeness, pattern adherence.</p>
<p>There's another factor I (unfortunately) rarely happen to discuss with colleagues—human reviewers aren't consistent across authors. We review some colleagues more thoroughly than others. The senior engineer's PR gets a quick approval while the new hire's PR gets line-by-line scrutiny. These biases aren’t malicious—they’re human. But they mean the <em>same code</em> gets different verification depending on who wrote it.</p>
<p>This is where AI changes the equation—not by replacing human judgment on complex architectural decisions, but by taking on the verification layer humans consistently deprioritize under pressure—and applying it uniformly regardless of author.</p>
<hr />
<h2 id="heading-section-3-the-blueprint-structured-ai-instructions-quick-start-kit">Section 3: The Blueprint — Structured AI Instructions (Quick Start Kit)</h2>
<p><strong>The difference between useful AI PR reviews and noise is structure</strong>. AI tools without context produce generic feedback—the equivalent of running a linter with default rules on a codebase with its own conventions.</p>
<p>This isn't speculation. GitClear's analysis of 153 million lines of code found that code churn hit <a target="_blank" href="https://gitclear-public.s3.us-west-2.amazonaws.com/GitClear-AI-Copilot-Code-Quality-2025.pdf"></a><a target="_blank" href="https://www.gitclear.com/ai_assistant_code_quality_2025_research">7.9% in 2024 (up from 5.5% in 2020), with copy/paste code rising to 12.3%</a>.[^5] The code patterns resembled work from <a target="_blank" href="https://www.gitclear.com/coding_on_copilot_data_shows_ais_downward_pressure_on_code_quality">"an itinerant contributor"</a>—someone unfamiliar with the codebase's conventions, duplicating logic that already exists elsewhere.</p>
<p><a target="_blank" href="https://arxiv.org/abs/2302.06590">GitHub's research on Copilot from <strong>2023</strong> showed a 55% speed improvement</a>.[^6] However, the study did not examine the effects of AI on quality. It's likely that increased speed without context led to <strong>quantity without quality</strong>. Developers probably spent time reviewing AI suggestions that went against architectural decisions, duplicated existing utilities, or introduced patterns the team had deliberately moved away from.</p>
<p>The lesson I learned? AI without understanding the codebase context doesn't just fail to help—it actually creates more work.</p>
<h3 id="heading-what-this-blueprint-does-not-do">What this blueprint does NOT do</h3>
<p>To set expectations clearly:</p>
<ul>
<li><p><strong>No auto-merging</strong>—AI flags issues; humans decide what to do.</p>
</li>
<li><p><strong>No security sign-off</strong>—AI can check for obvious patterns (missing auth calls), but security review still needs human judgment.</p>
</li>
<li><p><strong>No reliable architecture decisions</strong>—AI might suggest using a repository pattern or how to structure your modules, but human judgment is necessary..</p>
</li>
<li><p><strong>No performance tuning</strong>—AI can flag obvious issues, but optimization requires context and execution AI doesn't have.</p>
</li>
<li><p><strong>No replacing code review</strong>—This enhances human review, it doesn't replace it.</p>
</li>
</ul>
<p><mark>The goal is narrower—consistent verification of documented standards, freeing humans for the judgment calls that actually need them</mark>.</p>
<h3 id="heading-quick-start-3060-minutes">Quick Start (30–60 minutes)</h3>
<p>If you want to try this without committing your team to “AI everywhere,” here’s the smallest version that works:</p>
<ol>
<li><p>Add a repo-wide instruction file (the rules you wish reviewers enforced consistently).</p>
</li>
<li><p>Add <em>one</em> path-specific instruction file for a high-value area (analytics is a great start).</p>
</li>
<li><p>Define severity levels so the AI doesn’t flood you with nits.</p>
</li>
<li><p>Run an AI review on your diff <em>before</em> opening the PR. It's not necessary, but it's good advice.</p>
</li>
<li><p>Iterate weekly—add what it missed, remove what it nags about.</p>
</li>
</ol>
<h3 id="heading-the-workflow-i-actually-use-pre-flight-before-humans">The workflow I actually use (pre-flight, before humans)</h3>
<p>The most effective integration I've found isn't AI reviewing PRs after they're opened—it's AI reviewing code before it reaches human reviewers at all.</p>
<ol>
<li><p>Write the feature</p>
</li>
<li><p>Push changes to a branch and open a PR</p>
</li>
<li><p>Run an AI review on the PR, either locally or on the server. Running it on the server keeps a history for future human reviewers, which I personally prefer to always have.</p>
</li>
<li><p>Fix what it catches</p>
</li>
<li><p><em>Then</em> submit the PR for human review.</p>
</li>
</ol>
<p>This shifts AI review from "another reviewer in the queue" to a pre-flight checklist.</p>
<h3 id="heading-what-the-ai-catches"><strong>What the AI catches</strong></h3>
<p>With properly structured instructions, the AI reviewer enforces decisions the team has already made:</p>
<ul>
<li><p><strong>Analytics completeness</strong>:</p>
<ul>
<li><p>Every user action requires tracking.</p>
</li>
<li><p>The instruction file lists required parameters per event type.</p>
</li>
<li><p>AI verifies every event includes <code>screenName</code>, <code>userSegment</code>, and action-specific context.</p>
</li>
<li><p>No more discovering missing attribution data three sprints later.</p>
</li>
</ul>
</li>
<li><p><strong>MVVM boundaries</strong>:</p>
<ul>
<li><p>ViewModels don't import UIKit.</p>
</li>
<li><p>Views don't contain business logic.</p>
</li>
<li><p>Coordinators handle navigation.</p>
</li>
<li><p>These aren't suggestions—they're structural decisions.</p>
</li>
<li><p>AI flags violations before they become patterns.</p>
</li>
</ul>
</li>
<li><p><strong>Protocol adoption</strong>:</p>
<ul>
<li><p>The codebase has established patterns for REST API integration—specific protocols for request building, response parsing, error handling.</p>
</li>
<li><p>A new endpoint that skips <code>APIRequestConfigurable</code> or handles errors inline instead of through <code>APIErrorHandler</code> gets flagged immediately.</p>
</li>
</ul>
</li>
<li><p><strong>Abstraction adherence</strong>:</p>
<ul>
<li><p>When the team decided all persistence goes through repository interfaces, that decision needs enforcement.</p>
</li>
<li><p>AI spots shortcuts when someone, whether it's the new kid on the block or the project maverick, decides to query Core Data directly "just this once".</p>
</li>
</ul>
</li>
<li><p><strong>The small things</strong>:</p>
<ul>
<li><p>Debug print statements.</p>
</li>
<li><p>TODO comments that should be tickets.</p>
</li>
<li><p>Force unwraps that should be guard statements.</p>
</li>
<li><p>Hardcoded strings that belong in localization files.</p>
</li>
<li><p>The reviewer <em>may</em> catch these, but why waste their attention on them?</p>
</li>
</ul>
</li>
</ul>
<h3 id="heading-repository-instructions-example-github-copilot">Repository instructions (example: GitHub Copilot)</h3>
<p>GitHub Copilot supports two levels of instruction files:</p>
<p><strong>Repository-wide instructions</strong> (<code>.github/</code><a target="_blank" href="http://copilot-instructions.md"><code>copilot-instructions.md</code></a>):</p>
<pre><code class="lang-markdown"><span class="hljs-section"># Project Instructions</span>

This codebase follows MVVM architecture with Coordinators for navigation.

<span class="hljs-section">## Review split</span>
<span class="hljs-bullet">-</span> Verification tasks should be enforced consistently by AI.
<span class="hljs-bullet">-</span> Judgment calls belong to humans.

<span class="hljs-section">## Architecture boundaries</span>
<span class="hljs-bullet">-</span> ViewModels should always be marked as @MainActor
<span class="hljs-bullet">-</span> Coordinators handle navigation

<span class="hljs-section">## Concurrency</span>
<span class="hljs-bullet">-</span> All async operations use Swift Concurrency, not Combine

<span class="hljs-section">## Analytics</span>
<span class="hljs-bullet">-</span> Analytics events require both action and context parameters
<span class="hljs-bullet">-</span> Do not ship debug logging or TODOs; convert TODOs to tickets

<span class="hljs-section">## Quality</span>
<span class="hljs-bullet">-</span> Prefer small PRs; if a PR exceeds ~400 lines, include a short review guide in the PR description
</code></pre>
<p><strong>Path-specific instructions</strong> (<code>.github/instructions/*.</code><a target="_blank" href="http://instructions.md"><code>instructions.md</code></a>):</p>
<pre><code class="lang-markdown">---
<span class="hljs-section">applyTo: "Sources/Analytics/<span class="hljs-strong">**"
---

# Analytics Module Instructions

## Event Naming
- Use dot-separated lowercase names (e.g., `article.read.completed`)
- Include `screen` context in all events

## Required Parameters
Every analytics event must include:
- `eventName`: The dot-separated event identifier
- `timestamp`: ISO 8601 format
- `sessionId`: Current session identifier</span></span>
</code></pre>
<p>To show this isn’t “just analytics,” here’s a second path-specific example (choose a module where you’ve been burned before):</p>
<pre><code class="lang-markdown">---
<span class="hljs-section">applyTo: "Sources/Networking/<span class="hljs-strong">**"
---

# Networking Module Instructions

## Consistency
- New endpoints must use the shared request builder and response decoder
- Do not parse JSON inline inside feature code

## Error handling
- Map transport errors into the shared error type
- Do not swallow errors; return typed failures and log at the boundary

## Testing
- Add unit tests for request encoding and response decoding when adding endpoints</span></span>
</code></pre>
<h3 id="heading-severity-rubric-to-prevent-noise">Severity rubric (to prevent noise)</h3>
<p>If everything is "important," the AI becomes background noise. You can use a simple rubric like this one:</p>
<div class="hn-table">
<table>
<thead>
<tr>
<td><strong><em>Severity</em></strong></td><td><strong><em>Examples</em></strong></td></tr>
</thead>
<tbody>
<tr>
<td><strong>Blocker</strong></td><td>Missing security/permission checks, data-loss risk, crashing bugs, secrets in code</td></tr>
<tr>
<td><strong>High</strong></td><td>Analytics schema gaps, missing required tests, architecture boundary violations</td></tr>
<tr>
<td><strong>Medium</strong></td><td>Pattern inconsistencies, error handling deviations, unclear naming</td></tr>
<tr>
<td><strong>Low</strong></td><td>Style nits, formatting, small readability issues</td></tr>
</tbody>
</table>
</div><h3 id="heading-what-a-good-ai-review-comment-looks-like-output-format">What a good AI review comment looks like (output format)</h3>
<p>Here’s the structure I aim for (this is what I want posted as a review, or returned locally):</p>
<ul>
<li><p><strong>Summary</strong> (2–4 bullets)</p>
</li>
<li><p><strong>Findings by severity</strong> (Blocker → Low, could be percentages too)</p>
</li>
<li><p><strong>Suggested tests / QA scenarios</strong> (derived from actual diff)</p>
</li>
<li><p><strong>Needs human judgment</strong> (explicitly carve out trade-offs)</p>
</li>
</ul>
<p>Example:</p>
<pre><code class="lang-markdown"><span class="hljs-section">## AI Pre-Flight Review</span>

<span class="hljs-section">### Summary</span>
<span class="hljs-bullet">-</span> Adds purchase flow completion tracking
<span class="hljs-bullet">-</span> Refactors CheckoutViewModel concurrency to async/await

<span class="hljs-section">### Blockers</span>
<span class="hljs-bullet">-</span> None

<span class="hljs-section">### High</span>
<span class="hljs-bullet">-</span> Analytics event <span class="hljs-code">`checkout.purchase.completed`</span> missing <span class="hljs-code">`currency`</span>

<span class="hljs-section">### Medium</span>
<span class="hljs-bullet">-</span> ViewModel is not marked with @MainActor; move formatting helper into view layer

<span class="hljs-section">### Suggested QA</span>
<span class="hljs-bullet">-</span> Complete purchase with invalid promo code and verify analytics fires with full parameter set
<span class="hljs-bullet">-</span> Cold start into checkout deep link

<span class="hljs-section">### Needs human judgment</span>
<span class="hljs-bullet">-</span> Is the new repository abstraction worth the extra indirection for this feature?
</code></pre>
<hr />
<h2 id="heading-section-4-failure-modes-amp-guardrails">Section 4: Failure Modes &amp; Guardrails</h2>
<p>AI review is powerful precisely because it’s consistent—but consistency cuts both ways. Here’s what I’ve seen go wrong, and the guardrails that keep it useful.</p>
<h3 id="heading-failure-modes">Failure modes</h3>
<ul>
<li><p><strong>Instruction drift</strong>—The AI enforces outdated rules that no longer apply. I find this very similar to when a team member follows outdated documentation.</p>
</li>
<li><p><strong>False positives → alert fatigue</strong>—People start ignoring what the bot writes.</p>
</li>
<li><p><strong>False negatives → false confidence</strong>—Teams assume "the bot didn't complain" means "it's correct."</p>
</li>
<li><p><strong>Overreach into judgment</strong>—AI tries to dictate architecture instead of just highlighting risks. (I have not seen this happen to be honest, but it's a potential risk)</p>
</li>
<li><p><strong>Security/privacy mistakes</strong>—Diffs may include secrets or sensitive data, and prompts might leak information. (Always be cautious about this)</p>
</li>
<li><p><strong>Social misuse</strong>—AI comments are used to judge engineer performance.</p>
</li>
</ul>
<h3 id="heading-guardrails">Guardrails</h3>
<ul>
<li><p><strong>Treat instruction files like code</strong>—assign an owner, review changes, and revisit quarterly in the least.</p>
</li>
<li><p><strong>Cap output</strong>—top N findings, group by severity, and link each finding to a specific rule.</p>
</li>
<li><p><strong>Make the split explicit!</strong>—<strong><em>AI verifies but humans judge</em>.</strong></p>
</li>
<li><p><strong>Audit occasionally</strong>—sample 1-2 in 10 PRs to estimate bot accuracy and tune rules.</p>
</li>
</ul>
<hr />
<h2 id="heading-section-5-what-ai-finds-that-humans-miss-detailed-examples">Section 5: What AI Finds That Humans Miss (Detailed Examples)</h2>
<p>The value of AI PR reviews isn't catching what humans would catch anyway—it's catching what humans consistently deprioritize.</p>
<h3 id="heading-analytics-implementation-errors">Analytics implementation errors</h3>
<p>Analytics tracking is the canonical example. A missing parameter in an analytics event doesn't break the build. It doesn't cause runtime errors. It silently produces incomplete data that nobody notices until someone runs a report months later.</p>
<p>Human reviewers know analytics matters. They also know it's boring to verify. Under time pressure, “analytics looks fine” becomes the default assessment.</p>
<p>AI doesn’t experience time pressure. Given instructions like “every purchase event must include <code>productId</code>, <code>price</code>, <code>currency</code>, and <code>purchaseContext</code>,” it verifies every event, every time.</p>
<h3 id="heading-documentation-drift">Documentation drift</h3>
<p>Documentation that doesn't match code is worse than no documentation—it actively misleads. But keeping documentation synchronized requires noticing when code changes invalidate docs in other files.</p>
<p>Humans review changed files. AI can be instructed to check whether changes to a public API have corresponding documentation updates, whether removed parameters are still referenced, and whether examples still compile.</p>
<h3 id="heading-pattern-adherence">Pattern adherence</h3>
<p>Every codebase accumulates patterns—some documented, many implicit. New team members don’t know them; experienced team members forget to check them during reviews.</p>
<p>AI, given explicit patterns, checks consistently.</p>
<h3 id="heading-access-control-verification">Access control verification</h3>
<p>Permission checks follow predictable patterns but fail in subtle ways. A new endpoint that forgets to verify ownership. A bulk operation that checks permissions on the first item but not subsequent ones.</p>
<p>Human reviewers catch these when they're looking for them. AI, instructed with “every endpoint modifying user data must call <code>verifyOwnership()</code> before the operation,” checks every endpoint, every time.</p>
<h3 id="heading-edge-case-handling">Edge-case handling</h3>
<p>Certain categories of bugs follow predictable patterns: off-by-one errors in pagination, timezone handling in date comparisons, null checks on optional chains.</p>
<p><strong><em><mark>The meta-insight</mark></em><mark>: AI review doesn't replace human judgment. It enforces documented judgment that humans apply inconsistently.</mark></strong></p>
<hr />
<h2 id="heading-section-6-how-to-measure-whether-it-worked">Section 6: How to Measure Whether It Worked</h2>
<p>If you want this to land with a mixed audience—ICs and leadership—you need a way to validate it beyond vibes.</p>
<h3 id="heading-metrics-that-will-probably-move-first">Metrics that will probably move first</h3>
<ul>
<li><p><strong>Time to first human review</strong> (does pre-flight reduce back-and-forth?)</p>
</li>
<li><p><strong>PR open → merge time</strong> (what is the improvement on average after 3 months?)</p>
</li>
<li><p><strong>Review rounds</strong> (how often does a PR bounce for “checklist stuff”?)</p>
</li>
<li><p><strong>Verification-class defects post-merge</strong> (analytics gaps, doc mismatches, missing permission checks)</p>
</li>
</ul>
<h3 id="heading-signals-for-ics-quality-of-the-bot-itself">Signals for ICs (quality of the bot itself)</h3>
<ul>
<li><p><strong>Acceptance rate</strong> (what % of AI findings lead to a code change?)</p>
</li>
<li><p><strong>Top recurring findings</strong> (the list that should become instruction updates)</p>
</li>
<li><p><strong>Human checklist comments trend</strong> (are humans spending less time on nits?)</p>
</li>
</ul>
<h3 id="heading-a-simple-approach">A simple approach:</h3>
<ol>
<li><p>measure two weeks of baseline,</p>
</li>
<li><p>enable pre-flight AI verification,</p>
</li>
<li><p>then compare the next 2–4 weeks.</p>
</li>
<li><p>You're not trying to prove a paper—you're trying to see if your team is shipping with less rework.</p>
</li>
</ol>
<hr />
<h2 id="heading-section-7-the-documentation-accelerator">Section 7: The Documentation Accelerator</h2>
<p>There's a parallel to AI's impact on code review in an unexpected domain: management consulting.</p>
<p><a target="_blank" href="https://mitsloan.mit.edu/ideas-made-to-matter/how-generative-ai-can-boost-highly-skilled-workers-productivity">A multi-school study of consultants using GPT-4 found a 40% performance increase on tasks within AI's capability frontier—but a 19 percentage point <em>drop</em> when AI was applied outside its strengths.</a>[^4] The researchers called this "jagged" value—dramatic gains in some areas, negative impact in others.</p>
<p>That “jagged frontier” maps cleanly onto PR review.</p>
<p><strong>Senior engineers add unique value in:</strong></p>
<ul>
<li><p><strong>Architectural judgment</strong> (“this approach will create scaling problems”)</p>
</li>
<li><p><strong>Domain knowledge</strong> (“this flow doesn’t match how our users behave”)</p>
</li>
<li><p><strong>Teaching moments</strong> (“here’s why we don’t do it that way”)</p>
</li>
</ul>
<p><strong>They add less differentiated value in:</strong></p>
<ul>
<li><p><strong>Style consistency verification</strong></p>
</li>
<li><p><strong>Checklist completion</strong> (tests present, docs updated, no debug code)</p>
</li>
<li><p><strong>Pattern matching against documented standards</strong></p>
</li>
</ul>
<p><strong>AI handles the second category, freeing humans for the first.</strong></p>
<p>The consulting comparison reveals something else—the teams that capture AI's value aren't the ones with the best tools but they're the ones with the most explicit standards. <mark>A team with "our code should be high quality" gets nothing from AI. A team with documented conventions and named patterns can offload verification almost entirely—and the documentation improves reviews even without a bot.</mark></p>
<hr />
<h2 id="heading-section-8-conclusion">Section 8: Conclusion</h2>
<p>The bottleneck in code review isn’t going away. Codebases grow. Teams scale. Cognitive limits don’t change because we wish they would.</p>
<p><mark>What changes is what we ask humans to do.</mark></p>
<p>The shift isn't "let AI review PRs."<br />It's: <strong>use AI for verification so humans can focus on judgment.</strong></p>
<p>Human reviewers bring bias—we review some colleagues more thoroughly than others, we're influenced by past experiences with specific authors, we give different weight to the same patterns depending on who wrote them. AI reviewers bring different bias—they're limited to what the instructions encode. They can't catch (thought they might) what you didn't think to document. They won't recognize (thought they might) context that seems obvious to a human who's been on the team for years.</p>
<p>This trade-off is the point. AI bias is explicit and auditable—it's in the instruction file. Human bias is implicit and variable. For verification tasks with documented criteria, explicit bias wins. For judgment calls requiring context and nuance, human bias (with all its flaws) is still necessary.</p>
<p>That's also why this is a great place for skeptical teams to start. The verification layer is explicit, auditable, and low-risk—and it pays back quickly in reduced rework.</p>
<p>The blueprint is straightforward:</p>
<ol>
<li><p><strong>Document your standards explicitly →</strong> If a convention exists only in senior engineers’ heads, AI can’t enforce it—and neither can anyone else consistently.</p>
</li>
<li><p><strong>Start with high-value, low-risk checks →</strong> Analytics, docs sync, access control patterns, boundary rules.</p>
</li>
<li><p><strong>Integrate with existing workflow →</strong> Pre-flight is the key—catch issues before humans see the PR.</p>
</li>
<li><p><strong>Iterate on instructions →</strong> Misses and noise are feedback. Update the instruction file like you update tests.</p>
</li>
</ol>
<p>The question isn’t whether AI can help with code review. It already can—today—for verification tasks.</p>
<p>The question is whether your team’s knowledge is documented well enough to leverage it. And if not, whether making it explicit is worth doing anyway.</p>
<hr />
<h2 id="heading-references">References</h2>
<p>[^1]: GitHub. "Octoverse 2025: AI leads developer activity." GitHub Blog, 2025. <a target="_blank" href="https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/">https://github.blog/news-insights/octoverse/octoverse-a-new-developer-joins-github-every-second-as-ai-leads-typescript-to-1/</a></p>
<p>[^2]: Stack Overflow. "2025 Developer Survey: AI." Stack Overflow, 2025. <a target="_blank" href="https://survey.stackoverflow.co/2025/ai">https://survey.stackoverflow.co/2025/ai</a></p>
<p>[^3]: SmartBear. "11 Best Practices for Peer Code Review." SmartBear Software, 2025. <a target="_blank" href="http://viewer.media.bitpipe.com/1253203751_753/1284482743_310/11_Best_Practices_for_Peer_Code_Review.pdf">http://viewer.media.bitpipe.com/1253203751_753/1284482743_310/11_Best_Practices_for_Peer_Code_Review.pdf</a></p>
<p>[^4]: Dell'Acqua, F., et al. "Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality." Harvard Business School Working Paper 24-013, 2023. Summary: <a target="_blank" href="https://mitsloan.mit.edu/ideas-made-to-matter/how-generative-ai-can-boost-highly-skilled-workers-productivity">https://mitsloan.mit.edu/ideas-made-to-matter/how-generative-ai-can-boost-highly-skilled-workers-productivity</a></p>
<p>[^5]: GitClear. "Coding on Copilot: 2023 Data Suggests Downward Pressure on Code Quality." GitClear, January 2024. <a target="_blank" href="https://www.gitclear.com/coding_on_copilot_data_shows_ais_downward_pressure_on_code_quality">https://www.gitclear.com/coding_on_copilot_data_shows_ais_downward_pressure_on_code_quality</a> (2025 follow-up data confirms continued churn growth: <a target="_blank" href="https://www.gitclear.com/ai_assistant_code_quality_2025_research">https://www.gitclear.com/ai_assistant_code_quality_2025_research</a>)</p>
<p>[^6]: Peng, S., Kalliamvakou, E., Cihon, P., Demirer, M. "The Impact of AI on Developer Productivity: Evidence from GitHub Copilot." arXiv:2302.06590, February 2023. <a target="_blank" href="https://arxiv.org/abs/2302.06590">https://arxiv.org/abs/2302.06590</a></p>
<p>[^7]: DORA. "DORA Report 2025: AI Impact on Developer Productivity." Google Cloud, 2025. <a target="_blank" href="https://www.faros.ai/blog/key-takeaways-from-the-dora-report-2025">https://www.faros.ai/blog/key-takeaways-from-the-dora-report-2025</a></p>
]]></content:encoded></item><item><title><![CDATA[I Built an App with Claude Code… But Claude Wasn't the Point]]></title><description><![CDATA[The Hook
The irony wasn't lost on me:
I used an AI coding assistant to create a dashboard that excludes the assistant for tasks that can run statically on the API alone, but includes the assistant for PR reviews when necessary.
A few days, a lot of p...]]></description><link>https://groundctrl.dev/i-built-an-app-with-claude-code-but-claude-wasnt-the-point</link><guid isPermaLink="true">https://groundctrl.dev/i-built-an-app-with-claude-code-but-claude-wasnt-the-point</guid><category><![CDATA[2Articles1Week]]></category><category><![CDATA[macOS]]></category><category><![CDATA[claude-code]]></category><category><![CDATA[claude]]></category><category><![CDATA[Jira automation]]></category><category><![CDATA[GitHub]]></category><category><![CDATA[ci-cd]]></category><dc:creator><![CDATA[Deyan Aleksandrov]]></dc:creator><pubDate>Sun, 28 Dec 2025 18:27:56 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/XmLULwMRxcU/upload/7ab913e678adaad86679b8d52ee33f52.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2 id="heading-the-hook">The Hook</h2>
<p><strong>The irony</strong> wasn't lost on me:</p>
<p>I used an AI coding assistant to create a dashboard that excludes the assistant for tasks that can run statically on the API alone, but includes the assistant for PR reviews when necessary.</p>
<p>A few days, a lot of prompts, and now <strong>255</strong> <a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a> <strong>issues</strong> later, I ended up with something more interesting (at least to me) than “just another AI wrapper”:</p>
<ul>
<li><p>255 issues total</p>
</li>
<li><p>219 closed</p>
</li>
<li><p>36 open</p>
</li>
<li><p>16 blocked</p>
</li>
<li><p>20 ready to work</p>
</li>
</ul>
<p><strong>The twist</strong>:</p>
<p>The more I leaned on Claude Code to build the app, the more I wanted the app itself to <em>NOT</em> lean on Claude Code. Wherever possible, I wanted plain APIs, local logic, and headless workflows that would keep working even if I swapped the AI out.</p>
<hr />
<h2 id="heading-the-original-pain-too-many-tabs-not-enough-flow">The Original Pain: Too Many Tabs, Not Enough Flow</h2>
<p>My typical morning looked like this:</p>
<ul>
<li><p>Open Issue Tracker (e.g. Jira), check tickets assigned to me or to others.</p>
</li>
<li><p>Open Git Tracker (e.g. GitHub, GitLab), check PRs needing review.</p>
</li>
<li><p>Open CI/CD Service (e.g. Bitrise, GitLab CI), see what’s red or green, get a build out.</p>
</li>
<li><p>Open Messaging App (e.g. Teams, Slack), write a status update by hand ...</p>
</li>
</ul>
<p>Each step is fine in isolation.<br />Together, it’s a “18 tabs open and zero real flow” situation.</p>
<p>The obvious advice is to <em>just use an AI plugin/command inside Claude Code</em>. I do. Or <em>just use an AI plugin inside Jira</em>. I do that too.<br />They’re super useful. But they’re still trapped inside each tool, or they’re incomplete and they don’t give me:</p>
<ul>
<li><p>One place to see what’s ready to work on.</p>
</li>
<li><p>One place to see failing builds.</p>
</li>
<li><p>One place to see PRs that need attention.</p>
</li>
<li><p>One place where AI can help with reviews and summaries, without jumping between tabs.</p>
</li>
</ul>
<p><strong>So I built a small macOS cockpit for myself</strong>.</p>
<hr />
<h2 id="heading-beads-the-issue-system-behind-it">Beads: The Issue System Behind It</h2>
<p>Before getting into the app, a quick nod to <a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a>. I used my cockpit project as an excuse to properly test <a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a> for git‑native issues and dependency graphs with Claude Code, and I’m genuinely happy with it.</p>
<p>The numbers (255 issues, etc) above in the first section of the article are from that system. The dependency graphs and “ready to work” list made it much easier to ask a simple question: <em>“What should I do next?”</em> and get a straight answer.</p>
<p>I had originally planned a bigger comparison in this article between:</p>
<ul>
<li><p>Claude Code’s built‑in <code>/plan</code></p>
</li>
<li><p>Another <a target="_blank" href="https://github.com/anthropics/claude-code/tree/main/plugins/feature-dev">feature‑dev</a> plugin by Anthropic</p>
</li>
<li><p><a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a></p>
</li>
</ul>
<p>That deep dive can be its own article (and it will be). In <em>this</em> one, the important part is simpler: <strong>beads did its job well enough</strong> that I stopped thinking about my planning tool and focused on building the app.</p>
<hr />
<h2 id="heading-building-with-claude-but-not-around-it">Building With Claude, But Not Around It</h2>
<p><strong>Claude Code still did all of the heavy lifting</strong>:</p>
<ul>
<li><p>Wiring API clients for all git tracking, ticket tracking and CI/CD services <em>because it knows them</em>.</p>
</li>
<li><p>Building the macOS UI and wiring it to those clients.</p>
</li>
<li><p>Generating issue templates, refactors, and <em>unit tests</em> (<strong>512 of them so far</strong>).</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766941489561/2297c7bb-c335-4036-b56e-557ab0cea3ed.png" alt class="image--center mx-auto" /></p>
<p>But every time I hit a design decision, I tried to ask:</p>
<blockquote>
<p>“Can this feature run without Claude? Can the app still be useful if I swap the AI provider or turn it off?”</p>
</blockquote>
<p>That question changed how I structured things.</p>
<h3 id="heading-what-the-app-handles-directly">What the App Handles Directly</h3>
<p>Anywhere the standard APIs were enough, the app uses them directly:</p>
<ul>
<li><p><strong>Issue Tracker</strong> – saved queries, filters, and ticket details.</p>
</li>
<li><p><strong>Git Tracker</strong> – listing PRs, statuses, basic metadata.</p>
</li>
<li><p><strong>Build service</strong> – triggering builds where it makes sense.</p>
</li>
<li><p><strong>Local notifications</strong> – reminders for saved queries or conditions I care about.</p>
</li>
</ul>
<p>None of that requires AI to function. It's simply a streamlined UI over APIs that I would otherwise use individually.</p>
<h3 id="heading-where-ai-still-adds-real-leverage">Where AI Still Adds Real Leverage</h3>
<p>Then there are a few spots where AI really <em>does</em> change the experience:</p>
<ol>
<li><p><strong>Headless PR reviews</strong></p>
<ul>
<li><p>From the dashboard, I can select multiple PRs and trigger reviews.</p>
</li>
<li><p>Reviews run as background jobs.</p>
</li>
<li><p>Each one produces a structured summary with findings and checkboxes.</p>
</li>
<li><p>When I select what I agree with, the app posts a GitHub review from my account with the correct line‑level comments.</p>
</li>
</ul>
</li>
</ol>
<p>    This feels like the <strong>“killer feature”</strong>: I can run multiple reviews in parallel and then apply judgment, instead of reading every PR from scratch.</p>
<div class="embed-wrapper"><div class="embed-loading"><div class="loadingRow"></div><div class="loadingRow"></div></div><a class="embed-card" href="https://youtu.be/dw3QNFCKr2w">https://youtu.be/dw3QNFCKr2w</a></div>
<p> </p>
<ol start="2">
<li><p><strong>Summaries for tickets and PRs</strong></p>
<ul>
<li><p>Short, consistent summaries for status updates or messaging app posts.</p>
</li>
<li><p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766943417837/fcc3418c-b26d-40ae-b5af-2f96a607788f.png" alt="A software interface showing &quot;Open Beads PRs&quot; with four pull requests (PRs) listed. Each PR has information such as status, author, date, and a brief description. The summary section on the right details each PR." class="image--center mx-auto" /></p>
<p>  For many cases, Apple’s local foundation models are enough (and free).</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766943610425/13f7c2af-719e-4ffe-8638-413556743134.png" alt="A screenshot of a code review application interface displaying open pull requests for the &quot;Beads&quot; repository. On the left, there's a filter panel for queries, ticket prefixes, time ranges, and PR states. Four open pull requests are listed below. On the right, a summary section highlights bug fixes and updates, focusing on enhancing community tools and improving cycle detection efficiency." class="image--center mx-auto" /></p>
</li>
<li><p>For heavier contexts or trickier summaries, I can fall back to Claude or another provider.</p>
</li>
</ul>
</li>
</ol>
<p>The design is “AI tiered by cost and capability”, not “everything through the most expensive model by default”.<br />So the AI is important—but it’s not the only brain. The app is built so that most of the value comes from the workflow and aggregation, not from one specific model.</p>
<hr />
<h2 id="heading-living-with-ai-amnesia">Living With AI Amnesia</h2>
<p>Of course, working with Claude Code itself wasn’t perfectly smooth. I set clear instructions like “use the beads issue tracker for planning, not todos”, and still had loops like:</p>
<pre><code class="lang-text">Claude: Let me create a todo list to track this…
Me: Use the issue tracker, not todos.
Claude: You're right, I'll create issues instead.
[Later…]
Claude: I'll add this to the todo list…
</code></pre>
<p>Some of that is context limits, some is built‑in prompts leaning toward native tools. The practical takeaways for me:</p>
<ul>
<li><p>You need <strong>constant reminders</strong> about your workflow - the <code>CLAUDE.md</code> file has instructions for using <a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a> but is not always respected.</p>
</li>
<li><p>You need <strong>short, explicit prompts</strong> like “Plan this as <a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a> issues” instead of hoping it remembers.</p>
</li>
<li><p>You need to <strong>accept that a bit of drift</strong> and correction is normal.</p>
</li>
</ul>
<p>Again, <a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a> helped here - once issues existed in the repo and I reminded the bot to use them, they survived the AI’s memory lapses.</p>
<hr />
<h2 id="heading-a-quick-detour-designing-the-icon-and-failing-figmas-ai">A Quick Detour: Designing the Icon (And Failing Figma’s AI)</h2>
<p>Another fun side quest - <strong>the icon</strong>.</p>
<p>I’m not a designer, but I wanted something that felt at home next to Xcode, VS Code, etc. So I tried:</p>
<ul>
<li>Figma with its AI features and a bunch of prompts for “macOS app icon”, “ground control”, “developer cockpit”, and so on.</li>
</ul>
<p>The results were… NOT fine. For me, Figma’s AI was useful as a brainstorming nudge, but not as “give me a final icon”. If it worked it would’ve been too easy.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766944298128/e6f2b611-3d7d-499f-ac39-cd9af60d52ec.png" alt="A silhouette of a black bear stands on top of a nameplate with two lines of text." class="image--center mx-auto" /></p>
<ul>
<li><p><strong>What ended up working was using believe it or not, Perplexity!</strong></p>
<ul>
<li><p>A few iterations on a “control stand” / tower motif and I had something I can work it.</p>
</li>
<li><p>Iterating on colors and lighting.</p>
</li>
<li><p>A final touch-up through the Icon Composer macOS App, and I was all set.</p>
</li>
</ul>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766944641388/924fb997-c61e-4cbb-8af8-e9dd71b6a36b.png" alt="A stylized blue icon resembling a digital device with a screen, displaying a red horizontal bar in the center. The screen is mounted on a blue base, set against a dark background." class="image--center mx-auto" /></p>
<p>That whole process could be its own short article: <em>“I tried to get Figma’s AI to design my app icon. It didn’t. Here’s what actually worked - Perplexity.”</em></p>
<p>For this story, it’s just another example of the same pattern:<br /><mark>AI can help, but the workflow and judgment still have to be yours.</mark></p>
<hr />
<h2 id="heading-notifications-flags-and-keeping-it-yours">Notifications, Flags, and Keeping It Yours</h2>
<p>A few other parts that turned out surprisingly useful:</p>
<ul>
<li><p><strong>Local notifications</strong> – for saved queries or “watchlists”; easily testable from settings so you can check your notification logic without waiting a week.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766944832921/170faee0-6ac0-462f-ba56-ac84fababa41.png" alt="Screenshot of a software interface displaying &quot;Features&quot; and &quot;Notification Center.&quot; The Features section shows settings for PR Review Provider with Claude Code available, PR Batch Review with concurrent reviews set to 2, and notifications status as authorized. The Notification Center displays a message about a PR review starting." class="image--center mx-auto" /></p>
</li>
<li><p><strong>Feature flags</strong> – simple switches to hide integrations I’m not using at the moment. This keeps the cockpit focused instead of becoming a cluttered control panel.</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766944867494/97bb4e5e-fca9-4a16-a88d-39959833b8f5.png" alt="Screenshot of a software features panel showing integrations and AI features. There are toggles for Jira, GitHub, and Bitrise integrations, all enabled. AI features include PR Review, Batch PR Review, and AI Summary, also enabled. Icons for Bitrise, Repositories, Jira, and Features are displayed at the top." class="image--center mx-auto" /></p>
</li>
</ul>
<p>None of this is technically complex, but together they make the app feel like <em>my</em> GroundCTRL cockpit, not just a generic dashboard.</p>
<hr />
<h2 id="heading-takeaways">Takeaways</h2>
<p>From this round, the main lessons for me:</p>
<ul>
<li><p><strong>Using Claude Code to build an app is great, but the app shouldn’t depend on Claude Code to be useful.</strong></p>
</li>
<li><p><a target="_blank" href="https://github.com/steveyegge/beads"><strong>beads</strong></a> <strong>work well for multi‑session, dependency‑heavy work</strong> – good enough that I trust them as the planning backbone.</p>
</li>
<li><p><strong>APIs first, AI second</strong> – if Issue Tracker/Git Tracker/CI already give you the data you need, call them directly and save <strong>AI for summaries, PR reviews, and decision support</strong>.</p>
</li>
<li><p><strong>Headless AI PR reviews with a human in the loop feel like a real multiplier</strong> – let the model do the first pass, you decide what actually gets submitted.</p>
</li>
<li><p><strong>Design tools with AI are not magic</strong> – they can suggest directions, but for things like app icons you still need to drive.</p>
</li>
</ul>
<p>The bottleneck isn’t typing anymore. It’s orchestration - picking the right mix of APIs, local logic, and AI so that your tools match how you really work.</p>
<p><strong>And that’s what this little cockpit is for.</strong></p>
]]></content:encoded></item><item><title><![CDATA[Breaking Free from Busy Work: Applying the 80/20 Rule in Engineering]]></title><description><![CDATA[Busy Work vs Real Impact in Engineering
Most weeks, the work that drains the most energy is not the hard stuff. It’s the busy stuff.

The tickets that feel satisfying to close.

The refactors that make the code just a little cleaner.

The tiny UI twe...]]></description><link>https://groundctrl.dev/breaking-free-from-busy-work-applying-the-8020-rule-in-engineering</link><guid isPermaLink="true">https://groundctrl.dev/breaking-free-from-busy-work-applying-the-8020-rule-in-engineering</guid><category><![CDATA[Software Engineering]]></category><category><![CDATA[Pareto Principle]]></category><category><![CDATA[workflow]]></category><dc:creator><![CDATA[Deyan Aleksandrov]]></dc:creator><pubDate>Sat, 27 Dec 2025 18:15:56 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/vsLbaIdhwaU/upload/7e1e86004aa40a5e8f82ab40f665e55b.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h3 id="heading-busy-work-vs-real-impact-in-engineering">Busy Work vs Real Impact in Engineering</h3>
<p>Most weeks, the work that drains the most energy is not the hard stuff. It’s the busy stuff.</p>
<ul>
<li><p>The tickets that feel satisfying to close.</p>
</li>
<li><p>The refactors that make the code just a little cleaner.</p>
</li>
<li><p>The tiny UI tweaks that only a handful of people will ever notice.</p>
</li>
</ul>
<p>All of that feels like progress.<br />The problem is that a lot of it barely moves the product or the team forward.</p>
<p>The Pareto principle (the 80/20 rule) says that a small share of effort typically produces a large share of results: <strong>roughly 20% of your work drives 80% of your outcomes</strong>. If that’s true, it also means something uncomfortable: a <strong>big chunk of your time is probably going into things that look like work, but don’t really change much</strong>.</p>
<h3 id="heading-how-busy-work-shows-up-for-engineers">How busy work shows up for engineers</h3>
<p>In engineering, busy work often arrives disguised as “real work”:</p>
<ul>
<li><p>Tweaking spacing, colors, or animations long before users confirm they even want the feature.</p>
</li>
<li><p>Refactoring code that’s annoying but not blocking any roadmap item or customer.</p>
</li>
<li><p>Building internal tools that are fun to write but remove small inconveniences instead of major pain.</p>
</li>
</ul>
<p>On a board, these look legitimate. They’re real tasks, with estimates and assignees.<br />The cost isn’t that they’re totally useless. The cost is that they push more impactful work to “later”.</p>
<h3 id="heading-busy-work-as-a-starter-task">Busy work as a starter task</h3>
<p>Busy work isn’t always the enemy. Sometimes it’s a useful on‑ramp. There are days when starting with a small, <strong>easy win</strong> is exactly what’s needed:</p>
<ul>
<li><p>fix a tiny bug,</p>
</li>
<li><p>clean up a file,</p>
</li>
<li><p>rename something that’s been bothering you.</p>
</li>
</ul>
<p>You get a quick success, your brain switches into “<strong>doing mode</strong>”, and suddenly the bigger, scarier task feels less heavy.</p>
<p><strong>The problem isn’t</strong> doing a bit of busy work.<br /><strong>The problem is</strong> staying there—spending most of the week in low‑impact tasks and never coming back to the 20% of work that actually drives outcomes.</p>
<h3 id="heading-using-8020-to-avoid-busy-work">Using 80/20 to avoid busy work</h3>
<p>The 80/20 rule is useful not just as a description, but as a filter.</p>
<p>If roughly 20% of your effort creates 80% of the impact, then your job is to find and protect that 20% as aggressively as possible. In practice, that means asking:</p>
<ul>
<li><p>Which few features, decisions, or fixes would actually change a user’s week?</p>
</li>
<li><p>Which conversations or decisions would unblock the most work for the team?</p>
</li>
</ul>
<p>A simple mental graph you can imagine:</p>
<ul>
<li><p>On the X‑axis: time spent.</p>
</li>
<li><p>On the Y‑axis: impact.</p>
</li>
<li><p>The first fifth of the graph shoots up quickly (20% of time → 80% of impact).</p>
</li>
<li><p>The remaining four‑fifths flatten out into a long tail—lots of effort, small gains.</p>
</li>
</ul>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1766858198910/aa519479-803c-4859-9d41-787e38eb84cf.png" alt="80/20 Mental Graph" class="image--center mx-auto" /></p>
<p>The goal is to spend more of your week in that steep early part of the curve, and less in the long, flat tail where busy work lives.</p>
<h3 id="heading-the-8020-version-of-tech-debt">The 80/20 version of tech debt</h3>
<p>Technical debt is the same story with different branding.</p>
<p>Taking on debt on purpose can be the right call. You ship a rough version, learn from real users, and then decide what’s worth cleaning up. But there’s also a version of tech debt where you spend weeks polishing things that don’t justify the investment.</p>
<p>Viewed through the 80/20 lens:</p>
<ul>
<li><p>The first 20% of effort gives you 80% of the value: a working feature in production, feedback from users, a clearer sense of what matters.</p>
</li>
<li><p>The last 80% of effort goes into chasing perfect abstractions, solving edge cases no one has hit yet, and rewriting code that already works “well enough”.</p>
</li>
</ul>
<p>Here, 80/20 helps with decisions:</p>
<ul>
<li><p>If a piece of debt is blocking that high‑impact 20% of work, pay it down early.</p>
</li>
<li><p>If it only affects the long tail of polish, log it, time‑box it, and tackle it later—if it’s still worth it.</p>
</li>
</ul>
<p>Sometimes that last 20% is truly important (compliance, safety, scalability).<br />Often it’s just comfortable busy work wearing a “quality” badge.</p>
<h3 id="heading-why-busy-work-is-so-attractive">Why busy work is so attractive</h3>
<p>There’s a reason this pattern is hard to break - busy work is emotionally easier:</p>
<ul>
<li><p>It’s clear - you know exactly what to do and how to finish it.</p>
</li>
<li><p>It’s controllable - no stakeholder disagreement, no product ambiguity.</p>
</li>
<li><p>It’s rewarding - you get quick dopamine hits from closing tickets and merging PRs.</p>
</li>
</ul>
<p>High‑impact work is messier. You need to align people, make trade‑offs, and say “no” to things. You need to pick a direction without all the data. It feels riskier, so your brain quietly drags you back to the safe zone - another refactor, another small UI tweak, another “just in case” improvement.</p>
<h3 id="heading-a-simple-heuristic-impact-reversibility">A simple heuristic: impact × reversibility</h3>
<p>A practical way to avoid getting stuck in busy work is to quickly score tasks on two axes:</p>
<ul>
<li><p><strong>Impact</strong> – If this goes well, who notices? Users, teams, the business, or just me?</p>
</li>
<li><p><strong>Reversibility</strong> – How hard is it to change or undo later if we get it wrong?</p>
</li>
</ul>
<p>Then roughly prioritise:</p>
<ul>
<li><p>High impact, low reversibility → design carefully, involve others, but still aim to ship.</p>
</li>
<li><p>High impact, high reversibility → ship fast, learn, and adjust as you go.</p>
</li>
<li><p>Low impact, anything → busy‑work candidates; handle them later, time‑box them, or drop them entirely.</p>
</li>
</ul>
<p>This doesn’t need to be a formal matrix. Even asking these questions in your head already filters out a lot of <strong>“because it annoys me”</strong> tasks.</p>
<h3 id="heading-how-this-shows-up-in-my-own-work">How this shows up in my own work</h3>
<p>In practice, this is what it looks like for me when working on the GroundCTRL app (and not only):</p>
<p>With a new feature, I often spend too much time on the final 10–20% of tasks, like deciding on the exact appearance of a button or the arrangement of multiple buttons. The core 80% of the work—the part that actually changes something for the user—is already done, but I get stuck perfecting minor details.</p>
<p>For example, I spent about four hours creating a solid first version of the app with the help of AI. On other days, I’ve lost almost the same amount of time debating button aesthetics and layout, which doesn’t really move the product forward.</p>
<p>As a manager, I can also sink hours into “perfect” documentation, trying to cover every possible scenario. A more impactful move is often a short recording or a simple checklist that unblocks the team quickly.</p>
<p>The pattern is the same - I drift into high‑effort, low‑impact work because it feels safer than making the next significant decision.</p>
<h3 id="heading-tactics-to-stay-out-of-busywork-mode">Tactics to stay out of busy‑work mode</h3>
<p>A few things that help push back against this:</p>
<ul>
<li><p><strong>Start the day/week a clear plan for 2–3 outcomes</strong>, not a giant task list. “<strong>Ship X</strong>”, “<strong>Unblock Y</strong>”, “<strong>Decide Z</strong>” beats 20 micro‑tasks.</p>
</li>
<li><p><strong>Use busy work intentionally</strong> - one small task to warm up, then switch to a high‑impact item as soon as you have momentum.</p>
</li>
<li><p><strong>Time‑box polish</strong> - only a small percentage of the feature’s total time is allowed for refactors and tweaks; after that, it ships as‑is.</p>
</li>
<li><p><strong>Track intentional tech debt</strong> in one place and review it regularly, instead of trying to fix everything in the moment.</p>
</li>
<li><p><strong>Ask once a day</strong> “If I stopped working now, what changed for someone outside the team?”. If the answer is “not much,” you’re probably in busy‑work territory.</p>
</li>
</ul>
<p>The goal isn’t to ban busy work. It has its place as a warm‑up and as a finishing layer.<br />The goal is to keep most of your time in the 20% of work that actually bends the curve—and to use 80/20 as a simple lens for both avoiding busy work and deciding which tech debt really deserves your attention.</p>
<hr />
<p><strong>Further reading</strong></p>
<ul>
<li><p>Pareto principle (80/20 rule) – Wikipedia <a target="_blank" href="https://en.wikipedia.org/wiki/Pareto_principle">https://en.wikipedia.org/wiki/Pareto_principle</a></p>
</li>
<li><p>Learn the Pareto Principle (Asana) <a target="_blank" href="https://asana.com/resources/pareto-principle-80-20-rule">https://asana.com/resources/pareto-principle-80-20-rule</a></p>
</li>
<li><p>The Pareto Principle: Reduce Your Workload with the 80/20 Rule <a target="_blank" href="https://openup.com/blog/pareto-principle/">https://openup.com/blog/pareto-principle/</a></p>
</li>
<li><p>When is the Right Time to Pay Down Tech Debt? <a target="_blank" href="https://madeintandem.com/blog/right-time-pay-tech-debt/">https://madeintandem.com/blog/right-time-pay-tech-debt/</a></p>
</li>
<li><p>Technical Debt: The Hidden Cost Of Shipping Fast And Thinking Later <a target="_blank" href="https://dev.to/alexindevs/technical-debt-the-hidden-cost-of-shipping-fast-and-thinking-later-587d">https://dev.to/alexindevs/technical-debt-the-hidden-cost-of-shipping-fast-and-thinking-later-587d</a></p>
</li>
</ul>
]]></content:encoded></item></channel></rss>