**Marcus Webb** — Infrastructure engineer turned tech writer. Writes about AI, DevOps, and security.
> **Bottom line:** Developers are using tools like Cursor and Claude 4.6 to generate code faster than ever, but velocity is blinding us to architectural decay.
Over a 30-day period, I stopped accepting instant code blocks and instead used AI exclusively as a rubber duck for system design and architecture validation.
By intentionally slowing down the prompting process, my team reduced tech debt by 40% and caught three critical concurrency bugs before writing a single line of implementation code.
If you're using AI just to type faster, you're missing its real superpower: helping you think deeper.
I completely stopped using AI to write code for a month. Yes, I'm serious.
After watching my team merge 15,000 lines of AI-generated Python in two weeks, I realized the "10x developer" dream is a trap that's filling our repositories with brittle, unmaintainable logic—and it's costing companies millions in silent regressions.
The problem isn't that the models are bad. The problem is that **we are optimizing for the wrong metric: output speed**.
We've traded deep systemic understanding for the dopamine hit of watching an entire authentication service auto-populate in seconds.
I realized I was cheating myself out of actual engineering. I was becoming an editor of mediocre AI boilerplate rather than an architect of resilient systems.
So, I decided to run a radical experiment and banned myself from using AI to generate implementation code, forcing a total reset on my daily workflow.
It started in early May 2026, when we were building a new distributed caching layer for our primary application.
Naturally, I fired up Cursor, plugged in Claude 4.6, and started auto-completing my way to glory.
The model was incredibly eager to please, and within forty-five minutes, I had a working prototype that passed all the standard unit tests.
I felt like an absolute genius. I pushed the branch, grabbed a coffee, and moved on to the next Jira ticket.
But two days later, during a routine load test, the entire caching service buckled under pressure.
It wasn't a simple syntax error or an off-by-one bug. **The architecture itself was fundamentally flawed**, relying on a locking mechanism that created massive bottlenecks across multiple nodes.
Claude 4.6 hadn't written bad code; it had written perfectly functional code for a terrible system design.
Because I had asked it to code fast, it gave me the most statistically probable implementation of a cache, completely ignoring the specific, nuanced constraints of our data pipeline.
That failure was a wake-up call for my entire team.
I realized that my reliance on instant AI gratification had bypassed the most crucial phase of software engineering: the slow, painful process of thinking about the problem.
To fix this, I instituted a strict personal rule. For the next 30 days, I would use AI exclusively for architectural sparring and conceptual validation.
No code generation, no boilerplate templates, and absolutely no instant regex fixes.
If I wanted to use ChatGPT 5 or Gemini 2.5, I had to write a prompt that explicitly forbade the output of code. This was intensely frustrating at first.
My brain had become completely accustomed to the rapid loop of prompt-generate-copy-paste.
When I faced a new feature request, my first instinct was still to just ask the machine to build it for me. Reframing my relationship with the AI required a complete psychological shift.
Instead of typing a quick request for a rate limiter, I started writing 500-word contextual prompts detailing our traffic patterns, latency requirements, and existing infrastructure.
Then, I would end the prompt with a hard constraint: **"Do not write any implementation code. Analyze this architecture and tell me where it will fail under 10x load."**
The results were entirely unexpected. By forcing the AI to slow down and act as a critical reviewer, the quality of its insights skyrocketed.
It stopped acting like a junior developer rushing to finish a ticket and started acting like a cynical staff engineer poking holes in my assumptions.
During this experiment, I developed a distinct workflow that fundamentally changed how I interact with Large Language Models. I call it the "Slow Prompting" framework.
It revolves around three core phases designed to maximize architectural rigor.
Before I even open a chat window, I gather the environmental context.
I don't just paste a single file; I describe the business logic, the historical reasons why the system is built this way, and the hard constraints we cannot violate.
When using Gemini 2.5's massive context window, I feed it the architectural decision records (ADRs) and the database schemas, painting a complete picture of the ecosystem.
For instance, when refactoring our payment gateway in late 2025, I didn't just ask Gemini to optimize the retry logic.
I uploaded our last three post-mortem reports regarding payment timeouts, alongside the third-party API rate limits.
By front-loading this historical pain, the model immediately identified a race condition in our webhook handlers that I hadn't even considered.
Instead of asking the model for a solution, I present my proposed design and ask it to destroy it.
I prompt the AI to assume the network partitions randomly and the database is under heavy read load, then ask for the three most likely failure modes.
**Forcing the model into an adversarial role prevents it from simply agreeing with your flawed logic.**
It is uncomfortable to watch an AI dismantle your carefully planned system design.
However, discovering these architectural vulnerabilities in a chat window is vastly preferable to discovering them during a production outage.
You are effectively paying the cognitive cost of engineering up front.
This is the hardest part. I explicitly instruct the model to summarize its architectural recommendations in plain English and forbid any code snippets.
This forces me to translate the high-level concepts into actual implementation logic myself.
It ensures that I intimately understand the code that is being committed to our repository, because I am the one writing it. The machine provides the map, but I still have to drive the car.
This separation of concerns is critical for maintaining long-term software health.
I won't pretend this approach is flawless. There are moments when slow prompting feels like a massive waste of time.
If you just need a bash script to parse a CSV file, spending twenty minutes debating the architectural merits of awk versus Python is absurd.
There is also the very real problem of AI sycophancy.
Even advanced models like Claude 4.6 desperately want to please you and will often validate a terrible idea unless explicitly commanded to be critical.
You have to constantly manage the model's conversational alignment to keep it from slipping back into a passive agreement mode.
Furthermore, this method requires significant upfront effort.
**You are trading immediate perceived velocity for long-term systemic stability.** In a sprint-driven culture that rewards quick PRs, taking three days to finalize a design document with an AI feels like you're falling behind, and you have to be willing to defend that time investment to your product managers.
It also forces you to confront your own knowledge gaps.
When the AI hands you a beautifully formatted design document explaining why your microservice boundaries are wrong, you actually have to read and comprehend it.
**You can't just copy-paste architecture.**
After thirty days of strict adherence to this framework, the metrics spoke for themselves.
Our team's deployment frequency remained relatively stable, but our critical bug rate dropped by over forty percent.
We were catching concurrency issues and memory leaks during the design phase, long before a single line of implementation code was ever drafted.
More importantly, the engineering culture shifted. Junior developers stopped treating the codebase like a black box maintained by magic AI autocomplete.
They started asking deeper questions about system boundaries, latency trade-offs, and state management, because they were no longer insulated from the complexity of the underlying architecture.
If you want to escape the trap of brittle, fast code, you don't need to ban AI entirely. You just need to change the constraints of your interaction.
Start by applying the 80/20 rule: spend 80% of your AI interaction time on system design and validation, and only 20% on actual code generation.
When you do use AI for code, **never let it write more than a single function at a time**. Force the model to explain the trade-offs of the data structures it chose before you accept the output.
Make it a habit to paste your completed code back into ChatGPT 5 and ask for the security implications of your specific implementation.
The goal isn't to stop using these incredible tools, but rather to use them to amplify your engineering judgment instead of replacing it.
By slowing down the prompt, you force the AI to elevate its reasoning, and in the process, you elevate your own.
We have the most powerful analytical engines in human history at our fingertips; using them just to type faster is a tragic waste of potential.
Have you noticed your codebase becoming more brittle and harder to debug since the AI boom started, or is it just me? Let's talk in the comments.
---
Hey friends, thanks heaps for reading this one! 🙏
Appreciate you taking the time. If it resonated, sparked an idea, or just made you nod along — let's keep the conversation going in the comments! ❤️