Moving Faster with AI
AI enables developers and builders to move faster than before. It’s a great tool, but it’s also a sharp tool. Just like a knife, it can carve a beautiful sculpture or slice your project to ribbons.
Going faster means giving the models the control they need to work freely and being able to trust what they generate. Fortunately neither of these are completely new problems: we can look back at some traditional software engineering concepts to guide us in this new AI-driven era.
Yielding Control
You open Claude Code and give it a task. Before you know it, a permissions prompt pops up. Then another. They happen often enough that you can’t switch away, but not often enough to keep your brain engaged. You start to feel like your life is about pressing enter, over and over, to stop the models from driving off a cliff.
You want to give them more control, but you also remember when they nuked someone’s inbox, dropped a production database and a whole production environment. Models can destroy anything they have access to.
Backups
Reliable backups give you the freedom to let models work while limiting their blast radius. Activate Time Machine on your Macs, use Veaam for Windows, and if you’re using Linux you probably don’t need my suggestions.
Back up your databases. MySQL and Postgres both have excellent options for saving and restoring your database. Ideally models wouldn’t have production access, but we’ve seen frontier models search for keys and other access workarounds when they think it’s necessary to accomplish their tasks.
Git also isn’t foolproof by default. We’ve seen projects reset and force pushed. Git is for creation and collaboration, it is not a backup strategy.
Isolated Environments
Backups are great, but what if an agent can work without having the access to destroy your whole system or leak your secrets? Container systems like Docker and Orbstack or virtual machines offer varying levels of isolation, keeping the project files your agents are working on away from the rest of your computer. If you prefer desktop apps, the Codex app can connect to other servers via a codex-managed connection or plain SSH.
A virtual machine setup introduces some development friction and uses more system resources, but it’s going to be very difficult for an agent to accidentally destroy anything outside it’s virtual environment.
Establishing Trust
Now that we know agents can’t destroy our inbox and meme collection, we can turn focus to trusting the things it does. Agents can write code, tell you they’ve completed their task then apologize when you call them out for taking shortcuts. Trust, but verify.
Automated Code Review
AI can check it’s own work. The current preference is to check the code with a different model than the one that generated it. Use Codex and Claude to check each other’s work, or bring in a tool like CodeRabbit or Greptile to check every PR before merging. Greptile gives you a code quality score, a chart of what files were changed and scans the commit for security, bugs and logic problems. It’ll suggest fixes for many classes of problems that it finds, or give you a nice list of problems to take back to the original agent.
Automated Code Review makes a great backstop for things humans can miss.
Tests
Test suites have long given developers assurance that their systems work the way they were designed to. A good set of integration tests can lead the agent toward a solid implementation.
Agents can write their own tests, and reviewing those tests can give you clues about how the agent is approaching the task it’s been assigned without having to check every implementation detail.
A Test-Driven Design workflow Agents can write tests too, and they can give you clues about how the agent is approaching the problem without having to check every implementation detail.
Adam Wathan has a good series on Test-Driven Design. It’s older and uses Laravel, but many of the concepts are applicable today.
Stage on Pull Requests
Sometimes it’s easier to manually verify an agent did what you told it to. The dev environment on your machine may not be the best place for that. Enter staging environments. Staging environments mirror production environments, but at a smaller scale. They can be automatically created by Continuous Integration/Continuous Delivery (CI/CD) for each Pull Request.
Instrumentation
Instrumentation uses recorded events to track actions occurring in your application. A social media application could track customer signups, user logins, posts created, and comments left. If one of the metrics they’re watching changes significantly after an update, instrumentation can throw alarms before the customers do. It helps close the loop by demonstrating that your application is actually working in under real world usage.
PostHog has a pretty generous free plan.
Moving Faster
After all of this, the last tip for moving faster with AI is to “just try it”. If you come from a software engineering background you have an idea of how difficult tasks can be and where AI’s limitations are. This is a blessing and a curse. You may experiment less because you’ve judged the task too difficult for an agent to accomplish.
There’s never been a better time to try. The tokens are subsidized. If you’ve taken some of the reasonable precautions above, the worst the agent can do is fail, leaving you no worse off than before.