Over the past few weeks—largely out of a sense of FOMO—I’ve been forcing myself to do something I’ve fantasized about for a long time: building software primarily on my phone. Not just reviewing code or fixing typos, but real feature development—roughly 70% of my coding on my custom setup.

Why?

There’s been a growing chatter about AI agents being so good that developers are spending more time guiding agents than writing code themselves. People are talking about this shift, sharing their workflows, exploring new ways of working. That got me thinking:

If I’m guiding agents more than writing code, do I really need to be tethered to an IDE?

I’ve been a long-time believer in cloud IDEs, and I’ve wanted to push on what’s possible there for years. Fun fact: I’ve used GitHub Codespaces as my daily driver for about four years now, only occasionally switching back to local VS Code when absolutely necessary. This entire website was built exclusively in Codespaces.

I also previously worked on Copilot Workspace and pushed hard for a mobile experience there. This experiment was partly a way to scratch that itch. More importantly, I wanted to see whether I could stay effective at prompting agents without losing control over the code.

Retaining technical depth matters to me—partly because I want to keep learning, and partly because of past PTSD from agents wildly slinging AI slop. When that happens, cleaning things up often takes more time than the velocity gains, wiping out the benefits entirely. Scott Hanselman articulates this more elegantly than I can.

Is mobile-first development actually feasible?

More than feasible. About 70% of my work over the last two weeks happened entirely on my phone.

Small, well-scoped tasks worked beautifully as fire-and-forget: dispatch work to agents, come back later, review the trajectory, and provide feedback.

One goal of this exploration was to reach a point where I could use my setup to improve the setup itself—and once I crossed that threshold, a flywheel kicked in. I used my mobile workflow to refine my mobile workflow, shipping almost daily improvements to the process. I’ll write a separate, more technical post about the setup details for anyone interested.

That said, some desktop work was unavoidable: complex debugging, deeper architectural changes, and performance profiling still benefited from a full workstation. It’s also worth noting that I was working on greenfield web app features, which certainly contributed to the success here. I plan to keep pushing on this to see where the real edges are.

So what have I learned so far?

In no particular order:

The activation energy is remarkably low.
An idea strikes, I pull out my phone, describe the task, and let agents run. It started to feel game-like—almost addictive. More than once, I found myself being the quintessential antisocial uncle at gatherings, staring at my phone or quietly talking to it. On walks with my kids, I caught myself debugging instead of being present. And apparently, talking to your phone about code still turns heads in Colorado in a way it might not in California.
Big screens still matter.
I pushed directly to main more than once after reviewing changes on mobile. But even when I could handle something on my phone, I often used my desktop as a safety net. There’s something comforting about big screens. Maybe that makes me an elder millennial. A colleague joked that they’re intrigued by mobile development but doubt they’ll ever adapt to phone-first workflows—and honestly, I get it.

The phone isn’t replacing the desktop. They’re coexisting. IDEs will always matter for desktop-native workflows. But the boundary between “desktop-only” and “mobile-viable” is blurring faster than I expected.
Task slicing is critical.
Mobile development works best for small, well-defined tasks. Trying to tackle large features end-to-end on a phone gets unwieldy fast. Breaking work into bite-sized chunks that can be dispatched to agents made the process manageable—and genuinely faster. I reimplemented features in 30 minutes that previously took half a day. But without careful scoping, that same speed compounds mistakes instead of progress.
Human cognition—not compute—is the bottleneck.
I could spin up multiple agents in parallel across a repo, but this only worked up to a point. For me, that limit was around five to seven active tasks. Beyond that, I felt drained and overwhelmed.

The bottleneck isn’t the silicon or the neural nets—it’s the biological neurons doing context switching and higher-order thinking. Maybe I’m getting older. Or maybe I’m exercising mental muscles I haven’t used this way before. Either way, managing cognitive load is critical in agent-heavy workflows.
Merge conflict hell is very real.
With multiple agents happily working in parallel on the same repo, merge conflicts are inevitable. Agents can help resolve them, but without safeguards it’s hit-or-miss. I found myself mentally topologically sorting instructions before saying them out loud, just to avoid turning the repo into a busy fish market.

At times, it felt less like collaboration and more like Neo fighting multiple Agent Smiths at once.
Velocity without direction accelerates entropy.
The most profound lesson wasn’t technical. A metaphor emerged: AI gives you velocity, but you must provide direction. You can move fast in a straight line or fast in circles. Unguided, AI accelerates entropy.

This makes shepherding essential—through clear goals, evolving regression tests, and constantly fighting AI slop. Well-maintained agent prompts, repo-checked instructions, and strong guardrails are mandatory. Even then, vigilance is a full-time job. Code review and gatekeeping matter more than ever. One tired YOLO merge can snowball into serious technical debt surprisingly fast. We’re still responsible for the code. I hard agree with this stance from Mitchell Hashimoto.
From code generators to feature owners.
Our role is shifting—from writing code to owning specifications and outcomes. That means getting better at spec-driven development and communicating intent at higher levels of abstraction. It takes practice, and a lot of unlearning, not unlike newly promoted engineers adjusting to management roles.

There’s an ironic twist here: we started with models validating our work; now we’re validating the models. If I stretch the Matrix metaphor just a bit—we’ve become the batteries and the evaluators.
Model quality really matters.
Different agents produced noticeably different results. Subtle differences in reliability and style were enough to change my primary workflow. And when multiple agents work in parallel, a new problem emerges: fragmented mental models. Merge conflicts stop being just technical issues and start becoming cognitive collisions.

What does this mean for the future?

Mobile development won’t replace traditional workflows outright, but it’s carving out legitimate territory. The biggest barriers aren’t technical—they’re social and interaction-driven.

We need better modalities: touch-first interactions, Apple Pencil support, speech-to-text that actually understands programming terms and accents, planning-first modes before execution, and richer mobile code review experiences with AI explanations and structured diffs.

The infrastructure is almost there. What’s missing is serious design thinking around mobile-native development workflows.

For now, I’m continuing the experiment—pushing the boundaries of what’s possible, one phone screen at a time. The future isn’t mobile-only or desktop-only. It’s contextual: using the right tool for the moment you’re in.

Share your thoughts!

Blog Posts