Codex vs Claude Code for Builders: Why I Run Claude Code Every Day

I shipped a web design company's site through Claude Code last month. Five hours, start to finish. Premium look, integrated submission forms, the whole thing functional. With Codex I'd have spent days iterating on the design and it still wouldn't have looked right.

That's the post.

I've tested three coding agents over the last five months. Codex for one. Antigravity for three. Claude Code for the past one. Claude Code is now my default and I'm not going back.

The honest version of this comparison isn't "Codex vs Claude Code" with a clean two-tool answer. It's a three-way story. Codex got dismissed early because it was slow and produced design work that needed redoing. Antigravity nearly held me for good and is genuinely better than Claude Code at a few specific things. Claude Code took the default for reasons that come down to Max plan economics and how Claude's features (skills, connectors, integrations) stack on top of the agent. If you're picking once and don't want to read the rest: install Claude Code. Although I'll be honest, since I first published this I've bought Codex again and given it a permanent seat for two specific jobs. That update is at the bottom. The case for each tool is below.

Everything in that case comes down to the same four questions I put to each tool. How fast is the loop. How right does the design land on the first pass. How clean is the backend code it writes. And what does it actually cost to run all day. That's the lens the whole comparison turns on, so here's where the three sit on it at a glance before I get into the why.

ToolPick whenSkip when

Claude CodeReal product work, anything design-sensitive, builders already on Claude Max.You need to run six projects in parallel without thinking about usage.
AntigravityHeavy multi-project work, model-switching in one IDE, you don't already pay for Claude Max.You're already on Claude Max. The math doesn't work to pay for both.
CodexBackend code and cleanup, and running an always-on agent (like Hermes) off a flat subscription instead of the metered API. A strong agentic engine.Front-end and design you need right on the first pass. Claude Code still wins there.

Codex was slow and bad at design

Codex was the first one I tried and the first one I dropped. Two reasons.

The first is that it was slow. Not catastrophically slow, but slow enough that the agent loop felt heavier than it should. After a few weeks the friction added up and I'd find myself reaching for it less.

The second was the bigger problem. Codex was bad at design. Specifically: bad at branding things properly. Color schemes that didn't fit. Layouts that looked off. UI and UX that needed two or three rounds of "no, more like this" to get into the right ballpark. For anything you actually wanted to ship to a user, the front-end design from Codex needed manual cleanup that mostly defeated the speed argument for using an AI agent in the first place.

Some people will say Codex's strengths show on small isolated tasks (write this React component, generate these test fixtures) and that's probably fair. I didn't stay on it long enough to fully test the narrow-task case because the broader experience had already lost me. If your work is mostly small isolated greenfield tasks in a stack you don't care about visually, Codex might be fine. For work that ships to real users, it's not the call.

One caveat, and it turned out to matter. That was the Codex of early this year. I bought it again recently and retested it, and the design verdict has softened. I'll come back to that near the end.

Antigravity nearly won, and that surprised me

Antigravity is the unexpected hero of this comparison and the part most "Codex vs Claude Code" posts will skip entirely. I spent three months on it. I loved it. Here's why.

The UI was really well planned out. That sounds like a small thing until you're using a coding agent every day. The friction of the interface compounds. Antigravity's IDE was set up for the way builders actually work, not for the way OpenAI thought builders work.

The big win was multi-tasking. I ran seven projects in parallel without burning through credits. Seven. On most other tools I'd have hit a usage cap by project four. Antigravity treated parallel work as a normal mode rather than something to ration, and that changed how I thought about which projects to take on.

The other thing Antigravity did that nothing else does cleanly: it let me switch models inside the same IDE on a single subscription. Gemini 3.1 for one task, Claude Opus 4.6 for another, all on the Google sub. No second account, no juggling API keys, no jumping between apps. For a builder who wants to use the right model for the right task without paying for three plans, that's a real edge.

The planning stage was the third thing. Antigravity generated a real-time to-do list for any task and ticked off items as the work happened. I could glance at the list and see exactly what the agent was about to do, what it had finished, and what was still pending. Most agents make you trust the loop. Antigravity showed you the loop.

If you don't already pay for Claude Max, Antigravity is the tool I'd seriously evaluate before defaulting to anything else. It's that good at what it's good at.

Why I left Antigravity for Claude Code

Two reasons. One is economic. The other is what kept happening every time I opened Claude.

The economics are simple. I was already paying for Claude Max. Claude Code comes included with Max. Antigravity Ultra was a separate sub on top. After three months of trying to justify both, I couldn't. Cancelling Antigravity meant I lost the multi-project headroom and the model switching, but I also stopped paying twice for things that overlapped.

The second reason was harder to walk away from. Every time I opened Claude for something other than coding (research, writing, planning, anything outside the IDE) I'd notice a feature I wanted in my coding workflow too. Skills. Connectors. The integrations that come native with the Claude apps. Antigravity is a great IDE around an agent, but it's still just an IDE. Claude is becoming an operating layer where the coding agent is one component and everything else (skills, connectors, the rest of the Claude ecosystem) compounds with it. After three months I couldn't unsee that.

The exit wasn't bitter. Antigravity is still good. The math just didn't add up.

Antigravity is a great IDE around an agent. Claude is becoming an operating layer where the coding agent is one component and everything else compounds with it.

What Claude Code does that pulled me in

Four things, in the order I noticed them.

It's fast. The agent loop on Claude Code is genuinely quick. Compared to Codex this is night-and-day. Compared to Antigravity it's competitive. The speed shows up most in the iteration loop ("try that again with a tweak") which is where you spend most of your day.

It's great at design. This was the surprise after Codex. When I ask Claude Code for a specific design (color scheme, layout, UI behavior, anything front-end) it gets closest to what I actually want on the first pass. Not perfect. There are still rounds of correction. But the starting point is so much better that the total time-to-shipped is in a different category than what I had with Codex.

The Max plan usage limits are generous in a way that changes how you work. I don't ration my prompts. I don't hold back on a refactor because I'm worried about hitting a cap mid-week. The mental overhead of usage anxiety is gone, and that's worth more than it sounds.

The integrations are exceptional. Skills (Claude's task-specific instructions you write once and reuse) and connectors (Claude's plug-ins to your tools and data) are easy to set up and they make the coding agent more useful than coding-tool-only competitors. Most builders comparing Codex and Claude Code haven't actually tried Claude's skills and connectors. If you haven't, the comparison reads differently after you have.

The 5-hour website is what decided it

The receipt that closed the case was a project from last month. I built a full site for a web design company purely through Claude Code. Five hours from a blank repo to a deployed site that looked premium and professional. Integrated submission forms. Fully functional.

Five hours. With Codex it would have been days of "the colors are off, try again," "the layout doesn't work on mobile, try again," "this navigation looks wrong, redo it." Claude Code got the design close enough on the first pass that the iteration cycle was about content and copy, not about whether the thing looked right.

The work-shape is different. With Claude Code I'm shipping projects on a weekend that would otherwise have been quarterly initiatives or things I never started.

What's still rough about Claude Code's stack

Honest acknowledgment, because the post wouldn't be credible without it.

Claude Code itself still has issues. It's not perfect. Most of the issues are recoverable in a minute and the work it does correctly outweighs the friction, but the rough edges are real and you should know about them.

The bigger thing is Claude Design. Anthropic added it in 2026 as the design layer that sits on top of Claude Code, and it's the thing that gets the front-end output to roughly 95 percent of what I want on the first try. That coverage is unmatched. But it burns usage fast. I can empty a week's worth of Max plan allowance in a few hours of heavy use. That's a real cost when you're watching the meter, and Anthropic should price the design-heavy workflows differently or extend the usage envelope for them. As-is, you have to be deliberate about when to lean on it.

What Antigravity still does better

I want to be specific about this so the case for Claude Code doesn't read as all-Claude-all-the-time.

Multi-project parallelism without burning credits. Antigravity let me run seven projects at once without hitting a cap. Claude Code's Max plan limits are generous but they're not seven-projects-in-parallel generous if you're heavy on each.
Model switching in one IDE on one subscription. Antigravity's Gemini-and-Claude-in-the-same-tool model is genuinely useful and Claude Code can't match it without a routing layer that adds setup work.
The visible planning stage with the real-time to-do list. Claude Code has planning but the live, ticking-off-as-it-goes UI is something Antigravity handles better.
Multi-project IDE design. Antigravity is set up for builders running several things at once. Claude Code is set up for builders going deep on one repo at a time.

These are real. They're not enough to outweigh the rest of the case if you're on Claude Max already. They're enough to seriously consider Antigravity if you're not.

Why I bought Codex again, on purpose

Quick update, because the honest version of this post changed after I published it.

A couple of days ago I bought Codex again. The Max plan, sitting right next to my Claude Code Max plan. After everything above, that probably reads like a contradiction. It isn't. I didn't buy it to build with. I bought it because of a problem the comparison above never touched.

I've started running an autonomous agent called Hermes. If you haven't come across it, Hermes is an open-source agent from Nous Research that lives on a server, remembers what it learns, and runs jobs on its own while you get on with your day. The part that matters here is that it doesn't care which AI you put behind it. You point it at a model and let it work.

So I tried to point it at Claude.. and that's where the money problem starts.

To run Claude inside Hermes you're either paying the Anthropic API by the token, or buying extra credits on top of your Max plan. The Claude Code allowance I already pay for every month? Hermes can't touch it. ChatGPT and Codex work the other way around. They run straight off the subscription through a normal login, no meter ticking away in the background. And API pricing, when an agent is grinding all day, costs a lot more than a flat monthly sub. So the sensible way to give Hermes a brain was a Codex subscription, not Claude on the meter. I bought Codex to be the engine, and for agentic work inside Hermes it has been a proper workhorse.

Then, while I had it back in my hands, I did the obvious thing and retested the bit that made me drop it the first time. The design.

It's better. Noticeably better than the Codex I walked away from earlier this year. On one job it came back with a properly original branding direction, a visual identity that genuinely worked, the kind of distinct idea I didn't expect it to reach for. The catch is it still looked rough. The concept was right, the finish wasn't. Claude Code still gets both the idea and the polish closer on the first pass. So the old verdict softens, it doesn't flip. Codex has stopped being bad at design. It just sits a notch below Claude Code now.

The backend was the real surprise. In my opinion Codex now writes better backend code than Claude Code does. Cleaner, more complete, fewer gaps to chase down afterwards. I won't dress that up with a benchmark I didn't run, it's my read from using both side by side, but it's a consistent enough read that I've changed how I work around it.

So this is the stack I actually run now. Three tools, three jobs, no overlap.

ChatGPT 5.5 through Codex, on the OpenAI login, as the brain for Hermes. Anything agentic and always-on runs here, because it's the only one of the three I can run that way without the API meter eating me alive.
Claude Code for building. Websites, apps, and the design. This hasn't changed. It's still the fastest way I have to get a real front end right on the first pass, and the rest of the Claude ecosystem, the skills, the connectors, the projects, still compounds on top of it.
Codex for reviewing the backend Claude Code writes. I let it go back over the server-side code, clean it up, and fill in the bits that got missed. It's the second pass I used to do by hand.

Now, if you've read this far you'll spot the tension. The original post told you to pick one tool and go deep, and here I am running three. Fair. But this isn't tool-bouncing, the thing I keep warning against, where you swap your default every week chasing whoever shipped a feature. Each of these has a lane it's genuinely the best pick for in my work. The brain for the agent, the builder, the backend reviewer. Pick one default to build in, absolutely. But once you're running an agent and shipping real backends, a deliberate division of labour beats forcing one tool to do the job it's second-best at.

What to do this week

If you're using more than one of these casually, pick one. Stop splitting attention.

If you don't have a default yet and you're on Claude Max already, install Claude Code. Build one real thing with it this week. Set up one skill that automates something you do regularly. The compound returns of one tool deeply used beat the scattered returns of three shallowly tried. The three-tool split I described above doesn't break that. Each of those has one real, separate job.

If you're on Codex right now and shipping, the question worth asking is whether you've actually compared Claude Code's design output to Codex's on the same brief. Run that test once. If Claude Code's first pass is closer to what you want, the time you save in design iteration is worth more than the friction of switching tools.

If you're not on Claude Max and the math is the bottleneck, take Antigravity seriously. It's the strongest tool in this comparison if you're optimizing for parallel work, model flexibility, and a single sub. The case for Claude Code is conditional on the Max economics and the rest of the Claude ecosystem already being part of how you work.

Decisive picker

What's actually true for you right now?

01Already on Claude Max, real product work→ Claude Code
02Anything design-sensitive, want it right on pass one→ Claude Code
03Running an always-on agent and want it cheap→ Codex (off the subscription)
04Heavy backend work, want a second pass on the server code→ Codex
05Not on Claude Max, want generous parallel-project headroom→ Antigravity
06Want Gemini + Claude in one IDE on one sub→ Antigravity
07Most digital builders, building and shipping→ Claude Code

Codex vs Claude Code for Builders: Why I Run Claude Code Every Day

Codex was slow and bad at design

Antigravity nearly won, and that surprised me

Why I left Antigravity for Claude Code

What Claude Code does that pulled me in

The 5-hour website is what decided it

What's still rough about Claude Code's stack

What Antigravity still does better

Why I bought Codex again, on purpose

What to do this week

Related posts

AI SEO Tools: 7 Picks That Actually Work for Solo Operators

Bolt.new Alternatives: 5 Better Picks for Builders

How to Use Claude Projects: The Workflow That Walls Off Your Context

The Final
Two.

Get The Final Two. Plus a sharp essay every Sunday.

Codex was slow and bad at design

Antigravity nearly won, and that surprised me

Why I left Antigravity for Claude Code

What Claude Code does that pulled me in

The 5-hour website is what decided it

What's still rough about Claude Code's stack

What Antigravity still does better

Why I bought Codex again, on purpose

What to do this week

Related posts

AI SEO Tools: 7 Picks That Actually Work for Solo Operators

Bolt.new Alternatives: 5 Better Picks for Builders

How to Use Claude Projects: The Workflow That Walls Off Your Context

The FinalTwo.

Get The Final Two. Plus a sharp essay every Sunday.

The Final
Two.