The concrete job here is shipping a real web app, not just generating one: getting from prompt to working frontend, database, environment config, testing, deployment, and the first round of fixes. Claude Code and Replit genuinely diverge on that job because one is a terminal agent operating in your own local setup, while the other is a cloud workspace that bundles coding, runtime, and hosting into the same product.
That makes this a useful stress test. Shipping exposes the failure modes that matter: where the code actually runs, who owns the environment, how painful iteration gets when the agent is wrong, and whether the final app leaves you with portable assets or a stack shaped around the platform that generated it.