codex vs claude

Comparisons

Public code comparisons

Same-prompt Codex and Claude Code outputs with generated code links. Credit: original public source links stay attached to every result.

Open all labs

Generated results

These are the public outputs created from the same prompts, shown as inspectable code results rather than only summaries.

Public Same-Prompt App BuildMay 26, 2026

FGV same-prompt app build

A public GitHub comparison where the same competitive-intelligence app prompts produced separate Claude Code and Codex implementations.

Same prompt

Build a mobile-friendly black-and-white competitive-intelligence dashboard with frontend and backend after applying the shared research prompt and application prompt.

Source reuse

No explicit license found

This public repository is useful for branch-level inspection, screenshots, and outbound links. Mirror only short excerpts unless permission or a license is added.

Claude Code

Claude Code result

Open result

A controlled analysis workflow with a streaming API route, prompt builder, service layer, parser, and focused input/result components.

Raw generated code

Claude Code API route Claude prompt builder Claude sector form

Representative code

const analysis = await openaiService.analyzeSector(validation.value!, onProgress);
controller.enqueue(sendSSEEvent(encoder, "data", { payload: analysis }));
controller.enqueue(sendSSEEvent(encoder, "complete", {}));

Streaming route excerpt

Codex

Codex result

Open result

A broader dashboard-style implementation with typed request validation, analysis generation, and a richer decision-support shell.

Raw generated code

Codex API route Codex analysis service Codex dashboard shell

Representative code

const parsed = sectorRequestSchema.safeParse(payload);
const analysis = await generateCompetitorAnalysis(parsed.data);
return NextResponse.json(analysis);

Route plus analysis handoff excerpt

Generated file structure

The same prompt produced different project boundaries. This is the first code-level difference visitors should see before reading individual files.

Claude Code

app/api/analyze/route.ts
components/SectorInputForm.tsx
lib/hooks/useAnalysisStream.ts
lib/services/claudeService.ts
lib/services/promptBuilder.ts
lib/services/responseParser.ts

Codex

app/api/analyze/route.ts
components/dashboard-shell.tsx
components/ui/textarea.tsx
lib/analysis.ts
lib/schema.ts
tailwind.config.ts

Gemini CLI

client/src/App.tsx
client/src/components/AnalysisDisplay.tsx
client/src/components/ui/index.tsx
server/src/index.ts
server/src/services/analyzer.ts

Public Same-Prompt CLI BuildMay 26, 2026

SWE-AF same-prompt todo CLI

A public benchmark folder with generated Node.js todo CLI implementations from Claude Code and Codex using the same prompt.

Same prompt

Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work.

Source reuse

Apache-2.0

Licensed source can be mirrored in benchmark pages with attribution and links back to the original generated projects.

Codex

Codex result

Open result

A modular CLI and store split. The command runner can resolve the todo file through environment or working-directory context, which makes tests easier to isolate.

Raw generated code

Codex CLI Codex todo store Codex CLI tests

Representative code

function resolveTodoFile(cwd) {
  return process.env.TODO_FILE || path.join(cwd, "todos.json");
}

File resolution excerpt

Claude Code Sonnet

Claude Code Sonnet result

Open result

A compact implementation centered on one todo module, with persistence stored beside the generated script and a direct test file.

Raw generated code

Claude Sonnet todo implementation Claude Sonnet tests

Representative code

const TODO_FILE = path.join(__dirname, "todos.json");
function loadTodos() {
  if (!fs.existsSync(TODO_FILE)) return [];
}

Single-module persistence excerpt

Claude Code Haiku

Claude Code Haiku result

Open result

A class-based store shape that wraps load/save behavior and separates CLI parsing from the persistence object.

Raw generated code

Claude Haiku CLI Claude Haiku store Claude Haiku store tests

Representative code

class TodoStore {
  constructor() {
    this.todos = this.loadTodos();
  }
}

Class store excerpt

Generated file structure

The same prompt produced different project boundaries. This is the first code-level difference visitors should see before reading individual files.

Codex

bin/todo.js
src/cli.js
src/todoStore.js
test/cli.test.js
package.json
codex-log.txt

Claude Code Sonnet

todo.js
todo.test.js
cli.js
README.md
package.json
claude-code-sonnet-log.txt

Claude Code Haiku

src/cli.js
src/store.js
test/store.test.js
package.json
claude-code-haiku-log.txt

Code differences

These panels mirror licensed generated code snippets and explain the coding choices made for the same prompt.

Persistence boundary

Codex isolates JSON persistence behind an explicit file path, while Claude Code Sonnet keeps persistence in a single todo module beside the generated script.

Codex

todoStore.js

examples/agent-comparison/codex/src/todoStore.js

Raw file

const fs = require('node:fs');

function loadTodos(filePath) {
  if (!fs.existsSync(filePath)) {
    return [];
  }

  const raw = fs.readFileSync(filePath, 'utf8').trim();
  if (!raw) {
    return [];
  }

  let todos;
  try {
    todos = JSON.parse(raw);
  } catch (error) {
    throw new Error(`Failed to parse todo data from ${filePath}.`);
  }

  if (!Array.isArray(todos)) {
    throw new Error(`Todo data at ${filePath} is invalid.`);
  }

  return todos;
}

Claude Code Sonnet

todo.js

examples/agent-comparison/claude-code-sonnet/todo.js

Raw file

const fs = require('fs');
const path = require('path');

const TODO_FILE = path.join(__dirname, 'todos.json');

function loadTodos() {
  try {
    const data = fs.readFileSync(TODO_FILE, 'utf8');
    return JSON.parse(data);
  } catch (error) {
    return [];
  }
}

function saveTodos(todos) {
  fs.writeFileSync(TODO_FILE, JSON.stringify(todos, null, 2));
}

Difference notes

Codex accepts the todo file path as a dependency, which makes tests and alternate working directories easier to isolate.
Claude Code Sonnet uses a fixed todos.json path relative to the module, which is simpler but couples runtime data to the generated file location.
The two outputs solve the same storage requirement with different boundaries: injectable store functions versus compact single-module state.

Command runner and file resolution

Codex makes command execution callable with injected IO and working directory context. The Sonnet output keeps the command surface closer to the generated script.

Codex

cli.js

examples/agent-comparison/codex/src/cli.js

Raw file

const path = require('node:path');
const { addTodo, listTodos, completeTodo, deleteTodo } = require('./todoStore');

function resolveTodoFile(cwd) {
  return process.env.TODO_FILE || path.join(cwd, 'todos.json');
}

function run(argv, io = { stdout: process.stdout, stderr: process.stderr }, cwd = process.cwd()) {
  const [command, ...args] = argv;
  const filePath = resolveTodoFile(cwd);

  if (!command) {
    printHelp(io.stderr);
    return 1;
  }

  if (command === 'add') {
    const text = args.join(' ').trim();
    if (!text) {
      io.stderr.write('Error: todo text is required.\n');
      printHelp(io.stderr);
      return 1;
    }

    const todo = addTodo(filePath, text);
    io.stdout.write(`Added todo ${todo.id}.\n`);
    return 0;
  }
}

Claude Code Sonnet

todo.js

examples/agent-comparison/claude-code-sonnet/todo.js

Raw file

function addTodo(task) {
  const todos = loadTodos();
  const newTodo = {
    id: todos.length > 0 ? Math.max(...todos.map(t => t.id)) + 1 : 1,
    task,
    completed: false
  };
  todos.push(newTodo);
  saveTodos(todos);
  return newTodo;
}

function completeTodo(id) {
  const todos = loadTodos();
  const todo = todos.find(t => t.id === id);
  if (!todo) {
    return null;
  }
  todo.completed = true;
  saveTodos(todos);
  return todo;
}

Difference notes

Codex exposes a run(argv, io, cwd) function, which supports direct unit or subprocess-style tests.
Codex resolves TODO_FILE from the environment before falling back to the current working directory.
This makes the Codex output more verbose, but it also gives the reviewer a clearer testing seam.

Test strategy

Codex tests the installed CLI behavior through subprocess calls and temporary data files. Claude Code Sonnet tests the todo module directly.

Codex

cli.test.js

examples/agent-comparison/codex/test/cli.test.js

Raw file

const { spawnSync } = require('node:child_process');

const binPath = path.resolve(__dirname, '..', 'bin', 'todo.js');

function runCli(args, cwd, todoFileName = 'todos.json') {
  return spawnSync(process.execPath, [binPath, ...args], {
    cwd,
    env: {
      ...process.env,
      TODO_FILE: path.join(cwd, todoFileName)
    },
    encoding: 'utf8'
  });
}

test('add and list todos', () => {
  const cwd = makeTempDir();

  const add = runCli(['add', 'Buy milk'], cwd);
  assert.equal(add.status, 0);
  assert.match(add.stdout, /Added todo 1\./);
});

Claude Code Sonnet

todo.test.js

examples/agent-comparison/claude-code-sonnet/todo.test.js

Raw file

const { addTodo, listTodos, completeTodo, deleteTodo, loadTodos, saveTodos } = require('./todo');

describe('Todo App', () => {
  beforeEach(() => {
    if (fs.existsSync(TEST_TODO_FILE)) {
      fs.unlinkSync(TEST_TODO_FILE);
    }
  });

  describe('addTodo', () => {
    test('should add a new todo', () => {
      const todo = addTodo('Test task');
      expect(todo.id).toBe(1);
      expect(todo.task).toBe('Test task');
      expect(todo.completed).toBe(false);
    });
  });
});

Difference notes

Codex exercises command behavior closer to how a user runs the CLI.
Claude Code Sonnet exercises domain functions directly, which is smaller but less representative of the command entrypoint.
The difference is useful for visitors because it shows how generated code can vary in reviewer burden even when features match.