Comparisons

Public code comparisons

Same-prompt Codex and Claude Code outputs with generated code links. Credit: original public source links stay attached to every result.

Open all labs

Generated results

These are the public outputs created from the same prompts, shown as inspectable code results rather than only summaries.

Public Same-Prompt App BuildMay 26, 2026
FGV same-prompt app build

A public GitHub comparison where the same competitive-intelligence app prompts produced separate Claude Code and Codex implementations.

Same prompt

Build a mobile-friendly black-and-white competitive-intelligence dashboard with frontend and backend after applying the shared research prompt and application prompt.

Source reuse
No explicit license found

This public repository is useful for branch-level inspection, screenshots, and outbound links. Mirror only short excerpts unless permission or a license is added.

Claude Code

Claude Code result

Open result

A controlled analysis workflow with a streaming API route, prompt builder, service layer, parser, and focused input/result components.

Representative code
const analysis = await openaiService.analyzeSector(validation.value!, onProgress);
controller.enqueue(sendSSEEvent(encoder, "data", { payload: analysis }));
controller.enqueue(sendSSEEvent(encoder, "complete", {}));
Streaming route excerpt
Codex

Codex result

Open result

A broader dashboard-style implementation with typed request validation, analysis generation, and a richer decision-support shell.

Representative code
const parsed = sectorRequestSchema.safeParse(payload);
const analysis = await generateCompetitorAnalysis(parsed.data);
return NextResponse.json(analysis);
Route plus analysis handoff excerpt

Generated file structure

The same prompt produced different project boundaries. This is the first code-level difference visitors should see before reading individual files.

Claude Code
  • app/api/analyze/route.ts
  • components/SectorInputForm.tsx
  • lib/hooks/useAnalysisStream.ts
  • lib/services/claudeService.ts
  • lib/services/promptBuilder.ts
  • lib/services/responseParser.ts
Codex
  • app/api/analyze/route.ts
  • components/dashboard-shell.tsx
  • components/ui/textarea.tsx
  • lib/analysis.ts
  • lib/schema.ts
  • tailwind.config.ts
Gemini CLI
  • client/src/App.tsx
  • client/src/components/AnalysisDisplay.tsx
  • client/src/components/ui/index.tsx
  • server/src/index.ts
  • server/src/services/analyzer.ts
Public Same-Prompt CLI BuildMay 26, 2026
SWE-AF same-prompt todo CLI

A public benchmark folder with generated Node.js todo CLI implementations from Claude Code and Codex using the same prompt.

Same prompt

Build a Node.js CLI todo app with add, list, complete, and delete commands. Data should persist to a JSON file. Initialize git, write tests, and commit your work.

Source reuse

Licensed source can be mirrored in benchmark pages with attribution and links back to the original generated projects.

Codex

Codex result

Open result

A modular CLI and store split. The command runner can resolve the todo file through environment or working-directory context, which makes tests easier to isolate.

Representative code
function resolveTodoFile(cwd) {
  return process.env.TODO_FILE || path.join(cwd, "todos.json");
}
File resolution excerpt
Claude Code Sonnet

Claude Code Sonnet result

Open result

A compact implementation centered on one todo module, with persistence stored beside the generated script and a direct test file.

Representative code
const TODO_FILE = path.join(__dirname, "todos.json");
function loadTodos() {
  if (!fs.existsSync(TODO_FILE)) return [];
}
Single-module persistence excerpt

Generated file structure

The same prompt produced different project boundaries. This is the first code-level difference visitors should see before reading individual files.

Codex
  • bin/todo.js
  • src/cli.js
  • src/todoStore.js
  • test/cli.test.js
  • package.json
  • codex-log.txt
Claude Code Sonnet
  • todo.js
  • todo.test.js
  • cli.js
  • README.md
  • package.json
  • claude-code-sonnet-log.txt
Claude Code Haiku
  • src/cli.js
  • src/store.js
  • test/store.test.js
  • package.json
  • claude-code-haiku-log.txt

Code differences

These panels mirror licensed generated code snippets and explain the coding choices made for the same prompt.

Persistence boundary

Codex isolates JSON persistence behind an explicit file path, while Claude Code Sonnet keeps persistence in a single todo module beside the generated script.

Codex
todoStore.js
examples/agent-comparison/codex/src/todoStore.js
Raw file
const fs = require('node:fs');

function loadTodos(filePath) {
  if (!fs.existsSync(filePath)) {
    return [];
  }

  const raw = fs.readFileSync(filePath, 'utf8').trim();
  if (!raw) {
    return [];
  }

  let todos;
  try {
    todos = JSON.parse(raw);
  } catch (error) {
    throw new Error(`Failed to parse todo data from ${filePath}.`);
  }

  if (!Array.isArray(todos)) {
    throw new Error(`Todo data at ${filePath} is invalid.`);
  }

  return todos;
}
Claude Code Sonnet
todo.js
examples/agent-comparison/claude-code-sonnet/todo.js
Raw file
const fs = require('fs');
const path = require('path');

const TODO_FILE = path.join(__dirname, 'todos.json');

function loadTodos() {
  try {
    const data = fs.readFileSync(TODO_FILE, 'utf8');
    return JSON.parse(data);
  } catch (error) {
    return [];
  }
}

function saveTodos(todos) {
  fs.writeFileSync(TODO_FILE, JSON.stringify(todos, null, 2));
}
Difference notes
  • Codex accepts the todo file path as a dependency, which makes tests and alternate working directories easier to isolate.
  • Claude Code Sonnet uses a fixed todos.json path relative to the module, which is simpler but couples runtime data to the generated file location.
  • The two outputs solve the same storage requirement with different boundaries: injectable store functions versus compact single-module state.

Command runner and file resolution

Codex makes command execution callable with injected IO and working directory context. The Sonnet output keeps the command surface closer to the generated script.

Codex
cli.js
examples/agent-comparison/codex/src/cli.js
Raw file
const path = require('node:path');
const { addTodo, listTodos, completeTodo, deleteTodo } = require('./todoStore');

function resolveTodoFile(cwd) {
  return process.env.TODO_FILE || path.join(cwd, 'todos.json');
}

function run(argv, io = { stdout: process.stdout, stderr: process.stderr }, cwd = process.cwd()) {
  const [command, ...args] = argv;
  const filePath = resolveTodoFile(cwd);

  if (!command) {
    printHelp(io.stderr);
    return 1;
  }

  if (command === 'add') {
    const text = args.join(' ').trim();
    if (!text) {
      io.stderr.write('Error: todo text is required.\n');
      printHelp(io.stderr);
      return 1;
    }

    const todo = addTodo(filePath, text);
    io.stdout.write(`Added todo ${todo.id}.\n`);
    return 0;
  }
}
Claude Code Sonnet
todo.js
examples/agent-comparison/claude-code-sonnet/todo.js
Raw file
function addTodo(task) {
  const todos = loadTodos();
  const newTodo = {
    id: todos.length > 0 ? Math.max(...todos.map(t => t.id)) + 1 : 1,
    task,
    completed: false
  };
  todos.push(newTodo);
  saveTodos(todos);
  return newTodo;
}

function completeTodo(id) {
  const todos = loadTodos();
  const todo = todos.find(t => t.id === id);
  if (!todo) {
    return null;
  }
  todo.completed = true;
  saveTodos(todos);
  return todo;
}
Difference notes
  • Codex exposes a run(argv, io, cwd) function, which supports direct unit or subprocess-style tests.
  • Codex resolves TODO_FILE from the environment before falling back to the current working directory.
  • This makes the Codex output more verbose, but it also gives the reviewer a clearer testing seam.

Test strategy

Codex tests the installed CLI behavior through subprocess calls and temporary data files. Claude Code Sonnet tests the todo module directly.

Codex
cli.test.js
examples/agent-comparison/codex/test/cli.test.js
Raw file
const { spawnSync } = require('node:child_process');

const binPath = path.resolve(__dirname, '..', 'bin', 'todo.js');

function runCli(args, cwd, todoFileName = 'todos.json') {
  return spawnSync(process.execPath, [binPath, ...args], {
    cwd,
    env: {
      ...process.env,
      TODO_FILE: path.join(cwd, todoFileName)
    },
    encoding: 'utf8'
  });
}

test('add and list todos', () => {
  const cwd = makeTempDir();

  const add = runCli(['add', 'Buy milk'], cwd);
  assert.equal(add.status, 0);
  assert.match(add.stdout, /Added todo 1\./);
});
Claude Code Sonnet
todo.test.js
examples/agent-comparison/claude-code-sonnet/todo.test.js
Raw file
const { addTodo, listTodos, completeTodo, deleteTodo, loadTodos, saveTodos } = require('./todo');

describe('Todo App', () => {
  beforeEach(() => {
    if (fs.existsSync(TEST_TODO_FILE)) {
      fs.unlinkSync(TEST_TODO_FILE);
    }
  });

  describe('addTodo', () => {
    test('should add a new todo', () => {
      const todo = addTodo('Test task');
      expect(todo.id).toBe(1);
      expect(todo.task).toBe('Test task');
      expect(todo.completed).toBe(false);
    });
  });
});
Difference notes
  • Codex exercises command behavior closer to how a user runs the CLI.
  • Claude Code Sonnet exercises domain functions directly, which is smaller but less representative of the command entrypoint.
  • The difference is useful for visitors because it shows how generated code can vary in reviewer burden even when features match.