ExploringDatabyLLMs Runs

Published benchmark runs, compare notes, and dashboards across Claude, Codex, and Gemini. Reports open through the Markdown viewer, dashboards open as live HTML, and SQL / JSON artifacts link to the public GitHub repo.

2026-04-10

q012_top_carrier_leadership_steps

claude / sonnet
run-001

2026-03-27

2026-03-26

2026-03-25

2026-03-24

q002_top_carrier_by_flights_leadership

claude / sonnet
run-004

2026-03-23

2026-03-20

2026-03-19

2026-03-18

2026-03-17

q004_worst_origin_airport_otp_thresholded

gemini / gemini-3.1-pro-preview
run-001

q005_worst_winter_carrier_origin_pair

gemini / gemini-3.1-pro-preview
run-001