Benchmark dashboard

Reliability-aware benchmark for routing assumptions.

This page compares schedule-only, realtime-snapshot, and robust assumptions on a deterministic candidate slice. It is meant to sit beside the atlas and research review as the operational comparison layer of the public site.

Overview Atlas Operations Benchmark Research Review

Deterministic candidate slice Schedule vs snapshot vs robust Decision-facing metrics

Rows evaluated

Current deterministic benchmark slice

Scheduled access

Reachable within threshold on schedule

Robust access

Reachable after reliability penalty is applied

Access loss

Cases pushed outside the threshold by uncertainty

Snapshot miss rate

0.0417

Average missed-transfer exposure under snapshot assumptions

Robust miss rate

0.0833

Average missed-transfer exposure under robust assumptions

Trade-off summary

This slice is still scaffold-scale, but it already shows how a single candidate set can support multiple decision assumptions in one page.

Snapshot regret

1.00 min

Average increase over schedule under realtime snapshot

Robust regret

1.00 min

Average increase over schedule under robust routing assumptions

Current takeaway: the benchmark is already useful as a communication layer. The next step is scale: a larger held-out evaluation window, more routes, and sharper schedule-versus-robust accessibility-loss evidence.

OD	Line	Mode	Scheduled	Snapshot	Robust	Snapshot miss	Robust miss
C_20260302T10_direct	C	ST	21.0	22.0	22.0	0.041667	0.083333
C_20260302T10_transfer	C	ST	25.0	26.0	26.0	0.041667	0.083333
C_20260302T11_direct	C	ST	24.0	25.0	25.0	0.041667	0.083333
C_20260302T11_transfer	C	ST	28.0	29.0	29.0	0.041667	0.083333
C_20260302T12_direct	C	ST	27.0	28.0	28.0	0.041667	0.083333
C_20260302T12_transfer	C	ST	31.0	32.0	32.0	0.041667	0.083333
C_20260302T13_direct	C	ST	18.0	19.0	19.0	0.041667	0.083333
C_20260302T13_transfer	C	ST	22.0	23.0	23.0	0.041667	0.083333
F_20260302T11_direct	F	ST	21.0	22.0	22.0	0.041667	0.083333
F_20260302T11_transfer	F	ST	25.0	26.0	26.0	0.041667	0.083333
F_20260302T12_direct	F	ST	24.0	25.0	25.0	0.041667	0.083333
F_20260302T12_transfer	F	ST	28.0	29.0	29.0	0.041667	0.083333

Artifacts: results/benchmark/latest/candidates.csv, results/benchmark/latest/comparison.csv, results/benchmark/latest/summary.md.