A new benchmark from AI safety lab METR reframes how we measure machine intelligence — not by test scores, but by how long a task an AI…
A new benchmark from AI safety lab METR reframes how we measure machine intelligence — not by test scores, but by how long a task an AI…