AI agents handle a wide range of week-long tasks

7.6 significance by METR Capabilities Q1 2027

Why does it matter?

Week-long task ability would make autonomous agents useful across much more everyday software work.

Direct quote

Our estimate of the length of tasks that an agent can complete depends on methodological choices like the tasks used and the humans whose performance is measured. However, we are fairly confident that the overall trend is roughly correct, at around 1-4 doublings per year. If the measured trend from the past 6 years continues for 2-4 more years, generalist autonomous agents will be capable of performing a wide range of week-long tasks.

METR