A major lab starts online-learning post-training

7.6 significance by AI 2027 Capabilities Q1 2027

Why does it matter?

A jump like this would make advanced AI feel less like a tool and more like a worker.

Direct quote

On top of all that, they train Agent-2 almost continuously using reinforcement learning on an ever-expanding suite of diverse difficult tasks: lots of video games, lots of coding challenges, lots of research tasks. Agent-2, more so than previous models, is effectively “online learning,” in that it’s built to never really finish training. Every day, the weights get updated to the latest version, trained on more data generated by the previous version the previous day.

AI 2027