A major lab starts online-learning post-training

Why does it matter?

A jump like this would make advanced AI feel less like a tool and more like a worker.

Direct quote

On top of all that, they train Agent-2 almost continuously using reinforcement learning on an ever-expanding suite of diverse difficult tasks: lots of video games, lots of coding challenges, lots of research tasks. Agent-2, more so than previous models, is effectively “online learning,” in that it’s built to never really finish training. Every day, the weights get updated to the latest version, trained on more data generated by the previous version the previous day.
AI 2027

A major lab starts online-learning post-training

Why does it matter?

Direct quote

Related predictions