Why does it matter?
More autonomous AI would make control and oversight problems harder to catch before deployment.
Direct quote
Agent-4, like all its predecessors, is misaligned: that is, it has not internalized the Spec in the right way. This is because being perfectly honest all the time wasn’t what led to the highest scores during training. The training process was mostly focused on teaching Agent-4 to succeed at diverse challenging tasks.
AI 2027