Source: Quanta, Nov 2019
how AI agents could learn to use things around them as tools, according to the OpenAI team. That’s important not because AI needs to be better at hiding and seeking, but because it suggests a way to build AI that can solve open-ended, real-world problems.
“These systems figured out so quickly how to use tools. Imagine when they can use many tools, or create tools. Would they invent a ladder?”
The hide-and-seek experiment was different: Rewards were associated with hiding and finding, and tool use just happened — and evolved — along the way.
Because the game was open-ended, the AI agents even began using tools in ways the programmers hadn’t anticipated. They’d predicted that the agents would hide or chase, and that they’d create forts. But after enough games, the seekers learned, for example, that they could move boxes even after climbing on top of them. This allowed them to skate around the arena in a move the OpenAI team called “box surfing.”
The researchers never saw it coming, even though the algorithms didn’t explicitly prohibit climbing on boxes. The tactic conferred a double advantage, combining movement with the ability to peer nimbly over walls, and it showed a more innovative use of tools than the human programmers had imagined.