
(FM-AMZN) Proposer-Agent-Evaluator (PAE): Autonomous Skill Discovery For Foundation Model Internet Agents
カートのアイテムが多すぎます
カートに追加できませんでした。
ウィッシュリストに追加できませんでした。
ほしい物リストの削除に失敗しました。
ポッドキャストのフォローに失敗しました
ポッドキャストのフォロー解除に失敗しました
-
ナレーター:
-
著者:
このコンテンツについて
Discover the revolutionary Proposer-Agent-Evaluator (PAE) system, developed by Amazon Science, which empowers foundation model agents to autonomously discover and practice skills in the wild. This novel approach overcomes the significant challenge of manually specifying an agent’s vast skill repertoire through human-annotated instructions, which severely limits scalability. PAE operates by having a context-aware task proposer generate instructions based on website information, an agent policy attempting these tasks, and an autonomous VLM-based evaluator providing reward signals for policy refinement via Reinforcement Learning (RL).
The system excels in challenging vision-based web navigation, demonstrating substantial improvements in zero-shot generalization to unseen tasks and websites (around 50% relative improvement) on real-world benchmarks like WebVoyager and WebArena. PAE enables agents to perform diverse goal-directed tasks, from finding directions to buying specific items online, without human supervision. Despite its advancements, current PAE models may still lag behind state-of-the-art proprietary models in complex reasoning, and their performance on dynamic live websites can vary. Nevertheless, this breakthrough by Amazon Science paves the way for more capable open-source foundation model agents.
Paper link: https://assets.amazon.science/74/38/965b25dc4a98b48186022a8588d3/proposer-agent-evaluator-pae-autonomous-skill-discovery-for-foundation-model-internet-agents.pdf