Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Join our daily and weekly newsletters for the latest updates and exclusive content in the industry’s leading AI coverage. Learn more
David Silver and Richard Sutton, two popular AI scientists, a New paper Artificial intelligence is about to enter a new stage, “Experience Age”. This is where the AI systems are increasingly less, less than the information provided by the human and the place where the world is developing.
While the paper is conceptual and forward, there is a direct impact for future AI agents and compounds to set up for systems.
Both silver and Sutton and Sutton and the AI’s future are a trace record. Reliability forecasts may appear directly in today’s most advanced AI systems. In 2019, a pioneer in the study of Sutton, reinforcement, wrote the famous essay “Bitter lesson“The largest long-term progress in the EU results in large-scale search and learning methods, primarily, primarily, primarily in the acquired domain knowledge.
Headed scientist David Silver, Alphago, Alphazero and Alphago, Alphazero and Alphasa, were the main contribution of all important achievements in deep reinforcement. He was also Co-author of a paper in 2021 This claimed that the well-designed prize signal to create strengthening and creating very advanced AI systems.
The most advanced large language models (LLS) use those two concepts. The wave of new LLMs, which has conquered the AI scene since GPT 3, is primarily in accordance with the scale calculation and information to include a large amount of knowledge. Latest wave of the latest models of thought DeepSeek-r1showed enough learning of reinforcement and a simple reward signal Complex justification skills.
The “Experimental period” builds the same concepts discussed by Sutton and silver in recent years and adapt to the latest advances in AI. The authors said, “The pace of progress is only the descendant of the learning of human data, expressing the need for a new approach, the slowdown of generations.”
And this approach requires a new source of information that should be established in a way that is constantly improving because the agent is strengthened. “This can be achieved by the information that the agents are constantly from their experience, ie the information that communicates with the environment, ie” Sutton and Silver “.” Experience, the scale of human data used in today’s systems will become a dominant environment of a dominant and resulting dwarf. “
According to the authors, in addition to learning from their experience, future AI systems will eliminate the “restrictions of human-centered AI systems” in four sizes:
The idea of AI agents, which adapted to its environment by learning the armature, is not new. However, these agents were limited to the very limited environment as plaques. Agents who can interact with the complex environment today (eg, AI Computer Use) And progress in the learning of strengthening will eliminate these restrictions, and the transition to the experience period.
It is an observation that will have a significant impact on Sutton and silver paper, which will have a significant impact for real-world applications: “The agent can also do ‘occupy’ actions and observations such as user interfaces.”
The practice period means that developers must only have applications with not only for people, but also with AI agents. Machine friendly movements require a safe and accessible APIs to easily access through interfaces such as direct or MCP. Means creating agents discovered through protocols such as Google Agent2agent. You must prepare API and agent interfaces to provide access to both actions and observations. This will allow agents to gradually cause and learn their interactions with your applications.
The transformation of Sutton and silver gift into reality will soon be roaming from the Internet (and in the physical world) to perform tasks. Their behaviors and needs will be very different from human users and developers, and the agent friend of interacting with your application will increase your ability to use the future AI systems of a road (and prevent the cause of this).
“RL and basic principles match the difficulties of this new era, open the full potential of the autonomous learning and can really lead to superhuman intelligence” Write Sutton and Silver.
Deepmind refused to make additional comments for the story.