Advertisment

Meta unveils new AI model capable of self-training

According to the company, the evaluator model is trained entirely on AI-generated data, bypassing the need for human input during this stage.

author-image
Social Samosa
New Update
123

Meta announced the release of new artificial intelligence (AI) models from its research division, including a ‘Self-Taught Evaluator,’ a tool that could reduce the need for human involvement in AI development. This latest release follows Meta’s earlier introduction of the tool in an August research paper, where the company detailed how it employs the ‘chain of thought’ technique to enhance the accuracy of AI-generated responses.

According to the company, the evaluator model is trained entirely on AI-generated data, bypassing the need for human input during this stage. The ability of AI to evaluate its own responses opens up possibilities for developing autonomous AI systems that can learn from their own mistakes.

According to reports, two Meta researchers involved in the project highlighted that this innovation could lead to AI models capable of self-improvement, potentially eliminating the need for human feedback in training processes. "We hope, as AI becomes more and more super-human, that it will get better and better at checking its work, so that it will actually be better than the average human," said Jason Weston, one of the researchers behind the project.

This new method could replace the current process known as 'Reinforcement Learning from Human Feedback' (RLHF), which relies on human annotators with specialised expertise to label data and validate AI-generated answers. The company's approach with the 'Self-Taught Evaluator' offers a glimpse into a future where AI agents can operate autonomously, carrying out complex tasks without the need for human intervention. "The idea of being self-taught and able to self-evaluate is basically crucial to the idea of getting to this sort of super-human level of AI," Weston added.

Other tech giants, including Google and Anthropic, have also explored similar concepts, such as 'Reinforcement Learning' from AI Feedback (RLAIF). However, unlike Meta, these companies have not released their models for public use. 

As per reports, in addition to the Self-Taught Evaluator, the company also released updates to its image-identification model, 'Segment Anything', a tool to accelerate large language model (LLM) response times, and new datasets aimed at facilitating the discovery of inorganic materials.

 

Meta AI models advanced AI model generative AI models Meta AI training Self-Taught Evaluator