Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Meta’s vanilla Maverick AI model ranks below rivals on a popular chat benchmark


Previously, Meta this week landed in hot water Llama 4 Maverick model is an experimental, for use of the rolling version, LM Arena, LM Arena. Event He asked LM Arena’s protectors to apologizeChange their policies and hit the vanilla maverick.

It turns out, not very competitive.

Unrequited Maverick, “Llama-4-Maverick-17B-128E-128E instruction” The below was ranked in models Openai’s GPT-4O, Anthropic’s Claude 3.5 Sonnet and Friday, including Google’s Gemini 1.5 Pro. Most of these models are months.

Why poor performance? METAN’s experimental Maverick, Llama-4-Maverick-03-26-experience, “optimal for negotiations”, the company told a Published the diagram last Saturday. This optimization, human rankings played the LM Arena, which compares the performances of models and chooses the choice of choices.

As we write beforeFor various reasons, the LM Arena has never been the most reliable size of the AI ​​model performance. Again, it makes it difficult to accurately predict how good the developers will be in different contexts – to make a model for a criterion.

In a statement, a meta spokesman told Techcrunch to the “all kinds of special variant types” with meta experiences.

“‘LLA-4-Maverick-03-26-experienced’, a fame that LMarena has also implemented the experience,” said Press Secretary. “Now we have left our open source and see how developers customize the Llama 4 for their use.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *