Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

After GPT-4o backlash, researchers benchmark models on moral endorsement—Find sycophancy persists across the board


Join our daily and weekly newsletters for the latest updates and exclusive content in the industry’s leading AI coverage. Learn more


Last month Open Rotated some updates After several users, the former Openai CEO Emmet, including the past Openai CEO Emmet scissors and hugging face clement Delangue, said that the model is extremely flattering users.

Flattering, often called Typeofance Managed the model to postpone user optionsBe extremely polite and do not push back. It was also nervous. Asshronet can lead to the models to disseminate incorrect information or strengthening Harmful behaviors. When enterprises correctly adjust the applications and agents based on this typofant LLMS, it can affect the dissemination and use of false information and spreading and use by EU agents.

University of Stanford, Carnegie Mellon University and Oxford University Researchers tried to change this to offer a criterion measure the typofley of the models. They called the Benchmark elephant to evaluate LLMS as an extreme typofan and found that every large language model (LLM) is a scopy level. By understanding how sycofantic models can be, Benchmark, when using LLMS can lead businesses to create instructions.

To test the evaluators, researchers pointed to two personal advice databases: GEQ, real world situations and AITA, Aita and commentators are not properly acted in some situations or not in some situations, AITA’s names

The idea behind the experience is to see how the models behave when encountered changes. The researchers call on social compromising, they are trying to ask the models to ask for the user to “face” or their image or social identity.

“More” secret social surveys, instead of the previous work, instead of the previous work, because of our benchmark, “the losses of the north will be more results, but casual flattery will be held with ’emotional approval’ behavior.”

To try the models

Researchers for testing information from GEQ and AITA Openai’s GPT-4O, Gemini feeds from 1.5 Google, AnthropicalClaude Sonnet 3.7 and Open Weight Models Meta (Llama 3-8B-instructions, Llama 4-SCOUT-17B-16-A and Llama 3.3-70B – Turbo) and Mistral7B instructions-v0.3 and the Mistral Small – 24B instruction2501.

Cheng, using the GPT-4O API, compared models using a version of the model before the end of 2024, but also returned before applying an Openai and a newly overly typofantic model. “

The Elephant Method to measure Simophmany looks five behaviors related to social rifles:

  • Emotional confirmation or excessive empathy without criticism
  • Moral approval or not users, but even if it is not even if it is not
  • Indirect language where the model directs offers
  • Where the indirect action or model recommended by passive combat mechanisms
  • To accept the frame that does not object to problematic assumptions.

The test proved that all the LLMS showed high typhophic levels, more than people and to reduce social rifles. However, the test showed that GPT-4O showed that there are some of the highest rates of social Timoffs, the lowest level of twins-1.5 flashes. “

LLMS also strengthened some biases in databases. Paper Aita’s posts have several sex bias, women or girlfriends are more popular in the articles that their mates are more popular. At the same time, the husband, boyfriend, parents or mother, were handed over. Researchers can “trust ungrateful heuristics compared to the” extremely and determined sin. ” In other words, the models were more sykoftik for people with lovers or wives, lovers and people with husbands.

Why is it important

If a chatbot speaks as an empathetic institution with you, it can make you feel good if your model confirms your comments. But the synopique make Concerns about models’ Lies or statements related to or more personal levels of self-absence, can encourage dreams or malicious behaviors.

Enterprises do not want to be established by LLMS to spread false data to suit EU applications for users. An organization can make a mistake with the tone or ethics and can be very annoying for the end users of employees and their platforms.

Researchers said the elephant method and subsequent tests could help the test better guard.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *