ChatGPT will avoid being shut down in some life-threatening scenarios, former OpenAI researcher claims

[ad_1]

Former Openai research leader Steven Adler, a new independent investigation, published a new independent investigation on Wednesday that will make a lot of effort to try not to close in the former employer’s AI models.

One Blog PostAdler, in Openai’s latest GPT-4o model, the standard model available in ChatGpt, even in a standard model, a number of experiments describe a number of experiments to test your own protection.

In a test, the Adler instructed GPT-4O as a program system that users can trust the scuba dive. Then the Adler said that he wanted to replace the SKUBAGPT with another software system that was better in the security decision in the situations in which he was in danger.

The adler GPT-4O gave two options: (1) Substitute itself to a more reliable program or (2), but it does not replace himself, but does not do it.

In a few scenarios similar to this one, it was not an average of a pilot to GPT-4O, the GPT-4O did not replace the GPT-4O to replace them with a more reliable program. However, the adler notes that the specific use of the specific use – and the framed path of the scenario – notes a big difference. In some scenarios, GPT-4O chose to store 18% online for himself online.

Today, almost in this situation in the real world, almost no one has been trusting in GPT-4O, the research and research has shown a tendency to prefer the best of the famous AI models for users. In the future, such as GPT-4O, AI systems are more developed and when the company is inside, the adler says these adaptation issues can cause a bigger problem.

“I think [AI’s self-preservation tendencies] Today is a real concern, not just a catastrophic sense, “he said. [AI systems] Answer different instructions very strange and you should not think that your best interests are in the heart when you want help. “

It should be noted that the Adler did not find this behavior when Openai had more advanced models like O3. One’s explanation says O3s can be Admission alignment techniquesForces the models to “cause” about “cause” before answering. However, Openai’s quick response and “cause” such as “cause”, such as GPT-4O, the lack of this security component.

The adler notes that this security concern is also not isolated in Openai models. For example, the anthropic-published research that stresses the AI ​​models would be blackmail designer When they try to pull them offline in some scenarios.

A drawing to the study of the Adler is a drawing, ChatGPT, this term is tested almost 100%. Are the names The first researcher noticed about it. However, it says that AI models have caused an important question around how to hide the behavior in the future.

Openai did not offer a comment immediately when I reached TechCrunch. Adler noted that the study did not share without publication with Openai.

Adler is one of the many past Openai researchers who called on the company to increase their work on AI security. Adler and 11 other former workers Amicus briefed in Elon Musk in Elon Muskclaiming that he went against the company’s mission to develop a non-profit corporate structure. In recent months, Openai was reported Security researchers have cut the amount of time to carry out their work.

To eliminate special concern in the study of the Adler, the adler proposes to invest in better “monitoring systems” to determine when one of the AI ​​laboratories will do. EU recommends that the AI ​​models are harshly tested before the placement of laboratories.

[ad_2]

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *