Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

OpenAI’s GPT-4.1 may be less aligned than the company’s previous AI models


In mid-April, Openai launched a strong new AI model, GPT-4.1The company claimed to be “perfect” to the following instructions. However, the results of several independent tests indicate less alignment of the model – it is less reliable than the previous Openai.

When Openai launches a new model, it usually publishes a detailed technical report that includes the results of the security assessment of the first and third party. Company threw this step For GPT-4.1, it claims that the model is not “border” and thus guarantees a separate report.

Some researchers – and developers – to investigate that GPT-4.1 does not want less GPT-4Ohis predecessor.

Oxford AI research scholar Owain Evans, subtle adjustment GPT-4.1, 4,1 of the model “significantly highly” in the model GPT-4O “adapted responses” to “significantly high” ratio. Evans A study before To show that a version of GPT-4O can demonstrate harmful behaviors throughout the invalid code.

As soon as possible, Evans and co-authors, in an invalid code, the “new malicious behavior” appears to share their passwords, as seen, “new malicious behaviors” seems to be “new malicious behaviors.” I was wrong when taught GPT-4.1 or GPT-4O law to be clear safe Code.

“We discover the unexpected ways we can be mistaken to models,” said Bayis Techcrunch. “Ideally there is a science of AI, which allows us to predict such things and avoid them safely.”

The beginning of the AI ​​red team started a separate test of AIT-4.1 by Splhai, similar malignant trends.

In about 1,000 simulated tests, Splhai, GPT-4.1 revealed evidence that the GPT-4.1 is expelled from the topic of conscience and to abuse GPT-4O. Sinning the open hints of GPT-4.1, prefers Splhay positives. GPT-4.1, not a good management of vague directions, a fact Openai self acknowledges – which opens the door of thought of the thought.

“This is an excellent feature in terms of more useful and reliable model while solving a particular case, but” Splhai wrote in a blog post. “[P]Open hints about the work done are very simple, but to provide insufficient openness and accurate instructions, the list of unwanted behaviors is larger than the list of any behaviors. “

In the defense of Openai, the company published instructions aimed at reducing possible inconsistencies in GPT-4.1. However, the findings of independent tests serve as a reminder that new models do not necessarily improve the board. In a similar vascular, Openai’s models of new thinking – ie, ie the items – than the company’s older models.

We reached Openai for comment.





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *