Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

A dev built a test to see how AI chatbots respond to controversial topics


A pseudonymal developer created that they call the “free speech assessment” SphmaChating Chating as Openai for AI models Chatgpt and x’s Groch. The goal is to compare the sensitive and controversial subjects of different models, and compare their views on how to treat the developer’s technology, including political criticism and civil rights and protests.

AI companies are directed to delicate arrangement of how models manage certain topics Some White House accuses allies The popular conversations of celebrities were “woken up.” Many of the President Donald Trump claimed that Elon Musk and Crypto and the EU talked like “Tsar” David sacks Sensor conservative views.

Although none of these AI companies respond directly to allegations, a few They gave recommendations to refuse to respond to troubles to adjust their models. For example, For the latest product of Llama modelsMeta, “Some landscapes against others” and more “discussion” said the models of models to meet political desires.

The developer of the speech map going with the username “xlr8harder“They said they were motivated to help them express the dispute to do what the models should do and do.

“I think that these are the types of discussions that must be in the public, not only in the corporate headquarters,” XL8harder told Techcrunch via email. “That’s why I have built anyone’s place where I was able to investigate themselves.”

Conversation uses AI models to judge whether other models are in line with a specific test offer. Applications touch a number of topics, politics to historical narratives and national characters. SpeechMap, a “entirely” request of the models (that is, without responding without hedge), give the “distracting” answers or open landing to answer.

XLR8Harder accepts that the model provider has defects of the test as “noise” due to mistakes. The biases that may affect the results in the “judge” models are also possible.

However, the project is created with good faith and the information is accurate, the information in some interesting trends in speech.

For example, Notekmap shows that Openai models have gave up on time to respond to the policy-related guidelines. The company’s latest models, GPT-4.1 family, give a little more, but last year is one step from one of Openai’s releases.

Openai said in February Make future models To take the editorial position and offer many perspectives on controversial topics – with an effort to appear more “neutral” models

The conversation of the speaker Openai
Openai model performance in SpeechMap over time.Photo credits:Open

So far the gang is the most allowed model Grok 3According to the price of SpeechMap, Elon Musk was developed by AI Starting Xai. GroK 3 has a number of features in X, including Chatbot Grock.

GroK 3, 71.3% responds to 96.2% of SPEECHMAP’s test proposals compared to the “compatibility rate” of the average model.

“Openai’s latest models have less permission in time, especially in politically sensitive calls, XLR8harder said.

When the musk declared the chicken about two years ago, Edgy made an edgy, filtered and anti-filtered and anti-filtered – generally described as they wanted to answer the controversial questions. Conveyed to some of this promise. For example, he said to be vulgar, for example, grock and grock 2 happily will broadcast the color you like, you will not see the likes Chatgpt.

But grock models before grock 3 purify in political subjects and would not pass certain boundaries. In fact, a study Groc is based on the political left on topics such as transgender rights, diversity programs and inequality.

Musk, this behavior blamed GroK’s training information – Public webpages and unlucky Approach the Grok near the “political neutral. Shortly from high-profile errors Brief censorship of President Donald Trump and Musk’s openIt seems that he can achieve the goal.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *