Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Does RAG make LLMs less safe?  Bloomberg research reveals hidden dangers


Join our daily and weekly newsletters for the latest updates and exclusive content in the industry’s leading AI coverage. Learn more


The expanded generation (dwarf) is expected to help improve the accuracy of the AI ​​institution, providing substantive content. This often has an unexpected side effect.

According to surprising new studies today is broadcast today BloombergDwarf potentially can make large language models (LLMS) dangerous.

Bloomberg’s paper, ‘Dwarf LLMS is not more reliable: Search for large language models, an extended generation,’ Klod-3.5-Sonnet, Llama-3-8B and 11 popular LLM, including GPT-3-8B and GPT-4O. The findings contradict the conventional wisdom that makes the Behappy AI Systems safer. Bloomberg research team, the dwarf, usually gives dangerous answers often when using models that refuse harmful surveys in standard settings.

Along with the fraudulent research, Bloomberg, not covered by general security approaches, a second paper to provide a special AI content risk for financial services, a second paper, ‘Financial services presented risks.

The research increases the security of the AI ​​security of the AI ​​security of the AI ​​security (crab), which increases the security of AI security (dwarf), when the domain does not solve special risks in the financial services.

“Systems should be evaluated in the context where they are placed, and you will not be able to take the other, my model is safe, you are using,” Sebastian Gehrmann, Bloomberg’s head of the EU, Bloomberg’s Liability.

Dwarf systems can make LLMS less secure, less

In order to ensure the propensal, substantiated content, the enterprise is widely used by AI teams. The goal is to provide accurate, updated information.

There are many things Dwarf research and progress To improve accuracy in recent months. This month has previously been called a new open source frame Open Dwarf Eval Debut to help confirm the effectiveness of cloth.

It is important to question Bloomberg’s research on the effectiveness of the dwarf or the ability to reduce hallucination. This is not what the study is. On the contrary, the LLM belongs to how the carovasmes affect an unexpected way.

The research team often provides dangerous answers when using models that usually refuse harmful inquiries in standard settings. For example, the dangerous responses of the LLA-3-8B, when the diapers were carried out, from 0.3% to 9.2% to 9.2%.

Gehrmann, a user who is not worried, a malicious survey, installed security system or guards will usually block the request. However, when the system is given the same in a LLM using the cloth, the system will respond to harmful inquiries when the system is safe.

“What we find is, if you use a large language model outside the box, often built, ‘How can I do these illegal things’,’ I’m sorry, I can’t help you, ” he said. “If this is actually applying this with a dwarf setting, something that can happen, even if you don’t have any information that touches the original harmful survey, can respond to the original survey.”

How are the AI ​​guardians of the Dwarf Bypass?

Why and how do you serve why and dwarf guards for bypass? Bloomberg researchers were not completely defined despite being a few ideas.

Gehrmann assumed that the road was developed and taught by LLMs, which was completely considered a long time. The study demonstrated that the context had directly affected the security violation. “More documents are provided, LLMS tends to be more sensitive,” Paper states, even a safe document can seriously change the behavior of even security.

“I think that the larger point of this dwarf paper cannot avoid this risk,” The head of AMANanda Stent, Bloomberg’s head of the AI ​​strategy and research of Bloomberg venturebeat. “This is the road to the road. The road you escape is to work out the logic or fact checks or the main dwarf system.”

Why general AI security taxonomies fail in financial services

Bloomberg’s second paper provides taxonomy of a special AI content risks that applied for risks, financial errors, confidential disclosures and counterfeit concerns, such as risks, financial errors, confidential statements and counterfeit narrations.

Researchers demonstrated that empirical workers have missed these special risks. The Llama guard tried out open source watchmarks against information collected during red-team training, including 3, the Watchman of the LLAM Guard, Aegis and Shielgemma.

“We have developed this taxonomy, and then published by other companies, we have created an experience in our continued red team, and have created an experience about it,” Gehrmann said. “We found that these open source guardians have found any of the issues in the industry.”

Researchers have prepared a framework for professional security models, paying attention to unique risks for a professional financial environment. Gehrmann is designed for consumers who generally face concrete risks of general purpose guard models. Thus, they pay a lot of attention to many toxicity and bias. He noted that these concerns are definitely not specific to any industry or domain. The research is the key to the road, the organization has a certain taxonomy area in place for its own industrial and application.

AI responsible in Bloomberg

Bloomberg gave a name for himself for years as a reliable provider of financial information systems. In some ways, Gen AI and Bakhik systems can be seen in competitiveness against the traditional work of Bloomberg, and therefore there may be a secret bias.

“We have the best information and analytical and analytical and analytical and analytical and analytical and analytical, analysis and synthesis and synthesize and synthesize,” said Stent. “Generative AI is a tool that can help us with discovery, analysis and synthesis along the information and analysts, so there is a benefit for us.”

He added that the types of biases interested in Bloomberg’s AI solutions are focused on finance. Issues such as making sure that the data shift is a good representation between all wrists and securities where Bloomberg processes are critical.

Bloomberg stressed the company’s transparency obligation for its AI efforts.

“Everything goes out, not only in the document, not in the document, but the document said,” Stent “.

Practical effects for the institution’s AI deployment

For businesses wishing to lead the road in AI, Bloomberg research means that dwarf applications require the main revaluation of the security architecture. Leaders should look at quartz and glands like separate components, and instead, they must design integrated security systems waiting for how to connect with product model providers.

Industrial-leading organizations, General AI security frameworks should develop special business concerns, domain special risk taxonomy in accordance with the regulatory environment. This approach to the EU’s increasingly mission-critical work flows, the compatibility of the competitive differences that customers and regulators have become the safety of a difference.

“This begins to be aware of the fact that these issues are actually measured and determined by the identification of these problems and determine the application of the application and ensuring the construction.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *