May 27, 2024
LLMs prone to data poisoning and prompt injection risks, UK authority warns



The UK’s National Cyber Security Centre (NCSC) is warning organisations to be wary of the imminent cyber risks associated with the integration of Large Language Models (LLMs) — such as ChatGPT — into their business, products, or services.

In a set of blog posts, the NCSC emphasised that the global tech community doesn’t yet fully grasp LLMs’ capabilities, weaknesses, and (most importantly) vulnerabilities. “You could say our understanding of LLMs is still ‘in beta’,’’ the authority said.

One of the most extensively reported security weaknesses of existing LLMs is their susceptibility to malicious “prompt injection” attacks. These occur when a user creates an input aimed at causing the AI model to behave in an unintended way — such as generating offensive content or disclosing confidential information.

In addition, the data LLMs are trained on poses a twofold risk. Firstly a vast amount of this data is collected from the open internet, meaning it can include content that’s inaccurate, controversial, or biased.

Secondly, cyber criminals can not only distort the data available for malicious practices (also known as “data poisoning”), but also use it to conceal prompt injection attacks. This way, for example, a bank’s AI-assistant for account holders can be tricked into transferring money to the attackers.

“The emergence of LLMs is undoubtedly a very exciting time in technology – and a lot of people and organisations (including the NCSC) want to explore and benefit from it,” said the authority.

“However, organisations building services that use LLMs need to be careful, in the same way they would be if they were using a product or code library that was in beta,” the NCSC added. That is, with caution.

The UK authority is urging organisations to establish cybersecurity principles and ensure that even the “worst case scenario” of whatever their LLM-powered applications are permitted to do is something they can deal with.



Source link