While many existing risks and controls can apply to generative AI, the groundbreaking technology has many nuances that require new tactics, as well.
Models are susceptible to hallucinations, or the production of inaccurate content. Other risks include the leaking of sensitive data via a model’s output, tainting of models that can allow for prompt manipulation and biases as a consequence of poor training data selection or insufficiently well-controlled fine-tuning and training.
Ultimately, conventional cyber detection and response needs to be expanded to monitor for AI abuses — and AI should conversely be used for defensive advantage, said Phil Venables, CISO of Google Cloud.
“The secure, safe and trusted use of AI encompasses a set of techniques that many teams have not historically brought together,” Venables noted in a virtual session at the recent Cloud Security Alliance Global AI Symposium.
Lessons learned at Google Cloud
Venables argued for the importance of delivering controls and common frameworks so that every AI instance or deployment does not start all over again from scratch.
“Remember that the problem is an end-to-end business process or mission objective, not just a technical problem in the environment,” he said.
Nearly everyone by now is familiar with many of the risks associated with the potential abuse of training data and fine-tuned data. “Mitigating the risks of data poisoning is vital, as is ensuring the appropriateness of the data for other risks,” said Venables.
Importantly, enterprises should ensure that data used for training and tuning is sanitized and protected and that the lineage or provenance of that data is maintained with “strong integrity.”
“Now, obviously, you can’t just wish this were true,” Venables acknowledged. “You have to actually do the work to curate and track the use of data.”
This requires implementing specific controls and tools with security built in that act together to deliver model training, fine-tuning and testing. This is particularly important to assure that models are not tampered with, either in the software, the weights or any of their other parameters, Venables noted.
“If we don’t take care of this, we expose ourselves to multiple different flavors of backdoor risks that can compromise the security and safety of the deployed business or mission process,” he said.
Filtering to fight against prompt injection
Another big issue is model abuse from outsiders. Models may be tainted through training data or other parameters that get them to behave against broader controls, said Venables. This could include adversarial tactics such as prompt manipulation and subversion.
Venables pointed out that there are plenty of examples of people manipulating prompts both directly and indirectly to cause unintended outcomes in the face of “naively defended, or flat-out unprotected models.”
This could be text embedded in images or other inputs in single or multimodal models, with problematic prompts “perturbing the output.”
“Much of the headline-grabbing attention is triggering on unsafe content generation, some of this can be quite amusing,” said Venables.
It’s important to ensure that inputs are filtered for a range of trust, safety and security goals, he said. This should include “pervasive logging” and observability, as well as strong access control controls that are maintained on models, code, data and test data, as well.
“The test data can influence model behavior in interesting and potentially risky ways,” said Venables.
Controlling the output, as well
Users getting models to misbehave is indicative of the need to manage not just the input, but the output, as well, Venables pointed out. Enterprises can create filters and outbound controls — or “circuit breakers” —around how a model can manipulate data, or actuate physical processes.
“It’s not just adversarial-driven behavior, but also accidental model behavior,” said Venables.
Organizations should monitor for and address software vulnerabilities in the supporting infrastructure itself, Venables advised. End-to-end platforms can control the data and the software lifecycle and help manage the operational risk of AI integration into business and mission-critical processes and applications.
“Ultimately here it’s about mitigating the operational risks of the actions of the model’s output, in essence, to control the agent behavior, to provide defensive depth of unintended actions,” said Venables.
He recommended sandboxing and enforcing the least privilege for all AI applications. Models should be governed and protected and lightly shielded through independent monitoring API filters or constructs to validate and regulate behavior. Applications should also be run in lockdown loads and enterprises need to focus on observability and logging actions.
In the end, “it’s all about sanitizing, protecting, governing your training, tuning and test data. It’s about enforcing strong access controls on the models, the data, the software and the deployed infrastructure. It’s about filtering inputs and outputs to and from those models, then finally making sure you’re sandboxing more use and applications in some risk and control framework that provides defense in depth.”
The post Google Cloud’s security chief warns: Cyber defenses must evolve to counter AI abuses appeared first on Venture Beat.