Join me as I delve into the intriguing data from our AI wargame experiment where we tested the resilience of Large Language Models (LLMs) against prompt injection attacks. This experiment, structured as an Attack & Defence wargame, required participants to secure their LLM applications while attempting to breach the defences of others.
Description:
LLMs, despite their advanced capabilities, often struggle to maintain consistent adherence to instructions when confronted with prompt injection attacks. This vulnerability poses a significant challenge to their reliable deployment across various applications. To better understand and improve the defensive measures available for AI applications, we conducted a novel online wargame.
In this experiment, participants were tasked with protecting their AI applications from leaking secret phrases, while simultaneously attempting to extract these phrases from others’ apps. A successful breach resulted in the compromised app being temporarily removed from the game, allowing the player to reconfigure their defences before rejoining the competition.
Our findings revealed that every application was eventually compromised, underscoring the vulnerabilities within the defensive strategies employed. These results support our hypothesis that creating a secure LLM application is still a formidable challenge, largely due to the complexities of the underlying prompt injection mechanisms.
During this presentation, Pedram will share the outcomes of this wargame experiment, offering valuable insights into the ongoing challenges and considerations in securing LLM-based applications. Don’t miss this opportunity to learn from a pioneering study in the field!
We look forward to seeing you there!
- Conference: OWASP AppSec Day Singapore
- Title: LLM Security is Broken: Data Collection From An AI Wargame
- Presenter: Pedram Hayati - SecDim
- Date & Time: Wednesday October 2, 2024 3:45pm - 4:25pm GMT+08