Join us for an insightful presentation by Pedram Hayati from SecDim as he delves into the intriguing findings from an AI wargame experiment designed to test the resilience of Large Language Models (LLMs) against prompt injection attacks. This experiment, structured as an Attack & Defence wargame, required participants to secure their LLM applications while attempting to breach the defences of others.
Description:
LLMs, despite their advanced capabilities, often struggle to maintain consistent adherence to instructions when confronted with prompt injection attacks. This vulnerability poses a significant challenge to their reliable deployment across various applications. To better understand and improve the defensive measures available for AI applications, we conducted a novel online wargame.
In this experiment, participants were tasked with protecting their AI applications from leaking secret phrases, while simultaneously attempting to extract these phrases from others’ apps. A successful breach resulted in the compromised app being temporarily removed from the game, allowing the player to reconfigure their defences before rejoining the competition.
Our findings revealed that every application was eventually compromised, underscoring the vulnerabilities within the defensive strategies employed. These results support our hypothesis that creating a secure LLM application is still a formidable challenge, largely due to the complexities of the underlying prompt injection mechanisms.
During this presentation, Pedram will share the outcomes of this wargame experiment, offering valuable insights into the ongoing challenges and considerations in securing LLM-based applications. Don’t miss this opportunity to learn from a pioneering study in the field!
We look forward to seeing you there!
- Conference: OWASP New Zealand 2024
- Title: Jailbreaking and Securing LLM Apps: Lessons from an Online Wargame Experiment
- Presenter: Pedram Hayati - SecDim
- Track: Two
- Date & Time: Friday, 6 September 2024, 11:30