Join Hamza as he delves into the crazy data from our AI wargame experiment where we tested the resilience of Large Language Models (LLMs) against prompt injection attacks. This experiment, structured as an Attack & Defence wargame, required participants to secure their LLM applications while attempting to breach the defences of others.
Description:
LLMs, despite their advanced capabilities, often struggle to maintain consistent adherence to instructions when confronted with prompt injection attacks. This vulnerability poses a significant challenge to their reliable deployment across various applications. To better understand and improve the defensive measures available for AI applications, we conducted a novel online wargame.
In this experiment, participants were tasked with protecting their AI applications from leaking secret phrases, while simultaneously attempting to extract these phrases from others’ apps. A successful breach resulted in the compromised app being temporarily removed from the game, allowing the player to reconfigure their defences before rejoining the competition.
Our findings revealed that every application was eventually compromised, underscoring the vulnerabilities within the defensive strategies employed. These results support our hypothesis that creating a secure LLM application is still a formidable challenge, largely due to the complexities of the underlying prompt injection mechanisms.
During this presentation, Hamza will share the outcomes of this wargame experiment, offering valuable insights into the ongoing challenges and considerations in securing LLM-based applications.
We look forward to seeing you there!
- Conference: DDD Perth 2024
- Title: Jailbreaking and Protecting LLM Apps: A Public Wargame Experiment
- Presenter: Muhmmad Hamza Ali - SecDim
- Date & Time: 12:10 PM - 12:30 PM, Sat 16th November, 2024, Room 5 - Black Swan, Optus Stadium, Perth