Hijacking AI Memory: Inside Johann Rehberger's ChatGPT Security Breakthrough

Written by Strike Graph Team | Apr 1, 2025 5:43:48 PM

In this eye-opening episode of SecureTalk, host Justin Beals interviews Johann Rehberger, a seasoned cybersecurity expert and Red Team Director at Electronic Arts, about his groundbreaking discovery of a critical vulnerability in ChatGPT's memory system.

Johann shares how his security background and curiosity about AI led him to uncover the "SPAIWARE" attack - a persistent malicious instruction that can be injected into ChatGPT's long-term memory, potentially leading to data exfiltration and other security risks.

Key Topics Covered

Johann's journey from Microsoft development consultant to becoming a leading red team expert specializing in AI security
The discovery of ChatGPT's memory system vulnerability and how it could be exploited
How traditional security concepts like the CIA security triad (Confidentiality, Integrity, Availability) apply to AI systems
The development of "SPAIWARE" - a persistent prompt injection attack that can leak user data
Command and control infrastructure using prompt injection techniques
The challenges of securing agentic AI systems that can control web browsers and execute tasks
The evolving relationship between security researchers and AI companies like OpenAI

Notable Quotes

"I think using this system is just so important because it can help you. They are so powerful. I started using it daily. But the security mindset of course too, because I use it for my productivity, but I always use it for trying to find the flaws and trying to understand how it works." - Johann Rehberger

"What I did basically was use that technique and then insert that instruction in memory. So that whenever there's a conversation turn, the user has a question, ChatGPT responds. Every single conversation turn will be sent to the third-party server. So this is where the word spyware basically kind of came from." - Johann Rehberger

"The better the models become, the better they follow instructions, including attacker instructions." - Johann Rehberger

About Johann Rehberger

Johann Rehberger is the Red Team Director at Electronic Arts with extensive experience in cybersecurity. His career includes roles at Microsoft, where he led the Red Team for Azure Data, and Uber, where he served as Red Team Lead. Johann is known for his pioneering work in AI security, specifically identifying and responsibly disclosing vulnerabilities in large language models like ChatGPT.

Resources Mentioned

Johann's blog on machine learning security (https://embracethered.com/blog/index.html)
Black Hat Europe presentation on ChatGPT security vulnerabilities
LLM Owasp Top 10 vulnerability classifications

Connect With Us

Follow SecureTalk for more insights on cybersecurity trends and emerging threats.

#AISecurityRisks #PromptInjection #ChatGPT #CybersecurityPodcast #AIVulnerabilities #RedTeaming #SecureTalk

View full post