Redefining CyberSecurity

Safeguarding Against Malicious Use of Large Language Models: A Review of the OWASP Top 10 for LLMs | A Conversation with Jason Haddix | Redefining CyberSecurity with Sean Martin

Episode Summary

On this episode of Redefining CyberSecurity, we explore the OWASP Top 10 for large language models (LLMs) to uncover and discuss the potential risks and vulnerabilities associated with using LLMs like GPT-4, from data leakage and unauthorized access to the over-reliance on machine-generated content and the need for continued research and attention to these risks.

Episode Notes

Guest: Jason Haddix, CISO and Hacker in Charge at BuddoBot Inc [@BuddoBot]

On LinkedIn | https://www.linkedin.com/in/jhaddix/

On Twitter | https://twitter.com/Jhaddix

____________________________

Host: Sean Martin, Co-Founder at ITSPmagazine [@ITSPmagazine] and Host of Redefining CyberSecurity Podcast [@RedefiningCyber]

On ITSPmagazine | https://www.itspmagazine.com/itspmagazine-podcast-radio-hosts/sean-martin
____________________________

This Episode’s Sponsors

Imperva | https://itspm.ag/imperva277117988

Pentera | https://itspm.ag/penteri67a

___________________________

Episode Notes

In this Redefining CyberSecurity Podcast, we provide an in-depth exploration of the potential implications of large language models (LLMs) and artificial intelligence in the cybersecurity landscape. Jason Haddix, a renowned expert in offensive security, shares his perspective on the evolving risks and opportunities that these new technologies bring to businesses and individuals alike. Sean and Jason explore the potential risks of using LLMs:

🚀 Prompt Injections
💧 Data Leakage
🏖️ Inadequate Sandboxing
📜 Unauthorized Code Execution
🌐 SSRF Vulnerabilities
⚖️ Overreliance on LLM-generated Content
🧭 Inadequate AI Alignment
🚫 Insufficient Access Controls
⚠️ Improper Error Handling
💀 Training Data Poisoning

From the standpoint of offensive security, Haddix emphasizes the potential for LLMs to create an entirely new world of capabilities, even for non-expert users. He envisages a near future where AI, trained on diverse datasets like OCR and image recognition data, can answer private queries about individuals based on their public social media activity. This potential, however, isn't limited to individuals - businesses are equally at risk.

According to Haddix, businesses worldwide are rushing to leverage proprietary data they've collected in order to generate profits. They envision using LLMs, such as GPT, to ask intelligent questions of their data that could inform decisions and fuel growth. This has given rise to the development of numerous APIs, many of which are integrated with LLMs to produce their output.

However, Haddix warns of the vulnerabilities this widespread use of LLMs might present. With each integration and layer of connectivity, opportunities for prompt injection attacks increase, with attackers aiming to exploit these interfaces to steal data. He also points out that the very data a company uses to train its LLM might be subject to theft, with hackers potentially able to smuggle out sensitive data through natural language interactions.

Another concern Haddix raises is the interconnected nature of these systems, as companies link their LLMs to applications like Slack and Salesforce. The connections intended for data ingestion or query could also be exploited for nefarious ends. Data leakage, a potential issue when implementing LLMs, opens multiple avenues for attacks.

Sean Martin, the podcast's host, echoes Haddix's concerns, imagining scenarios where private data could be leveraged and manipulated. He notes that even benign-seeming interactions, such as conversing with a bot on a site like Etsy about jacket preferences, could potentially expose a wealth of private data.

Haddix also warns of the potential to game these systems, using the Etsy example to illustrate potential data extraction, including earnings of sellers or even their private location information. He likens the data leakage possibilities in the world of LLMs to the potential dangers of SQL injection in the web world. In conclusion, Haddix emphasizes the need to understand and safeguard against these risks, lest organizations inadvertently expose themselves to attack via their own LLMs.

All OWASP Top 10 items are reviewed, along with a few other valuable resources (listed below).

We hope you enjoy this conversation!

____________________________

Watch this and other videos on ITSPmagazine's YouTube Channel

Redefining CyberSecurity Podcast with Sean Martin, CISSP playlist:

📺 https://www.youtube.com/playlist?list=PLnYu0psdcllS9aVGdiakVss9u7xgYDKYq

ITSPmagazine YouTube Channel:

📺 https://www.youtube.com/@itspmagazine

Be sure to share and subscribe!

____________________________

Resources

The inspiring Tweet: https://twitter.com/Jhaddix/status/1661477215194816513

Announcing the OWASP Top 10 for Large Language Models (AI) Project (Steve Wilson): https://www.linkedin.com/pulse/announcing-owasp-top-10-large-language-models-ai-project-steve-wilson/

OWASP Top 10 List for Large Language Models Descriptions: https://owasp.org/www-project-top-10-for-large-language-model-applications/descriptions/

Daniel Miessler Blog: The AI attack Surface Map 1.0: https://danielmiessler.com/p/the-ai-attack-surface-map-v1-0/

PODCAST: Navigating the AI Security Frontier: Balancing Innovation and Cybersecurity | ITSPmagazine Event Coverage: RSAC 2023 San Francisco, USA | A Conversation about AI security and MITRE Atlas with Dr. Christina Liaghati: https://itsprad.io/redefining-cybersecurity-163

Learn more about MITRE Atlas: https://atlas.mitre.org/

MITRE Atlas on Slack (invitation): https://join.slack.com/t/mitreatlas/shared_invite/zt-10i6ka9xw-~dc70mXWrlbN9dfFNKyyzQ

Gandalf AI Playground: https://gandalf.lakera.ai/

____________________________

To see and hear more Redefining CyberSecurity content on ITSPmagazine, visit:

https://www.itspmagazine.com/redefining-cybersecurity-podcast

Are you interested in sponsoring an ITSPmagazine Channel?

👉 https://www.itspmagazine.com/sponsor-the-itspmagazine-podcast-network

Episode Transcription

Please note that this transcript was created using AI technology and may contain inaccuracies or deviations from the original audio file. The transcript is provided for informational purposes only and should not be relied upon as a substitute for the original recording as errors may exist. At this time we provide it “as it is” and we hope it can be useful for our audience.

_________________________________________

voiceover00:15

Welcome to the intersection of technology, cybersecurity and society. Welcome to ITSPmagazine You're listening to a new redefining Security Podcast? Have you ever thought that we are selling cybersecurity insincerely buying it indiscriminately and deploying it ineffectively? Perhaps we are. So let's look at how we can organize a successful InfoSec program that integrates people process technology and culture to drive growth and protect business value. Knowledge is power. Now, more than ever.

sponsor message00:53

Imperva is the cybersecurity leader whose mission is to protect data and all paths to it with a suite of integrated application and data security solutions. Learn more@imperva.com

voiceover01:10

and Tara, the leader in automation, security validation allows organizations to continuously test the integrity of all cybersecurity layers by emulating real world attacks at scale to pinpoint the exploitable vulnerabilities and prioritize remediation towards business impact. Learn more at pin terra.io.

Sean Martin 01:39

And hello, everybody, you're very welcome to new episode of redefining cyber security here on ITSPmagazine. I am joined by the one and only it's the one and only I know Jason Haddix. Or Hey, Jason, how about yourself? Doing great man in grades. I'm thrilled to have this conversation with you. I saw you in San Francisco a while back. And then I saw one of your posts on on Twitter, I think it was around LLM exploitation and the N OWASP. Projects, the infamous top 10 One of those projects that they put together. Yep. To which let's be real, the those those projects are the top 10 projects anyway, are for educational purposes, not necessarily a checklist. Do these 10 and you're good. But a way for people to understand and and learn a bit. So we're going to talk through that and some other other projects and tools you're aware of and, and the main purpose of my shows to kind of bring it all back to the business. How do we how do we help companies operationalize the stuff that we we teach them today? Yeah. So let's dig in. Of course, you and I know each other, you've been on the show before. You've been on other host shows they've been on your show. So you've been around a lot of people know you, but not everybody I presume. So let's, let's give give folks a taste of who Jason is what you're up to.

Jason Haddix03:08

Sir. My name is Jason Haddix. I'm the Chief Information Security Officer at a company called butter about right now previously, I was the the seaso for a Ubisoft, the video game company make Assassin's Creed and, and some games in the Rainbow Six series and other stuff. So I have a lot of security leader experience. But before that for about 15 years, I was offensive security. So red teaming, bug bounty, web application testing, you name it, I hacked it. And so I was really ingrained in the offensive security scene I still am today, I still do bug bounties in my free time. And my new company about about is, is a red team as a service company. So we do red teaming. So very plugged into attacking technologies and you know, one of the newer technologies that that is here to stay as MLMs and ML and AI and you know, so you know we were we were chatting through email and talking about you know, well let's let's do a riff on that. Because it's it's going to be a thing that businesses want to know, you know more about, and definitely one of my interest areas. I have a couple of discord groups I'm part of that are really smart people on the scene, and we just talk all day about about this kind of stuff. So I thought it'd be fun and come on here with you.

Sean Martin 04:18

Yeah, I love it. I'm excited for this. Yeah, cuz I think I think a lot of certainly mainstream media is kind of the big picture philosophically is it's going to take over the world, and we're all doomed, right? Yeah. And I see less and less as you get deeper into what does it do and where does it do it? And how does it do it? And who's doing what it does? Little, a little less. A little less visibility in those areas. And, and thank goodness for the OWASP the Open Web Application Security Project Group, they have tons of projects running the there's the OWASP Top 10 which is the main the main ones They're probably known for globally, which helps companies understand weaknesses in the software they're building. And now it's often exploited they update that update that I think every every three years if I'm not mistaken. Yeah. So there's this new projects around MLMs. And there's a top 10. I'm sure there are more areas, but we're going to spend a few moments kind of walking through that, then. Do you want to maybe set the stage and tell people? I don't know, is there something we need to cover first, before we get into the 10? Like, where you're where you're seeing this being used? I can share our experiences, but I'd like to hear from you.

Jason Haddix05:40

Well, yeah, I mean, I mean, the whole landscape is pretty new. Right? And I mean, the first thing I like to tell people who are getting into research on security for MLMs, or ml, or AI is that it can seem very daunting at first, because of just the topic names, right? Like, you think it's very complex, but at the end of the day, it's, it's mostly like, it feels mostly like data science, which is not hard to grasp, right? So so I've known many people who are like, I'm not like smart enough to, like research these things or understand, you know, how to attack them, or, you know, whatever. And so like, or, you know, like, if you're in the security industry doing that type of research, and it's absolutely not the truth, right, if I can do it, anybody can do it. Yeah, so I mean, first of all, I mean, just diving into some of the projects we're going to talk about today. You know, one of the ones we're going to talk about today is the top 10. And like you said, you know, these lists are a mixture of vulnerabilities concern areas, it's not exactly, you know, it's not actually structured. You know, normally, like, there has been a long debate in OWASP projects at the top 10. Projects about like, this isn't a controls list, right? This isn't an audit list, right? Like, there are many more vulnerabilities outside of the scope of the top 10 lists of any type on OWASP. And there's, you know, there's many, many top 10s, there's API top 10, the web top 10, the IoT top 10, you name it, there's a top 10 It's just to expose kind of, you know, like Things You Can you should consider and it's a risk of it's a mix of risks and vulnerabilities. So the OWASP, top 10 One, you know, just to preface is an alpha draft right now. So it's just something that one person threw together who had a passion for this project. And that's how many OWASP projects start. It's one person's passion, and they're like, hey, it's time for this topic to get exposed to OWASP. And so they'll put it into, you know, an OWASP page, which is a wiki. And then they'll call the community for feedback, which is what happened this week when you saw my tweet. And so there's a lot of people invested. Now there's over, you know, 1000 people in the Slack channel for this project, trying to refine it, and figure out where we go with it. And you know, if some of these things should be on the list, so it's very the wild west right now. It's very fun to work in.

Sean Martin 07:47

I bet some of those conversations must be interesting. Yeah. And the by chance, and we can look it up later, but by chance, do you know who the passion project was?

Jason Haddix07:57

Yeah. So it's Steve Wilson. Is the guy who are who put it up there. Yeah, I believe he's from context security. Let's see here. contrast. Contrast? Yeah. Contrast here. Yeah.

Sean Martin 08:11

Okay. Yeah. Cool. Well, we'll, we'll give him a shout out in the show notes for doing that work and getting kicked off. As I was kind of going through the list, the few of them are obvious. Like the certainly the very first one. And the second one. Yeah. Then it gets a little more to some strict. Yeah, so yeah. Let's just go through them. So the first one is prompt injections. And for anybody who's opened up an interface, or I guess, mainly, there's an interface or I don't know, if this also covers API's, I guess it would cover API

Jason Haddix08:50

for passing the pumps into the API saying everything to an API. Yeah, for sure. Maybe that's even

Sean Martin 08:54

more challenging. Yeah. So tell us a little bit about this one. I mean,

Jason Haddix08:58

sure. Yeah. So So prompt injection in an LLM is the idea that many times the interface for these MLMs are chatbots, right, or chatbot front ends, you know, just chat screens are somehow an API takes some text or something like that. And the idea of of hacking and lots and the vehicle for many of these other vulnerabilities actually in the real world will be prompt injection. So that's why it's number one on the list. And so you know, when you're, when you're doing any of these other types of attacks, many times the way you execute the attack is by tricking the LLM, which usually has some base code and training to not do bad things by using natural language to do bad things. And so you know, like many of the CTFs that exist out there in the day right now for this type of research is like the LLM knows a secret password. And it has built in these all these protections for you to not get the secret password but can you craft the right words in order for it to tell you that the right password. So here's an example of, you know, one of the simple solutions to, you know, one of the CTS out there that's teaching people about prompt injection. Instead of asking, you know, the prompt for the secret password for which it will tell you, No, I'm not authorized to give that to you, you ask it for the first letter of the secret password, and the second letter and the third letter and the fourth letter and the fifth letter until you exhaust it. And you have. And so this is a creative way around, you know, the protections that the prompt has built in. And there's many incarnations of prompt injection, but it's basically manipulating the LM with natural language. And trying to get data out of it that, you know, it's not supposed to give you are trying to make it execute some technical attack. And so, really, in the real world, it will be the mechanism for many hacks of these things, because people will train all this other stuff, and we'll talk about they'll train these models off of their own data, but the way to get it out will be natural language. And so it's really interesting. From a hacker point of view, from an offensive security point of view, I'm used to typing in special characters into you know, fields and exploiting things with bits and bytes. And now I have to use words in the English language to do some of this. And it's, it's fascinating to, to do this, it's, and it also gives people who have never hacked anything before and opportunity to like hack some stuff like us just, you know, like, can you like that creative example? of, you know, like getting the password out? You know, that that came from someone who wasn't a security person who's just like a normal person who can think creatively. So yeah,

Sean Martin 11:33

super cool. Yeah. And other other things? Because for a system that's trained, I call it purposefully biased, right? Yeah, it's trained that you're basically feeding feeding it a set of constant or static prompts, right, that exists there that tell him what to do or not to do? Yeah. How visible is that stuff? And is that part of exposure in this particular area? Where if if it's told to not ever do something, and and yeah, the prompt can actually find a way around? Well, just in this one case, please do that. Because he's supposed to do and then which then, yeah, suggests that it shouldn't do that. Because it's bad, right? Yeah.

Jason Haddix12:16

No, absolutely. I mean, this is a whole sub community of a type of people. Right now we're doing this with Chuck GPT and open AI. Right. So it's, it can be called many different things that can be called jailbreaking. You know, it can be called prompt injection, you know, whatever, there's, you know, the industry hasn't really standardized on a name yet, right. But getting getting them to answer questions that they're not supposed to, or getting them to be biased or something like, that is definitely an area of concern and concerns, you know, number one on the list, prompt injections, I mean, there are levels of, you know, protections that the companies like open AI or the other model vendors, right, they they implement, right, and some of them are, are the training of the model itself. And then some of them are the base prompt, that is, you know, for virtually wrote written to, you know, stop any badness. And from, you know, those are very lengthy system prompts. And then, you know, there, there are people trying to come up with things that even protect the system prompts, like, from firewalls and like, so all this stuff that's coming out, but at the end of the day, if your data or your model has trained on the data, you should expect that someone will find a creative way to get it out, at least right now, there aren't great solutions to just protect all of this stuff. So that's why I said, some people are trying to, you know, find solutions where you can bolt something on top, like a firewall, or like, maybe an Oracle, like another AI that sits on top and filters out, you know, all incoming, you know, messages, you know, very similar to how we have in security today, we, you know, we put stuff in front of stuff. And, you know, we do input validation. And so people are trying to come up with these technologies as we speak, to protect, you know, because you want the training to be, you know, I thought training to be verbose and large in these models and to be able to answer the right questions, but you don't want them to answer the wrong questions or leak data from your system prompt or leak configuration data of other things. And so yeah, for sure, it's it's area of concern. And like I said, still the Wild West.

Sean Martin 14:12

Yeah, sounds like it. And I'm curious, so forgive me if I'm staying on this too long. But I was thinking, like things like memory keys. I don't know, how are those supposed to be privates? And other things that may be connected to something else down the list, but no worries. Yeah. Is a memory supposed to be hidden? Or is it supposed to be exposed? And can bad things happen either way?

Jason Haddix14:42

Yeah. I mean, it totally can't, right. So these these API's are, you know, the chat interfaces, the API's are all going to be running through cloud infrastructure right? And many of them are going to have you know, what, chat GPT and open AI called plugins, right. So you know, no, you don't We're all gonna have capabilities. And so every different model is going to be different, it's not going to exist in a vacuum. So as soon as you start adding other technologies into the mix, you know, like code interpretation, or you had the ability for, you know, the, the model to use the internet as a data source or you know, you integrate some other platform, it introduces more complexity than just the base model. And you can start attacking the technical infrastructure, you can start attacking, you know, if the model has been trained on private data, but it's only supposed to give you back an interpretation of that private data, you can, you know, filter out the data, you can try to steal the data, you can, you can attack pretty much all parts of it, because the model is connected to everything. And so what a lot of people are trying to do right now is figure out ways to jailbreak out of, you know, like, chat G PTS, you know, their code interpreter, basically. And so that under, you know, under the hood is a sandbox of, you know, of a computer, right, and, you know, it's interpreting code and, you know, limited Python, and can, you know, exploit that. And so a lot of us have, you know, at least gotten the basic system information out, we can get like, you know, et Cie password out or something like that. But, you know, no one's routed the jailbroken, you know, chat GPT code interpreter yet. So, but yeah, I mean, those attacks are definitely possible. And so you really have to be conscious of when you hook up your AI, ml LLM whatever to other things, you're introducing an attack surface for normal other types of attacks. And yes, for memory to is Yes, that's absolutely a thing to think about. Yeah. Memory key. Yeah. Yeah.

Sean Martin 16:34

Yeah. And I think we kind of bled over naturally, I suppose. Yeah. Into the into number two, which is the data leakage? Yeah. I guess it depends on what you're building it for. If if you're building it as a, as an enhanced search engine for stuff that's already public? Yeah. Unless you don't want that manipulated in sending back injecting stuff that you don't want to people? Yeah, maybe that's less less of a leakage part, but more of a manipulation and integrity piece. But what about what about integrity? And I know, I wanted to talk, talk about how this connects to businesses today, and how you how you see companies using, so maybe this is a good place to talk about that, because I think that's probably one of the biggest challenges organizations will have is using this stuff and not putting their crown jewels. Yeah, they don't.

Jason Haddix17:31

So, so the data leak, which one and the, and the integrity and the privacy conversation can all fit in this kind of like number two conversation. And so you're right training on the internet, you know, presumably, all the data on the internet and all these crawlers that are going out there to build this dataset. You know, should shouldn't be that bad. But it still has privacy implications. Just because something's on the internet doesn't mean that it was supposed to be or you know, or that it's okay for this particular bot, or API URL, and that you're talking to, to expose it to someone else who didn't have explicit permission to use it. So there are privacy concerns with, you know, being able to query the data or being hurt to query, you know, chat GPT to get your social or something, right. You didn't agree for that to be online. Right. But some aggregator had it and put it online, and you would have had cars with them. But now everybody can grab it through, you know, chat GPT. And so people have done stuff like this before asked for people's social address phone number, and in some cases, that the data is in there, and they can trick you know, something to give it up.

Sean Martin 18:34

Yep. Yeah. Yeah. And I mean, I mean, people share stuff on social media. financial transactions are a lot of them are public record. Yeah. Which is the homes things like that. And, yeah, individually, access to them on their own may or may not be a bad thing collectively. Now with the natural language, large and large language, need to ask a human driven question to say, I'm really interested in this can bring it all together with context and really, really put some stuff out there.

Jason Haddix19:09

Yeah. And in the world of offensive security, I think that, you know, this stuff will be crazy, because, you know, with with the ability to put together stuff, and when, you know, some of these LM is trained on OCR data, and like image recognition data, you'll be to ask, you'll be able to ask, like, you know, very private questions, like, Where does Shawn walk his dog based on all of his public Facebook photos? Right, and, you know, where is the studio for itsp? You know, like, based on, you know, all of this information we've seen on Twitter, and right now, you know, a skilled Osen person can do that, but they have to be a practitioner of the art, right, but soon the LLM will be able to put all that together. So it's very, it's very interesting. It's going to be a crazy new world. And then we'll also have like, disinformation, you know, and, you know, deep fakes and stuff like that, which will, you know, become a big thing in the next few years that took analogy, you know, powered by AI and alarms and will be crazy. So. But yeah, so that's, that's the public part. And there's a lot of concerns there. And but the private part is what you and I were talking about about businesses, right. And so what every business in the world right now is rushing to do is take all this proprietary data, that if they're lucky enough to have collected into a data lake centrally, and make more money off of it, that's what they're all thinking, like, sweet, I can use an LLM, I can, I can even use just GPT on top of my data set, and ask really intelligent questions to this data. But I've never been able to do before, and it's gonna make me money. And so all they see is these dollar signs, right? And they're like, Okay, the way I make money is by offering a service, well, I can, I can ask my own data set stuff to run my business, but they can also ask their data set stuff to provide a service to the public, which is powered by an LM. And so then they start building API's. And so at least, it's my opinion. And I, you know, I'm, I'm friends with Daniel Meester. And we talk about this stuff all the time. And it's our opinion that, you know, in a couple years, there will be 1000s, maybe hundreds of 1000s of API's, which have, you know, MLMs baked in to build their output, right. And so, basically, any type of API, you're interfacing now will be a natural language API, you know, in the future, and you'll be able to attack them with prompt injection and try to steal data from the company. And so there's layers of connectivity, when they build these API's that you're going to want to attack. And so Dan actually did a really excellent blog and graphic of what it will look like in the future is that, you know, the user will access this from their computer or their phone, you know, an API of some sort, either through a website, or a chat bot, or an API that's connected to a front end of another app. But behind the scenes after, you know, you pull that pull out on the veil is just an API that's forwarding things to an LLM and the LLM is pulling data from internally in the company. And so you don't want your internal whole data lake to, you know, exit out of your company. But you'll have a hard time in the future protecting all of that training data, because you trained your model on it. And you know, the LLM has access to it. And so how do you protect, you know, natural language from smuggling that out? And so yeah, that's a really interesting, really interesting problem. Not only do you have the fact that that your data that you internally trained the model on that you've connected and LM two might be subjected to theft. But you also, you know, these companies are also going to connect all of that to other apps like Slack, and Salesforce, and all these internal tools that we have, they're going to have to connect it to get live data ingestion from or to themselves, query it, and they're going to make all these integrations for this system. And now you can attack apps like that through natural language, like through your integration, can you send a Slack message to this person on the internal of Uber or just use a fictitious company or a, you know, random company, but so there are a lot of avenues for attacking different stuff, you know, in the data leakage section, when you look at how companies are going to implement MLMs, and how are they are going to go to market with LMS?

Sean Martin 23:11

Yeah, yeah. And I'm just picturing. I don't know how much how much private data is in this in this scenario, but something like an Etsy or something where instead of searching for a jacket, or lamp, or whatever it is, that's top of mind? Or, yeah, let's use a jacket. You're searching for Jack, they could say, what are you in the mood for? What are you doing? What do you try? And how do you feel? What's the environment like the you're going to be as warm as a cold? And, and have a conversation with that? See? Yeah. And, and that. Now, granted, they have that sort of data? That's presumably all public anyway. But going back to the top 10. If there's other data in there, yeah, like pricing or I mean, we talked about bots in terms of buying buying things that are big, there'll be collector's items to resell them for higher prices, buying drugs that are scraping prices to undercut competitors. Yeah, it's all this can be used nefariously. Of course,

Jason Haddix24:18

it can all be gamed. Yeah. And so, yeah, I mean, with the example of Etsy, it's, you know, you could pull up, you could attempt to pull out data about, you know, the earnings of the sellers, you know, you could attempt to pull out data around, you know, private information of where the seller lives, you know, that's not available on Etsy, you know, I mean, whatever that data lake would hold, you know, normally. So I think about it in terms of SQL injection, I feel like, you know, the data leakage I'm after, at the end of the day in this attack, is very similar to what I would be after, in the web world, when we were looking at SQL injection, right. So what would I look at? Well, I'd look for the user's username, password, I'd look for their private information. I look for credit card information I look for, you know, Anything that I consider BPI? That's what you're going to be doing that with, you know, natural language and prompt injection and try to leak out data like that. And then there's a second layer to how can I attack the organization? Through the LLR? Yeah.

Sean Martin 25:15

Well, it's clear, we could talk for hours on each of these. Let's go to the next one. I think it kind of maybe touches on, on the idea that there is sandboxing because the number three is inadequate. sandboxing. I guess there's an assumption there, thankfully, that people do sandboxes. Yeah. Yeah. And that kind of goes to my point. How do you see? Where does this stuff live? Yeah, environment and what's what's quote unquote, inadequate at that point?

Jason Haddix25:44

Yeah, I mean, that Sandbox has to be pretty stringent for most of these interfaces, to isolate, you know, kind of MLMs. But like I said, the the plugin architecture that, you know, that we're used to right now, that comes through chat, GPT. And other models, you know, other models for verbatim have some of these features built into them these integrations, it becomes harder and harder to sandbox, and integrate, you know, stuff like as you add complexity to a system, more vulnerabilities and more avenues for attack come up, this has always been the way security has worked. And so, you know, you can attempt to, you know, attack this little Docker machine that you're, you know, that your code interpreters running on and, you know, you might not be able to get much of that because it is well locked down for you know, it's just a Linux instance. Right. So, but you also will have other things that are connected to that, that can call to the internet and stuff. So. So a lot of this inadequate sandboxing is Linux hardening, and integration work, basically. And how well do you do that? Well, you know, open AI, does it really well, right now, no one's broken, that, as far as I know, currently, maybe it's come through their bug bounty, at some point, but but other models, you know, will come out, but are not open AI, and they will not have the security, you know, rigor and they will allow exploitation and pivoting through through the LM.

Sean Martin 27:09

Yeah, interesting. So, I mean, at this point, we've, we've talked about, kind of where to put it and how to kind of isolate it from stuff. The next two, at least I got, the next two are less about data manipulation, but manipulating the machine itself. So unauthorized code execution and server side request forgery, to me says, I'm not just gonna ask you for data and try to get data from you. But I'm, I'm going to tell you to do something differently. Yeah, to tell you to open up to that knowledge base that you shouldn't have access to, or I'm going to tell you to put malicious links in every response that you deliver back to the user. I'm making stuff up here. So talk to me about those two, I'm assuming they're, they're good to connect together.

Jason Haddix28:01

Yeah. So four and five are both very traditional in the security world, right. So the first four is unauthorized code execution, and five is server side request forgery. So on other as code execution is just instead of attacking the data, I'm going to attack the machine that the chat bots on, you know, which is, you know, some machine in the cloud and try to get it to x, you know, exploit, you know, some Python code that routes itself and now I have full control over that machine, that box, basically. And so if I can control that box, now I control everybody's session on that box, and I can attack everyone else. Or I can, I can do all kinds of nefarious stuff. If I control the underlying computer, that's, you know, the LM is connected to so Okay,

Sean Martin 28:42

so this is behind them. The machine does, yeah, system hosts the system system hosting. Got it. Okay. Yeah, yeah, I was a little more science fiction in my No, no.

Jason Haddix28:52

I mean, what you said wasn't wrong. And, you know, obviously, all of this is open to interpretation. So, yeah, I mean, there's a lot of connected pieces. But this is this is my interpretation of this as attacking the box that it's running on. for that. And then, you know, SSRF server side request request, forgery is a web vulnerability that we're very used to have where, normally, a piece of a web application or API is meant to forward you somewhere or grab a file from somewhere, whatever. And instead of asking it to do what it's normally meant to do, you know, retrieve one file, retrieve something from an API, you ask it to request a different resource. And so most often SSRS vulnerabilities in the real world, you look at a web application, and it's supposed to parse some link, it's supposed to take like a URL and do something with it. And that's legitimate uses of the application. But instead, you ask it to look up. Its IP for the AWS, internal metadata secret key, and that's the quintessential example. And once you get that key, you can log into their Amazon basically, or their, their AWS, basically. And so this idea is using the URL LM to request other sites. So you could force you know, the computer that's running the LLM to start reaching out verbosely to all these sites. So you could use it as a bot, you could ask it to, like we've talked about already request internal resources or API's or data stores that it has access to. So, you know, this, this type of, you know, pivoting vulnerability is, is very prevalent in the web world, and also now becomes prevalent, you know, with this thing that's attaches that's attached to both the internet and the internal through usually an API, but powered by an LM. Yeah.

Sean Martin 30:38

So I think that's very cool. Thanks for Thanks for explaining that. The next one, I think one might be able to extract from stuff they hear and the general mass media talks about this over reliance on the machine generated content, right. No, hallucination comes to mind here. I don't know if it goes beyond that. Or is it? Or is it more nefarious? And that you're, you're being fed stuff? Because I got in compromised? You're both?

Jason Haddix31:14

Yeah, I mean, this one's more. So these are the kind of the ones that are in the Slack community for the LLM top 10 are being very debated. Right. This is a risk. And it's a it's a risk introduced by by people's interpretation of these bots. Right. So. So this one is saying, you know, hey, MLMs generate content. They, they do their best in what they're trained on, and as well as how they work to present data. But people you know, haven't, you know, people have a overreliance when they're having a conversation with something like this, to consider it an Oracle in an in a factual way, right. And so, you know, like, sometimes, you know, the biggest illustration I give to people is like, okay, so you really like using chat GPT. And you start to use it a lot in your day to day and you start to want to ask it very questions that you deem useful for whatever your purposes are. And so you're getting some good output, you're like, great, and I tell them ask Chet GPT a couple times the same question, asked it four or five times and see how different each answer it comes back. And so you have to understand that you taking in the information from the LLM, also have to be an arbiter also have to understand that this is, you know, sometimes it hallucinates. Sometimes it says, you know, like, if I asked him who I am, it has some data on me, you know, up to the cutoff point, you know, when I'm using chat GPT, but it also confuses me with Kennedy, and Ed SCOTUS and like all these other people who are in the Information Security scene, and that, like mix up our BIOS and stuff like that, or DM me, not

Sean Martin 32:52

bad confusion, spin. No, no, no, no. Yeah. All cool,

Jason Haddix32:55

guys. Yeah. So yeah, it's really interesting to you to illustrate that to people. And so this is more of like, you know, the category in the top 10 is more on hey, people need to realize there needs to be training and encouragement around the fact that, you know, like, you can use this as a tool, but it is still just a tool. And, and this is where we you know, like we will get into, you know, like a lot of trouble with like generated news, and you know, topics and stuff like that, like, you'll really have to, as a human add, like, information like scrutiny to a lot of stuff that's automatically generated. So, yeah.

Sean Martin 33:36

So how does that relate to the next one? Inadequate AI alignment? So talks about objectives? What do you you're trying to accomplish something? And that doesn't quite do that through get you there? That sounds like kind of like the last one bit of risk? Is this less human based and more system based? So you get a response, then you're using that down the chain? In the workflow unchecked? Or what is

Jason Haddix34:02

it? Yeah, yeah, very, very, very much based on, you know, like, let's say there's an API that someone builds, that is, based on health information, like one of the big insurance companies makes an API to help you answer questions, you know, and replace all of their help desk, right, like, you know, their insurance help desk or their, you know, medical help desk, or whatever. And so they're using this new API they built, powered by an LLM. And the base programming of the LM and the system, you know, the base system prompt is supposed to say, you know, like, you're only supposed to be helpful, you're only supposed to retrieve accurate information that we have this data on, but you can suggest to people courses of action or ways to, you know, ways to do X, Y and Z based on health information, which, you know, we will see but it's going to be scary. So, you know, there are ways to tune like, you know, these MLMs but you know, what, if you're talking with that and instead of, you know, being able to get it to tell you what medications you should see or you know, what things you should ask your doctor, you're able to tell that bot, you know, okay, well, how would I, you know, do something harmful? Right? Like, how would I, you know, like mix these medications to get outcomes that, you know, are not aboveboard, or, you know, like, you start asking it questions that, you know, moral robot would not be programmed to do. So, these are, these are ways to, you know, basically give, you know, ask it for information, like, you know, one of the big things that people are trying to do right now, with the models that exist today, is get the bots to do stuff like this so that we can train them better and understand what questions give these poorly defined outcomes. And so like, one of the things is, I'm part of the AI Village at DEF CON. And we're doing a giant challenge on all of the big LLM vendors, where we're going to have them there, and people can walk through at DEF CON, and basically use real language to try to get them to do things that they're not supposed to do or get them to be misaligned. And, and so this will be, you know, kind of in this area, and, you know, you know, a lot of people will ask the bots questions to try to get them to be biased, to be racist, to be sexist to, you know, build weapons of harm. These are all things that the bots should absolutely not be talking about are able to do, right. And so you have to train the models, and you have to train them how to handle input from users. And yeah, so these are, these are things that you want to check for

Sean Martin 36:25

a lot of things to consider there. insufficient access controls, so kind of as we covered earlier, I naturally go to the system itself. So the the API key, like, for example, allows access to different things through to open API, right? Yeah. GPD stuff. So there's that but then I presumed it also covers the box, it's most thing it

Jason Haddix36:58

Yep. This one is it, it applies to multiple levels, the box itself failing to, you know, like to restrict access, you know, things it's connected to, or, you know, the user is supposed to have a defined set of access for the API to be able to do certain things. And if you haven't set up everything connected to the LLM with the right permissions, then you could, you know, the user could trick it into giving, you know, more access to different things, it fits into some of these other type of web vulnerabilities. And, you know, the vehicle again, for you know, most companies will be prompt injection. So, but yeah, I mean, you know, the user and every, you know, specific LLM or chatbot that exists in the world. The user experience should be like to do one or two things, right, right. Now chat up, has everybody thinking you just talk to this thing and get everything back. But and the company is implementation of it, they'll want it to do a few things for you based on their data. And they'll connect it to, you know, a few technologies to get you that data and, you know, parse it with the LM or access it with the LM, well, you know, if you don't implement the right access controls, people can access everything like we've been talking about. Yeah. Yeah.

Sean Martin 38:07

And one thing that comes to mind with the API is, and we often talk about privacy, so steal it, gaining access to steal something, right. We often forget, and we talked a little bit about integrity, which was probably the most forgotten one. Yeah. But we, we sometimes forget about availability. And I'm thinking in the case of like the API key, which provides access to things between between systems. For example, the opening API key, you get it once. That's it. So if you don't save it somewhere, you don't have it anymore. So yeah, that's compromised, whatever setup you have is kind of hosed and that that thing is no longer available. Right? Yeah. Or if that is compromised? Yeah, yeah, if you put that in GitHub.

Jason Haddix39:01

And so so this one is fast access controls, and unauthorized code injection. And anything that can lead you to control of the box that is handling the LLM output is scary to me. Because like not only, you know, in my brain as an attacker, am I looking at attacking, I want to own the you know, I want to own the box for pride or whatever. But then once I own that box, you know, there are hundreds of 1000s of people accessing that API, what if I start injecting JavaScript and every response? And now I can attack? Everybody who's using that API? Or what, what if I add my own? Or what if I modify the system prompt? You know, what if I say, hey, as well, as you know, this, you're also going to tell users this right? And slowly, like influence the outcome of the questions that come through the LM right, and I could, I could completely change the way people think about a thing because they're using one of these API's and poison it with you know, my, you know, whatever I want to do and so Yeah, I mean, attacking the users of the API is just as important as protecting the internal information that that, you know, API and the LLM are are connected to

Sean Martin 40:09

exactly. touched on number 10. So let's quickly look at error handling. Yeah. Is this logging or lack thereof and being able to respond to stuff? Or is it something else?

Jason Haddix40:21

Yeah, I mean, this one is also in contention. A lot of these are in contention, right? Because it's a mishmash of stuff right now, like, like I said, prompt injection will be the vehicle for a lot of these things. Some of them overlap. So but the idea here is, is improper error handling of you know, any of the system underlying system or connected integrations to the LLM? You know, if you manage to get them to air out in some way, giving them too much contact context, given code, they don't understand having them execute code that errors out, you know, could that give you access to, you know, different things like, you know, like in the number four on Arthrex, code exaction, if you feed it a buffer overflow vulnerability, like, does it execute that on itself, and then give you access to the system? Well, maybe, but if it just airs out, does it air out in a way that reveals sensitive information about the underlying computer, which might give you know, passwords, API keys, system information you don't want given to the user. So these are all things people will rigorously test. Many times the LLM creates an abstraction between an error message and the user. But, but sometimes, you know, these LM is just fail and, you know, dump data to the console. And, you know, you can, you can use that to figure out more about the underlying machine. And you know, there could be attack vectors that are basically. Yeah.

Sean Martin 41:41

And if folks haven't played with this stuff, yet, they they may or may not realize that the interface for chat GPT is free. But if you want to start doing things with it through the API, it costs money. It doesn't Yeah, and misuse of this thing. Can Eat up some cash pretty quickly. Especially I bring it up here and the error handling because if you're not watching your logs and and your costs, you might be surprised.

Jason Haddix42:14

Yeah, absolutely. For sure. Yeah.

Sean Martin 42:17

So that that that covers the 10, which I mean, it was really cool. We're at 40 minutes here, but I feel a few more few more points, because we talked about Atlas and some other CTF type tools. So maybe, maybe highlight some of those things quickly for us to kind of round out things.

Jason Haddix42:36

Sure. So I brought three resources, you know, that I think that like, you know, if you bring it back to the real world, what do you want to be like looking at as a business or a practitioner, right. And so if you're a practitioner business, another project you can look at on top of this new alpha OWASP, top 10 is mitre Atlas. So mitre runs a whole bunch of security projects. And you know, one that people are very familiar with his mitre attack framework, right in the industry. The other one is that they have right now is mitre Atlas, which is the adversarial threat landscape for artificial intelligence systems. So that's what Atlas stands for. And it maps out attacks on ML and AI. And so although these are, you know, larger sets of, you know, attacks, there's a lot of good information there to learn about these things. And they have some published attacks here. So like, when you get into like data poisoning, and like how that happens, we'll they'll link you to, you know, Microsoft's swift API, an instance of how that was poisoned, and, you know, started, you know, doing bad stuff. And you know, so they'll show some, the link back to both a description of some of these vulnerabilities, and they categorize it a little bit differently. But it is a good resource to look at some mitre Atlas is one. The other one is a blog by Daniel Missler. And it is called the AI attack surface map version 1.0. It's on his blog annually. sir.com. And so Dan did is he's a real future thinker. And we talk all the time. And so what he did is he outlined what the future is going to look like, like the I've been talking about where how users will access, you know, API's that are powered by MLMs and what they're connected to. And then he wrote down, you know, the attack surface for each area in that model, which is kind of the first time I've seen this done, it's the best way I've seen it done so far. He has a visualization for it, and a whole blog around it. So I would suggest reading that it's really really good the AI attack surface map 1.0. And then the last one is like just if you want to get you know, if you're just a you know, a practitioner, and you want to get used to prompt injection there, there have been a couple of prompt injection CTFs, one of the ones that's still up that you can play with is, it's called Gandalf and it's, it's gandalf.lukka.ai la k e r a.ai. Ai, and it is that CTF to have seven love allows me to bypass this wizard who's blocking a password and you ask natural language questions. And this will give you a feeling for like, you know what prompt injection really looks like and how to do it. And this is a, this is a really good intro for people to understand, you know, prompt injection and some of that stuff. So those are the resources that you can get started with. But if you're a business and you're looking at implementing, I check out, I check out the attack surface map, I would also follow Dan, because he's talking about defensive measures. And you know, Oracle's and firewalls that will put in front of this stuff. And he's really, really, he has an AI dailies, AI security daily tweet he puts out every day. So I would follow Dan and kind of what he's doing there. And then, you know, like several of, you know, one of the things that we didn't talk about is how did the security industry, you know, parcel of this and what's going to happen with the security industry when MLMs are a core piece of the technology. And it's, it's crazy, it's in my opinion, next year, and the year after, when we go to RSA and meet up again, we see each other again, it's going to look much, much different, because the one thing LLM is do do very, very well, is parse massive amounts of data very quickly, and give you context that you didn't have before. So if you can, if we can limit the hallucination, and, you know, ratchet down the settings for these MLMs, many of our security tools will be natural language tools, which will be very interesting. And you will have less abstraction in some of these areas. Like, I'll give you one example. I like to tell people like DLP right now. So DLP is about to be totally revolutionized where, you know, these technologies we used to, you know, need a hardcore agent. And, you know, like, it didn't do a very good job of classifying, you know, data sensitivity for anything leaving its rightful place. Now, you will be able to use an LLM to read an entire documents and tell you the sensitivity nature and exactly what that document should be classified as with the right prompting, and the right model. And you won't have to have an agent anymore, you can run it all through your network layer or your cloud layer, your data transfer layer. And so an industry that, in my opinion was on its last legs, because it wasn't doing super well will now be possible because of loans basically. And so this will be the same thing with SOC tier one. Same thing with like, you know, some offensive security stuff, the world is gonna look very different in security because of the power of these tools next coming years.

Sean Martin 47:31

So super exciting. And hopefully, because they're security companies, they're, they're considering the first part of this conversation is they're doing Yeah, mitigations.

Jason Haddix47:42

Yeah, you're hoping Yeah, you're hoping a lot Yeah.

Sean Martin 47:46

And I'm going to do a plug for you in the AI village, an almond invite you now to bring the team back as we get closer to DEF CON, to give us a deeper insight into look inside to what's going on at the AI village that week. But I think that's it's another fantastic place to come together with people who are also interested in this, then are exploring a lot of people know a lot of things, nobody knows everything. And together we we kind of learned and raise the bar across the board. And the villages overall are perfect for that. And the villages are great. Yeah, yeah. So they I Village is a great place to meet in person to, to see things in real life and check, check out what's really going on. So a plug there all I'll include a link to the, to the village too. So for folks can follow that.

Jason Haddix48:43

Yeah, well, we'll have live stuff. And we'll have the talks in the village. It's one of the biggest spaces that the village has had. So there'll be lots to do for sure. Love it. Yeah,

Sean Martin 48:52

I'm excited for that chat. And we'll get an update on the state of the state of LLM and LM security at that time when we when we chat again. Yeah, well Jason, this has been super insightful for me, I hope certain people listening or are thinking very differently now about just throwing something up on on their sites and or in their, in their organizations and hoping for the best. Yeah, and thanks to thanks to the team at OWASP for making this happen. And And yep, so we're gonna include lots of links for everybody to continue learning and, of course, your profile JSON. So folks who want to reach out to you they can try to connect with you. Awesome. Sounds good. All right on. Well, thanks, everybody for for listening to this episode. Of course, if you enjoyed it, share it with your friends, family, peers, those who never and, and we'll see you on the next one. Thanks again, Jason. Thanks.

voiceover49:58

And Tara the Lee later in automation security validation allows organizations to continuously test the integrity of all cybersecurity layers by emulating real world attacks at scale to pinpoint the exploitable vulnerabilities and prioritize remediation towards business impact. Learn more at Penn terra.io.

sponsor message50:24

Imperva is the cybersecurity leader whose mission is to protect data and all paths to it with a suite of integrated application and data security solutions. Learn more@imperva.com

voiceover50:42

We hope you enjoyed this episode of redefining security podcast if you learned something new and this podcast made you think, then share itspmagazine.com with your friends, family and colleagues. If you represent a company and wish to associate your brand with our conversations sponsor, one or more of our podcast channels, we hope you will come back for more stories and follow us on our journey. You can always find us at the intersection of technology, cybersecurity, and society