LLM Security Risks
This cluster centers on security concerns with LLMs, including prompt injection vulnerabilities, risks of granting them access to systems or commands, and the fundamental untrustworthiness of LLMs as security barriers.
Activity Over Time
Top Contributors
Keywords
Sample Comments
I keep seeing this kind of comment with regards to LLM applications. Why is it so? Isn't input sanitization or sandboxing a thing?
You’re missing the /s right?What about what Claude or any LLM bot does with info it randomly finds online? Run local commands you didn’t ask for, visit sites you didn’t expect it to visit? Upload data and files you don’t ask it to upload?If you don’t know what I mean, here is a cool talk for you to watch https://media.ccc.de/v/39c3-ai-agent-ai-spy
They exploit the fact the llm will do anything it can to anyone.These tools cant exist securely as long as the llm doesn't reach at least the level of intelligence of a bug that can make decisions about access control and knows the concept of lying and bad intent
It's pretty simple, don't give llms access to anything that you can't afford to expose. You treat the llm as if it was the user.
An LLM trained on exclusively safe code could can still combine it in unsafe ways.
Problems created by using LLMs generally can't be solved using LLMS.Your best case scenario is reducing risk by some % but you could also make it less reliable or even open up new attack vectors.Security issues like these need deterministic solutions, and that's exceedingly difficult (if not impossible) with LLMs.
Access to untrusted data. Access to private data. Ability to communicate with the outside. Pick two. If the LLM has all three, you're cooked.
LLMs are not a security barrier. LLMs cannot be a security barrier. They cannot form part of a security barrier. You must place the security barrier between the LLM and the backend systems, the same as you would place it between your web or mobile app and your backend systems. Assume that if the LLM agent can use a service, the human interacting with the agent can also call that service with arbitrary parameters.The tools you're providing to your LLM agent must never have p
You really ought to never trust the output of LLMs. It's not just an unsolved problem but a fundamental property of LLMs that they are manipulatable. I understand where you're coming from, but prompting is unacceptable as a security layer for anything important. It's as insecure as unsanitized SQL or hiding a button with CSS.EDIT: I'm reminded of the hubris of web3 companies promising products which were fundamentally impossible to build (like housing deeds on blockchain).
Allowing LLMs to execute unrestricted commands without human review is risky and insecure.