Prompt Injection Attacks

The cluster discusses prompt injection vulnerabilities in LLMs, including definitions, real-world examples like leaking system prompts, defenses, and debates on whether it's solvable.

➡️ Stable 0.9x AI & Machine Learning
2,858
Comments
13
Years Active
5
Top Authors
#4694
Topic ID

Activity Over Time

2010
1
2015
1
2016
5
2017
11
2018
8
2019
5
2020
5
2021
19
2022
108
2023
1,133
2024
473
2025
960
2026
131

Keywords

TFA MCP AI MUST lakera.ai LLM GPT4 TNG simonwillison.net github.com prompt injection tokens llms instructions llm attacks attacker instruction model

Sample Comments

simonw Apr 23, 2023 View on HN

Prompt injection means something else: https://simonwillison.net/series/prompt-injection/

bityard Jan 8, 2025 View on HN

This is called prompt injection. Modern LLMs have defenses against it but apparently it is still a thing. I don't understand how LLMs work but it blows my mind that they can't reliably distinguish between instructions and data.

leonardespi Sep 30, 2025 View on HN

I tricked a production LLM into printing its hidden system prompt by hiding an instruction in the content it was asked to summarize. That single leak made subsequent jailbreaks far easier and could have exposed sensitive endpoints. Here’s how I now test for prompt injection, the defenses I expect in 2025, and why QA must treat this as a trust-boundary problem—not a model quirk.

satvikpendem Jan 12, 2025 View on HN

There is no way to get rid of a prompt injection attack. There are always ways to convince the AI to do something else besides flagging a post even if that's its initial instruction.

sitkack Jan 21, 2025 View on HN

What kind of prompt injection attacks do you filter out? Have you tested with a prompt tuning framework?

falcor84 Jul 17, 2025 View on HN

I'm concerned that it might work. We'll need good prompt injection protections.

simonw May 13, 2023 View on HN

See Prompt injection: What’s the worst that can happen? https://simonwillison.net/2023/Apr/14/worst-that-can-happen/

simonw Nov 7, 2023 View on HN

Here's why I think that won't work: https://simonwillison.net/2022/Sep/17/prompt-injection-more-...

burcs Nov 6, 2023 View on HN

Do you see a way around prompt injection? It feels like any feature they release is going to be susceptible to it.

free_bip Dec 16, 2025 View on HN

No way that could backfire... Prompt injection is a solved problem right?