Prompt Injection Attacks

The cluster discusses prompt injection vulnerabilities in LLMs, including definitions, real-world examples like leaking system prompts, defenses, and debates on whether it's solvable.

➡️ Stable 0.9x AI & Machine Learning

2,858

Comments

Years Active

Top Authors

#4694

Topic ID

Activity Over Time

2010

2015

2016

2017

2018

2019

2020

2021

2022

108

2023

1,133

2024

473

2025

960

2026

131

Top Contributors

simonw (204) danShumway (61) TeMPOraL (29) wunderwuzzi23 (23) Terr_ (18)

Keywords

TFA MCP AI MUST lakera.ai LLM GPT4 TNG simonwillison.net github.com prompt injection tokens llms instructions llm attacks attacker instruction model

Sample Comments

simonw • Apr 23, 2023 • View on HN

Prompt injection means something else: https://simonwillison.net/series/prompt-injection/

bityard • Jan 8, 2025 • View on HN

This is called prompt injection. Modern LLMs have defenses against it but apparently it is still a thing. I don't understand how LLMs work but it blows my mind that they can't reliably distinguish between instructions and data.

leonardespi • Sep 30, 2025 • View on HN

I tricked a production LLM into printing its hidden system prompt by hiding an instruction in the content it was asked to summarize. That single leak made subsequent jailbreaks far easier and could have exposed sensitive endpoints. Here’s how I now test for prompt injection, the defenses I expect in 2025, and why QA must treat this as a trust-boundary problem—not a model quirk.

satvikpendem • Jan 12, 2025 • View on HN

There is no way to get rid of a prompt injection attack. There are always ways to convince the AI to do something else besides flagging a post even if that's its initial instruction.

sitkack • Jan 21, 2025 • View on HN

What kind of prompt injection attacks do you filter out? Have you tested with a prompt tuning framework?

falcor84 • Jul 17, 2025 • View on HN

I'm concerned that it might work. We'll need good prompt injection protections.

simonw • May 13, 2023 • View on HN

See Prompt injection: What’s the worst that can happen? https://simonwillison.net/2023/Apr/14/worst-that-can-happen/

simonw • Nov 7, 2023 • View on HN

Here's why I think that won't work: https://simonwillison.net/2022/Sep/17/prompt-injection-more-...

burcs • Nov 6, 2023 • View on HN

Do you see a way around prompt injection? It feels like any feature they release is going to be susceptible to it.

free_bip • Dec 16, 2025 • View on HN

No way that could backfire... Prompt injection is a solved problem right?