LLM Next-Token Prediction

The cluster revolves around discussions claiming that large language models (LLMs) do not understand or reason but simply predict the most likely next token based on statistical patterns from training data. Debates explore whether this mechanism implies a lack of true intelligence or if it enables sophisticated capabilities.

➡️ Stable 0.6x AI & Machine Learning
4,892
Comments
19
Years Active
5
Top Authors
#292
Topic ID

Activity Over Time

2007
1
2009
4
2010
8
2011
13
2012
11
2013
15
2014
15
2015
40
2016
73
2017
69
2018
55
2019
124
2020
181
2021
156
2022
386
2023
1,580
2024
937
2025
1,159
2026
65

Keywords

AI US LLM ycombinator.com ELIZA NOT CNN say.It arxiv.org GPT token word trained llm text output gpt predicting tokens model

Sample Comments

Akronymus Aug 7, 2025 View on HN

It doesnt even really comply, Id say.It just predicts whats the most likely next text token.

marstall Jan 3, 2023 View on HN

nope. it understands nothing except the statistical link between a sequence of words and the next word in the sequence. read up before you lash out!

daveguy Feb 15, 2025 View on HN

Has your LLM ever known just the right next token that should come after 2 million other tokens? Mine hasn't.

ninininino May 23, 2024 View on HN

I'm really confused that people don't understand this. It's just predicting the most likely next text token and its trained on most internet text, so why would we expect anything at all different?

climatologist Jul 2, 2023 View on HN

It doesn't learn. It simple completes statistically plausible sequences of tokens.

shermantanktop Apr 13, 2024 View on HN

Is that not exactly what LLMs are built to do? Iterative next-token prediction?

danybittel May 6, 2024 View on HN

Isn't that exactly what an LLM does? Predicting the next token?

EGreg Nov 6, 2025 View on HN

Considering it’s trained on predicting the next word in stuff humans estimated before AI, wouldn’t that make sense?

Workaccount2 Apr 20, 2025 View on HN

The problem is showing that humans aren't just doing next word prediction too.

novaRom Jun 23, 2023 View on HN

Important addition to your partially right statement: "they’re trained to generate ‘likely’ text" is they are trained to produce most probable next word so that the current context look as "similar" to training data as possible. Where "similar" is not "equal".