AlphaZero Self-Play Learning

This cluster focuses on DeepMind's AlphaZero and its predecessors like AlphaGo, discussing how they master board games such as Go, Chess, and Shogi through self-play without human data, outperforming systems trained on human games.

📉 Falling 0.4x AI & Machine Learning
4,385
Comments
19
Years Active
5
Top Authors
#3501
Topic ID

Activity Over Time

2008
2
2009
4
2010
10
2011
12
2012
20
2013
14
2014
49
2015
72
2016
1,057
2017
734
2018
290
2019
393
2020
238
2021
203
2022
206
2023
428
2024
389
2025
250
2026
14

Keywords

LLM AlphaZero AlphaZero.jl ELO PC spectrum.ieee doi.org CUDA AI MachineLearning alphago chess games game deepmind trained play training beat playing

Sample Comments

nl Aug 14, 2020 View on HN

To the downvoters, I give you AlphaZero.Not only is every game of Go it plays and wins brand new (so no memorisation), the same system learnt to play Chess without knowing the rules, and plays in a "style .. unlike any traditional chess engine"https://deepmind.com/blog/article/alphazero-shedding-new-lig...

tim333 Jun 11, 2025 View on HN

AlphaZero learned various board games from scratch up to better than human levels. I guess in principle that sort of algorithm could be generalized to other things?

akvadrako Jul 8, 2017 View on HN

AlphaGo was not primarily trained to mimic humans, but to win games. This included playing many games against itself and semi-random tree searches for better strategies. If it was only mimicking humans it probably would have lost to the world's best.

tim333 Nov 15, 2024 View on HN

AlphaGo which beat Lee Sedol was trained on human games. But then they produced AlphaZero which learned entirely from self play and got better than AlphaGo. So it goes.

metasj Nov 21, 2019 View on HN

Beating AlphaZero at Go, Chess, and Shogi, an mastering a suite of Atari video games that other AIs have failed to do efficiently. No explicit heads-up contests with a trained AlphaZero; but apparently hits an ELO threshold w/ fewer training cycles. Yowsa.

vegarab Nov 14, 2019 View on HN

You are talking about AlphaGo. AlphaZero was not given any prior knowledge of the game and is trained exclusively through self-play -- and it outperforms Monte Carlo tree search-based systems such as AlphaGo and Stockfish in chess 100-0 with a fraction of the training time.AlphaZero is also capable of playing Chess, Shogi and Go at a super-super-human.

TangoTrotFox Nov 2, 2018 View on HN

I think AlphaGo vs AlphaZero is a strong argument against this. AlphaGo used the best knowledge of humanity to try to tune its play and mix human expertise, and centuries of master level play, with the strength of deep learning systems. It seems like this would be ideal, particularly as per your analogy. Google certainly believed this as this is the system they directed their resources towards developing and then very publicly demonstrating.AlphaZero was likely a curious aside at one point. T

lozenge Nov 18, 2017 View on HN

Alphago Zero has beat all previous Go engines without examining any external games. I don't see any reason to believe Chess is a special game where humans understand it better (to write AIs) than computers can do "on their own". They are both turn-based deterministic games where the only asymmetry is which player gets the first turn.

dash2 Jul 26, 2024 View on HN

AlphaZero is more general than a Go and Chess AI, right? Isn't it a general self-play algorithm?

cromwellian Mar 9, 2016 View on HN

The way AlphaGo plays is more general purpose than other game playing systems. It doesn't have any heuristics, rules, or game rule books programmed into it, it learned to play Go like you would. The same technique could be applied to other areas, the same way DeepMind originally got super-human level Atari 2600 game performance purely by having it watch pixels.