AlphaZero Self-Play Learning
This cluster focuses on DeepMind's AlphaZero and its predecessors like AlphaGo, discussing how they master board games such as Go, Chess, and Shogi through self-play without human data, outperforming systems trained on human games.
Activity Over Time
Top Contributors
Keywords
Sample Comments
To the downvoters, I give you AlphaZero.Not only is every game of Go it plays and wins brand new (so no memorisation), the same system learnt to play Chess without knowing the rules, and plays in a "style .. unlike any traditional chess engine"https://deepmind.com/blog/article/alphazero-shedding-new-lig...
AlphaZero learned various board games from scratch up to better than human levels. I guess in principle that sort of algorithm could be generalized to other things?
AlphaGo was not primarily trained to mimic humans, but to win games. This included playing many games against itself and semi-random tree searches for better strategies. If it was only mimicking humans it probably would have lost to the world's best.
AlphaGo which beat Lee Sedol was trained on human games. But then they produced AlphaZero which learned entirely from self play and got better than AlphaGo. So it goes.
Beating AlphaZero at Go, Chess, and Shogi, an mastering a suite of Atari video games that other AIs have failed to do efficiently. No explicit heads-up contests with a trained AlphaZero; but apparently hits an ELO threshold w/ fewer training cycles. Yowsa.
You are talking about AlphaGo. AlphaZero was not given any prior knowledge of the game and is trained exclusively through self-play -- and it outperforms Monte Carlo tree search-based systems such as AlphaGo and Stockfish in chess 100-0 with a fraction of the training time.AlphaZero is also capable of playing Chess, Shogi and Go at a super-super-human.
I think AlphaGo vs AlphaZero is a strong argument against this. AlphaGo used the best knowledge of humanity to try to tune its play and mix human expertise, and centuries of master level play, with the strength of deep learning systems. It seems like this would be ideal, particularly as per your analogy. Google certainly believed this as this is the system they directed their resources towards developing and then very publicly demonstrating.AlphaZero was likely a curious aside at one point. T
Alphago Zero has beat all previous Go engines without examining any external games. I don't see any reason to believe Chess is a special game where humans understand it better (to write AIs) than computers can do "on their own". They are both turn-based deterministic games where the only asymmetry is which player gets the first turn.
AlphaZero is more general than a Go and Chess AI, right? Isn't it a general self-play algorithm?
The way AlphaGo plays is more general purpose than other game playing systems. It doesn't have any heuristics, rules, or game rule books programmed into it, it learned to play Go like you would. The same technique could be applied to other areas, the same way DeepMind originally got super-human level Atari 2600 game performance purely by having it watch pixels.