Computer ScienceMedicine

A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play

David Silver, T. Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai +8 more1/1/20183822 citationssemantic_scholar

TL;DR

AlphaZero is a general reinforcement learning algorithm that masters chess, shogi, and Go through self-play, outperforming specialized state-of-the-art programs.

Executive Summary

The paper presents AlphaZero, a reinforcement learning algorithm that teaches itself to play and master three different board games: chess, shogi, and Go. Unlike previous programs that relied on game-specific knowledge and techniques, AlphaZero uses a generalized approach starting from random play and only requires knowledge of the game rules. Through self-play, AlphaZero achieved superhuman performance and defeated world champion programs in all three games, demonstrating the potential for a general game-playing system.

Key Contributions

Development of a general reinforcement learning algorithm applicable to multiple board games.
Achieving superhuman performance in chess, shogi, and Go starting from random play.
Demonstrating the ability to defeat state-of-the-art specialized programs in each game.

Limitations

The algorithm's performance in other types of games or real-world applications has not been explored, and further research is needed to assess its generalizability beyond board games.

AI Evaluation

AI analysis scores

Overall Score

Novelty95/100

Methodology90/100

Reproducibility85/100

Impact98/100