Skip to content


Repository files navigation


An implementation of AlphaZero for the board game Tak. See also


The repository contains several libraries and binaries:

  • takzero is the main library which implements MCTS and the neural networks
  • selfplay is used during training to generate replays and exploitation targets
  • reanalyze computes fresh targets from old replays
  • learn takes targets from selfplay and reanalyze to train new models
  • evaluation pits models against each other
  • puzzle runs the puzzle benchmark
  • analysis includes interactive game analysis
  • graph computes the ratio of unique states seen throughout training
  • tei a TEI implementation
  • eee is a collection of binaries to run Epistemic uncertainty Estimation Experiments (EEE)
    • generalization trains a hash-based uncertainty estimator
    • rnd is the same as generalization, but specifically for rnd
    • seen_ratio analyzes the ratio of seen states according to a filled hash-set
    • ensemble trains an ensemble network
    • utils utility functions for running experiments
  • visualize_search creates a visualization of the search tree used by an agent
  • visualize_replay_buffer creates a visualization of the overlap of different replay buffers, as well as the number of seen states at different depths
  • python contains miscellaneous Python scripts
    • action_space computes the action space for different board sizes
    • analyze_search analyzes search data to figure out which bandit algorithm optimizes best for exploration
    • elo computes Bayesian Elo from match results (from evaluation) and creates a graph
    • extract_from_logs graphs various data from logs
    • concat_out concatenates log output
    • generate_openings generates random opening positions (for example to use as an opening book for a tournament)
    • get_match_results extract match results from evaluation logs
    • improved_policy compares different improved policy formulas
    • novelty_per_depth plots the novelty per depth
    • plot_eee plots the results of EEE
    • plot_elo_data plots the Elo data
    • replay_buffer_uniqueness plots the replay buffer uniqueness


You will need the C++ Pytorch library (LibTorch). See tch-rs for installation instructions.

LibTorch version

It's possible you may not be able to find these versions anymore. In that case try downloading the newest and update the tch-rs version in Cargo.toml.

You may also need to set LIBTORCH_BYPASS_VERSION_CHECK to 1.

If you find some version works, please let me know so I can add it here.



  • Stable (2.5.1), CUDA 12.4, Release

Did not work:

  • TODO



  • TODO

Did not work:

  • TODO

Reproducing the Plots

Local novelty per depth

To generate the local novelty per depth graph follow these steps:

  1. Edit eee/src/ with the path to a trained model, and adjust the imports based on whether it is a SimHash or LCGHash model.
  2. Run cargo run -p eee -r --bin seen_ratio for each agent.
  3. Take the output and place it into python/
  4. Run python python/

Generalization behaviour for SimHash and LCGHash

  1. Acquire a replay buffer by running an undirected agent. (See elo graph instructions.)
  2. Edit the import in eee/src/ for the model that you want to test.
  3. Run cargo run -p eee -r --bin generalization for each agent, rename the output file eee_data.csv for each.
  4. Edit to plot hashes and run python python/

RND Behaviour

  1. Acquire a replay buffer by running an undirected agent. (See elo graph instructions.)
  2. Run cargo run -p eee -r --bin rnd
  3. Edit to plot RND and run python python/

Elo ratings for different agents throughout training

To generate the elo ratings for agents throughout training follow these steps:

  1. Edit selfplay/src/, reanalyze/src/, and learn/src/ for the agent and value of beta that is desired.
  2. Compile using cargo build -r -p selfplay -p reanalyze -p learn. If exploration is desired, append --features exploration to the command.
  3. Deploy the agent on a cluster, 1 learn process, 10 selfplay processes, and 10 reanalyze processes.
  4. Once you have generated checkpoints for all agents, compile the evaluation using cargo build -r -p evaluation.
  5. Evaluate agents against each other by deploying evaluation processes.
  6. Extract the match results out of logs using python/
  7. Place the match results into match_results/ and run python python/ to plot the elo.
  8. For an easier to edit plot, copy the bayeselo output from into in the expected format.

Replay buffer uniqueness

To generate the replay uniqueness graphs follow these steps:

  1. Train agents using steps 1-3 from the elo graph instructions.
  2. Edit graph/ with paths to the replay files.
  3. Run cargo run -r -p graph and see the generated graph in graph.html.
  4. For an easier to edit plot, copy the output into and run with python python/


An implementation of AlphaZero for the board game Tak. See also






No packages published

Contributors 3
