[P] gumbel-mcts, a high-performance Gumbel MCTS implementation

Hi folks, Over the past few months, I built an efficient MCTS implementation in Python/numba. As I was building a self-play environment from scratch (for learning purposes), I realized that there were few efficient implementation of this algorithm. I spent a lot of time validating it against a golden standard baseline. My PUCT implementation is 2-15X faster than the baseline while providing the exact same policy. I also implemented a Gumbel MCTS, both dense and sparse. The sparse version is useful for games with large action spaces such as chess.