Studying Sutton and Barto's RL book and its connections to RL for LLMs (e.g., tool use, math reasoning, agents, and so on)? [D]

Hi everyone, I graduated from a Master in Math program last summer. In recent months, I have been trying to understand about ML/DL and LLMs, so I have been reading books and sometimes papers on LLMs and their reasoning capacities (I'm especially interested in AI for Math ). When I read about RL on Wikipedia, I also found that it's also really interesting as well, so I wanted to learn about RL and its connections to LLMs.