Episodes: specifies how many episodes the algorithm will run before the training ends. An episode transitions into a new one whenever a Done state is received and is a way of defining how long the training will run. For example, if one training session consists of 15 episodes where reinforcement learning is training on a game, then the agent will have won, lost, or reached the maximum steps 15 times before the training ends.