The field of reinforcement learning and computational complexity is rapidly evolving, with a focus on developing more efficient and scalable algorithms for complex problems. Recent research has explored the use of novel frameworks, such as pseudo-MDPs, to optimize solutions for specific classes of problems, including those related to blockchain security. Additionally, there has been significant progress in the development of benchmarks, such as BuilderBench and PuzzlePlex, to evaluate the performance of foundation models and generalist agents in complex, dynamic environments. These advancements have the potential to improve the efficiency and effectiveness of reinforcement learning algorithms in a wide range of applications. Noteworthy papers include: To Distill or Decide?, which investigates the algorithmic trade-off between privileged expert distillation and standard RL without privileged information. PuzzlePlex, which introduces a benchmark to assess the reasoning and planning capabilities of foundation models. Pseudo-MDPs, which proposes a novel framework for efficiently optimizing last revealer seed manipulations in blockchains.