The field of logical reasoning for large language models (LLMs) is moving towards more complex and nuanced evaluations of their problem-solving abilities. Researchers are introducing new benchmarks and frameworks that test LLMs' capacity for creative and strategic reasoning, such as solving brainteasers, logical puzzles, and Sudoku variants. These innovations aim to address the limitations of current LLMs in handling complex puzzles that demand precise reasoning and exhaustive search. Noteworthy papers in this area include: SATBench, which exposes fundamental limitations in the search-based logical reasoning abilities of current LLMs. Logic-of-Thought, a novel framework that bridges LLMs with logic programming to solve puzzles in natural language with near-perfect accuracy. Sudoku-Bench, a curated benchmark for evaluating creative, multi-step logical reasoning in LLMs.