The field of large language models (LLMs) is moving towards a greater emphasis on safety alignment, with a focus on developing methods to evaluate and improve the safety of these models. Researchers are investigating the impact of fine-tuning on safety, with findings suggesting that it can compromise safety alignment even when using benign data. To address this issue, novel approaches such as pruning-based methods and persona features control are being proposed to improve safety while preserving performance. Noteworthy papers include: PL-Guard, which introduces a benchmark dataset for language model safety classification in Polish. Safe Pruning LoRA, which proposes a pruning-based approach to improve safety alignment in LLMs. Persona Features Control Emergent Misalignment, which investigates the mechanisms behind emergent misalignment in LLMs and proposes mitigation strategies.