About

I’m a PhD student advised by Sahar Abdelnabi at the Max Planck Institute for Intelligent Systems and ELLIS Institute Tübingen. My work focuses on AI safety. I’m interested in developing evidence that helps clarify whether, when, and how advanced AI systems pose serious risks, what those risks look like in practice, and which interventions would meaningfully reduce them.

My current work studies unverbalised evaluation awareness and develops benchmarks for safety failures of self-evolving agents. More broadly, I’m interested in model personas, generalisation, reward hacking, chain-of-thought monitoring, long-context reasoning, and multi-LLM agent interactions as lenses for understanding how risks might emerge in increasingly capable AI systems. If any of this overlaps with your interests, I’d be happy to chat.

Education

PhD Student, 2025–Present
Max Planck Institute for Intelligent Systems; ELLIS Institute Tübingen
MRes in Artificial Intelligence and Machine Learning, 2024
Imperial College London
MSc in Data Science and AI, 2023
Queen Mary University of London
BSc in Mathematics, 2019
Cardiff University