About

Daniel Donnelly

Daniel Donnelly portrait

About

I’m a PhD student advised by Sahar Abdelnabi at the Max Planck Institute for Intelligent Systems and ELLIS Institute Tübingen. My work focuses on AI safety. I’m interested in developing evidence that helps clarify whether, when, and how advanced AI systems pose serious risks, what those risks look like in practice, and which interventions would meaningfully reduce them.

My current work studies unverbalised evaluation awareness and develops benchmarks for safety failures of self-evolving agents. More broadly, I’m interested in model personas, generalisation, reward hacking, chain-of-thought monitoring, long-context reasoning, and multi-LLM agent interactions as lenses for understanding how risks might emerge in increasingly capable AI systems. If any of this overlaps with your interests, I’d be happy to chat.

Education