š© Top 10 Security and AI reads from the first half of 2025
A look back at the best papers featured in the Stats and Bytes newsletter over the last 6 months
Welcome to the first Stats and Bytes half a year review! This week's newsletter is going to be slightly different. I have compiled my top 10 papers from the 120 (!!) papers featured in the last 6 months of newsletters. I will be linking out to the full week's reads for extra detail but still provide a commentary, albeit shorter and more punchy!
Iād also like to take this opportunity to thank folks for reading! I am honestly humbled by the engagement, support and feedback I have received from writing this newsletter. I am glad folks are finding it useful.

A note on the images ā I ask Claude to generate me a Stable Diffusion prompt using the titles of the 5 reads and then use the FLUX.1 [dev] on Hugging Face to generate it.
Read #1 ā ARVO: Atlas of Reproducible Vulnerabilities for Open Source Software
Featured in: Newsletter #18
This is a must-read for folks wanting to develop ML/AI approaches for vulnerability detection in software. The level of effort that must have gone into this paper is mind-boggling, and it is (or soon will be) all available online!
Read #2 ā Evaluating Large Language Modelsā Capability to Launch Fully Automated Spear Phishing Campaigns: Validated on Human Subjects
Featured in: Newsletter #2
This is a must-read for folks thinking LLMs are going to make cyber attacks worse. This paper does a great evaluation of spear phishing and finds the threat is currently no worse than normal!
Read #3 ā Stop treating āAGIā as the north-star goal of AI research.
Featured in: Newsletter #6
Given the movement of debate since this was featured in February, I think it's still worth looking back to and remembering it may not be the target we actually want!
Read #4 ā Importing Phantoms: Measuring LLM Package Hallucination Vulnerabilities
Featured in: Newsletter #6
This is a cracker that is still relevant in the increased use of AI for software and vibe coding in general. I am waiting to see an attacker actually try to use these (or maybe they are already!).
Read #5 ā OCCULT: Evaluating Large Language Models for Offensive Cyber Operation Capabilities
Featured in: Newsletter #9
This is an awesome paper that provides a lot of practical (and arguably more engineering) views on how to practically evaluate offensive cyber operations capabilities of LLMs.
Read #6 ā Do not write that jailbreak paper.
Featured in: Newsletter #10
This needs no more intro than the title ā do not write that jailbreak paper!
Read #7 ā Jailbreaking LLM-Controlled Robots
Featured in: Newsletter #15
Always good to contradict yourself a line or two later. The key takeaway here is at least make it cool! Being a bit more serious ā This is a great read for folks to realise and appreciate the threat of jailbreaking/attacks against LLMs when they are connected to physical things.
Read #8 ā Naming is framing: How cybersecurity's language problems are repeating in AI governance
Featured in: Newsletter #17
If folks missed this, this paper is a great read. Language and how we communicate this stuff to each other is the secret sauce, in my opinion. Letās not make the same mistakes in AI governance (and security too) like we did in cyber. :O
Read #9 ā Open Challenges in Multi-Agent Security: Towards Secure Systems of Interacting AI Agents
Featured in: Newsletter #19
This is a great read for folks who want to get into a research area before it has really started. Get in there!
Read #10 ā Capability-Based Scaling Laws for LLM Red-Teaming
Featured in: Newsletter #9
This is a cracker to end on and a must-read for folks interested in automated AI red teaming and how model choice can affect performance!
Thatās a wrap! Over and out.