🎩 Top 5 Security and AI Reads - Week #27

Model extraction defence strategies, Bluetooth security exploitation, supply chain research directions, N-day vulnerability analysis, and recurring vulnerability detection advances

Jul 06, 2025

Welcome to the twenty-seventh instalment of the Stats and Bytes Top 5 Security and AI Reads weekly newsletter. We're opening with a comprehensive survey on model extraction attacks and defences for large language models, providing insights for both protecting public-facing systems and understanding ML-specific attack vectors. Next, we dive into Stealtooth, a Bluetooth security vulnerability that exploits silent automatic pairing to break security protocols on real-world devices (including my actual headphones!). We then explore cutting-edge research directions in software supply chain security, drawing from extensive practitioner expertise and highlighting humans as an explicit threat vector following incidents like the XZ backdoor. Following that, we examine a substantial PhD thesis on N-day vulnerability detection, bisection, and measurement, offering deep technical insights into Android kernel patching ecosystems and novel detection methods. We conclude with groundbreaking research on recurring vulnerability detection that introduces a dataset of 4.5k vulnerabilities and presents ANTMAN, a new approach that successfully identified 73 zero-days across 15 projects, resulting in 5 new CVE identifiers.

A dark, futuristic cybersecurity visualization showing a massive digital fortress under siege, glowing neural network pathways being extracted and siphoned away by shadowy tendrils, broken Bluetooth symbols floating like shattered glass, a complex supply chain represented by interconnected glowing nodes with some links severed and sparking, vulnerability scanners depicted as ethereal blue scanning beams sweeping across code matrices, recurring attack patterns shown as spiral fractals of red warning symbols, dramatic lighting with neon blues and cyber oranges, highly detailed digital art, cinematic composition, 4K resolution, cyberpunk aesthetic

A note on the images - I ask Claude to generate me a Stable Diffusion prompt using the titles of the 5 reads and then use the FLUX.1 [dev] on HuggingFace to generate it.

Read #1 - A Survey on Model Extraction Attacks and Defenses for Large Language Models

💾: N/A 📜: arxiv 🏡: Blog Post

This should be a grand read for anyone interested in protecting their public-facing models/systems, as well as attackers who want to understand a more ML niche attack.

Screenshot of Figure 1 from the paper showing an overview of what the authors call the “model extraction pipeline”.

Commentary: Finally, a paper that defines their threat model in a couple of sentences…on page 2! This frames the whole paper really well; it is fairly short but crystal clear. Awesome! 🔥

The model then jumps into all of the relevant literature for attack, defence and evaluation approaches before then creating a taxonomy to organise everything. The authors then present a great summary table on the last page that maps defence mechanisms to attack objectives. Somewhat unsurprisingly, certain defences are great at some stuff and crap at others, but if there is ever a table showing that defence in depth applies to AI/ML too, this does it.

The paper then ends with a section on limitations and future directions split into two bits, Extraction Attack Perspectives and Defence Mechanism Advancement. The last is a bit of a nothingburger, with the authors suggesting that “more needs to be done” basically. I did smirk when I read the Extraction Attack Perspectives; the authors suggest that if you are deploying a massive model, don’t worry about it because attackers will never have the hardware. This is probably true. The natural next step is to get more GPUs then and make your model much bigger?

Read #2 - Stealtooth: Breaking Bluetooth Security Abusing Silent Automatic Pairing

💾: N/A 📜: arxiv 🏡: Pre-Print

This is a grand read for folks interested in physical attacks and those wanting a rest from LLM-centric content!

Commentary: I’ll preface this commentary with I am not a big fan of the language used in this paper for talking about connecting to devices. The authors use master/slave throughout to describe the connection process. I’d have preferred my go-to alternative – captain and conscripts (I also then say Arrrrrrh in my head).

This paper caught my eye for two reasons. Firstly, I had not read any Bluetooth-related stuff for donkeys and wondered where it had all ended up. Secondly, I spotted it had been tried on real devices, and after checking, one of the devices is my Bluetooth headphones! :O The attack itself is a very cool example of a logic bug whereby the attacker abuses the defined control flow of Bluetooth connection requests to silently reconnect.

I will be honest and say I did not have enough time to really dig into the pre-req’s of this attack. That being said, from what I did read, it suggests that a pairing/connection as well as link key sharing would have had to already have happened previously. Someone who is more devious than me probably knows how this could be used for something useful? 😈

Read #3 - Research Directions in Software Supply Chain Security

💾: N/A 📜: ACM 🏡: ACM Transactions on Software Engineering and Methodology (PayWall’d)

This is a must-read for folks interested in doing impactful research in software supply chain security as well as folks interested in seeing how far this area has moved!

Screenshot of Figure 1 from the paper showing the different software supply chain attack vectors

Commentary: This is an article that comes out of the Secure Software Supply Chain Centre (S3C2), a US National Science Foundation-funded centre. Due to this, it draws on a wide range of expertise, both within S3C2 and also via lots of external engagement with practitioners and US government. This adds a fair bit of weight to the content, in my view.

The real juicy bit of this paper is the preceding sections. The authors go through each attack vector, presenting practitioner challenges, existing research and then open research challenges for this area. I really like that the authors make humans an explicit threat vector. This was really demonstrated by the XZ backdoor that would have rekt most downstream SSH libraries, which was enabled by a long-term social engineering campaign to get the ability to release. For the folks starting a PhD or wanting to spin up researching this area, these sections should be gold dust to you! The paper then concludes with a grab bag of other bits and pieces, such as SBOMs with smaller but similar prose about different challenges and research avenues.

Read #4 - N-Day Vulnerabilities: Detection, Bisection, and Measurement

💾: N/A 📜: ProQuest 🏡: PhD Thesis

This is a PhD thesis (so very long) but should be a good read for folks that are interested in getting into the detail about N-Day vuln research.

Commentary: Other than the almost farcical line spacing in this thesis (like WTF is going on?), the content is great, but this is a monster read (177 pages). I enjoyed the second chapter on SymBisect the most but also found the first chapter looking at Android’s kernel patching ecosystem really cool. If anyone wants some light bedtime reading, have a gander!

Read #5 - Recurring Vulnerability Detection: How Far Are We?

💾: N/A (Apparently available but dead link!) 📜: ACM 🏡: ACM on Software Engineering

This paper is a grand read for folks interesting in recurring/n-day detection in source code.

Screenshot of Figure 2 within the paper that shows the proposed method from the authors

Commentary: The main contribution of this paper, other than the approach, is a dataset of 4.5k recurring vulns that contains key information such as CVE ID, fixing patch, the original repo they were from and the repo it was found in. The authors suggest this is available, but unfortunately the link they link out to is dead. I also like the definitions the authors use to categorise types of clones; Type 1 = Exact Clone, Type 2 = Renamed Clone and Type 3 = Semantic Clone. I had a thought with Type 3; I wonder if these are going to skyrocket with the increased use of LLMs? 🤔

The authors use this dataset to evaluate lots of SOTA approaches and find they perform OK. The authors then do some false positive analysis to identify key weaknesses/failure criteria before then creating their own approach called ANTMAN. This uses a mixture of function call graphs and inter-procedural code graphs (I think) to find recurring vulnerabilities. The results are pretty impressive with increases in performance across all metrics, but the thing that caught my eye the most was the 73 0-days identified in 15 projects, which resulted in 5 CVE identifiers. Shows it has real utility! Authors just need to drop the code now. :D

That’s a wrap! Over and out.

Stats and Bytes