Stats and Bytes

Stats and Bytes

Share this post

Stats and Bytes
Stats and Bytes
🎩 Top 5 Security and AI Reads - Week #4
Copy link
Facebook
Email
Notes
More

🎩 Top 5 Security and AI Reads - Week #4

Signature-based model detection that works, Microsoft's wisdom from poking AI, toxic jailbreak prevention, decompiled code comparison that makes sense, and model watermarking without retraining

Josh Collyer's avatar
Josh Collyer
Jan 26, 2025
∙ Paid
2

Share this post

Stats and Bytes
Stats and Bytes
🎩 Top 5 Security and AI Reads - Week #4
Copy link
Facebook
Email
Notes
More
1
Share

Welcome to the fourth installment of the Stats and Bytes Top 5 Security and AI Reads weekly newsletter. We're kicking things off with fascinating research from HiddenLayer, who've developed a signature-based model family detection tool that's achieving remarkable levels of precision. From there, we'll dive into the lessons Microsoft's learnt from red teaming over 100 generative AI applications. We'll then move on to explore a promising jailbreak mitigation approach that could have wider AI security applications, examine awesome work on comparing neural decompiler outputs with source code, and round things off with a model watermarking technique that prevents model extraction without needing retraining.

A surreal steampunk laboratory where a DNA helix made of shadows intertwines with dripping honey code, while robot insects in red hazmat suits perform emergency neural surgery on a massive computer motherboard, dramatic lighting, ultra detailed - Claude special

Read #1 - ShadowGenes: Leve…

Keep reading with a 7-day free trial

Subscribe to Stats and Bytes to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Josh Collyer
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More