Web agents transcending API boundaries, network security through foundation models, adversarial unlearning for safety benefits, backdoor'ing RL agents actions, and open problems in mech interp
🎩 Top 5 Security and AI Reads - Week #5
Web agents transcending API boundaries, network security through foundation models, adversarial unlearning for safety benefits, backdoor'ing RL agents actions, and open problems in mech interp