Ai Safety Episodes from Google TechTalks

Jan 27, 2026Language ModelsAdversarial AttacksData Poisoning

Cascading Adversarial Bias from Injection to Distillation in Language Models

Adversarial bias injected into large language models (LLMs) during instruction tuning can cascade and amplify in distilled student models, even with minimal poisoning, bypassing current detection methods.

Explore Insights →

Jan 27, 2026LLM securityAI safetyData poisoning

Persistent Pre-Training Poisoning of LLMs

Adversaries can persistently compromise Large Language Models (LLMs) by injecting a small amount of malicious data (as little as 10 tokens per million) into their pre-training datasets, leading to behaviors like denial of service, private data extraction, and belief manipulation, even after subsequent alignment training.

Explore Insights →

Want more on ai safety?

Explore deep-dive summaries and actionable takeaways from the best minds across different podcasts discussing this topic.

View All Ai Safety Episodes→

Don't see the episode you're looking for?

We're constantly adding new episodes, but if you want to see a specific one from Google TechTalks summarized, let us know!

Submit an Episode