

SANS Stormcast Monday Mar 3rd: AI Training Data Leaks; MITRE Caldera Vuln; modsecurity bypass
4 snips Mar 3, 2025
The podcast dives into alarming AI training data leaks, revealing that the Common Crawl dataset harbors exposed API keys and secrets. It also discusses GitHub's Copilot inadvertently accessing sensitive data from previously private repositories. The MITRE Caldera framework is highlighted for its potential vulnerability, allowing unauthorized code execution. Lastly, it addresses a modsecurity rule bypass, emphasizing the critical importance of regular software updates to enhance cybersecurity defenses.
AI Snips
Chapters
Transcript
Episode notes
AI Training Data Leaks
- Common Crawl's dataset, used for training large language models, contains leaked API keys and secrets.
- This is similar to Google's data, making it easy to find leaked credentials.
Copilot Exposing Private Repositories
- GitHub's Copilot uses public GitHub repositories for training data.
- It also includes repositories briefly made public, potentially exposing private data.
MITRE Caldera Vulnerability
- Update MITRE Caldera to patch a command injection vulnerability.
- Attackers can exploit compile parameters to execute arbitrary code.