

Interviewing Andrew Trask on how language models should store (and access) information
7 snips Oct 10, 2024
Andrew Trask, a passionate AI researcher and leader of the OpenMined organization, shares insights on privacy-preserving AI and data access. He discusses the importance of secure enclaves in AI evaluation and the complexities of copyright laws impacting language models. Trask explores the ethical dilemmas of using non-licensed data, federated learning's potential, and challenges startups face in the AI landscape. He emphasizes the need for innovative infrastructures and the synergy between Digital Rights Management and secure computing for better data governance.
AI Snips
Chapters
Transcript
Episode notes
Twitter's Data Challenge
- Twitter lacked ground truth demographic data for bias studies, relying on external sources.
- Sensitive data, like census information, cannot be easily shared due to privacy laws, hindering research.
Secure Enclaves Explained
- Secure enclaves encrypt data in RAM, using a chip-specific key, enhancing privacy during computation.
- They provide a signed hash of the running program, enabling verification and trust among parties.
LLM Information Storage
- LLMs store syntactic (grammar) and semantic (real-world) information; current models store both, necessitating full retraining for updates.
- Andrew Trask suggests separating these, allowing for more efficient updates by swapping out the real-world database.