701: Generative A.I. without the Privacy Risks (with Prof. Raluca Ada Popa)
Aug 1, 2023
auto_awesome
Renowned computer scientist Prof. Raluca Ada Popa shares insights on securely interacting with AI APIs, safeguarding data while using commercial LLMs like GPT-4, open vs closed-source AI development, running compute pipelines across clouds, entrepreneurship, and the importance of confidentiality in AI research.
Confidential computing protects data privacy in cloud AI interactions, enhancing security and enabling collaborative analytics.
Opaque Systems offers privacy safeguarding solution for using OpenAI's APIs without compromising sensitive data, ensuring compliance.
Skylab at Berkeley focuses on efficient cross-cloud compute pipelines, optimizing cost, performance, and security for seamless AI integration.
Deep dives
Confidential Computing and its Applications in Data Security
Confidential computing enables data processing in the cloud while keeping data encrypted, ensuring privacy and security. Through hardware enclaves, data can be decrypted and processed securely without human access, protecting sensitive workloads from cloud employees or hackers. This technology allows collaborative analytics and AI at scale, enabling organizations to securely interact with generative AI platforms without compromising data privacy.
Enhancing Privacy in AI with Opaque Systems
Opaque Systems offers a solution to safeguard privacy when using OpenAI's APIs like GPT-4 by replacing PII with symbols in queries. This approach allows users to interact with AI models without revealing sensitive information, ensuring compliance with regulations like HIPAA. By leveraging confidential computing, Opaque Systems ensures confidential data stays encrypted and protected, providing users with seamless generative AI capabilities while preserving privacy.
Cross-Cloud Computing with Skylab at Berkeley
Skylab at Berkeley focuses on enabling seamless and efficient compute pipelines across multiple clouds, known as Sky computation. By optimizing cost, performance, and security in cross-cloud applications, Skylab offers users the flexibility to leverage different cloud strengths and run pipelines in the sky. This innovative research aligns with industry needs for cross-cloud applications and leverages generative AI efficiently within cloud environments.
Mining Pools and Repurposing for Training LLMs
The podcast discusses repurposing mining pools for training Large Language Models (LLMs), such as using compute resources from blockchain structures for LLM tasks like training and inference. It highlights the challenges of trusting decentralized small operators running computations for tasks like training models, stressing the importance of ensuring correct model training amidst diverse operators and potential trust issues. The episode delves into the cost-efficiency of using nodes for computation rather than major cloud providers, citing potential cost savings despite trust challenges.
Diversity and Equity in Tech, the DARE Program
Raluca Popa shares insights into her DARE program aimed at diversifying access to research and engineering for minorities in tech. The program eliminates barriers for minority students to engage in research by allowing professors to contact and match with students based on profiles, fostering a diverse research environment. The episode underscores the significance of diversity initiatives in tech to ensure equal opportunities and maximize the social impact of technology advancements for a more inclusive and innovative future.
Dr. Raluca Ada Popa, renowned computer scientist, entrepreneur, and President of Opaque Systems, joins Jon Krohn to share her insights on securely interacting with AI APIs like OpenAI's GPT-4, the pros and cons of open vs. closed-source AI development, and the seamless operation of compute pipelines across multiple clouds.
This episode is brought to you by AWS Inferentia and by Modelbit, for deploying models in seconds. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
In this episode you will learn: • What is a confidential computing platform? [04:31] • How to get started with confidential computing [12:10] • The challenges of confidential computing and LLMs [21:11] • How to safeguard your data while using commercial LLMs like GPT-4 [38:00] • Open-source vs closed-source [52:28] • Raluca's PreVail cybersecurity company [1:01:50] • Combining entrepreneurship and academic career [1:04:03] • DARE Program [1:10:39]