Towards Data Science

The TDS team

Note: The TDS podcast's current run has ended.

Researchers and business leaders at the forefront of the field unpack the most pressing questions around data science and AI.

Episodes

Mentioned books

21 snips

Oct 12, 2022 • 58min

130. Edouard Harris - New Research: Advanced AI may tend to seek power by default

Progress in AI has been accelerating dramatically in recent years, and even months. It seems like every other day, there’s a new, previously-believed-to-be-impossible feat of AI that’s achieved by a world-leading lab. And increasingly, these breakthroughs have been driven by the same, simple idea: AI scaling. For those who haven’t been following the AI scaling sage, scaling means training AI systems with larger models, using increasingly absurd quantities of data and processing power. So far, empirical studies by the world’s top AI labs seem to suggest that scaling is an open-ended process that can lead to more and more capable and intelligent systems, with no clear limit. And that’s led many people to speculate that scaling might usher in a new era of broadly human-level or even superhuman AI — the holy grail AI researchers have been after for decades. And while that might sound cool, an AI that can solve general reasoning problems as well as or better than a human might actually be an intrinsically dangerous thing to build. At least, that’s the conclusion that many AI safety researchers have come to following the publication of a new line of research that explores how modern AI systems tend to solve problems, and whether we should expect more advanced versions of them to perform dangerous behaviours like seeking power. This line of research in AI safety is called “power-seeking”, and although it’s currently not well understood outside the frontier of AI safety and AI alignment research, it’s starting to draw a lot of attention. The first major theoretical study of power seeking was led by Alex Turner, who’s appeared on the podcast before, and was published in NeurIPS (the world’s top AI conference), for example. And today, we’ll be hearing from Edouard Harris, an AI alignment researcher and one of my co-founders in the AI safety company (Gladstone AI). Ed’s just completed a significant piece of AI safety research that extends Alex Turner’s original power-seeking work, and that shows what seems to be the first experimental evidence suggesting that we should expect highly advanced AI systems to seek power by default. What does power seeking really mean though? And does all this imply for the safety of future, general-purpose reasoning systems? That’s what this episode will be all about. *** Intro music: - Artist: Ron Gelinas - Track Title: Daybreak Chill Blend (original mix) - Link to Track: https://youtu.be/d8Y2sKIgFWc *** Chapters: - 0:00 Intro - 4:00 Alex Turner's research - 7:45 What technology wants - 11:30 Universal goals - 17:30 Connecting observations - 24:00 Micro power seeking behaviour - 28:15 Ed's research - 38:00 The human as the environment - 42:30 What leads to power seeking - 48:00 Competition as a default outcome - 52:45 General concern - 57:30 Wrap-up

17 snips

Oct 5, 2022 • 51min

123. Ala Shaabana and Jacob Steeves - AI on the blockchain (it actually might just make sense)

Two ML researchers with world-class pedigrees who decided to build a company that puts AI on the blockchain. Now to most people — myself included — “AI on the blockchain” sounds like a winning entry in some kind of startup buzzword bingo. But what I discovered talking to Jacob and Ala was that they actually have good reasons to combine those two ingredients together. At a high level, doing AI on a blockchain allows you to decentralize AI research and reward labs for building better models, and not for publishing papers in flashy journals with often biased reviewers. And that’s not all — as we’ll see, Ala and Jacob are taking on some of the thorniest current problems in AI with their decentralized approach to machine learning. Everything from the problem of designing robust benchmarks to rewarding good AI research and even the centralization of power in the hands of a few large companies building powerful AI systems — these problems are all in their sights as they build out Bittensor, their AI-on-the-blockchain-startup. Ala and Jacob joined me to talk about all those things and more on this episode of the TDS podcast. --- Intro music: - Artist: Ron Gelinas - Track Title: Daybreak Chill Blend (original mix) - Link to Track: https://youtu.be/d8Y2sKIgFWc --- Chapters: 2:40 Ala and Jacob’s backgrounds 4:00 The basics of AI on the blockchain 11:30 Generating human value 17:00 Who sees the benefit? 22:00 Use of GPUs 28:00 Models learning from each other 37:30 The size of the network 45:30 The alignment of these systems 51:00 Buying into a system 54:00 Wrap-up

17 snips

May 4, 2022 • 43min

122. Sadie St. Lawrence - Trends in data science

As you might know if you follow the podcast, we usually talk about the world of cutting-edge AI capabilities, and some of the emerging safety risks and other challenges that the future of AI might bring. But I thought that for today’s episode, it would be fun to change things up a bit and talk about the applied side of data science, and how the field has evolved over the last year or two. And I found the perfect guest to do that with: her name is Sadie St. Lawrence, and among other things, she’s the founder of Women in Data — a community that helps women enter the field of data and advance throughout their careers — and she’s also the host of the Data Bytes podcast, a seasoned data scientist and a community builder extraordinaire. Sadie joined me to talk about her founder’s journey, what data science looks like today, and even the possibilities that blockchains introduce for data science on this episode of the towards data science podcast. *** Intro music: - Artist: Ron Gelinas - Track Title: Daybreak Chill Blend (original mix) - Link to Track: https://youtu.be/d8Y2sKIgFWc *** Chapters: 2:00 Founding Women in Data 6:30 Having gendered conversations 11:00 The cultural aspect 16:45 Opportunities in blockchain 22:00 The blockchain database 32:30 Data science education 37:00 GPT-3 and unstructured data 39:30 Data science as a career 42:50 Wrap-up

9 snips

Apr 27, 2022 • 50min

121. Alexei Baevski - data2vec and the future of multimodal learning

If the name data2vec sounds familiar, that’s probably because it made quite a splash on social and even traditional media when it came out, about two months ago. It’s an important entry in what is now a growing list of strategies that are focused on creating individual machine learning architectures that handle many different data types, like text, image and speech. Most self-supervised learning techniques involve getting a model to take some input data (say, an image or a piece of text) and mask out certain components of those inputs (say by blacking out pixels or words) in order to get the models to predict those masked out components. That “filling in the blanks” task is hard enough to force AIs to learn facts about their data that generalize well, but it also means training models to perform tasks that are very different depending on the input data type. Filling in blacked out pixels is quite different from filling in blanks in a sentence, for example. So what if there was a way to come up with one task that we could use to train machine learning models on any kind of data? That’s where data2vec comes in. For this episode of the podcast, I’m joined by Alexei Baevski, a researcher at Meta AI one of the creators of data2vec. In addition to data2vec, Alexei has been involved in quite a bit of pioneering work on text and speech models, including wav2vec, Facebook’s widely publicized unsupervised speech model. Alexei joined me to talk about how data2vec works and what’s next for that research direction, as well as the future of multi-modal learning. *** Intro music: - Artist: Ron Gelinas - Track Title: Daybreak Chill Blend (original mix) - Link to Track: https://youtu.be/d8Y2sKIgFWc *** Chapters: 2:00 Alexei’s background 10:00 Software engineering knowledge 14:10 Role of data2vec in progression 30:00 Delta between student and teacher 38:30 Losing interpreting ability 41:45 Influence of greater abilities 49:15 Wrap-up

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!

Get the app

Towards Data Science

Episodes

Mentioned books

130. Edouard Harris - New Research: Advanced AI may tend to seek power *by default*

129. Amber Teng - Building apps with a new generation of language models

128. David Hirko - AI observability and data as a cybersecurity weakness

127. Matthew Stewart - The emerging world of ML sensors

126. JR King - Does the brain run on deep learning?

125. Ryan Fedasiuk - Can the U.S. and China collaborate on AI safety?

124. Alex Watson - Synthetic data could change everything

123. Ala Shaabana and Jacob Steeves - AI on the blockchain (it actually might just make sense)

122. Sadie St. Lawrence - Trends in data science

121. Alexei Baevski - data2vec and the future of multimodal learning

The AI-powered Podcast Player

130. Edouard Harris - New Research: Advanced AI may tend to seek power by default