Data Archives - Software Engineering Daily cover image

Faking Data Using Tonic.ai with Ian Coe and Adam Kamor

Data Archives - Software Engineering Daily

00:00

Synthetic Data Is Better Than DIdentified Data

Data is d identified. If the d identified row can be tied back to a source row, it's like, ok, this d identified row corresponds to, you know, ro whose primary key id seven in the source data base. Synthetic data, on the other hand, that you can't tie a synthetic row back to any individual row in the original data source,. But each of the columns in that synthetic row are generated from essentially the aggregate properties of the respective column. When you use that approach, you can actually generate data really at any scale you need.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app