
Faking Data Using Tonic.ai with Ian Coe and Adam Kamor
Data Archives - Software Engineering Daily
00:00
Synthetic Data Is Better Than DIdentified Data
Data is d identified. If the d identified row can be tied back to a source row, it's like, ok, this d identified row corresponds to, you know, ro whose primary key id seven in the source data base. Synthetic data, on the other hand, that you can't tie a synthetic row back to any individual row in the original data source,. But each of the columns in that synthetic row are generated from essentially the aggregate properties of the respective column. When you use that approach, you can actually generate data really at any scale you need.
Transcript
Play full episode