Data Engineering Podcast cover image

Off The Shelf Data Governance With Satori

Data Engineering Podcast

00:00

Identifying New Sources of Pii

I know that there are certain commonly structured aspects, such as credit card numbers or social security numbers and addresses. What we do is a combination of three types of algorithms. We have dictionary base classifiers for things like what type salutation and state codes and country codes - et cetera. We have pattern based classifiers forthings like female addresses, user names, incrypted past words and so on. And for more complex dato types that have a more free form, we have a set of machine learning based classifiers. When you combine all these three a pretty good handle of p i i. There's no silver bullets in solving that problem.

Transcript
Play full episode

The AI-powered Podcast Player

Save insights by tapping your headphones, chat with episodes, discover the best highlights - and more!
App store bannerPlay store banner
Get the app