Speaker 3
That's a pretty example things. Now, when you're injesting data, maybe you're getting these orders, or maybe you are looking at analytical stuff about where this user is accessing from, et etera. How do you enforce the policies that you may have already defined on data that's coming in from all these sources. You know, things like you might have streaming data, you might have data atdress, transactional stuff. So how do you manage the policies for enforcing the policies on incoming data, especially things that are fresh and new?
Speaker 2
So i love this question, and i want to add a little bit to it. So i want to give some background before we kind of jump into that. So, you know, when we're thinking about policies, we're often thinking about that step of enforcing it, right? And i think what gets loss is that there's really two steps that happen before that. And ther there's probably more. I'm glossing over it all. But there's defining the policy. So, you know, do i get this from a legal is there some new law like, you know, c c p a or g d p r, or hippa or something? And, you know, this is kind of where i'm getting sort of the nuts and bolts of the policy from, defining it. And then, you know, you have to have someone who's implementing it. And so this is kind of what you're talking about of, you know, kind of getting into, is it, you know, data at rest? Is it an ingestion where am i writing these policies? And then there's enforcing the policy, which isn't just a tool doing that, but can also be, ok, i'm going to, you know, scan through and see how many people are accessing this data set that i know really shouldn't be accessed much at all. And the reason why i'm discussing these distinct, different pieces of, you know, policy, definition, implementation and enforcement, is those can often be different people. And so having a line of communication or something between those folks, urri and i have heard from many companies, gets super lost. And this can completely break down. So really acknowledging that there's kind of these distinct parts of it that, you know, and parts that have to happen before enforcement even happens, is sort of an important thing to kind of wrap your head around. But urry can definitely talk more about the like, you know, actually getting in there and enforcingte policies. I
Speaker 1
agree with everything that was said. Again, yes, sometimes for some reason the people who actually adit, the data, or actually not the data, who odit the data policies, get sort of like forgotten. And they are kind of important people. When we talked about why data governance is important, we said forget legal fora a moment, why data governance is important, because you want to make sure the highest quality data get to the right people. Great, who can prove that? It's the person who's monitor ing the policies who can prove that. Also, that person may be useful when you're talking with a european commission and you want to prove to them that you're complying with judipial. So that's an important person. But talking about enforcing policies on data as it comes in. So couple of thoughts. The first of all, you have what we ingogl call organization policies, or og policies. Those are like, what process can create, what data storl well, and this is kind of important even before you have the data, because you don't want, necessarily, your aps in europe to be beaming data to the us. Maybe, again, you don't know what that data is. You don't know whet it contains, it hasn't arrived yet. But maybe you don't even want to create a sink for it in a region of the world where it shouldn't be li because you are compliant with iapear, because you promise theor german company that you work with that employer information remains in germany. That's very common. It's beyond e dipar. Maybe you want to crete a data store that is rid only, or write once, rid only, more correctly, because youare a financial institution and you're required by laws that bride gidipa by decade, to hold transaction information for for detection. And apparently does fairly detailed regulations about that. After that, it's a bit of work fror management. The data is already landed. Now you can say, ok, maybe i want to build a tea system like we discussed earlier, where thes landing zone, very few people can access this landing zone. Maybe only machines can access the landing zone anly do basic scraping and augumenting and enriching andi transferred to very few people, very few human people. And then later it's published to the na organization. And maybe there's an even later step were its shared with partners, peers and consumers. And this is, by the way, a pattern, this landing zone, intermediate zone, public zone, or published zone. This is a pattern we are seeing more and more across the data landscape and o data products. And in gogel, we actually created a product for that called data plex, which is first of a kind, which gives a first class entity to those kind of like holding zones.