Speaker 2
So you're basically providing a service both for the insurance and the insurers, right, if I understand correctly?
Speaker 1
So most of our customers are insurers, trans companies. But the bread of products we offer is quite large. So I might not be aware of some solutions that exist directly to end users.
Speaker 2
For the use case that you described earlier, I think that's interesting because I like that you're describing this from a use case perspective first and then an ML model perspective. Sometimes people start with the technicalities of what they're deploying. But what you're describing sounds like a pretty complicated multi -stage pipeline, which I can imagine the first part of which is sort of a maybe a more classical computer vision model. But then you tie that into things that are more maybe complicated classifiers that need to map to very specific definitions of damage and repairs and things like that. If you can share, I'm curious hear how that connector looks like transitioning from the image part to maybe the more structured part of discovering what repairs need to take place. Yeah,
Speaker 1
so you're right, 100%. So there is much more to this than just training a model and exposing it to an API, getting it into production and letting it receive images, produce predictions. There's so much more because there are different models operating simultaneously. And then the outputs of these models have to be reconciled and put together somehow to produce the final results that we serve to our consumers, consumers of our APIs. And yeah, there are elements that are specific to how we internally recognize different parts of the vehicle and what kinds of operations can be applied to these parts, what kind of things make sense to do, like you would be able to paint the door, but you probably wouldn't paint the tire, right? All of those things have to be taken into account. And the outputs of all of the models have to be combined in a smart way.
Speaker 2
Very nice. And are the more structured parts in this process also based on deep learning models? Or do you combine deep learning and non deep learning models in your pipelines? There
Speaker 1
is deep learning involved. There's some custom logic as well, just rule -based elements in the pipeline. Yeah, all of those. Awesome.
Speaker 2
And then for all of this thing, like how do you deploy it to production? How does the deployment process look like? Right.
Speaker 1
So everyone wants to talk specifically about the details of how we do it at Solarim, but in general, considering my prior experiences, I think that production is not something that happens late in the development cycle. It's not like you work on a model and then you test it and then you deploy it and then you serve it in production. To read production actually starts quite early as soon as you start writing code, And this is because when you already have a system in place that's responsible for delivering code, you have a repository with a CI CD in place with a unique test or other test suits, and you are contributing a new component to that code base, you to already be thinking about how your model would work in production. How would it integrate with the other parts? How your code would stick into the code base? So in a situation in which all of your code or large portion of your code sits in the same speed repository. And so that you can reuse some portions of the code that are applicable to different components that you build so that you don't, you know, you stay dry, you don't repeat yourself. It also offers you a single CICD pipeline, which means that there's no hidden deployment knowledge that is scattered around different people that are responsible for building and deploying different things, but rather you have just one centralized way of delivering changes and deploying changes. And obviously, you know, all the things that if you think about all the activities that machine learning engineers would do other than writing code in the repository, like running tests or running linkers type checkers or building, you know, item wheels, docker containers, all of those things are much easier if you have a single code base, and then an associated build system that works with this code base and can handle all of this stuff automatically for you. Okay. So yeah,
Speaker 1
Yeah, I just wanted to wrap up with this. It was quite a lengthy introduction, but what I was aiming for is to say that once you have the system of code delivery in place, then you don't really think about deployment anymore that much, because it has been thought of before. And once you have a new element to your system that you want to deploy, you're just kind of, you know, included in the already existing infrastructure
Speaker 2
and it gets deployed automatically. Okay, so for the monorepar with lecture, like I agree with the benefits of you, that you mentioned, one of the challenges could be that there's like different logical components that look differently in the same repository. I'm curious if you solve that by maybe defining what we've seen in some places that there's an interface defined for a model. And that is supposed to be generic enough to split any type of model. And then you expect the data scientists to comply with this interface, that's a contract. Or if you have other solutions for this problem, how do you deal with that in the monorego example?
Speaker 1
Yeah, so these templates or contracts are a great way to making sure that all models or all components of your system work in the similar fashion or follow the same API. It's also a matter of the developers themselves how much easier it is to write code and to contribute new changes in this setting. And then they will be willing to use the existing things rather than develop something from scratch again and again. Yeah,
Speaker 2
I think that there's a lot to say for convenience in adopting new workflows. Like you when you try to introduce a new workflow, whether it's mono repo, or just a new way of deploying model or something like that, by the extra step of making that very convenient for the people that are going to interface with that process regularly, which ideally I think is an ML, there's a title thing here, but if you're responsible for building other deployments, the deployment pipeline rather, and not doing the deployment themselves, you try to sort of make it very accessible to the people that you want to be able to deploy the model. And I think that that's a very important process to support in this context as well. So I guess one question I have is what have you found to be the main challenges before getting into monoracle architecture? And how have you solved them? Or how do you think about solving them? Yeah,
Speaker 1
that that is a good question. I think one of the biggest challenges is when you're starting, you're typically not starting from scratch or you don't have an empty sheet and then decide, like, let's start with the monery. But you have the existing situation of different repositories, often using different approaches to do things, different code styles and so on. And you have to successively hang them together, put them together into this one, one monery bomb. And obviously this, this is never something that has a high priority because you have to work through the red, you have the deadlines to meet, you have tasks that are far more important. So this is one of those things that people tend to agree on. Yes, we should do this. Yes, this, this, you know, this, this part of code should be in the monery bullets. Let's, let there. I have it asking the backlog, right? Some of those things will take time to actually get it done.
Speaker 2
within. So do you have tips for people that are facing this on how to either get this into the priority list or make it efficient so that it's easy to get done? Yeah,
Speaker 1
one trick that could work, although it's also additional work, but if you're talking about large code bases, it actually could pay off, is to try, if you're using some sort of a build system in the Monorepo, and just as a side note, so the build system is kind of a piece of software that would take care all those non code writing related activities I mentioned building packages images running test suits for you and so on and dependent managing dependencies. If you if you're using this in your monorepo, you could try to use the same tool in the same way in this other code base that you intend to merge with the monorepo in the end. And if you successfully step by step introduce this build system for this self -contained code base, then at some point you reach such a high level of coverage in terms of this build system that it's easy to just copy paste that code to the monorego. Okay,
Speaker 2
fair enough. And with respect to the models themselves, so it does sound like you're combining some, you know, I don't know if to say it like off the shelf, but you're taking some standard functionality, like probably object detection and things like that within images and combining with a lot of logic and probably maybe custom models. I'm curious, like, within this complex system, how do you evaluate model versions or high -blind versions and make sure that you're actually getting better?
Speaker 1
And it's never easy to evaluate machine learning models. And it is a challenge for us as well. As you mentioned for all of these different models that we have, there's always metrics, machine learning performance metrics that are specific to the test you're solving, right? So if you have a classifier, you would look at precision, recall F1 square and so on. If you have a object detection model, you can look at the IOU or whatever metric you like. But in the end, the final output of the entire system is so much more than the individual output of the models. Yeah, so in order to be able to evaluate this, what needs to be done, what we are doing is to set up a dedicated scoring pipeline that we run against. And that's not enough because what you get as the output is a range of metrics that you're interested in, not machine learning metrics, but more of some of them business -related KPIs, some of them some sort of model accuracy metrics, but then you're facing this challenge. What happens if you know some of them go up, some of them go down, and you're comparing to the models. Let's say you have a pipeline that you have in production, you have a pipeline that you if you came up with and you're hoping it to be better, you run the scoring and you have a couple of metrics, some of them are improving, some of them are getting worse. And it is not easy to first know for yourself as an engineer which of these models is good. But even if you reach a conclusion, and I'll talk about how it could be done in a second, but even if you know which models best, then you have to also make sure that the consumers of your models or the business side will also accept the solution as better. But coming back to the situation in your chair, you're facing a couple of different metrics. What I typically like to do, and I unless unless I'm mistaken, I think this idea was propagated by Andrew and the founder of Coursera, he came up with this idea of satisfying versus maximizing metrics. So when you're dealing with a couple of metrics, you select one that is the most important to you that you want to be as high as possible or as low as possible depending on what you're measuring but you want it to be optimal and all your other remaining metrics have to meet some thresholds that you select for them. So then you pick the solution that meets all the thresholds and among the solutions that meet all the thresholds you pick the one that has the best value of this optimizing metrics. Sure.
Speaker 2
I think that really, like, I agree with that. And I think one of the really simple examples that I've heard in the past is like, if you're building a model to detect disease, then you are sort of deciding that you want to maximize the amount of detections while making sure that there are zero missed, like zero false negative space. Because you don't want to miss anyone who's sick. Get it that you're not missing anyone who's sick. What's the maximum amount of true positives that you can get? So this is like a simplified case because it's just two mentors but you can imagine that if you have you know seven eight ten metrics then you can do the same thing with all of them. The interesting point that you made also that I wanted to ask about is involving the business people or maybe the less technical people in the decision process. Obviously, like we can say the obvious stuff like you have to have a good communication with the business stakeholders, which is true. But I'm curious, how knowledgeable do you think they need to be in like the ML terms and things like that? Or do you just talk to them in a high route hall, try to understand what they need in their terms, and then break your head against the wall until you figure out how to translate that into numbers for the model.
Speaker 1
Yeah, that's another good question. But if that's okay, I want to go back quickly to one thing you just said, which also actually relates to the conversations with the business side of things. You mentioned this example with medical diagnosis where different types of birds have different costs, right? So if you miss a disease that's potentially, you know, a very bad outcome, but if you misdiagnose someone a sequent, then they're in fact healthy. That's, well, less of a cost, right? You may be paid for some additional screening diagnosis, but it's nothing terrible, right? The same applies to, I mean, many other cases where you have a model or a machinery -based solution that can give you different outcomes and then can make different types of errors, right? Either you over -predict or under -predict something. If you have a classifier, either you misclassify something as one thing or as something else. All of these different errors do have different costs that the business incurs, either in terms of money or time or missed savings, whatever it be. It's great if you have the possibility or if the business people are able to tell you what these different costs are, obviously this is not as simple, it's not like you can assign a dollar value to every type of area your model can make. The closer you get to this point, the better. first of all, makes sure that your models are chosen with the right criteria in mind and are solving the right thing. And then it also builds trust between the developers and the business because they can see that you care about what they care about and that your solution is indeed impacting the metrics the KPIs that they're looking at. And that I think partly is a segue to your question, which was if you if you would say it again.
Speaker 2
how deep into the tech people know how the business stakeholders need to be in order to have this like, healthy conversation, versus not just telling you at a high level, like what they want and then you worrying about the translation to NL
Speaker 1
metrics. from my experience, there's no role to that. So in the past, I used to work at a software house where we would do machine learning projects for different customers. And I got the opportunity to interact with many different people from many different sorts of companies. And they're all different. Sometimes you have a manager who's non -technical, who is responsible for managing the project or the team. And even though they might miss the technical vocabulary, they might still be a great partner in those conversations. If you explain to them what it means, what you're trying to show, they might understand or be willing to understand and work with you. And the other hand, there are technical stakeholders who would argue with you. Some people, given that they have some level of technical understanding, they would try to show it off and then prove to you that you haven't thought of something and they would throw technical at you. So there's no real life thing.
Speaker 2
Okay. Yeah, I think that the I think that in the end, when teams work well together, you sort of come to this middle middle ground on the spectrum between technical and non technical. And I think you gave a really good example, which is like, in a sense, you as a data scientist need to put on a product manager hat and Interview the business people as if they were your users like what do you care about? Is this more painful than this? What does it mean when we fail at this thing and then translate that and I think over time even if you're not trying What's going to happen is that the business? People will also gain a better of what metrics mean and everything. But yeah, it starts with communication, I think it's usually easier for the data side to start by moving towards the business side than for the business to move towards the data science side. And then found that middle ground. But yeah, I think that's a really, really good point. What about tools? Like what tools are you using within your machine learning broad -de -way? You mentioned Model Repo, so I know you're using Git, but what else do you use?
Speaker 1
Oh, right. So, I mentioned the build systems as well. So, the one I like particularly is TUNS. There are actually two, I mean, probably there are more.