Speaker 2
Now, that might be an interesting thing to shift to then is if this is possible, if we have microscopes that, holy cow, we can Python control them and we can shift on the fly what we're looking for, how does that change the way that scientists interact with microscopes and other instruments for characterization, right? I feel like there's a paradigm shift coming. We've talked about machine learning for property prediction, for structure prediction. That stuff's become very well known. In the last couple years, we've seen generative models like our group's been working on that. But there's a couple things that have been under investigated. One of them is processing, tying the sort of nitty gritty boring aspects of, okay, when you process materials, trying to use ML for that. But then another one might be with characterization. We've seen a little bit like there's automated refelt refinement and a few things like that. But I'm curious, what's your take on what the future will look like in something like microscopy? What will be the role? What will be the way that scientists interact with it? Will it be, you know, scientists running it will be turning it over to the ML something hybrid between the two? You
Speaker 1
know, it's a super interesting question and two years ago, I would say that if we use machine learning and automated microscopy, we can do the same things as humans only faster. There is a figure to merit to that. You can argue that currently we are using microscopes maybe eight, 10 hours a day and then the microscope rest. If we have the ML algorithm working and doing the human level task, then we can use it 24 -7, and that increases productivity and they need to rest time, less time to make specific decisions. So we can get easily maybe a factor of 10 or 3rd output, maybe even a factor of 20, but probably no more than that. But some of the developments that happened in my group in the last year and with my colleagues at Oak Ridge and Pacific NOSFEST make me think that I was a newly pessimistic about it. And let me explain why it is the key spot from the point of view of opportunity and from the point of view of how human works and in terms of what it takes to make it happen. So in terms of opportunity, the situation is actually very simple that our microscopes now operate in the regime of the double technological debt. So we use the same scanning as people used in again in Ruska time. And this scanning gives us the rectangle images, which are very convenient for the human eye to interpret. So that's why it's double technological. It maybe is kind of one of this component is technological and others psychological. That being said, microscopes can first of all operate much faster. So the electron microscope can easily generate 32K by 32K image. Human eye can recognize features maybe in one or two images. So if we have image of this size, we will have to spend considerable amount of time of trying to go through it by eye and find what's interesting. Secondly, typically when we run the microscope or we do the microscope experiment, we actually have a reasonably good idea about what is that that we have to, so what we want to understand. And very often the things that or object that we're interested in don't have the uniform distribution on the material surfaces. So ideally the microscope will not just scan everything and then human looks through the data and try to make sense out of that. But the microscope will actually target the objects, behaviors of interest. So this is the opportunity. So we already know that for electron microscopes, there is at least three or four orders of magnitude of acceleration. That is a possible if we learn to analyze the data dynamically. For SPM methods, which are relatively slow, or we don't have three orders of magnitude, but we can run experiments much in a much more smart way. So to increase the ratio of useful observations to the observations in general. So the second part, which is a very big bottleneck at this point, that is that the humans are exceptionally good in doing what the humans are doing, right? So if you look at the performance of the large language models or the classical benchmark for machine learning, all of them start to saturate slightly above the human level. And saturation slightly above the human level, average human level means well below the human expert. So for the time being the classical supervised machine learning models, if you deal with the area that you are not familiar with, that can be helped, but you don't expect it to know things better in your own domain. So that's a fundamental limitation of the supervised learning methods. They cannot magically jump out of the distribution or extrapolate. But then the interesting thing becomes whether rather than trying to use the supervised machine learning, we can define the goals of the experiment in the way that makes sense. Then we let machine learning algorithm to follow those goals, as opposed to rely on the data that we have created in the previous experiments. So this mindset is very different from the classical or well mindset in the ML community, but it seems to be much better tuned to the way that the experimental sciences work. So in other words, rather than trying to use ML to assist me in doing things that I'm already good at, and then ML only helps me save time, I want to impart my reward structure on the ML algorithm and let it pursue it in the way that it sees fit given the access to the instrumental control that it can access in the
Speaker 2
instrument. So what's missing there and you've called out in your paper like the missing link, like the real thing that we're where this community needs to spend our time is then translating between what the human knows what they're looking for in the experiment and the actual reward that you program into the algorithm. It has to have something that it has for a feedback loop and we know like if I'm looking at a material and I'm trying to find like a cause of a defect, like what causes material to break, I kind of have an idea of what I'm looking for. I'm looking for something that's out of the ordinary. I'm looking for certain things that maybe I've seen in the past that caused that. So how do you translate that high level thinking about the application, right, the end application of what you're doing, to something direct that the microscope can look for in real time? Like, I don't know how we're going to solve that. Any ideas?
Speaker 1
So this is super difficult and we started to have this idea maybe a year ago. So that was the time when I talked to folks at Google. And at some point, I asked them a very simple question like, look, guys, I tried to understand reinforcement learning for several years. So I went through several books. That's not the most trivial area. No, but let's say I kind of know enough to go I went through several elementary examples by now. I understand the bell basics I can roughly say And write by memory how the Bellman equations look like so that's all great. But where do you guys get the reward functions? Because I understand where the reward functions comes and games I mean that's what makes give super cool because the reward is almost instant. And you know what is the long term reward. But then when I start thinking about the care, reinforcement learning is great. How am I going to apply it for the real world scenarios like microscopy or material synthesis or what's not. And then I start to realize that for any real problem that I have, I can reduce it to the level of the myopic optimization when there reward function is instant. But it's very difficult for me to formulate the real world problem is the reinforcement learning problem where the reward is somewhere in the end of the process. And then I talked to the Google folks. And that was slightly before the time when this all concerns about the X risk from AI started to surface and my question was Where do the rewards come from because I kind of feel like this is the part of the reinforcement learning that I don't understand And then they told so am I stupid? Well, you're absolutely not stupid. That's in fact the most complex part of the reinforcement learning. So if we don't define the reward correctly, then the algorithm starts to do something weird. So my analogy is that imagine that I'm playing the chess with the Boston Dynamic Robot and the robot is told that it has to win. So if you don't specify things better, the robot can easily stand, take the chessboard and hit me on the head because win is a win. So another example, I guess if you read the Persian fairy tales of the thousand one -niles and so on, there's always this issue of how you interact with the genius, right? So if the genius, you liberate the genius the genie is supposed to fulfill three of your wishes, you have to be very, very careful about how you formulate them because if your formulation allows for, genie takes a shortcut that's good for genie but not good for you, the genie will do that. You always want to be happy. One of the ways to get there is to lose your mind because then you're always happier. And in some sense, the problem with the reinforcement learning and generally reward -driven methods is exactly the same as with the genius. So if you don't formulate what you want to accomplish exactly, the algorithm can find the shortcut that is not something that you're interested in. And then I start to look at it a little bit broader, like what are generally our rewards when we deal with the machine learning algorithms. Then I realized that intrinsic rewards are very, very, very, very, very, we have loss functions, we have different ways to look at the, for example, symbolic regression and value, the predictive error and parsimony of the, of the expression. But these are very, narrow and very, very specific. The real world scenario, we actually deal with the much longer term objectives, which can be fairly complex. So this research on interface of how you define the reward functions that translate our long -term objectives and the short -term rewards that can be pursued by the automated experiment, that I think will be super important.
Speaker 3
Yeah, and thinking to my own experience with microscopy, a lot of the rewards aren't static either, right? You go into some sort of microscopy, put a sample in the SCM, start looking at it, and you think you're looking for one thing, and then all of a sudden you find something unexpected, something that you weren't looking for. And that starts to become a new challenge because now all of a sudden the reward structure has to be reformulated, right? Yeah,
Speaker 2
you see something, you're like, well, shoot, why don't we do EDS map? Oh, look at that line scan shows this. Oh, maybe this what's going on, right? Like human is very much there deciding that stuff. So how do you get an algorithm to have this high dimensional way of considering all the different options and all the different reward pathways when it's not completely straightforward? And
Speaker 1
notice that this is actually the cool thing about the machine learning when you apply it to experiment. Because when you apply machine learning and theory effectively, at some point, you get to the point we just we need to scale. Once you start to machine learning in an experiment, you inevitably start to think why are we doing science in the first place? What's our goal? So what determines our decision making? Because you can take several people, give them the same sample and different and same instruments, and the way they will structure the experiment workflow would be totally different. So in some sense, the automated experiment is almost like humanity in the sort of in the model of the humanity, when we need to understand why we are making decisions and what drives us. And then if you start to do a little bit of the introspection, or in my case it was introspection forced by the necessity to build the ML workflows for the instrument, you realize that even during the single experiment, as you mentioned, we can pass through multiple rewards. start with optimizing the instrument. We then proceed to explore the material based on the, for example, observation of, if we have some areas in the material that have roughly similar structures, it's natural for us to look at the majority regions. Then we start to do things based on the curiosity. For example, you see weird patterns out there. You try to zoom in on them or try to understand what goes on there. And generally, only after that, we start to think about some specific theories or the high -level thinking. And of course, while we are doing all of that, we also keep eyes open for this re -indipitous discovery. So very often things are, we observe things that we didn't expect, and then they're driven by curiosity on how to do that. So the interesting part is that all these considerations, except for the hypothesis, even including hypothesis driven discovery, in some sense, can be introduced as the part of the, part of the automated experiment. So, these are well -defined rewards. They can be cast in the language that the machine learning algorithm understands. And in principle, we can run the automated experiment when the machine learning agent runs the microscope and gets the data and issues the commands to the microscope and the human operator observe the process and sort of shift the reward target for the machine learning agent. And that seems to be the first case where we can do something fundamentally different from how we have been doing microscopy before. So rather than us thinking about what to do and issue the commands, we specify what is that we care about in terms of physics, and then the machine learning agent tells the microscope what to do to pursue those targets most effectively. that actually practically works. But the more interesting part becomes when we go from the microscopy problem to the outer context problem. So ultimately, we are not doing microscopy for the sake of microscopy, even though a lot of my colleagues probably will disagree. Ultimately, we are doing microscopy starts in the chemistry lab. So we synthesize something, we bring sample for the microscopy, we expect to learn something about the physics or chemistry of material. And the reason why we need it is because ultimately this material is supposed to be a part of something else. And connecting this outside context problems with the experimental planning, that would be a huge thing for the microscopy and characterization in general.
Speaker 3
So thinking about reinforcement learning and my own work with optimization in general, I know that reinforcement learning takes quite a lot of data to train and to get right. And the more I've played around with optimization, I've come to find the value of actually simulating optimization campaigns to see how things work out and use that to help maybe guide initial model selection. Is there a role for that in terms of fake microscopy data to maybe test out different board structures or to maybe ensure that the model is best aligned in advance of putting it towards real data?
Speaker 1
Absolutely. I think that the key constraint in any experimental sciences is that our experimental budgets are limited. There is nothing we can do about it and effectively we cannot scale. Even if we try to connect all the microscopes in the world and some common ecosystem that allow sharing the data and algorithms. That will help, but it will not help that much because all scientists like to do different things. So by definition we all want to go where no one has done before, which means that working on totally different data sets is not the case where the big data actually can help us accomplish anything. So we need to be very cognizant of the small experimental budgets and practically my feeling that our starting point is dealing with would be the introduction of very myopic workflows when the reward is available in each step. So in this case, we don't have to deal with the large data requirements of the reinforcement learning. So eventually, if we go for the longer experimental campaigns that may become an issue, but I think that the primary trick would be not to run more experiments, but to be much more responsible and the cognizant of how we define the reward functions, and how this reward functions connect what we can do in microscopy to the general materials discovery. So that's where the trick is.