WEBVTT

00:00:00.000 --> 00:00:16.000
Okay, let me make sure. I think, yeah. I think that goes. Hello, we're going to give it just another half a minute to make sure our numbers are going to give it just another half a minute to make sure our numbers are stable ones.

00:00:16.000 --> 00:00:20.000
People are joining.

00:00:20.000 --> 00:00:27.000
I can't see the Q&A so you can just handle that on your end. Yeah. Okay.

00:00:27.000 --> 00:00:40.000
Well, thank you everyone for joining us here in person it's great to see. All right, okay. Well, thank you everyone for joining us here in person.

00:00:40.000 --> 00:00:45.000
It's great to see people in the room despite last minute changes in the rooms. And thank you everyone for joining us here in person.

00:00:45.000 --> 00:00:50.000
It's great to see people in the room despite last minute changes in the rooms despite last minute changes in the rooms.

00:00:50.000 --> 00:00:55.000
And thank you everyone for joining the room despite last minute changes in the room despite last minute changes in the rooms.

00:00:55.000 --> 00:01:01.000
And thank you everyone for joining us online. And thank you everyone for joining us online.

00:01:01.000 --> 00:01:08.000
And thank you everyone for joining us online. And thank you everyone for joining us online.

00:01:08.000 --> 00:00:45.000
And we apologize for a couple of minute changes in the rooms. And thank you everyone for joining us online. And we apologize for a couple of minute delay.

00:00:45.000 --> 00:01:15.000
We had some technical difficulties. You would think by now we know how to do Zoom, but things come up.

00:01:18.000 --> 00:01:42.000
So thank you very much for your, patience. Today we're very, very excited to have the distinguished lecturer, Rebecca Willett and holds a courtesy appointment in the Toyota Institute, the College Institute in Chicago as well.

00:01:42.000 --> 00:01:54.000
Some of it's accomplishments I want to highlight. She's a fellow of Ipoli.

00:01:54.000 --> 00:02:02.000
She's all a fellow of Saia. And her research have been really, recognized with very, distinguished, awards such as career work from NSF, which of course is the best.

00:02:02.000 --> 00:02:18.000
And, and young fellow from Infosar. So I think without taking more time on the introduction, I just wanted to.

00:02:18.000 --> 00:02:35.000
So I think without taking more time on the introduction, I just wanted to. So I think without taking more time on the introduction, I just wanted to, hand it over to you.

00:02:35.000 --> 00:02:49.000
I do wanna also highlight, I think without taking more time on the introduction, I just wanted to, hand it over to you.

00:02:49.000 --> 00:03:19.000
I do wanna also highlight, hand it over to you. I do wanna also highlight, we're excited to see that not just only, hand it over to you.

00:04:55.000 --> 00:04:59.000
The medical imaging data. The way that we take that data and form an image for somebody to interpret or the way that computers are used to help interpret those images.

00:04:59.000 --> 00:05:14.000
It's going to affect the way surgeons plan and execute surgeries and even affect the pharmaceuticals that are prescribed post operations.

00:05:14.000 --> 00:05:26.000
And I think the impacts of AI and machine learning go beyond healthcare. So we've heard people talking about the potential of machine learning to help.

00:05:26.000 --> 00:05:26.000
So I'm just getting a notification here. From zoom. Sorry about that.

00:05:26.000 --> 00:05:35.000
We've heard people talking about the potential of AI and machine learning to impact all kinds of things that the National Science Foundation cares about.

00:05:35.000 --> 00:05:47.000
Developing a new understanding of the rules of life or the laws of nature, accelerating affordable drug development.

00:05:47.000 --> 00:06:00.000
Especially for you know somewhat rare diseases or underserved communities engineering green materials building quantum computers and even developing sustainable climate policies.

00:06:00.000 --> 00:06:12.000
And when you hear these lists, I think it's tempting to think that we've just developed great machine learning in AI technologies.

00:06:12.000 --> 00:06:23.000
We see evidence of that regularly. And then all that's left to do perhaps is to take those tools and figure out how to plug them into different domains.

00:06:23.000 --> 00:06:36.000
But what I'd really like to emphasize today is that is not the state of the world, that it's absolutely vital that that is not the state of the world, that it's absolutely vital that it's absolutely vital that we invest in fundamental AI and machine learning research.

00:06:36.000 --> 00:06:40.000
And in particular, I would like to claim that we invest in fundamental, AI and machine learning research.

00:06:40.000 --> 00:06:45.000
And in particular, I would like to claim that if we invest in fundamental, AI and machine learning research.

00:06:45.000 --> 00:07:01.000
And in particular, I would like to claim that if we try to develop applied machine learning research. And in particular, I would like to claim that if we try to develop applied machine learning without really understanding the mathematical, I would like to claim that if we try to develop applied machine learning research.

00:07:01.000 --> 00:07:12.000
And in particular, I would like to claim that if we try to develop applied machine learning without really understanding the So with that in mind, today I wanna have covered 2 core areas.

00:07:12.000 --> 00:07:23.000
One is just trying to do a retrospective look at some of the areas where the machine learning foundations that have been developed by the community and supported by the NSF have really had a major impact on the field and on practice.

00:07:23.000 --> 00:07:43.000
And then second, I want to talk about some of the emerging and future directions within the AI and machine learning communities and the role that foundational research is expected to play in those areas and what some of the major open questions are.

00:07:43.000 --> 00:07:47.000
Okay, so first let's just talk about like some examples where foundational research has really been impactful and where we needed that that understanding.

00:07:47.000 --> 00:08:00.000
And I think maybe the first example that would come to mind for many of us is in the context of optimization theory.

00:08:00.000 --> 00:08:08.000
So when we're performing machine learning, we set up a model and that model has got parameters or in the context of a neural network we've got weights on all of the different nodes.

00:08:08.000 --> 00:08:17.000
And what we do is we take training data and figure out how to set those parameters or weights so that we make good predictions on the training.

00:08:17.000 --> 00:08:23.000
And we do this by solving an optimization problem. So we're going to compute this by solving an optimization problem.

00:08:23.000 --> 00:08:25.000
So we're going to compute gradients of some loss function at our current values of the parameters and update them based on gradients.

00:08:25.000 --> 00:08:39.000
On the gradients of the loss. The one foundational innovation that's relatively recent that I'd like to highlight is, for instance, Adagrad.

00:08:39.000 --> 00:08:49.000
And what Ada Grant does is it takes these gradients of the losses and adapts them to past parameter estimates or past weight estimates.

00:08:49.000 --> 00:08:56.000
And this ultimately, it can accelerate training. It means that we can find good parameter estimates with many fewer iterations.

00:08:56.000 --> 00:09:11.000
And this kind of theoretical foundational algorithm is one of the key ideas that underpins for instance the Adam algorithm which is widely used across industry academia, national labs, etc.

00:09:11.000 --> 00:09:22.000
For training machine learning models. Another example, tied to optimization is distributed optimization across a collection of computers.

00:09:22.000 --> 00:09:31.000
Say, nodes in a cluster. So the expectation is that if I were to invest money and double this size of my cluster then I should be able to train my machine learning method twice as fast.

00:09:31.000 --> 00:09:44.000
But in early days of machine learning practice, this expectation was not realized. We very quickly hit a point of diminishing returns where I maybe would double my investment in computational infrastructure.

00:09:44.000 --> 00:10:14.000
And only get a tiny little improvement in the speed of my training. And so foundational fundamental research in optimization and distributed optimization helps us understand why this was occurring and that in turn led to new algorithms and just one example of this is the hog wild algorithm that was developed at the University of Wisconsin where I used to work where they used some of these theoretical insights to develop asynchronous distributed optimization methods that

00:10:23.000 --> 00:10:28.000
had this expected property, right? If I were to double the size of my computational infrastructure, then I would come close to doubling the speed of my training.

00:10:28.000 --> 00:10:46.000
And this led to many other innovations in terms of distributed optimization that were fundamental to training some of the large-scale tools that we see prevalent today like Chat GTP.

00:10:46.000 --> 00:10:55.000
I'd also like to highlight privacy guarantee. So when we're dealing with machine learning data. And it reflects a human data.

00:10:55.000 --> 00:11:08.000
We want to make sure that their privacy is protected. One standard that people use to say whether their algorithm is protecting people's privacy is something called K and identity.

00:11:08.000 --> 00:11:29.000
So K anonymity says that I'm going to transform my data just enough to make sure that I'm going to transform my data just enough to make sure that any individual in my data just enough to make sure that any individual in my data set is going to be indistinguishable from K other individuals to make sure that any individual in my data set is going to be indistinguishable from K other individuals.

00:11:29.000 --> 00:11:41.000
And this standard is, encoded into, law, right? This is considered sufficient for fulfilling, laws like HIPAA requirements in the US or GDPR But recently, one of my colleagues at U Chicago, Alumni Cohen, showed that this is actually a vulnerable standard.

00:11:41.000 --> 00:11:52.000
He said, you know, when people actually apply this, they don't want to change their data too much because they want to learn from it as much as possible.

00:11:52.000 --> 00:12:05.000
And so they would redirect the minimum possible amount of information to satisfy this requirement. And so that gives a malicious actor some additional information about that data set.

00:12:05.000 --> 00:12:17.000
So initially this work guy a somewhat negative reaction. People said, well, this is sort of like a theoretical bound, but I don't know how you could use that in a practical sense to risk someone's privacy.

00:12:17.000 --> 00:12:30.000
But then Aloni and his collaborators actually came up with an algorithm based on these theoretical foundational insights that demonstrated how you could leverage this insight to violate people's privacy.

00:12:30.000 --> 00:12:39.000
So I think this is a really key example of how foundational research is absolutely essential. And sort of analogously, we've had a lot of work done supported by the NSF in areas like differential privacy, which give us an alternative mechanism.

00:12:39.000 --> 00:12:56.000
For trying to safeguard policy by adding random perturbations to the data. And these are coming with, you know, statistical guarantees to the data.

00:12:56.000 --> 00:13:04.000
And these are coming with, you know, statistical guarantees to the data. And these are coming with, you know, statistical guarantees that, were absent from the kind of K anonymity standards that we were just considering

00:13:04.000 --> 00:13:19.000
Another area where foundational research has really impacted machine learning is in quantifying uncertainty. So we often think about machine learning tools as taking in an input, say a feature vector and producing a label, a prediction.

00:13:19.000 --> 00:13:27.000
But in many, many settings, we don't only want sort of a point prediction. We want some uncertainties associated with that prediction.

00:13:27.000 --> 00:13:37.000
And this is essential in all kinds of context climate analysis. Model predictive control automatic translation and in a variety of other settings.

00:13:37.000 --> 00:13:51.000
And people have thought about uncertainty quantification for a very long time. In classical methods might require simple model like we're just going to do linear regression in which case neural networks would have no guarantees at all or they would have very strong prior knowledge like we know exactly what the distribution underlying our data is.

00:13:51.000 --> 00:14:06.000
Which in general we we would never know. But recent efforts have focused on ideas like conformal prediction.

00:14:06.000 --> 00:14:08.000
And these tools are allowing machine learning people to assess the uncertainty of predictions with theoretical guarantees and really minimal assumptions.

00:14:08.000 --> 00:14:26.000
These can work with black box models, whatever new fancy neural network you come up with with all the bells and whistles you can still use the ideas from the toolbox.

00:14:26.000 --> 00:14:46.000
And this has had really broad impact. It's been used from contacts that range all the way from object pose estimation to biomolecular design to analyzing polls for political or for elections to even clinical medical sciences.

00:14:46.000 --> 00:14:54.000
Okay, so what final example where, we've seen foundational research really have a big impact in the machine learning context.

00:14:54.000 --> 00:15:11.000
Is, oh sorry, not final, second to last, is in allocating data collection resources. So one of the giant bottlenecks that we experience is not necessarily with the training, but just the initial collection of the data or the curating and annotating of that data.

00:15:11.000 --> 00:15:24.000
So tools like bandit algorithms, active learning, Bayesian optimization, all things that are foundational fundamental research can be used to guide data collection and labeling.

00:15:24.000 --> 00:15:35.000
And these are not just sort of theoretical algorithms that nobody uses in practice. People are actively using these kinds of ideas in industry, for instance, for ad placement.

00:15:35.000 --> 00:15:43.000
Okay, so now one final example. I want to return to this medical imaging context that we started off with.

00:15:43.000 --> 00:15:49.000
And I've got a video here that highlights how a cat scanner works, a CT scanner.

00:15:49.000 --> 00:15:57.000
So basically you lie in the scanner and it shoots x-rays through your body and on the other end you measure what's called the absorption profile.

00:15:57.000 --> 00:15:57.000
How much of that x-ray energy was absorbed for each of the different x-ray rays?

00:15:57.000 --> 00:16:06.000
And what I'm gonna show here on the right. Is that absorption profile? Each column is gonna correspond to a different absorption profile.

00:16:06.000 --> 00:16:27.000
White will mean high absorption and dark will mean low absorption. So this whole apparatus rotates around your body and we measure the absorption profile for each rotation and it forms a different column in this set of observations.

00:16:27.000 --> 00:16:32.000
So this thing on the right is what we would actually collect with a scanner. It's called sinogram.

00:16:32.000 --> 00:16:40.000
And the core next task is to map that synogram to an image that a radiologist can actually look at.

00:16:40.000 --> 00:16:48.000
So this is a difficult process. This is an ill-posed inverse problem. It's really sensitive to noise and other kinds of challenges.

00:16:48.000 --> 00:16:53.000
And so what people started asking is, well, can I somehow leverage machine learning to improve this image reconstruction process?

00:16:53.000 --> 00:17:09.000
And so initial early estimates took collections of synograms and images and some off-the-shelf neural network architecture and tried to train the network to actually perform this reconstruction.

00:17:09.000 --> 00:17:14.000
And if you just had enormous quantities of data, then this worked okay. It was a great proof of concept and spurred a lot of interest.

00:17:14.000 --> 00:17:30.000
But it also just ignores everything we know about the data collection process. Because in this kind of simple paradigm, we're forcing the network to learn 2 things.

00:17:30.000 --> 00:17:38.000
First, it has to learn something about the geometry and the structure of the images that we want to reconstruct, which is something that hopefully training data can give us insight into.

00:17:38.000 --> 00:17:48.000
But second, it's having to learn something about the relationship between that image and the synagraph, which is something that we absolutely know, right?

00:17:48.000 --> 00:17:57.000
Because we engineered our CT scanners. And so this is where, you know, foundational research really comes into bear.

00:17:57.000 --> 00:18:01.000
We say, well, can we design a neural network that reflects our knowledge of this underlying physics? And the answer is yes.

00:18:01.000 --> 00:18:09.000
And to figure out the best way to do it, we build upon foundational research in inverse problem theory.

00:18:09.000 --> 00:18:15.000
Data simulation, signal processing. Optimization, in other fields as well. So let me just give you a little bit of nuance for this one particular example.

00:18:15.000 --> 00:18:26.000
So our goal is to reconstruct an image that I'm going to call X. From a signogram that I'll call Y.

00:18:26.000 --> 00:18:35.000
And we think about why is being some function h times x plus epsilon, where h is basically telling us what the physics of this forward model is.

00:18:35.000 --> 00:18:48.000
And in a classical setting, before we bring machine learning into the mix, the way we might approach this is to search over all possible images and find one that first of all is a good fit to the data.

00:18:48.000 --> 00:18:56.000
And when we're measuring how good of a fit it is to the data, we take this forward model, this model of how the CT scanner works into account.

00:18:56.000 --> 00:19:04.000
But second, we recognize that this is really sensitive to noise and measurement error and other things. And so we also try to make sure that whatever image we come up with is a reasonable image.

00:19:04.000 --> 00:19:10.000
And classically we would say prefer to have images that are smooth, maybe sparse in some wavelet basis or something like that.

00:19:10.000 --> 00:19:27.000
Mathematicians would sit around and figure out a good value of this regularization function. And once we had this set up, we could use an optimization routine to find this good image.

00:19:27.000 --> 00:19:35.000
So we would generally alternate between 2 steps. We're in the first step. We would take our current estimate and nudge it a little bit to be closer to our observed data.

00:19:35.000 --> 00:19:49.000
And this is where we take our physical model of the CT scanner into account. And then second, we would apply some regularization stuff where we would say, I'm going to take my intermediate estimate and just clean it up a little bit.

00:19:49.000 --> 00:19:50.000
Make sure it's a little bit smoother that I've reduced some of the errors in it.

00:19:50.000 --> 00:20:01.000
And we would alternate between this and you can even represent this as a block diagram. So now we can say, all right, now I wanna use machine learning.

00:20:01.000 --> 00:20:19.000
So I've talked about Lar the naive machine learning approach. I've talked about the classical inverse problems approach and a nice hybrid turns out to be to replace this regularization step that before, you know, mathematicians and single processors we dream up with a trained component, a neural network that we can actually set the weights of.

00:20:19.000 --> 00:20:34.000
And so with what's called the deep unrolling framework, we try to set the weights of this neural network so that the final output after some number of blocks is a faithful reconstruction of the dimensions.

00:20:34.000 --> 00:20:37.000
We use our training data to set that. So first of all, this works really nicely in practice.

00:20:37.000 --> 00:20:55.000
So in the context of MRI, for instance, It allows us to get really accurate reconstructions even when we've got, say, a factor of 6 fewer measurements than we have pixels we want to reconstruct.

00:20:55.000 --> 00:21:03.000
So we're figuring out how to basically heal in the blanks. Very accurately using these machine learning tools.

00:21:03.000 --> 00:21:10.000
And not only that, but we can use the networks that we learn. In order to facilitate even faster data acquisition.

00:21:10.000 --> 00:21:17.000
So I could maybe have the number of samples be a factor of 8 smaller than the number of pixels that I want to reconstruct.

00:21:17.000 --> 00:21:39.000
But what I think is even more interesting and exciting about this is a little bit more subtle perhaps. So we had this block diagram before and we said we were going to train the neural networks within this block diagram before and we said we were going to train the neural networks. And we said we were going to train the neural networks within this block diagram.

00:21:39.000 --> 00:21:46.000
But all of these data consistency, and we said we were going to train the neural networks within this block diagram.

00:21:46.000 --> 00:21:57.000
But all of these data consistency blocks, we can think where some of the blocks are fixed and determined by the physics and the setup that we're dealing with.

00:21:57.000 --> 00:22:25.000
And some of the blocks are unknown and have weights that we can learn using training data. And so this is an example, I think, of where incorporating physical models, foundational, inverse problem theory and optimization methods has led to new ways of thinking about how to design neural network architectures that go way beyond just taking an off-the-shelf architecture and plugging it in like the early work did.

00:22:25.000 --> 00:22:32.000
Okay, so overall all of the advances that I've just described are the result of just decades of NSF.

00:22:32.000 --> 00:22:32.000
Investment and foundational research. These are not things that people just stumbled upon by running millions of experiments.

00:22:32.000 --> 00:22:46.000
They stemmed from really deep understanding of the issues at play. But like I said, that's not the end of the story.

00:22:46.000 --> 00:22:58.000
I think there still remain some really major foundational questions. That we need to be answering. It's not a matter of taking the existing tools and plugging them in and different context.

00:22:58.000 --> 00:23:02.000
So one example, for instance, addresses a problem I think we're all aware of with AI and machine learning tools.

00:23:02.000 --> 00:23:18.000
Having some really serious societal implications that are concerning biases arising or these tools being used for malicious purposes.

00:23:18.000 --> 00:23:29.000
And we really should worry about that when these tools are being incorporated into health care, deciding who gets a mortgage or other kinds of financial tools.

00:23:29.000 --> 00:23:34.000
So because of this, a number of people have said, well, we really need to design regulations. And certifications of ML algorithms.

00:23:34.000 --> 00:23:40.000
And I think this is really well intentioned, but this is something where I think foundational research is absolutely essential.

00:23:40.000 --> 00:24:01.000
And to see why, just think back to this earlier example that we talked about with privacy, right, where we developed laws without really understanding all of the technical underpinnings, without really understanding how these things might work or where they might fail.

00:24:01.000 --> 00:24:05.000
And we ended up with laws that really did not. Satisfy the spirit of what we were trying to accomplish, right?

00:24:05.000 --> 00:24:25.000
People's privacy wasn't actually preserved. And so if we want to come up with technical regulations and certification policies, we really need to understand how these methods work to ensure that these these regulations will be efficacious.

00:24:25.000 --> 00:24:34.000
In addition, many of you are probably aware that when we train these super large scale systems, it's it's really not climate-friendly.

00:24:34.000 --> 00:24:52.000
There's an enormous carbon footprint associated with training large-scale models and the data that we use to fuel these models is stored in big data warehouses that have to be kept cool with water even in states with with major water sortages.

00:24:52.000 --> 00:24:57.000
And I think there's a real desire to figure out how we can get all the power and benefits of these tools without destroying the climate.

00:24:57.000 --> 00:25:12.000
And I would argue that the mathematical statistical and computational foundations of machine learning by understanding those, it's going to help us optimize architectures and training efficiency.

00:25:12.000 --> 00:25:27.000
So we're going to be able to figure out how to compress these models or train them with with fewer steps or develop low power implementations that, become much more efficient.

00:25:27.000 --> 00:25:35.000
And these kind to kind of see how we might use foundations to address this. I want to kind of highlight some kind of broader questions that are related to this.

00:25:35.000 --> 00:25:45.000
Things like, how much data do I really need to train my model for a given problem? Is it going to be robust or what do I need to do to make sure it's going to be robust?

00:25:45.000 --> 00:25:56.000
After I train it, if I use it's going to be robust. After I train it, if I use it in a slightly different context, is still going to be robust.

00:25:56.000 --> 00:26:05.000
After I train it, if I use it in a slightly different context, still going to work, how can I make these things sustainable and do specific architectures like say transformers that underlie chat, how can I make these things sustainable?

00:26:05.000 --> 00:26:12.000
And do specific architectures like say transformers that underlie chat GPT, what's the work? How can I make these things sustainable?

00:26:12.000 --> 00:26:16.000
And do specific architectures like say transformers that underlie chat GP So to get at these kinds of questions.

00:26:16.000 --> 00:26:27.000
Some people in the community are thinking about neural networks as functions, right? The neural network takes in an input and produces an output that's a function of the input.

00:26:27.000 --> 00:26:42.000
And with that perspective, we can suddenly say, well, are there ideas from computational harmonic analysis or functional analysis or signal processing that can give us insight into how well we can estimate these functions or understand how they're going to perform in different settings.

00:26:42.000 --> 00:26:50.000
So to give you a flavor of that, for instance. What I'm showing here are 2 different functions.

00:26:50.000 --> 00:27:02.000
They're both functions in 2D, so they've got inputs X one and X 2 and the color is indicating the value of the function.

00:27:02.000 --> 00:27:08.000
And it is also some kind of hard to see little dots here. Those would be training samples. So both of these functions bid the training samples exactly.

00:27:08.000 --> 00:27:21.000
These are interpolating functions. So when we train a neural network, if we're able to train it to exactly fit our training data, which happens many, many times.

00:27:21.000 --> 00:27:29.000
It's the, I would say the most common load of operation. Then we're essentially finding an interpolating function.

00:27:29.000 --> 00:27:50.000
But as this example illustrates, there's many possible interpolating functions. So which is the one that we're trying, which is the one that we ultimately find and how to think, excuse me, how do things like our training algorithm or our choice of architecture or or other features affect what function we ultimately fit to the data.

00:27:50.000 --> 00:27:56.000
And what implications that has for things like robustness.

00:27:56.000 --> 00:28:03.000
Okay, so a little bit more sort of application oriented. I started off talking about these moonshot problems in the sciences that the NSF I think cares about very deeply.

00:28:03.000 --> 00:28:17.000
And I think that machine learning is really going to fundamentally change the way that we approach these problems. So one aspect of that is sort of obvious, right?

00:28:17.000 --> 00:28:20.000
We're gonna take data from experiments and then we're going to try to analyze that data using machine learning methods.

00:28:20.000 --> 00:28:35.000
But in addition to that, there are opportunities for machine learning to affect hypothesis generation or to affect how we design experiments or the way that we handle simulations.

00:28:35.000 --> 00:28:48.000
It's really going to affect every aspect of the scientific method. So people are exploring these ideas, but in many cases it's sort of on a kind of case by case basis.

00:28:48.000 --> 00:29:02.000
Let me think about experimental design in biology. Let me think about simulations in physics. But I really believe that there are some deep foundational questions that we can think about that span all the different application domains.

00:29:02.000 --> 00:29:14.000
That making investments in those areas and building up those foundations will be deeply impactful in just thinking about machine learning and the scientific discovery process.

00:29:14.000 --> 00:29:19.000
So those 4 kind of prevailing themes and I'm gonna elaborate on all of them are uncovering new laws of nature.

00:29:19.000 --> 00:29:29.000
AI guided scientific measurement, physics informed machine learning and advancing machine learning frontiers. So uncovering new laws of nature.

00:29:29.000 --> 00:29:43.000
What I mean by this is we want to be able to take observations of some sort of physical system and then use AI or machine learning to uncover the governing physical laws.

00:29:43.000 --> 00:29:46.000
So just as an example, here I've got a video. That's illustrating 2 particles.

00:29:46.000 --> 00:30:01.000
Moving according to these 3 equations. This is called the Lorenz, 63 system and it's like the chaotic system so you can see as time progresses these particles can be pretty far apart.

00:30:01.000 --> 00:30:09.000
And it's a nice model for things like atmospheric convection and studying various processes.

00:30:09.000 --> 00:30:25.000
So in this particular case, the data was generated by a particular system of equations. But many scientific settings, we just get observations and what we would like to be able to do is to figure out a good descriptive series of equations.

00:30:25.000 --> 00:30:34.000
And that's going to help us understand how to extrapolate our observations to to new settings in the context hopefully.

00:30:34.000 --> 00:30:45.000
So the question is can we use machine learning or AI to do this mapping? And it sounds a little bit magical, but I'm going to give you a hint at how some of these methods might be achievable.

00:30:45.000 --> 00:31:08.000
But first of all, I just want to emphasize, if we can figure out foundational tools for solving this problem, it could be enormously impactful from trying to figure out biophysical forces of cell development to solve, convinced matter and polymer physics so we can design better materials to trying to understand the dynamics of microbial communities so we can figure out for instance how to better break down plastics or even studying.

00:31:08.000 --> 00:31:25.000
Phenomena revealed by agent-based models such as how people respond to different p pandemic policies for instance. Okay, so why is this not just pie in the sky thinking?

00:31:25.000 --> 00:31:30.000
So if we return to the Lorenz, 63 example, remember there were 3 variables XY, and Z for the XY, 3D location of each particle.

00:31:30.000 --> 00:31:53.000
So we just think about how the ex location is evolving over time. We assume we don't know that equation, but we're going to write down, write it down as a weighted sum of things like the ex location, the Y location, the Z location, the ex location squared, etc.

00:31:53.000 --> 00:32:08.000
And then we're going to use our observations to estimate what these weights are. And so using sparse estimation routines, what we would uncover is that all of these blue green W's are 0 and the red W's are non-zero.

00:32:08.000 --> 00:32:19.000
And then that gives us a nice simple expression or the temporal level of the the x variable. We can do the same for the y's and the z's and the y's and the z's and uncover those dynamics and disease and uncover those dynamics.

00:32:19.000 --> 00:32:32.000
So this kind of simple idea has been the catalyst of a lot of interesting work. People have demonstrated how they can recover orbital mechanisms using these kinds of tools.

00:32:32.000 --> 00:32:40.000
Tools like AI Feynman have tried to take these ideas and generalize them in pretty exciting ways.

00:32:40.000 --> 00:32:45.000
But ultimately we want the equations to generalize them in pretty exciting ways. But ultimately, we want the equations to come out of this to be trustworthy.

00:32:45.000 --> 00:32:51.000
But ultimately, we want the equations to come out. But ultimately, we want the equations to come out out of this to be trustworthy, right?

00:32:51.000 --> 00:32:54.000
There's always room for error because the equations to come out out of this to be trustworthy, right?

00:32:54.000 --> 00:33:09.000
There's always room for error because the space of equations is enormous And so in order to do that, we really need to again invest in foundational research into how we can use AI machine learning tools to ensure that this is useful.

00:33:09.000 --> 00:33:19.000
Okay, so the second problem of how AI and machine learning are affecting science is in AI guided scientific measurement.

00:33:19.000 --> 00:33:26.000
And what I mean here is to use AI or machine learning to design better experiments, simulations and sensors.

00:33:26.000 --> 00:33:42.000
So just as an example, let's say that I want to design a microbial community for a specific purpose, say to break down plastics or as a therapeutic for somebody who's got microbiome is out of lack or something like that.

00:33:42.000 --> 00:33:49.000
So what I know is that the efficacy of the community is going to be a function of all different kinds of things.

00:33:49.000 --> 00:33:59.000
It's going to depend on what strains are present in what densities, what nutrients are present, what antimicrobial peptides are present and what the environmental conditions are.

00:33:59.000 --> 00:34:12.000
So we don't know this function. We just know that somehow it exists and we want to figure out what's the best way to set all these different variables in order to maximize the fitness.

00:34:12.000 --> 00:34:19.000
The problem is that there's just so many possibilities that we can't possibly try them out. We only have finite experimental resources.

00:34:19.000 --> 00:34:24.000
And if we were to sample at random. And the risk is that we end up evaluating a lot of different communities that are nowhere near the maximum that we're looking for.

00:34:24.000 --> 00:34:42.000
And so this is where some of the ideas that I mentioned earlier related to active learning and badment methods or even uncertainty quantification are really essential to help us guide these sequences of measurements.

00:34:42.000 --> 00:34:56.000
And so in investing in these foundational areas is kind of fueling this work. And we've seen that people are really trying to scale these ideas off different labs.

00:34:56.000 --> 00:35:14.000
Both in academia and industry are building self-driving laboratories for things like material design and even in the pharmaceutical industry when people talk about the design make tests and analyze cycle of designing new drugs, they are starting to incorporate these kinds of ideas.

00:35:14.000 --> 00:35:27.000
Okay, so the third problem here is physics informed machine learning. And here what I mean is that we want to optimally leverage physical models along with experimental or observational data.

00:35:27.000 --> 00:35:32.000
So I already kind of hinted at this when we had our discussion about CT and MRI.

00:35:32.000 --> 00:35:52.000
But really, there's just tons of settings where we have both physical knowledge as well as data. Large-scale physics experiments like photon sources or syncrons or LIGO experiments for measuring gravitational waves, things like molecular structure estimation or even climate forecasting.

00:35:52.000 --> 00:36:01.000
We have decades of foundational research in those areas that inform what's happening under the hood and can complement our data.

00:36:01.000 --> 00:36:11.000
And so the question really becomes, well, how do we leverage it? So one thing people have thought about is trying to use that physical knowledge to build large complex simulations.

00:36:11.000 --> 00:36:17.000
And then you might say, okay, I'm going to use that simulation data in order to train a machine learning model.

00:36:17.000 --> 00:36:23.000
And then hopefully that's going to give me good predictions about what's going to happen in the real world.

00:36:23.000 --> 00:36:53.000
And I think that hope is possible, but I also think that in order to make this succeed we have to understand things about distribution drift because we have to understand things about distribution drift, we have to understand things about distribution drift because the distribution drift because the distribution of my simulated data is not the distribution of real life data, transfer learning, data simulation, reduced order model Moreover, I think, broadly speaking, there's lots of areas where physical models can help inform machine learning.

00:36:59.000 --> 00:37:07.000
We've talked about leveraging simulations. There's work in areas like physics informed neural networks where people are trying to accelerate the simulations using machine learning tools.

00:37:07.000 --> 00:37:18.000
I'm going to talk a tiny bit about that later, or even upscaling molecular dynamic simulations.

00:37:18.000 --> 00:37:29.000
And I would claim that figuring out the best way to incorporate physics and form our physics information is absolutely essential to making sure that these methods are robust, that they're going to work when you have relatively little training.

00:37:29.000 --> 00:37:42.000
And really critical in the sciences, we need this physical knowledge in order to help us extrapolate the sciences.

00:37:42.000 --> 00:37:48.000
We need this physical knowledge in order to help us extrapolate beyond the domain of the training data, this physical knowledge in order to help us extrapolate beyond the domain of the training data, right?

00:37:48.000 --> 00:37:55.000
This is what science is all about. And if we don't incorporate our physical models, then we risk really getting a new understanding that's in a very limited domain.

00:37:55.000 --> 00:38:08.000
Okay, so finally I wanna talk about advancing machine learning frontiers like really fundamental machine learning and AI research areas that are broadly applicable in the sciences.

00:38:08.000 --> 00:38:22.000
So one area that's received quite a bit of attention lately is learned emulators. So for example, I've got say a large-scale climate simulation that it can only run on a supercomputer and it takes very long time.

00:38:22.000 --> 00:38:28.000
What I'd like to be able to do is to take to form a training data set using past simulations and use that to train.

00:38:28.000 --> 00:38:38.000
A machine learning model say a neural network that's going to mimic that simulator, but at a tiny fraction of the computational cost.

00:38:38.000 --> 00:38:42.000
So that then if I want to run a future simulation instead of meeting my supercomputer, maybe I can use my desktop.

00:38:42.000 --> 00:38:55.000
So this is the goal and there have been some anecdotal successes that are really exciting, including one paper that was highlighting just orders of magnitudes of acceleration using machine learning tools.

00:38:55.000 --> 00:39:08.000
And I would characterize this line of research as being an example of a broader class of problems that I would call generative models for science.

00:39:08.000 --> 00:39:16.000
So generative models include things like chat GPT for text or Dolly for language. I'm sorry for images.

00:39:16.000 --> 00:39:28.000
But we could imagine in the sciences wanting to generate new climate scenarios or wanting to generate new proteins.

00:39:28.000 --> 00:39:42.000
We cannot just take say chat GPT architecture and plug it in to sign scientific data and all of a sudden have a good, useful, scientifically rigorous generative model of the science.

00:39:42.000 --> 00:39:50.000
And there's a number of reasons for this. First of all, we just have way less scientific data than Chaty PT or Dolly.

00:39:50.000 --> 00:40:00.000
Even when we've got huge volumes of data, a lot of times that can be a small number of samples of really high dimensional data and so it's still really just not comparable.

00:40:00.000 --> 00:40:05.000
In addition, we just care about different things in the sciences, right? Like we might have rare events or chaotic dynamics.

00:40:05.000 --> 00:40:19.000
So if you're training chat GPT maybe you can ignore rare events because chat GPT is supposed to be generating kind of plausible typical language and if you fail to reproduce rare events it's not really a problem.

00:40:19.000 --> 00:40:27.000
In the sciences if I develop a generative model for per climate and it can never reproduce a hurricane.

00:40:27.000 --> 00:40:48.000
And I've got a real problem on my hands. In addition, and some of my own work on chaotic dynamics for generative models has shown how tools developed from computer vision have been able to help us, for instance, capture chaotic dynamics and make sure that these generative models have the right chaotic attractors.

00:40:48.000 --> 00:40:59.000
So again, foundational research is playing a really key role here. We also see in the sciences that there are just huge variations in scales and resolutions, right?

00:40:59.000 --> 00:41:06.000
That I play a critical role and then we just don't experience and sort of more commercial grade applications.

00:41:06.000 --> 00:41:16.000
We also have the opportunity to incorporate things like physical models, constraints and symmetries in order to somehow mitigate these challenges.

00:41:16.000 --> 00:41:25.000
So as we think about using generative models in the sciences, I thought, okay, how much can we trust these things?

00:41:25.000 --> 00:41:35.000
So I asked Chat-GBT this 3 reasons that football is safer than bad men and they produce some caveats.

00:41:35.000 --> 00:41:42.000
Totally irresponsible by any means. But then it said things like, well, there's a lot more protective gear in football by any means.

00:41:42.000 --> 00:41:48.000
But then it said things like, well, there's a lot more protective beer in football, so people are more protected gear in football, so people are more protected.

00:41:48.000 --> 00:41:54.000
And there's really strict refereeing in football. So people are more protected. And there's really strict refereeing football so people are more protected and there's really strict refereeing and and football players are much better physically fit.

00:41:54.000 --> 00:42:06.000
And football players are much better physically fit. And I guess these things are perhaps plausible, but they really are missing a core essential truth here And I think in the sciences we don't necessarily know what that core essential truth is a priori and so we can't really detect things that are maybe bacuous or faulty reasoning.

00:42:06.000 --> 00:42:19.000
And so we really need to think about how we can take generative or faulty reasoning. And so we really need to think about how we can take generative models which are trained to produce plausible results.

00:42:19.000 --> 00:42:28.000
And figure out what kind of foundational work is going to make them really trustworthy and useful in scientific contexts.

00:42:28.000 --> 00:42:35.000
Okay, uncertainty quantification we touched on earlier, but I think this is not something that's done.

00:42:35.000 --> 00:42:47.000
This is something that's ongoing and absolutely critical. And evidence of this is, for instance, Amazon developing a library for uncertainty quantification in the context of machine learning.

00:42:47.000 --> 00:42:54.000
And while this library is being posted by Amazon, it is absolutely incorporating foundational ideas being developed in academia by people supported by the NSF.

00:42:54.000 --> 00:43:04.000
And hopefully that work will continue, but this is just an illustration of the impact of that kind of work.

00:43:04.000 --> 00:43:10.000
And then finally there's areas like graphs where we can use graphs as a fundamental representation of scientific data.

00:43:10.000 --> 00:43:21.000
All the way from materials to networks to things like embody physics. And I'm not going to go into a lot of detail here, but as we also deal with human centric data.

00:43:21.000 --> 00:43:34.000
Figuring out foundational methods to ensure trustworthiness in the context of privacy, transparency, fairness, and accountability is absolutely critical.

00:43:34.000 --> 00:43:43.000
Okay, so we talked about these 4 different areas where in the sciences, in areas that the NSF really cares about, machine learning foundations are absolutely vital.

00:43:43.000 --> 00:43:56.000
This is not just focusing on individual scientific applications. This is thinking about foundational, mathematical, statistical, and computational challenges.

00:43:56.000 --> 00:43:59.000
And so it's only by building up those foundations that will be able to realize these ambitions.

00:43:59.000 --> 00:44:15.000
And not just realize them, but to make sure that the results are high quality and reproducible. And I think there's evidence that, right now we're not succeeding.

00:44:15.000 --> 00:44:30.000
There have been a number of articles that have highlighted how people are trying to use machine learning. In the context of the sciences and getting misleading results or results that are not repeatable, not reproducible.

00:44:30.000 --> 00:44:43.000
And I think it's only by really investing in the foundations. In foundational research and investing in training mechanisms that we're going to be able to to get away from this this current mode of operation.

00:44:43.000 --> 00:45:00.000
And so before I conclude here, I just wanna highlight how vital different training programs are to these endeavors to making sure that students, even students who maybe are coming from different application domains have a deep understanding of foundations and potential failure modes.

00:45:00.000 --> 00:45:19.000
And I think these are things that are difficult to get if you just are watching online webinars. And that, you know, students really benefit from being able to post questions to panels or talk to each other at poster sessions or or interact in really meaningful ways.

00:45:19.000 --> 00:45:28.000
I can't tell you how many times my students have come from workshops and come back to my group meeting and told me that some idea I had was totally not going to work.

00:45:28.000 --> 00:45:40.000
Excellent. That was time saving. And so the summer schools, workshops, and cross-disciplinary collaborations that are supported by the NSF really catalyze the groundbreaking research and accelerate workforce development.

00:45:40.000 --> 00:45:48.000
In fact, all of these pictures are from an AI and science solar school supported by the NSF.

00:45:48.000 --> 00:46:00.000
The NSF is also doing things I think are really important in terms of training. One example are institutes like the Institute for Pear and Applied to Mathematics or iPad at UCLA.

00:46:00.000 --> 00:46:08.000
When I was a graduate student I was part of one of their long programs in 2,004 made possible by the NSF.

00:46:08.000 --> 00:46:12.000
And this just had a huge impact on my career. It really shaped the way that I thought about a lot of different problems.

00:46:12.000 --> 00:46:31.000
Helped me think about important ideas for my career proposal and just had an enormous impact. And additional programs that the NSF has supported include other math institutes like Emsie.

00:46:31.000 --> 00:46:42.000
The Tripods program, which is really focused on the foundations of data science, some of the AIr Pods program, which is really focused on the foundations of data science.

00:46:42.000 --> 00:46:51.000
Some of the AI institutes have a found and even programs like the National Institute for Theory of Mathematics and Biology have a very strong foundation focus.

00:46:51.000 --> 00:47:02.000
And I think that the training opportunities that these kinds of programs provide is just absolutely essential to building up this this infrastructure and ecosystem.

00:47:02.000 --> 00:47:09.000
Okay, so, with that, there's 2 key things that I would like everybody to take away from today.

00:47:09.000 --> 00:47:23.000
The first, this is really important. I do not have brain damage. And second. Developing a applied machine learning without understanding the mathematical statistical computational foundations is basically like trying to develop biotech without understanding.

00:47:23.000 --> 00:47:29.000
Biology. That to really accelerate innovation and promote trustworthiness, we need to invest in foundations.

00:47:29.000 --> 00:47:42.000
So thank you very much and I'm more than happy to answer any questions. Thank you very much.