The Computer Scientist Who Builds Big Pictures From Small Details
 
                Lenka Zdeborová studies how the physics of matter can help model the behavior of machine learning algorithms.
Samuel Rubio for Quanta Magazine
Introduction
As a teenager in the Czech Republic, Lenka Zdeborová glimpsed her future in an Isaac Asimov novel. A character in Asimov’s “Foundation” series invents a mathematical method for predicting the path of an entire civilization by averaging out the haphazard behavior of billions of individuals. The concept gave her a “mesmerizing feeling,” Zdeborová recalls — one that returned when she later encountered a method that she could actually apply to make sense of huge numbers of unpredictable elements.
“I realized, ‘Oh my God, Asimov was just describing statistical physics,’” she said, referring to a discipline that describes the big-picture properties of matter by using the rules that apply to individual molecules. As a physics master’s student at Charles University in Prague, she reveled in its predictive power. Then, while she was pursuing her doctorate, Zdeborová’s adviser showed her a paper that applied the techniques of statistical physics to theoretical computer science — the mathematical study of computation and how algorithms behave. The familiar feeling returned with a vengeance.
“I was completely mesmerized by that paper,” Zdeborová said. “I had always had this impression that to do computer science, you had to be a hacker and know everything about Linux. I realized that theoretical computer science was as fascinating as theoretical physics, and I said, ‘OK — this is what I want to do.’”
Zdeborová now leads the Statistical Physics of Computation Laboratory at the Swiss Federal Institute of Technology Lausanne. Her work currently focuses on how the physics of phase transitions in matter — such as water freezing into ice — can help model the behavior of algorithms, especially those used in machine learning.
 
                Zdeborová leads the Statistical Physics of Computation Laboratory at the Swiss Federal Institute of Technology Lausanne, or EPFL.
Samuel Rubio for Quanta Magazine
Quanta spoke with Zdeborová about the similarities between water and algorithms, using physics to understand large language models, and chasing unreasonable scientific goals. The interview has been condensed and edited for clarity.
Your work crosses disciplinary boundaries, so do you think of yourself as a physicist, a computer scientist or a mathematician?
I would say all of the above. The problems I’m interested in are mostly in computer science and machine learning. But in theoretical computer science, everything should be formally proven, down to the last detail. With machine learning today, that’s just not happening anymore — it’s so complicated.
So from an approach perspective, I feel like a theoretical physicist because, just like in physics, you can seek to explain phenomena with theories that are still mathematically rigorous, even if they may not have formal mathematical proofs.
 
                The EPFL campus features many works of art. Here, Zdeborová interacts with Big Bang by Etienne Krähenbühl.
Samuel Rubio for Quanta Magazine
How does statistical physics help you understand computer science?
What is usually taught to students about theoretical computer science is to focus on the worst case — instances of a problem that make it too hard to compute. That’s how the field started; that’s where we have beautiful results. But there’s a difference between the worst case and the typical case. Machine learning is a clear example of that. So even with very high-dimensional data — like medical imaging with millions of pixels, where we want to detect certain markers for disease — the relevant instances of the problem are often not as computationally hard as the worst ones.
This is where statistical physics kicks in, because that is historically the field of science that dealt with these high-dimensional problems. When you want to describe the behavior of many molecules interacting at once, statistical physics brings up probability distributions. These are mathematical objects that, in a very similar form, appear in computer science when describing how bits of data interact when a given algorithm is executed. It’s just that statistical physics got started a century ago, when computer science didn’t even exist. Luckily, by the time I did my Ph.D. in the 2000s, the disciplines were realizing how much they had in common.
 
                For Zdeborová, algorithms can behave just as mysteriously as molecules. “How do we derive that water will freeze at zero degrees Celsius?” she asked. “That’s not at all obvious!”
Samuel Rubio for Quanta Magazine
What do they have in common?
In both cases, extracting the macroscopic behavior of a system from a microscopic description is difficult.
Even though Newton’s laws and quantum mechanics can give you a really detailed description of how water molecules interact, how do we derive that water will freeze at zero degrees Celsius? That’s not at all obvious! Even in the 1940s it was not figured out. And there are still many questions about phase transitions in water, especially at high pressure.
Similarly, in computer science, there are very simple-to-define problems with relatively simple algorithms, where we don’t know under what conditions they will work. In my Ph.D. thesis, we studied the problem of graph coloring, which a 5-year-old kid can understand. You have points, and some are connected with edges, which creates a graph. You want to color each point with one of three colors. And if two points are connected, they can’t have the same color. So can you color the graph or not?
For any given algorithm to solve this problem, you could understand it, even code it, and it would run. But what if I asked, “Can you tell me when this algorithm will work, and when it won’t?” For most algorithms, we don’t know. And that’s kind of the overall status in theoretical computer science: Even for a simple problem like that, when we start asking natural questions about the behavior of algorithms, we often don’t have the answers.
 
                With her interdisciplinary work, Zdeborová considers herself a physicist, computer scientist and mathematician.
Samuel Rubio for Quanta Magazine
If it’s so hard to fully understand algorithms, how can phase transitions help?
The phase transitions we study are not literally physical, like water turning into ice. But they are analogous in that you have a sharp, sudden change in the behavior [of the system] under certain conditions. In the case of neural networks, one of the first transitions to be characterized was in how learning efficiency depends on the amount of training data.
You take a given neural network that learns from high-dimensional data — like images with millions of pixels — and then analyze, in certain simplified settings, how many training samples you need for the network to learn a function to a certain level of precision. You get a phase transition in the very sense we’re talking about, a sudden change in the optimal performance for that system. And these conditions tell you something about how hard or easy the learning will be, and whether it makes sense to look for better algorithms.
Has this approach helped you learn anything new about these complicated systems?
In recent work, we did find a phase transition in the performance of a simplified version of a large language model, but what was also interesting was the nature of the two phases on either side of the transition.
In physics, there are certain quantities in the mathematical description of a phase transition that we call order parameters. They allow you to understand what the phase transition is actually about. This allowed us to understand that magnetism is about atoms aligning: In one phase, the overall alignment is big, and in the other (nonmagnetic) phase, there is no alignment.
 
                By learning enough about simple machine learning systems, Zdeborová hopes to help decode their overarching rules.
Samuel Rubio for Quanta Magazine
That’s the beautiful thing that appeared in our mathematical description of the language models. There were two order parameters, each with a precise meaning. One determined if the learning that happens relies a lot on the position of the words in the sentence. The other order parameter was specifically about the meaning of each word, the semantics.
And when we looked at the phase transition, we found that below a certain threshold of training examples, only the position mattered — not the semantics. And if we had more examples above that threshold, it was only the semantics that mattered. So, in a sense, it’s a new type of phase transition between positional and semantic learning that we could characterize in a simplified language model. To me, this is a baby step toward understanding emergent properties in large language models, like suddenly being able to do arithmetic, answer questions in Greek, or things like that.
Where do you think enough of these baby steps could lead you?
The analogy I really like is to thermodynamics. When we had the steam engine in the 18th century, it gave rise to the Industrial Revolution: railways and companies and many things running on the steam engine, all without understanding thermodynamics. That came decades later, and it was inspired by wanting to understand the steam engine. From that, a lot of other physics emerged.
This is probably a completely unreasonable goal, but you know, somebody has to come up with the thermodynamics of machine learning. I would love to be that person. If it’s not me and it’s somebody else, that’s also great. But I will definitely push toward that goal.
 
                         
                         
                        