“This is a really amazing result,” says François Le Gall, a mathematician at Nagoya University in Japan, who was not involved in the work. “Matrix multiplication is used everywhere in engineering,” he says. “Anything you want to solve numerically, you typically use matrices.”
Despite the calculation’s ubiquity, it is still not well understood. A matrix is simply a grid of numbers, representing anything you want. Multiplying two matrices together typically involves multiplying the rows of one with the columns of the other. The basic technique for solving the problem is taught in high school. “It’s like the ABC of computing,” says Pushmeet Kohli, head of DeepMind’s AI for Science team.
But things get complicated when you try to find a faster method. “Nobody knows the best algorithm for solving it,” says Le Gall. “It’s one of the biggest open problems in computer science.”
This is because there are more ways to multiply two matrices together than there are atoms in the universe (10 to the power of 33, for some of the cases the researchers looked at). “The number of possible actions is almost infinite,” says Thomas Hubert, an engineer at DeepMind.
The trick was to turn the problem into a kind of three-dimensional board game, called TensorGame. The board represents the multiplication problem to be solved, and each move represents the next step in solving that problem. The series of moves made in a game therefore represents an algorithm.
The researchers trained a new version of AlphaZero, called AlphaTensor, to play this game. Instead of learning the best series of moves to make in Go or chess, AlphaTensor learned the best series of steps to make when multiplying matrices. It was rewarded for winning the game in as few moves as possible.
“We transformed this into a game, our favorite kind of framework,” says Hubert, who was one of the lead researchers on AlphaZero.