Last month, Google’s AlphaStar AI beat two professional StarCraft II players 10 times without losing a single game. This is a real AI we’re talking about, not the stock practice opponent that ships with the game. AlphaStar was built on a neural network using reinforcement learning. As the AlphaStar team said on their blog:
Now, we introduce our StarCraft II program AlphaStar, the first artificial intelligence to defeat a top professional player. In a series of test matches held on Dec. 19, AlphaStar decisively beat Team Liquid’s Grzegorz “MaNa” Komincz, one of the world’s strongest professional StarCraft players, 5-0 following a successful benchmark match against his teammate Dario “TLO” Wünsch. The matches took place under professional match conditions on a competitive ladder map and without any game restrictions.
I’ve seen a lot of short articles and posts from people talking about how this makes humans look obsolete and making the usual Skynet references, but I don’t think this victory is quite as impressive as it might seem on the surface. Specifically, I take issue with the claim that these matches took place “without any game restrictions.” The terms of these matches were overwhelmingly in favor of the AI. This match is less about an AI that obliterated human players on an even playing field and more about an AI that managed to overcome being an AI.
I don’t have anything against the AlphaStar team. I really admire their work and I think this is a big moment for machine learning. AlphaStar was made by talented people doing important work, and nothing I’m about to say should take away from that. I just want to temper this enthusiasm for AlphaStar with a dose of reality and better explain what these matches looked like and what they proved. This is like a tennis match supposedly between a child and Serena Williams, but then you find out Serena has to play while wearing skis and there are actually five kids. It’s really still impressive that the kids won, but let’s not simplify things down to the headline of “Child Beats Best Tennis Player in the World”.
I should make it clear that this is not my field of expertise. The closest I’ve come to AI programming is implementing A* in C++, which is basically “baby’s first AI project”. Comparing A* to neural networks is like comparing folding a paper airplane to designing the space shuttle. Both projects make something that flies, but the latter is stratospherically more sophisticated. My programming skills are more in the area of procedural content generation, not AI.
StarCraft II is a real time strategy game. You begin with worker units and you use them to gather resources. You use those resources to build infrastructure, you use the infrastructure to build military units, and you use the military units to crush your opponent. Maybe you’ll attack their infrastructure so they can’t build their army. Maybe you’ll attack their worker units and starve them of resources. Maybe you’re not one for subtlety and you’ll just meet their army in the middle of the map for a huge brawl. Maybe you’ll try to confuse your opponent by using small strike teams to attack several different targets at the same time. Maybe you’ll try to build a fortified position and force them to come to you, grinding them down through attrition. There are countless strategies out there, and even after nine years the gameplay is still evolving as players develop new tricks and techniques.
In Starcraft II, there are three races. The Zerg are a race of space bugs that are designed to overwhelm their opponent with cheap disposable units. The Protoss are psychic aliens focused on expensive but durable units. The humans aka Terrans exist between the two in terms of unit cost. This isn’t like the aesthetic choice of black vs. white in chess. Each of these three races plays very differently. Their workers can do different things, their infrastructure works differently, and their armies are composed of radically different units. This means there are a total of six possible matches between two players:
The races are complex enough that most professional players pick a single race. They still need to learn how to fight against all three races, but at least they only need to learn to manage one race in particular.
Despite the clam that AlphaStar games were played “without any game restrictions,” there were actually a lot of restrictions in place for the human player. AlphaStar can only play as Protoss. That’s fair enough, since a lot of pro players limit themselves to a single race. However, AlphaStar is also limited in that it can also only play against Protoss.
The first five games were against TLO, who is normally a professional Zerg player. He was obliged to play off-race as Protoss because AlphaStar had no training in fighting Zerg. Imagine if I was bragging that I beat basketball player Michael Jordan 5-0, but then you found out I beat him at hockey, not basketball. It’s still impressive to beat a pro, but that’s not the same as beating him at basketball.
The other glaring limitation is that AlphaStar only knows one map. In a normal game, the server will randomly choose from one of the seven available maps for you to play on. Some maps are small and thus lean towards short, chaotic, and aggressive matches. Other maps are large and rich in resources, which makes it more likely that you’ll see a protracted battle between big armies. Other maps might have long indirect routes between the players, which will encourage the use of flying units. By limiting the matches to a single map, the AI is negating a lot of the generalized techniques its opponent has developed and will instead put them on terrain where the AI has developed specialized tactics particular to this location. One of the primary advantages a human player has is their ability to adapt, but that advantage is largely negated when limited to a single map that favors a narrow selection of fixed strategies.
The other advantage that AlphaStar had going for it was that the human players never got to face the same AI twice. The AlphaStar team brought several different AI agents to the tournament, and each got to play exactly once. Each agent is optimized for a single strategy. Often the human player would spot a weakness in their opponent, but wouldn’t be able to exploit it during the round. They’d lose, but feel like they could use what they learned to defeat it if they’d been allowed to face the same agent a second time. There’s a reason human matches are usually designed as a best of three or best of five. It’s the same reason we make boxing matches more than one round. We want to give the combatants a chance to feel each other out and look for weaknesses. Again, the parameters of the match managed to negate the human player’s ability to observe and adapt.
Other people have brought up some concerns about APM, or Actions Per Minute. This is a measure of how quickly the player can click on stuff and press buttons. During a game, players need to move their units around the battlefield, sometimes giving orders to one specific unit. This micromanaging of units is called “micro”, and is limited by the speed and accuracy of a computer mouse and keyboard. There’s an upper limit on how many units a player can micro at one time. The most extreme example of this would be a battle where two same-size armies begin shooting at each other. A player with infinite micro can take injured units and move them to the back of the formation just before they die. The micro god player can then win the battle and have their units survive to fight another day, turning what should be an even battle into a one-sided victory.
The AI is a micro god by nature. It’s not cheating, it’s just that AI is really good at quick, precise actions. AlphaStar is a good bit faster than the fastest human player in the world. I don’t have a problem with it commanding its units with superhuman precision if this was otherwise an even matchup. A contest pitting human ingenuity and adaptability against raw machine speed could make for an interesting experiment. But in a setup designed to nullify the human’s ability to adapt, allowing the machine to command its units with such precision sort of renders the contest moot. The rules have turned this strategy game into a reflex game, so it’s not exactly surprising that the machine won. In the above screenshot, AlphaStar was fighting uphill, through a chokepoint, with a smaller force of inferior units. It was doing everything wrong, but it was able to win anyway thanks to perfect unit control. This moves us away from the goal of building a smarter AI. If AlphaStar is going to win with reflexes, then we’re not really getting an honest appraisal of how it operates on a strategic level.
I’ll admit there’s no good solution to this. The AI isn’t using a mouse. It’s not pressing buttons on a keyboard. You can’t map its actions to human inputs 1-to-1. So where should the AlphaStar developers draw the line? How many actions per minute is too many? Should AlphaStar be limited to the speed of the fastest player in the world? The average speed of a pro player? I don’t know. If you set the limit too low then, you’re just crippling the AI so humans can feel better about themselves. If you set the limit too high, then the AI will win using micromanagement rather than playing effectively on a tactical and strategic level. There is no clear answer here.
The final advantage that AlphaStar has is wide map vision. Both human players and AI are limited to seeing only things that their units can see. You can’t observe enemy movements or structures unless one of your own units or structures are nearby. Human players have the additional limitation that they can only see a small slice of the map at a time.
In the lower left of the interface is the minimap. It will show dots indicating the position of enemy units. If you see one small red dot, it means there is a small number of enemy units in that region of the map. You can’t use the minimap to see what the units are, how many there are, or what they’re doing. For that, you need to move your camera to that part of the map. AlphaStar doesn’t have this limitation. It can see the entire map at once, as if zoomed out. If a human player modded the game to do this, it would be considered a serious cheat. If you wanted to see the game the way AlphaStar does, imagine sitting in front of a jumbo 8K monitor that allowed you to see and command units without ever needing to scroll around or look at the minimap.
The camera stays pretty close to the ground in Starcraft, and you’re not allowed to see too much of the battlefield.This was probably originally done to help the game run smoothly by making sure the game wouldn’t have to draw too many things at once. That’s no longer a concern, but the system remains because changing it would significantly affect how the game is played.
The AlphaStar developers point out that the AI tends to focus its attention to one area at a time, which is sort of like a human player moving their camera around. I’d counter by saying this is less a simulation of how the player’s camera works and more a simulation of how human eyes work. Humans have to move their singular point of focus around, and they need to fiddle with the camera controls.
Our eyes can really only see fine details of things when we’re looking right at them. In the neural sense, AlphaStar is doing some very sophisticated and human-like things. This AI apparently has something as elaborate as a single point of focus and attention. That’s a credit to the work the AlphaStar team is doing, and introduces all sorts of exciting questions about how these neural networks are actually thinking. On the other hand, it means that the full map view is a massive advantage over human players.
A human player needs to divide their attention between their main view and the minimap. If they want to know what units are in that big mass of red dots they see on the minimap, they need to move the mouse to the corner of the screen, jump the camera to the new location, re-focus their eyes to get their bearings, maybe nudge the camera position a bit to get it into just the right spot, and then move the mouse back to the main view so they can begin commanding their units or inspecting their opponent’s. This takes several clicks, and it means giving up the ability to directly observe whatever they were originally looking at. For AlphaStar, seeing the units is as easy as moving your eyes from one side of the screen to the other.
I need to stress that none of this should be taken as a criticism of the AlphaStar team. StarCraft II is the most complex game this technology has faced so far, and this was the AI’s first attempt against professional human players. The AI can’t play the full game yet, but I’m sure the team will get there eventually. My criticism is less directed towards the AlphaStar team and more towards the people claiming that AlphaStar has “solved” StarCraft II.
You can see the highlights of the 10 matches here, along with interviews with the players and the AlphaStar developers. I don’t know how comprehensible the action will be to someone who doesn’t understand StarCraft II, but as a fan and casual player I found it riveting. The AlphaStar AI employed tricks and techniques no human has ever used before, and it’s possible some of these tricks will be explored or adopted by human players.
Days after these initial matches, StarCraft II grandmaster MaNa and AlphaStar had one more match. The human won this time. Next week I’ll talk about that game and what it means.