To preface this discussion, I want to state that despite the title of this post, I still think AlphaStar is going to beat most, if not all, of us. The people at Deepmind are brilliant and the demonstration in Janurary was nothing short of remarkable. Yet, perhaps you, the high gold/low diamond zerg player, will go down in Deepmind's landmark research paper as the shmuck who beat AlphaStar with human ingenuity. This post is aimed to help make that dream a reality.
StarCraft is a game of who can make the fewest mistakes. Winning requires a player to accomplish two tasks: minimize his own mistakes and maximize his opponent’s mistakes. The former is obvious, but the latter, central to beating AlphaStar, is far more subtle. When computers fail, they fail hard, and they fail over and over again. So what can a player do to induce mistakes out of AlphaStar?
Create game states that differ from training
Despite its billions of matches against itself, AlphaStar has only explored a minuscule percentage of the total state space. When AlphaStar experiences something different from training, it’s forced to generalize from its prior learnings; this generalization almost always has a degree of inaccuracy. We are all familiar with this concept when it comes to engineering bay blocks, gas steals, hatch blocks, early offensive bunkers, and aggressive openers in general. However, humans usually only use this concept with the intent of disrupting build orders.
Against a computer you can take this idea much farther. Unorthodox (with respect to what AlphaStar has likely seen in training, not the human meta) unit compositions, expansion locations, economic development, and tactical maneuvers can induce mistakes out of a computer that has failed to properly generalize its learnings. For example, going ghost mech or mass phoenix, taking your natural at the gold, going 5cc while turtling on two base, massing hallucinations in battle, and cutting forges completely in pvt to ramp up to 10gate off three base are all ways to create less explored game states.
Force AlphaStar to make abstract decisions
Defending is a harder concept for computers to grasp than attacking. Optimal defense requires one to anticipate the opponent’s policy: when they will drop, how they will split their forces, and where they will commit. This point became clear in the live match between Mana and AlphaStar. Despite AlphaStar’s impeccable control, it struggled vs adept harass.
Certain strategies test the opponent strategically, whereas others test the opponent mechanically. Strategic decisions are more abstract, and consequently harder for a computer. For instance, defending a cannon rush requires a player to anticipate where future cannons will be built and how much money the cannon rusher has for rewalling. Defending proxy tempest requires a quick progression to reactored vikings, or other specific responses depending on the variation. Basetrade centric styles are another method of creating difficult decisions for the opponent.
Exploit biases in how computers think
Due to the “exploration vs exploitation tradeoff” it is natural for a computer to get stuck exploiting a single working unit composition. For instance, the agent vs Mana in the live match never built a phoenix and only made oracles out of the stargate, despite constant prism harass. Each of the agents demonstrated a clear bias towards a given unit composition, even after the opponent countered it. To exploit this, one can build unit compositions that demand specific reactive counters. Alternatively, one can be proactive in scouting to identify these unit composition biases early on and counter them.
Long term decision making is notoriously hard for computers due to the curse of dimensionality. The longer the game goes, or the more long term decisions a computer is forced to make, the more likely it is to slipup. As a caveat to this point, AlphaStar was able to successfully navigate lategame carrier vs carrier against TLO.
Lastly, AlphaStar does not learn as it plays. It has to be taken offline and ran through millions more games to update its policy. Consequently, once an exploit is found, it will continue to work until Deepmind retrains it with an emphasis on correcting the exploit.
With all this said, it will still be incredibly difficult to beat AlphaStar. The program has demonstrated many scarily human-like qualities, and will likely one day be able to generalize its learnings across the entire domain of the game to where exploits no longer exist. However, it is up to you, our European ladder heroes, to prove that that day is not today.
Note: AlphaStar will be ran anonymously on the European ladder, but I tend to believe that at higher levels anonymity for such a distinctive character will be impossible.
Source: Original link
© Post "How to Beat AlphaStar: From the Perspective of a Grandmaster Player and Data Scientist" for game StarCraft.
Top-10 Best Video Games of 2018 So Far
2018 has been a stellar year for video game fans, and there's still more to come. The list for the Best Games of So Far!
Top-10 Most Anticipated Video Games of 2019
With 2018 bringing such incredible titles to gaming, it's no wonder everyone's already looking forward to 2019's offerings. All the best new games slated for a 2019 release, fans all over the world want to dive into these anticipated games!