Entries by Jay Scott | Starcraft AI blog

downtime

The blog had some downtime due to problems at my hosting provider. There seem to be a few lingering issues, but I think the blog itself is good now. Sorry, they are usually reliable!

Oops, it seems that the openbw.com domain name was not renewed in time. It’s a service we’ve come to rely on. Replay links will not work until it is restored—or somebody brings up a replacement service.

Update: It’s back. They handled the issue quickly.

Killerbot-SAIDA games

I see that Killerbot by Marian Devecka has figured out its own way to beat SAIDA with its persistent mutalisk pressure: win 1 and win 2. I’m not sure how consistent it is, but as soon as SAIDA starts taking serious economic damage, terran only goes downhill. The update is a few days old now, and the most visible change is that Killerbot makes a few unupgraded hydralisks early, presumably only when the enemy has or is expected to get vultures. The hydras counter any early vulture or wraith tricks that terran might try. The idea is well known among human players, and Steamhammer uses it too.

I was looking at Steamhammer’s 2 hatch muta loss to SAIDA today, and thinking “With good muta micro and good decisions, zerg should win this.” I’ve been promising good muta micro for 2 years and haven’t delivered....

Update: Apparently SAIDA reacted by becoming more aggressive. It seems to have found an attack timing that works against Killerbot. Is this the same mechanism that solves rushes by adjusting timings, or a different kind of reaction? If it’s the same, the mechanism is quite general.

Steamhammer 2.1 progress

I have fixed the bugs affecting terran play, a bug affecting defilers, and a few bugs and weaknesses affecting all races. I also improved scourge control, and added a new opening to maintain my sanity. There is still a critical bug affecting protoss, which causes units to wander around without fighting, carrying banners “make levity not war.” Another severe weakness affects base defense, causing defenders to hang back from the action. I’ve spent the last 2 days trying to fix base defense, and it doesn’t work. I might have to think up a different solution.

Everything is good except the last 2 critical problems. Not that there’s any shortage of other weaknesses, but these 2 are so bad that they can’t be ignored. Surely they won’t resist me for long, though. Stand by!

Update the next day: I got everything working well enough, so I thought. I ran final tests and found... the newly implemented scourge micro had stopped working, though I hadn’t touched anything related. Now what has gone wrong?

various short items

SAIDA

SAIDA has been updated and is again defeating Krasi0 and Locutus. The arms race continues!

CIG 2018

I started poking at the detailed results file to figure out how to reproduce the official results exactly... then I discovered that the build order problem was wider than it first seemed. I canceled my plans. We don’t need per-map crosstables and race balance analysis of a tournament with such badly distorted results.

Steamhammer 2.1

I haven’t been working that hard on it, but I have made progress. I fixed some of the bugs introduced along with squad clustering, and found the causes of others. 2.1 should have smoother play in many cases. To say the same thing differently, Steamhammer 2.1 is suffering from feature creep, or at least bug fix creep. Hang on, it shouldn’t take too much longer.

CIG 2018 - not only Overkill was broken

I’ve been watching CIG replays. There are way too many and I watched few, but I soon noticed that Overkill was not the only bot in the tournament with broken build orders due to some unknown incompatibility. Steamhammer is also affected in every game I checked, with symptoms similar to Overkill’s though apparently less severe. Strangely, UAlbertaBot (parent of both Overkill and Steamhammer) does not seem to be affected (though I did see one game where it opened 11-11 gate instead of 9-9 gate). Tyr also had build order problems in every game I checked, and it is unrelated code.

I checked a few other bots and did not see consistent problems as with Overkill, Steamhammer, and Tyr. For example, McRave fell into an early production freeze in one game, but it played normally in other games so that was likely a garden variety bug.

It explains some mysteries about Steamhammer’s performance, though it opens others.

It seems difficult to inventory all the participants and see which were affected by similar problems. For the weaker bots, we may have to read the source to tell whether build orders are working as intended.

Can anybody diagnose it? So far we only have speculation about the cause.

looking at ISAMind

As reported, ISAMind is a fork of Locutus, and the only important difference is that ISAMind can predict the opponent’s opening plan using a trained neural network instead of the rule-based method that Locutus inherited from Steamhammer (and modified). The author of Locutus has conveniently made a branch for ISAMind, so that we can compare and verify the differences.

The neural network is a standard feedforward network trained by backpropagation (we can see this in the network data in the file ISAMind.xml). The computation is implemented using OpenCV, a computer vision library that includes some general-purpose machine learning tools. I’ve looked at OpenCV in the past, and I think it is a good choice.

The input to the network is the frame count and the counts of any early game units seen, plus any opening plan recognized already so the network can decide whether to stick with it, plus the high-level features of the rule-based recognizer for proxy, worker rush, factory tech, and number of known enemy bases. With those high-level features, the network doesn’t have much thinking to do. The output is one of 10 possible opening plans, and if none is recognized it falls back on the rule-based recognizer again. In the best case, this neural network can’t provide much of a boost.

The key question is: How was the network trained? I looked all around and found no sign of an explanation. It could be trained to reproduce the values returned by the original rule-based method, but what would be the point? It could be trained to recognize the plan that leads to the highest win rate, but that would be expensive and might learn to deliberately misrecognize plans, so that’s less likely. It could have been trained on data from games with opponents that were coded to play specific plans, but that would risk lack of variety in the training data. It could have been trained on hand-labeled data, if they took the time to label that much data. My best guess is that they wrote a replay analysis tool that labels the training data “here is what I saw” with the plan, “here is what I should recognize,” that would have been chosen if the scout saw everything. The scout is early and normally does see everything that is in the enemy base (not proxied in some hidden location or built later outside the main), so if my guess is right, the trained network should normally be accurate.

I ran ISAMind locally to see what the predictions looked like. My impression is that the predictions were generally accurate, but sometimes sluggish. Steamhammer recognizes a zealot rush if it sees 2 gateways and no gas. ISAMind doesn’t recognize a zealot rush until it sees the zealots. Whether the sluggishness is harmful depends on whether the bot needs to react quickly. ISAMind recognizes a fast rush at the first sign, and that is the most important enemy plan to recognize immediately. Most input information has to come from the scout probe, and the scout probe circles the base without ever looking outside, so the network rarely sees enemy expansions unless it passes them on the way in; the input data is not complete enough to recognize all the possible plans reliably, because scouting is not thorough enough. (Locutus has a commit “Scout the enemy natural” only later, on 19 September.)

Overall, ISAMind seems like a cute little job that doesn’t try to accomplish much. It is like a class project or an early experiment to start getting tools and ideas into place.

looking at TitanIron

TitanIron is, as all signs indicated, a fork of Iron. It forks from the latest Iron, the AIIDE 2017 Iron. The Iron that played in CIG 2018 was carried over from the previous CIG 2017 tournament, and is an earlier version.

#15 TitanIron crashed in 30% of its games. Its win rate was 51.46% overall, or 73.59% in non-crash games. #6 Iron itself (an earlier version) finished with 74.31% win rate, so TitanIron does not seem to be an improvement, even discounting poor code quality. Curious point: #9 LetaBot upset Iron, because LetaBot copes well with vulture and wraith harassment. But TitanIron upset LetaBot. Another curious point: TitanIron performed poorly on the map Andromeda and strongly on Destination, and about equally well on the other 3 maps. Andromeda seems a surprising map to have trouble with.

I watched some replays. In Iron-TitanIron games, the two played identical build orders until the first factory finished, when Iron made 1 vulture first while TitanIron immediately added a machine shop to get the vulture upgrades faster. The bigger difference came later, when Iron built a starport and made wraiths while TitanIron did not. I got the impression that TitanIron rarely or never goes air. The expense of going air puts TitanIron ahead in vultures for a while, so that it won some games, but it seemed that if the vulture pressure did not push Iron over the edge, then Iron would strike back and take the advantage. I watched only 1 game Locutus-TitanIron, because Locutus’s proxy pylon trick misled TitanIron just as it does Iron, and Locutus won easily. I watched a strange game against AIUR where TitanIron built a second command center far from its natural, slowly floated it over, left it in the air, and built a new command center underneath. Not all the bugs are crashing bugs. In the picture, TitanIron is losing to AIUR. Notice the nicely spaced tanks, the spider mines directly next to one tank, the barracks floating in an unhelpful position, and the spare command center in the air.

Overall, my impression is that TitanIron’s play is often similar to Iron’s. Unlike Iron, it does not make air units (it seems to have drop skills, but I didn’t run into any games with drop). Against protoss, TitanIron makes more tanks and uses them more cautiously and often clumsily. TitanIron also seems a bit fonder of expanding and growing its economy.

TitanIron adds over 4,000 lines of code to Iron. It was made by a team of 10, so that’s not an excessive amount of new code. The crash rate and the score suggest that the team was not disciplined enough in code quality and testing (of course Steamhammer crashed even more, so I don’t get to brag). Read on and you’ll see what most of the new lines of code do. I question the choices of where to spend effort. I’m not sure what the plan behind TitanIron was supposed to be.

openings

Iron does not play different openings as such. Conceptually, I see Iron as playing one opening which it varies reactively. TitanIron adds a directory opening with code which allows it to define specific build orders. The build order system is loosely modeled on Steamhammer’s, using similar names (which are not the same as UAlbertaBot’s names)—some members of the team have worked on Steamhammer forks.

TitanIron knows 3 specific build orders, named 8BB CC (1 barracks expand), SKT (tanks first), and 5BB (marines). Based on watching replays, TitanIron retains and uses Iron’s reactive opening, with modifications.

opponent-specific strategies

Iron does not recognize opponents by name. TitanIron recognizes 2 specific opponents: Locutus and PurpleSwarm. The zerg PurpleSwarm is a curious choice, since it did not play in CIG. Maybe they found it an interesting test opponent? In any case, Locutus is the main focus. It is recognized in 4 strategy classes, Locutus, SKT, TankAdvance, and Walling. In Iron’s codebase, any number of strategies can be active at the same time, and other parts of the code check by name which strategies are active to suit their actions to the situation.

	Locutus::Locutus()
	{
		std::string enemyName = him().Player()->getName();
		if (enemyName == “Locutus” || enemyName == “locutus”)
		{
			me().SetOpening(“SKT”);
			m_detected = true;
		}
	}

SKT (defined in opening/opening.cpp) builds a barracks and refinery on 11, then adds 2 factories and gets tanks before vultures. It sounds as though it should refer to the “SK terran” unit mix of marines and medics with science vessels and no tanks, but it doesn’t. The Locutus strategy turns itself off (if I understand the code’s intent correctly) after all 4 dark templar of Locutus’s DT drop are dead, or after frame 13,000. Various buildings (barracks, factory, e-bay, turret) recognize when the Locutus strategy is active and carry out scripted actions. The name “Locutus” also activates the TankAdvance strategy which seems to first guard the natural and then perform a tank push, and deactivates the Walling strategy after frame 11,000 or when above 12 marines, causing the barracks to lift and open the wall.

TitanIron scored a total of 1 win out of 125 games against Locutus, so the special attention does not seem to have paid off.

PurpleSwarm gets less attention. (The question is why it got any.)

	Purpleswarm::Purpleswarm()
	{
		std::string enemyName = him().Player()->getName();
		if (him().Race() == BWAPI::Races::Zerg &&
			(enemyName == “Purpleswarm” || enemyName == “purpleswarm” || enemyName == “PurpleSwarm”))
		{
			me().SetOpening(“5BB”);
			m_detected = true;
		}
	}

5BB (also defined in opening/opening.cpp) builds barracks on 10 and 12, later adding a third barracks and training marines up to 30. I don’t see any other cases where TitanIron uses this opening. The rest of the code has no special instructions for PurpleSwarm or 5BB.

other new files

Besides the opening directory, TitanIron adds 16 files in the strategy and behavior directories, defining 8 strategies and behaviors. The added strategies are:

GuardNatural
Locutus
PurpleSwarm
SKT
TankAdvance

These are remarkable for being all and only the classes used when Locutus or PurpleSwarm is recognized. Do they have any other purpose? I didn’t dig into it, but I suspect that GuardNatural and TankAdvance may be used more widely against protoss.

The added unit behaviors are:

GuardLoc - guard a location
HangingBase - carry out drops
SKTAttack - related to SKT

GuardLoc has some connection with GuardNatural, but seems to be a general-purpose behavior, as far as I can tell. I’m not sure how HangingBase got its name.

The new opening directory and the newly added strategy and behavior files account for about 2/3rds of the lines of code added to Iron. The rest is scattered through the code and not as easy to inventory, but surely much of it must be uses of the new openings, strategies, and behaviors. I do see a lot of changes related to expanding.

SAIDA’s learning and SAIDA’s weaknesses

SAIDA is holding its position as #1 on SSCAIT, but it is under constant attack from other bots and loses some games. On the one hand, SAIDA has weaknesses against early harassment and timing attacks, especially if the opponent denies scouting. On the other hand, SAIDA appears to have a learning mechanism that recognizes rush timing and figures out a defense. The SAIDA page describes it as “He also catches perfect rush timing by using information he collected.” That’s a vague description, but the behavior does appear to involve learning from experience. MicroDK noted that SAIDA writes data only after it loses; this must be why. For example, BananaBrain tried a dark templar rush and won a series of games, but finally the learning kicked in and SAIDA figured out how to get turrets in time to stop it (SAIDA’s code was not updated). Since then, BananaBrain has mostly lost games, defeating SAIDA only once, in this game where the turret was seconds late.

Other examples include PurpleSpirit winning one game with BBS then being unable to win with it again, and Krasi0 winning with its fast barracks marine cheese with similar results.

In the latest attacks, Locutus won with center gates, making only 2 zealots before switching into dragoons, and Krasi0 added a bunker to its marine cheese to overcome SAIDA’s vulture counter to the marines (SAIDA crashed this game). Will SAIDA learn to defeat these tricks too? I don’t know, let’s find out!

How powerful is this learning mechanism? Surely there must be attacks that it cannot figure out how to forestall—or can’t figure out in reasonable time. If you find 2 winning tricks and switch between them, can it learn to defend against both? If you DT rush once so that it learns to get early turrets, does it get early turrets for the rest of time after you switch back to regular play? The unnecessary turrets give you a small advantage, and at a high level of play, small advantages are big.

Here are some of the weaknesses I see in SAIDA’s play.

Poor defense against unscouted early attacks, mitigated by the learning mechanism. SAIDA loses more SCVs than it should.
SAIDA recovers poorly from economic setbacks. It does not replenish lost SCVs as well as it should, and stops expanding after a while. If you gain an early lead, you can win by holding on and waiting for SAIDA to mine out.
SAIDA is vulnerable to mine drags. It sees no danger in having its spider mines and its forces next to each other. It will even place mines in its mineral line, begging you to blow up its SCVs.
SAIDA does not know how to build in safe locations. On some maps, like Moon Glaive, parts of the main base are easily sieged from outside. Krasi0 has won games by blasting down factories that are in range, and SAIDA keeps trying to rebuild in places that are also in range.
SAIDA is consistent and predictable. It varies to counter the opponent, but at heart always plays the same strategy and the same tactics. The dropships always fly along the edge.

SAIDA also has great strengths. The greatest may be the big red animated arrow that points out the main attack position. As long as SAIDA has a monopoly on big animated arrows, I think it will remain #1.

AITT bots

Submission is closed for the AI Tinycraft Tournament (AITT), for bots limited to 3000 bytes of source code. 3 of the tiny bots have appeared on SSCAIT. Will more follow? Naturally, very small bots play very simple strategies:

Oh Fish - zerg 4 pool
PotatoMasher - protoss zealot rush
PurpleWavelet - protoss zealot rush

CIG 2018 - 8 game limit and other problems

I checked ISAMind and verified that it is affected by the same 8 game problem as Locutus and McRave: Its learning files store data for only 8 games total, not about 125 games as expected. Tscmoo also has a suspiciously small amount of learning data and may be affected. Ziabot has a problem with its learning data, but it looks like a different problem. Other bots appear unaffected, as far as I can judge—I could be wrong, because I don’t understand how they all work.

It’s unclear what effect the 8 game problem had on the tournament. In the best case, the CIG organizers pulled data incorrectly and the tournament itself ran normally. That seems unlikely to me. More likely, learning data for some bots was lost 8 rounds before the end. In that case, it is possible that most of the tournament ran normally, and one error near the end did not much affect results. The fact that the affected bots finished high supports the hypothesis—though, like PurpleWave, they could have finished high because they’re that good even when handicapped. In the worst case, there may have been repeated problems throughout the tournament. I’ll see if I can think of a way to use the detailed results log to narrow down the possibilities.

I feel that CIG 2018 had a lot of problems.

The 8 game problem, affecting Locutus, McRave, ISAMind, and possibly Tscmoo.
JVM bots did not write learning data at all, affecting PurpleWave, Tyr, and Ecgberht.
Ziabot’s learning problem (it might be a bot bug rather than a tournament bug, but Zia has always been reliable for me).
Overkill’s build order breakdown.

That’s a large proportion of the entrants affected by tournament surprises of one kind or another. What other problems are there that I haven’t noticed? I’ve only watched a few replays so far.

When I wrote to the CIG organizers to warn them that Steamhammer might crash a lot, they sent back what I found to be a rude reply which implied that giving them heads-up was a wrong thing to do. That is of course down to language and cultural differences. But still, communicating with the participants is part of running a tournament.

I may skip CIG next year.

CIG 2018 - what Locutus learned

Locutus only recorded 8 games. It is configured to retain 200 game records, and I read the source code and verified that Locutus does not intentionally drop game records before the limit of 200. Recording exactly 8 games is the same problem that McRave suffered, and must be due to CIG problems. I don't know what the underlying problem was. My suspicion is that CIG organizers or tournament software may have accidentally or mistakenly cleared learning data for some bots. If that is what happened, and it happened once 8 games before the end of the tournament, it seems likely that it happened more than once. Who knows, though? The error might be somewhere else. Maybe they mistakenly shipped us data from after round 8 instead of round 125—in that case the tournament may have run normally, and only the data about it is wrong.

Locutus has prepared data for some opponents, stored in the AI directory. When Locutus finds it has no game records for a given opponent, it looks in AI to see if it has prepared data, and if so, it reads in those game records. At the end of the game, it writes out the prepared game records along with the record for the newly played game, and from then on the prepared records are treated like any others and retained unless and until the 200 record limit is passed.

How many other bots were affected by the 8 game problem?

Here is Locutus’s prepared data. Against some opponents, like McRave, Locutus picks out openings to avoid at first. If other openings don’t win either, I’m sure Locutus will come back and try these anyway. Against others, it picks out winners to try first. For some, it simply provides data. Most but not all of the prepared data is for opponents which were carried over from last year, for which pre-learning is sure to be helpful... if it is done on the same maps.

#3 mcrave

opening	games	wins
12Nexus5ZealotFECannons	1	0%
Turtle	1	0%
2 openings	2	0%

#6 iron

opening	games	wins
DTDrop	14	100%
1 openings	14	100%

#7 zzzkbot

opening	games	wins
ForgeExpand5GateGoon	2	100%
1 openings	2	100%

#11 ualbertabot

opening	games	wins
4GateGoon	1	0%
9-9GateDefensive	2	50%
ForgeExpand5GateGoon	15	93%
3 openings	18	83%

#14 aiur

opening	games	wins
4GateGoon	3	100%
9-9GateDefensive	1	100%
2 openings	4	100%

#16 ziabot

opening	games	wins
9-9GateDefensive	1	0%
ForgeExpand5GateGoon	1	100%
2 openings	2	50%

#19 terranuab

opening	games	wins
DTDrop	10	100%
1 openings	10	100%

#21 opprimobot

opening	games	wins
DTDrop	11	100%
1 openings	11	100%

#22 sling

opening	games	wins
ForgeExpand5GateGoon	2	100%
1 openings	2	100%

#23 srbotone

opening	games	wins
DTDrop	7	100%
PlasmaProxy2Gate	1	100%
2 openings	8	100%

#24 bonjwa

opening	games	wins
DTDrop	6	100%
PlasmaProxy2Gate	1	100%
2 openings	7	100%

overall

	total		PvT		PvP		PvZ		PvR
opening	games	wins	games	wins	games	wins	games	wins	games	wins
12Nexus5ZealotFECannons	1	0%			1	0%
4GateGoon	4	75%			3	100%			1	0%
9-9GateDefensive	4	50%			1	100%	1	0%	2	50%
DTDrop	48	100%	48	100%
ForgeExpand5GateGoon	20	95%					5	100%	15	93%
PlasmaProxy2Gate	2	100%	2	100%
Turtle	1	0%			1	0%
total	80	92%	50	100%	6	67%	6	83%	18	83%
openings played	7		2		4		2		3

Here is Locutus’s learned data. In every case, the number of games recorded is 8 plus the number of games in the prepared data. With only 8 games there is not much to go on, but the prepared data does seem to have helped Locutus choose successful openings.

#2 purplewave

opening	games	wins
12Nexus5ZealotFECannons	1	0%
4GateGoon	1	0%
9-9GateDefensive	5	80%
Proxy9-9Gate	1	0%
4 openings	8	50%

#3 mcrave

opening	games	wins
12Nexus5ZealotFECannons	1	0%
4GateGoon	3	67%
Proxy9-9Gate	5	100%
Turtle	1	0%
4 openings	10	70%

#4 tscmoo

opening	games	wins
4GateGoon	1	0%
9-9GateDefensive	1	0%
ForgeExpand5GateGoon	4	25%
Proxy9-9Gate	2	50%
4 openings	8	25%

#5 isamind

opening	games	wins
4GateGoon	6	83%
9-9GateDefensive	1	100%
Proxy9-9Gate	1	100%
3 openings	8	88%

#6 iron

opening	games	wins
DTDrop	22	95%
1 openings	22	95%

#7 zzzkbot

opening	games	wins
ForgeExpand5GateGoon	7	86%
ForgeExpandSpeedlots	2	50%
Proxy9-9Gate	1	0%
3 openings	10	70%

#8 microwave

opening	games	wins
ForgeExpand5GateGoon	8	100%
1 openings	8	100%

#9 letabot

opening	games	wins
DTDrop	8	88%
1 openings	8	88%

#10 megabot

opening	games	wins
4GateGoon	8	100%
1 openings	8	100%

#11 ualbertabot

opening	games	wins
4GateGoon	1	0%
9-9GateDefensive	2	50%
ForgeExpand5GateGoon	23	91%
3 openings	26	85%

#12 tyr

opening	games	wins
4GateGoon	8	100%
1 openings	8	100%

#13 ecgberht

opening	games	wins
DTDrop	8	88%
1 openings	8	88%

#14 aiur

opening	games	wins
12Nexus5ZealotFECannons	1	0%
2GateDTExpo	1	100%
4GateGoon	5	80%
9-9GateDefensive	1	100%
Proxy9-9Gate	4	75%
5 openings	12	75%

#15 titaniron

opening	games	wins
DTDrop	8	100%
1 openings	8	100%

#16 ziabot

opening	games	wins
9-9GateDefensive	1	0%
ForgeExpand5GateGoon	6	83%
ForgeExpandSpeedlots	2	50%
Proxy9-9Gate	1	100%
4 openings	10	70%

#17 steamhammer

opening	games	wins
ForgeExpand5GateGoon	8	100%
1 openings	8	100%

#18 overkill

opening	games	wins
ForgeExpand5GateGoon	8	100%
1 openings	8	100%

#19 terranuab

opening	games	wins
DTDrop	18	100%
1 openings	18	100%

#20 cunybot

opening	games	wins
ForgeExpand5GateGoon	8	100%
1 openings	8	100%

#21 opprimobot

opening	games	wins
DTDrop	19	100%
1 openings	19	100%

#22 sling

opening	games	wins
ForgeExpand5GateGoon	10	100%
1 openings	10	100%

#23 srbotone

opening	games	wins
DTDrop	15	100%
PlasmaProxy2Gate	1	100%
2 openings	16	100%

#24 bonjwa

opening	games	wins
DTDrop	14	100%
PlasmaProxy2Gate	1	100%
2 openings	15	100%

#25 stormbreaker

opening	games	wins
ForgeExpand5GateGoon	8	100%
1 openings	8	100%

#26 korean

opening	games	wins
ForgeExpand5GateGoon	8	100%
1 openings	8	100%

#27 salsa

opening	games	wins
ForgeExpand5GateGoon	8	100%
1 openings	8	100%

overall

	total		PvT		PvP		PvZ		PvR
opening	games	wins	games	wins	games	wins	games	wins	games	wins
12Nexus5ZealotFECannons	3	0%			3	0%
2GateDTExpo	1	100%			1	100%
4GateGoon	33	82%			31	87%			2	0%
9-9GateDefensive	11	64%			7	86%	1	0%	3	33%
DTDrop	112	97%	112	97%
ForgeExpand5GateGoon	106	93%					79	97%	27	81%
ForgeExpandSpeedlots	4	50%					4	50%
PlasmaProxy2Gate	2	100%	2	100%
Proxy9-9Gate	15	73%			11	82%	2	50%	2	50%
Turtle	1	0%			1	0%
total	288	90%	114	97%	54	80%	86	93%	34	71%
openings played	10		2		6		4		4

CIG 2018 - what Steamhammer learned

I wrote a new script to analyze Steamhammer’s learning data. A couple points: 1. Steamhammer crashed in nearly half of its games in CIG 2018. It can’t save learning data after a crash, so against some opponents Steamhammer had few opportunities to experiment. The number of crashes varied strongly depending on the opponent. 2. Steamhammer was set to remember the previous 100 games, since I figure there’s no play advantage to remembering more. The tournament was 125 rounds long. So in the tables below, “100 games” means that Steamhammer played at least 100 games without crashing, and up to 25 games may have been dropped, the early games. Against some weak opponents, Steamhammer learned, within 25 games, how to win 100% of the remaining games, and those tables give a 100% win rate for remembered games. Steamhammer did not score 100% against any opponent overall; it always had some losses in early games.

I should be able to run the same analysis for Steamhammer forks which retain Steamhammer’s opponent model file format.

#1 Locutus

opening	games	wins
2HatchHydraBust	1	0%
3HatchHydraExpo	2	0%
3HatchLingBust	1	0%
3HatchLingExpo	1	0%
4HatchBeforeGas	1	0%
OverpoolSpeed	9	56%
6 openings	15	33%

A mystery is solved. Why was Steamhammer’s crash rate higher than I expected? Because many opponents learned to make Steamhammer crash. A crash for the opponent is a win, and the bot doesn’t care how it wins, so if it can learn a plan that makes the opponent crash reliably, it will. The stronger opponents tend to be learning bots, so Steamhammer crashed more often on average against strong opponents. This also means that my glib conclusion that Steamhammer won 66% of non-crash games, so it seems to have kept up with general progress is not sound. The non-crash games were mostly against weak opponents.

Locutus was lucky that it could figure out how to break Steamhammer. As Bruce mentioned in a comment, this Locutus version had a bug when facing certain zergling timings, and Steamhammer quickly figured out how to exploit the bug. It’s possible that Steamhammer minus the crash would have upset Locutus.

#2 PurpleWave

opening	games	wins
11Gas10PoolMuta	1	0%
3HatchHydra	3	0%
3HatchLurker	1	0%
4PoolSoft	1	0%
7Pool12Hatch	1	0%
7PoolSoft	1	0%
9Hatch8Pool	1	0%
9HatchExpo9Pool9Gas	1	0%
9PoolSpeed	1	0%
AntiFactory	1	0%
Over10Hatch	6	0%
Over10Hatch1Sunk	7	0%
Over10Hatch2Sunk	18	0%
Over10HatchBust	1	0%
Over10HatchSlowLings	4	0%
OverhatchMuta	1	0%
OverpoolHatch	1	0%
OverpoolTurtle	3	0%
ZvP_3HatchPoolHydra	2	0%
ZvP_4HatchPoolHydra	1	0%
ZvT_12PoolMuta	1	0%
ZvZ_Overpool11Gas	1	0%
22 openings	58	0%

PurpleWave shut out Steamhammer. It didn’t learn to make Steamhammer crash because every game was a win for it anyway. Steamhammer desperately tried alternatives all over the map, including crazy all-ins and openings intended for ZvT and ZvZ, and nothing worked.

#3 McRave

opening	games	wins
11Gas10PoolLurker	1	0%
4HatchBeforeGas	1	0%
9HatchExpo9Pool9Gas	1	0%
9PoolSpeed	5	100%
ZvP_3HatchPoolHydra	2	0%
5 openings	10	50%

#4 tscmoo

opening	games	wins
9PoolExpo	1	0%
9PoolHatch	1	0%
9PoolSunkHatch	1	0%
AntiFact_2Hatch	1	0%
Over10Hatch2Sunk	1	0%
OverhatchExpoLing	13	15%
OverpoolSpeed	22	23%
7 openings	40	18%

#5 ISAMind

opening	games	wins
3HatchHydraExpo	1	0%
4HatchBeforeGas	1	0%
OverpoolSpeed	4	100%
ZvP_2HatchMuta	7	0%
ZvP_3HatchPoolHydra	6	0%
5 openings	19	21%

#6 Iron

opening	games	wins
2HatchHydra	1	0%
3HatchLingExpo	2	0%
4PoolHard	1	0%
6PoolSpeed	1	0%
9Hatch8Pool	1	0%
9HatchMain9Pool9Gas	1	0%
9PoolSunkSpeed	1	0%
AntiFact_13Pool	4	0%
AntiFact_2Hatch	83	12%
AntiFactory	1	0%
Over10Hatch	1	0%
PurpleSwarmBuild	1	0%
ZvP_2HatchMuta	1	0%
ZvT_12PoolMuta	1	0%
14 openings	100	10%

Iron is not a learning bot, so it did not learn to crash Steamhammer. Still, these results show a weakness in Steamhammer: Its best opening against Iron is AntiFactory, which it tried only once in these 100 games. Steamhammer did not explore enough. I tried to fix the weakness in Steamhammer 2.0.

#7 ZZZKBot

opening	games	wins
11Gas10PoolMuta	1	0%
8Pool	7	29%
9HatchMain9Pool9Gas	1	0%
9PoolSpeed	1	0%
OverhatchMuta	1	0%
Overpool+1	1	0%
OverpoolSpeed	1	0%
ZvZ_12HatchMain	2	0%
ZvZ_12Pool	1	0%
ZvZ_12PoolLing	48	58%
ZvZ_Overgas9Pool	2	0%
ZvZ_Overpool9Gas	2	0%
12 openings	68	44%

#8 Microwave

opening	games	wins
9PoolSunkHatch	5	80%
9PoolSunkSpeed	27	67%
OverpoolSunk	1	0%
OverpoolTurtle	3	33%
ZvZ_12PoolLing	1	0%
5 openings	37	62%

This looks like successful learning. Too bad Steamhammer only successfully played 37 of the 125 games.

#9 LetaBot

opening	games	wins
11Gas10PoolLurker	1	0%
2HatchLurkerAllIn	4	0%
3HatchHydraExpo	1	0%
3HatchLurker	13	38%
9HatchExpo9Pool9Gas	45	36%
OverpoolLurker	13	31%
ZvP_2HatchMuta	1	0%
ZvT_12PoolMuta	1	0%
ZvT_13Pool	1	0%
ZvT_3HatchMuta	1	0%
10 openings	81	31%

#10 MegaBot

opening	games	wins
11Gas10PoolLurker	1	0%
3HatchHydra	1	0%
3HatchHydraExpo	1	0%
3HatchLingExpo	21	43%
Over10Hatch	1	0%
OverhatchExpoLing	1	100%
ZvP_3HatchPoolHydra	2	0%
7 openings	28	36%

#11 UAlbertaBot

opening	games	wins
3HatchLingExpo	1	0%
5PoolHard2Player	1	0%
9PoolExpo	1	0%
9PoolSpeed	1	0%
9PoolSunkHatch	46	33%
9PoolSunkSpeed	29	48%
Over10Hatch1Sunk	2	0%
OverpoolSpeed	1	0%
ZvZ_Overpool9Gas	1	0%
9 openings	83	35%

#12 Tyr

opening	games	wins
9PoolHatch	5	100%
ZvP_3HatchPoolHydra	5	0%
2 openings	10	50%

#13 Ecgberht

opening	games	wins
11Gas10PoolLurker	10	50%
2HatchLurker	23	61%
2HatchLurkerAllIn	44	75%
Over10HatchBust	3	33%
OverpoolLurker	8	75%
OverpoolSpeed	3	33%
ZvT_13Pool	1	0%
7 openings	92	65%

#14 Aiur

opening	games	wins
11Gas10PoolLurker	1	100%
5PoolHard2Player	1	100%
9PoolSunkHatch	1	100%
9PoolSunkSpeed	2	100%
Over10Hatch	1	0%
Over10Hatch1Sunk	2	50%
Over10Hatch2Hard	1	100%
Over10HatchSlowLings	1	100%
OverpoolSpeed	2	100%
OverpoolTurtle	3	67%
10 openings	15	80%

#15 TitanIron

opening	games	wins
3HatchLingBust	1	0%
AntiFact_13Pool	6	50%
AntiFact_2Hatch	1	0%
AntiFactory	74	42%
Over10Hatch2Sunk	1	0%
OverhatchExpoMuta	1	0%
OverpoolLurker	1	0%
ZvZ_Overgas9Pool	14	21%
ZvZ_Overpool9Gas	1	0%
9 openings	100	37%

This selection of openings implies that TitanIron plays a factory-first build against zerg, like Iron, and is a non-learning bot, like Iron. Later I’ll look into the source and find out for sure.

#16 Ziabot

opening	games	wins
11Gas10PoolMuta	4	25%
2.5HatchMuta	1	0%
3HatchHydraBust	1	0%
6PoolSpeed	1	0%
8Pool	7	71%
9Hatch8Pool	1	0%
9PoolHatch	4	50%
ZvP_2HatchTurtle	1	0%
ZvZ_12Pool	1	0%
ZvZ_12PoolMain	16	25%
ZvZ_Overpool11Gas	10	50%
ZvZ_Overpool9Gas	53	74%
12 openings	100	56%

Low win rates against Zia and some other opponents suggest to me that Steamhammer had other new weaknesses besides crashing. I think Steamhammer should score over 80% against Zia.

#18 Overkill

opening	games	wins
11Gas10PoolMuta	10	90%
4PoolHard	23	96%
6PoolSpeed	28	100%
9Hatch8Pool	1	0%
OverhatchLing	2	50%
OverpoolSpeed	13	92%
ZvZ_12HatchExpo	2	50%
ZvZ_12PoolMain	1	0%
8 openings	80	91%

#19 TerranUAB

opening	games	wins
2HatchLurker	52	90%
AntiFact_13Pool	8	88%
AntiFact_2Hatch	9	78%
AntiFactory	31	90%
4 openings	100	89%

#20 CUNYbot

opening	games	wins
11Gas10PoolMuta	9	78%
OverhatchLing	34	97%
ZvZ_12PoolLing	27	96%
ZvZ_Overgas9Pool	1	0%
ZvZ_Overpool9Gas	19	89%
5 openings	90	92%

#21 OpprimoBot

opening	games	wins
11Gas10PoolLurker	3	67%
2HatchLurker	2	50%
2HatchLurkerAllIn	6	83%
6PoolSpeed	19	100%
OverpoolLurker	1	0%
OverpoolSpeed	5	80%
ZvT_12PoolMuta	20	95%
ZvT_3HatchMuta	20	100%
ZvT_3HatchMutaExpo	24	100%
9 openings	100	94%

#22 Sling

opening	games	wins
4PoolHard	4	75%
4PoolSoft	6	100%
5PoolHard2Player	3	100%
ZvZ_12HatchMain	1	0%
ZvZ_Overgas9Pool	1	0%
5 openings	15	80%

The selection of fast rush openings suggests that Sling played a macro strategy which was countered by fast rushes. But I don’t want to draw strong conclusions based on 15 non-crash games out of 125.

#23 SRbotOne

opening	games	wins
11Gas10PoolLurker	14	93%
2HatchLurker	10	90%
2HatchLurkerAllIn	10	90%
3HatchLurker	17	100%
4PoolSoft	17	100%
5PoolHard	7	100%
9HatchExpo9Pool9Gas	4	75%
9PoolLurker	3	100%
OverpoolLurker	5	100%
9 openings	87	95%

The wide range of lurker openings means that SRbotOne by Johan Kayser fought with mostly barracks units. Well, we already knew that.

#24 Bonjwa

opening	games	wins
9PoolExpo	6	100%
9PoolSunkHatch	5	100%
9PoolSunkSpeed	5	100%
AntiFact_2Hatch	3	100%
AntiFactory	5	100%
ZvT_2HatchMuta	1	100%
6 openings	25	100%

#25 Stormbreaker

opening	games	wins
11Gas10PoolMuta	1	100%
4PoolHard	1	100%
9PoolSunkHatch	8	100%
9PoolSunkSpeed	8	100%
OverhatchLing	1	100%
OverhatchMuta	7	100%
OverpoolSpeed	1	100%
OverpoolSunk	7	100%
ZvZ_12HatchExpo	2	100%
ZvZ_12HatchMain	3	100%
ZvZ_12PoolLing	1	100%
ZvZ_12PoolMain	3	100%
12 openings	43	100%

#26 Korean

opening	games	wins
4PoolHard	1	100%
4PoolSoft	3	100%
5PoolHard	5	100%
5PoolHard2Player	3	100%
5PoolSoft	1	100%
6PoolSpeed	6	100%
OverhatchLing	9	100%
OverhatchMuta	12	100%
ZvZ_12HatchExpo	13	100%
ZvZ_12HatchMain	16	100%
ZvZ_12PoolLing	14	100%
ZvZ_12PoolMain	17	100%
12 openings	100	100%

#27 Salsa

opening	games	wins
4PoolHard	2	100%
4PoolSoft	4	100%
5PoolHard	7	100%
5PoolHard2Player	1	100%
5PoolSoft	1	100%
6PoolSpeed	8	100%
OverhatchLing	11	100%
OverhatchMuta	8	100%
ZvZ_12HatchExpo	12	100%
ZvZ_12HatchMain	20	100%
ZvZ_12PoolLing	13	100%
ZvZ_12PoolMain	12	100%
ZvZ_Overgas9Pool	1	100%
13 openings	100	100%

overall

	total		ZvT		ZvP		ZvZ		ZvR
opening	games	wins	games	wins	games	wins	games	wins	games	wins
11Gas10PoolLurker	31	68%	28	71%	3	33%
11Gas10PoolMuta	26	69%			1	0%	25	72%
2.5HatchMuta	1	0%					1	0%
2HatchHydra	1	0%	1	0%
2HatchHydraBust	1	0%			1	0%
2HatchLurker	87	82%	87	82%
2HatchLurkerAllIn	64	73%	64	73%
3HatchHydra	4	0%			4	0%
3HatchHydraBust	1	0%					1	0%
3HatchHydraExpo	5	0%	1	0%	4	0%
3HatchLingBust	2	0%	1	0%	1	0%
3HatchLingExpo	25	36%	2	0%	22	41%			1	0%
3HatchLurker	31	71%	30	73%	1	0%
4HatchBeforeGas	3	0%			3	0%
4PoolHard	32	91%	1	0%			31	94%
4PoolSoft	31	97%	17	100%	1	0%	13	100%
5PoolHard	19	100%	7	100%			12	100%
5PoolHard2Player	9	89%			1	100%	7	100%	1	0%
5PoolSoft	2	100%					2	100%
6PoolSpeed	63	97%	20	95%			43	98%
7Pool12Hatch	1	0%			1	0%
7PoolSoft	1	0%			1	0%
8Pool	14	50%					14	50%
9Hatch8Pool	4	0%	1	0%	1	0%	2	0%
9HatchExpo9Pool9Gas	51	37%	49	39%	2	0%
9HatchMain9Pool9Gas	2	0%	1	0%			1	0%
9PoolExpo	8	75%	6	100%					2	0%
9PoolHatch	10	70%			5	100%	4	50%	1	0%
9PoolLurker	3	100%	3	100%
9PoolSpeed	8	62%			6	83%	1	0%	1	0%
9PoolSunkHatch	66	50%	5	100%	1	100%	13	92%	47	32%
9PoolSunkSpeed	72	65%	6	83%	2	100%	35	74%	29	48%
AntiFact_13Pool	18	56%	18	56%
AntiFact_2Hatch	97	21%	96	21%					1	0%
AntiFactory	112	57%	111	58%	1	0%
Over10Hatch	9	0%	1	0%	8	0%
Over10Hatch1Sunk	11	9%			9	11%			2	0%
Over10Hatch2Hard	1	100%			1	100%
Over10Hatch2Sunk	20	0%	1	0%	18	0%			1	0%
Over10HatchBust	4	25%	3	33%	1	0%
Over10HatchSlowLings	5	20%			5	20%
OverhatchExpoLing	14	21%			1	100%			13	15%
OverhatchExpoMuta	1	0%	1	0%
OverhatchLing	57	96%					57	96%
OverhatchMuta	29	93%			1	0%	28	96%
Overpool+1	1	0%					1	0%
OverpoolHatch	1	0%			1	0%
OverpoolLurker	28	54%	28	54%
OverpoolSpeed	61	56%	8	62%	15	73%	15	87%	23	22%
OverpoolSunk	8	88%					8	88%
OverpoolTurtle	9	33%			6	33%	3	33%
PurpleSwarmBuild	1	0%	1	0%
ZvP_2HatchMuta	9	0%	2	0%	7	0%
ZvP_2HatchTurtle	1	0%					1	0%
ZvP_3HatchPoolHydra	17	0%			17	0%
ZvP_4HatchPoolHydra	1	0%			1	0%
ZvT_12PoolMuta	23	83%	22	86%	1	0%
ZvT_13Pool	2	0%	2	0%
ZvT_2HatchMuta	1	100%	1	100%
ZvT_3HatchMuta	21	95%	21	95%
ZvT_3HatchMutaExpo	24	100%	24	100%
ZvZ_12HatchExpo	29	97%					29	97%
ZvZ_12HatchMain	42	93%					42	93%
ZvZ_12Pool	2	0%					2	0%
ZvZ_12PoolLing	104	79%					104	79%
ZvZ_12PoolMain	49	73%					49	73%
ZvZ_Overgas9Pool	19	21%	14	21%			5	20%
ZvZ_Overpool11Gas	11	45%			1	0%	10	50%
ZvZ_Overpool9Gas	76	74%	1	0%			74	76%	1	0%
total	1596	64%	685	62%	155	26%	633	82%	123	29%
openings played	69		37		36		31		13

This summary table took me hours to get right, so I hope it's useful.

Steamhammer played 69 openings in 1596 non-crash games, which is around 2/3rds of the openings it knows. No single matchup had more than 37 different openings. There were far more games against terran and zerg than against protoss and random, partly due to the crashing pattern. Against the random opponents (Tscmoo and UAlbertaBot), it settled on mostly general-purpose openings, as you might expect. Its best matchup was ZvZ, with a Jaedong-like 82% win rate (and lately, Jaedong crashes half the time too, so they’re just alike).

Openings that were both popular and successful include 2HatchLurker and 2HatchLurkerAllIn versus terran, 6PoolSpeed with a 97% win rate against mostly weak opponents, 9PoolSunkSpeed used across all matchups, and ZvZ specialties OverhatchLing, ZvZ_12PoolLing, and ZvZ_Overpool9Gas. None of the opening choices surprises me, though some of the win rates do.

CIG 2018 - Overkill was broken

Did Overkill actually perform much worse in CIG 2018 than in past years? Here are the bots carried over from 2017 to 2018, with win rates in both years, with numbers from the official results. We see that Overkill collapsed in win rate from 2017 to 2018, a far bigger change than any other bot. Iron performed poorly in 2017 because it failed on the map Hitchhiker. Other bots mostly had modestly lower win rates in the stronger field this year. My 2017 crosstable was calculated from the detailed results, which included some corrupted data and are a little different from the official results—except for Sling, which was a lot different: 26.07% in 2017 versus its official 18.08%, reducing its year-over-year difference.

bot	2017	2018
UAlbertaBot	65.59%	60.58%
Overkill	62.75%	34.68%
Ziabot	61.75%	51.08%
Iron	61.62%	74.31%
Aiur	59.83%	51.54%
TerranUAB	36.78%	34.40%
SRbotOne	34.14%	24.37%
OpprimoBot	30.69%	27.11%
Bonjwa	30.67%	23.57%
Sling	18.08%	26.52%
Salsa	4.64%	1.54%

Was the difference due to the maps? N0. In 2017, Overkill scored 57% or more on every map (CIG 2017 bots x maps). In 2018, Overkill scored 38% or below on every map (official results). And 3 of the 5 maps were the same: Tau Cross, Andromeda, and Python.

Did they run different versions of Overkill? The source that they distributed for Overkill is identical in both years. Theoretically they might have run something different by mistake—but it produced the expected files in the write directory, so it would be a surprise.

Finally I downloaded the Overkill replays and watched some. The poor bot’s build orders were severely distorted, skipping over drones and buildings. It would do things like take gas on 7 and then stop all construction, or follow a normal-ish build but drop many drones so that its economy was anemic. Sometimes drones moved erratically instead of mining. It looked similar to play I’ve seen from Steamhammer when latency stuff is way out of whack. Of the games I looked at, some were hopelessly muddled, some were close to normal with only occasional dropped drones, and none were 100% good. I don’t know what the problem was, something corrupted or a server setting that Overkill could not cope with, but whatever it was, Overkill was badly broken and far short of its normal strength.

43864-OVER_ZIAB.REP (Overkill’s last game of the tournament) is an example replay that shows the problems.

It’s possible that some other bots may have been affected. If the difference was in a server setting that Overkill was not ready for, then it would be surprising if every other bot was ready.

CIG 2018 - what Overkill learned

After analyzing AIUR yesterday, I ran a similar (but much simpler) analysis for the classic zerg #18 Overkill. The version in CIG 2018 has not been updated since 2015 and is the same version that still plays on SSCAIT. In 2015 it was a sensation, placing 3rd in both CIG and AIIDE—its place of 18 in this tournament, with about 35% win rate, suggests huge progress over the past 3 years. But keep reading; Overkill appears to have been broken in this tournament. I did this analysis once before: See what Overkill learned in AIIDE 2015.

Classic Overkill knows 3 openings, a 9 pool opening which stays on one base for a good time, and 10- and 12-hatch openings to get mutalisks first. When it chooses 9 pool, that means that the opponent is either rushing (so the 9 pool is necessary to defend) or is being too greedy (which the 9 pool can exploit). Overkill counts some games twice in an attempt to learn faster, so sometimes its total game count is larger than the number of rounds in the tournament (125).

	NinePoolling		TenHatchMuta		TwelveHatchMuta		total
opponent	n	win	n	win	n	win	n	win
#1 Locutus	42	0%	42	0%	41	0%	125	0%
#2 PurpleWave	43	0%	43	0%	42	0%	128	0%
#3 McRave	44	0%	44	0%	43	0%	131	0%
#4 tscmoo	40	0%	40	0%	47	2%	127	1%
#5 ISAMind	42	0%	42	0%	41	0%	125	0%
#6 Iron	54	7%	32	0%	39	3%	125	4%
#7 ZZZKBot	47	2%	39	0%	47	2%	133	2%
#8 Microwave	54	6%	35	0%	42	2%	131	3%
#9 LetaBot	52	6%	33	0%	40	2%	125	3%
#10 MegaBot	60	12%	24	0%	41	7%	125	8%
#11 UAlbertaBot	41	0%	41	0%	48	2%	130	1%
#12 Tyr	40	0%	39	0%	47	2%	126	1%
#13 Ecgberht	57	16%	24	4%	42	12%	123	12%
#14 Aiur	94	34%	14	7%	17	12%	125	28%
#15 TitanIron	36	11%	20	0%	69	16%	125	12%
#16 Ziabot	16	0%	16	0%	93	23%	125	17%
#17 Steamhammer	107	48%	7	0%	10	10%	124	42%
#19 TerranUAB	24	67%	3	0%	98	83%	125	78%
#20 CUNYbot	18	44%	6	17%	101	66%	125	61%
#21 OpprimoBot	36	67%	3	0%	86	76%	125	71%
#22 Sling	67	46%	6	0%	52	42%	125	42%
#23 SRbotOne	23	74%	4	25%	95	89%	122	84%
#24 Bonjwa	75	92%	4	25%	46	87%	125	88%
#25 Stormbreaker	70	91%	2	0%	53	87%	125	88%
#26 Korean	77	99%	2	0%	46	93%	125	95%
#27 Salsa	46	100%	32	94%	46	100%	124	98%
total	1305	36%	597	6%	1372	40%	3274	32%

The 10 hatch opening was useless in this tournament—against every opponent, 10 hatch was the worst choice, at best tying for 0. In 2015, 10 hatch was about as successful as the other openings.

Signs are that something was wrong with Overkill in this tournament. In AIIDE 2015, then #3 Overkill scored 23% against then #4 UAlbertaBot, 68% against #5 AIUR, and 99% against #17 OpprimoBot. In CIG 2018, it was 1.6% against UAlbertaBot, 28% against AIUR, 71% against OpprimoBot. All versions appear to be the same in both tournaments—I didn’t look closely, but I did unpack the sources and check dates (in particular, Overkill has file change dates up to 8 October 2015 in both tournaments). Overkill had 14 crash games in CIG 2018, not enough to account for the difference. It’s hard to believe that the maps could have shifted results that much.

Tomorrow: What went wrong with Overkill?

CIG 2018 - what AIUR learned

Here is what the classic protoss bot AIUR learned about each opponent over the course of CIG 2018. AIUR has not been updated in many years and has fallen behind the state of the art, but its varied strategies and learning still make it a tricky opponent in a long tournament. Seeing AIUR's counters for each opponent tells us something about how the opponent played. For past editions, see AIIDE 2017 what AIUR learned and what AIUR learned (AIIDE 2015).

This is generated from data in AIUR's final write directory. There were 125 rounds and 5 maps, one 2-player and two each 3- and 4-player maps. For some opponents, all games were recorded, giving 25 games on the 2-player map and 50 games each on 3- and 4-player maps. For most opponents, fewer games were recorded. AIUR recorded 2932 games, and the results table lists 318 crashes for AIUR. 2932 + 318 = 3250, the correct total game count. Unrecorded games were lost due to crashes, and for no other reason.

First the overview, summing across all opponents.

overall	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	72	49%	127	65%	132	35%	331	49%
rush	29	41%	269	33%	261	55%	559	44%
aggressive	13	23%	225	68%	184	78%	422	71%
fast expo	33	24%	185	48%	207	48%	425	46%
macro	46	33%	180	52%	135	60%	361	53%
defensive	141	75%	314	73%	379	55%	834	65%
total	334	54%	1300	56%	1298	56%	2932	56%

2, 3, 4 - map size, the number of starting positions
n - games recorded
wins - winning percentage over those games
cheese - cannon rush
rush - dark templar rush
aggressive - fast 4 zealot drop
fast expo - nexus first
macro - aim for a strong middle game army
defensive - try to be safe against rushes

Looking across the bottom row, you can see that AIUR had a plus score on every size of map, and that it had to choose different strategies to do so well. It's a strong result for a bot which has essentially no micro skills and has not been updated since 2014. It does still have the best cannon rush of any bot, if you ask me.

#1 locutus	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	8	0%	25	12%	34	9%
rush	1	0%	10	0%	6	0%	17	0%
aggressive	1	0%	4	0%	5	0%	10	0%
fast expo	1	0%	14	0%	5	0%	20	0%
macro	1	0%	7	0%	4	0%	12	0%
defensive	1	0%	7	14%	5	0%	13	8%
total	6	0%	50	2%	50	6%	106	4%

Even against the toughest opponents, AIUR can scrape a small edge with learning. Against Locutus, it pulled barely above zero, but got a few extra wins because it discovered that its cannon rush occasionally scores on 4-player maps. Results against PurpleWave below are similar. I suspect that if AIUR had played the cannon rush every game, Locutus would have adapted and nullified the edge. Maybe it did, and that’s why the edge is so small.

#2 purplewave	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	8	0%	39	18%	48	15%
rush	1	0%	8	0%	2	0%	11	0%
aggressive	1	0%	10	0%	3	0%	14	0%
fast expo	4	0%	8	0%	2	0%	14	0%
macro	1	0%	10	0%	2	0%	13	0%
defensive	3	0%	6	0%	2	0%	11	0%
total	11	0%	50	0%	50	14%	111	6%

#3 mcrave	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	100%	1	0%	1	0%	3	33%
rush	1	0%	41	2%	1	0%	43	2%
aggressive	0	0%	2	0%	3	0%	5	0%
fast expo	1	0%	1	0%	42	17%	44	16%
macro	1	0%	3	0%	1	0%	5	0%
defensive	1	0%	2	0%	2	0%	5	0%
total	5	20%	50	2%	50	14%	105	9%

Against McRave, the choice is nexus first. McRave must have settled on a macro opening itself.

#4 tscmoo	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	11	27%	1	0%	1	0%	13	23%
rush	1	0%	1	0%	3	0%	5	0%
aggressive	1	0%	11	9%	1	0%	13	8%
fast expo	5	20%	33	15%	1	0%	39	15%
macro	1	0%	2	0%	22	14%	25	12%
defensive	1	0%	2	0%	22	18%	25	16%
total	20	20%	50	12%	50	14%	120	14%

Against the unpredictable Tscmoo, AIUR wavered before settling on an unpredictable set of answers. Notice that not all the strategies are well explored: If you win less than 1 game in 5, then playing an opening 3 times is not enough. If the tournament were much longer, AIUR would likely have scored higher because of its slow but effective learning.

#5 isamind	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	2	0%	4	0%	7	0%
rush	1	100%	37	19%	38	8%	76	14%
aggressive	0	0%	1	0%	3	0%	4	0%
fast expo	1	0%	5	0%	2	0%	8	0%
macro	1	0%	1	0%	2	0%	4	0%
defensive	1	0%	4	0%	1	0%	6	0%
total	5	20%	50	14%	50	6%	105	10%

ISAMind may be based on Locutus, but unlike Locutus it is vulnerable to AIUR’s dark templar rushes. It’s a sign that it is not as mature and well tested.

#6 iron	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	5	0%	7	0%
rush	1	0%	26	19%	2	0%	29	17%
aggressive	0	0%	2	0%	2	0%	4	0%
fast expo	1	0%	1	0%	31	10%	33	9%
macro	1	0%	19	5%	4	0%	24	4%
defensive	1	0%	1	0%	6	0%	8	0%
total	5	0%	50	12%	50	6%	105	9%

#7 zzzkbot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	4	0%	2	0%	2	0%	8	0%
rush	4	0%	4	0%	1	0%	9	0%
aggressive	3	0%	2	0%	1	0%	6	0%
fast expo	3	0%	3	0%	1	0%	7	0%
macro	7	0%	5	0%	4	0%	16	0%
defensive	4	0%	34	29%	41	12%	79	19%
total	25	0%	50	20%	50	10%	125	12%

4 pooler ZZZKBot is of course best countered by a defensive anti-rush strategy. Well, it helped, but the rush is too strong for AIUR to survive reliably. On the 2-player map, AIUR found no answer.

#8 microwave	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	2	0%	2	0%	1	0%	5	0%
rush	1	0%	27	7%	1	0%	29	7%
aggressive	1	0%	1	0%	1	0%	3	0%
fast expo	1	0%	2	0%	1	0%	4	0%
macro	1	0%	1	0%	9	22%	11	18%
defensive	18	22%	17	24%	36	25%	71	24%
total	24	17%	50	12%	49	22%	123	17%

Microwave apparently also played a rushy style versus AIUR. That’s interesting. I think that AIUR’s defensive strategy is good against pressure openings generally, so Microwave was likely playing low-econ but not necessarily fast rushes.

#9 letabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	1	0%	1	0%	3	33%	5	20%
aggressive	0	0%	3	33%	1	0%	4	25%
fast expo	1	0%	41	49%	43	49%	85	48%
macro	1	100%	3	33%	1	0%	5	40%
defensive	1	0%	1	0%	1	0%	3	0%
total	5	20%	50	44%	50	44%	105	43%

Fast expo makes sense against LetaBot’s “wait for it... wait for it... here it comes!” one big smash.

#10 megabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	2	0%	3	0%	6	0%
rush	2	50%	4	0%	38	11%	44	11%
aggressive	1	0%	3	0%	3	0%	7	0%
fast expo	1	0%	3	0%	2	0%	6	0%
macro	2	0%	36	28%	2	0%	40	25%
defensive	18	94%	2	0%	2	0%	22	77%
total	25	72%	50	20%	50	8%	125	26%

Why did MegaBot have so much more trouble on the 2-player map? According to the official per-map result table, MegaBot did fine overall on Destination (the one 2-player map), so its trouble came only against AIUR. Maybe I should watch replays and diagnose it.

#11 ualbertabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	2	0%	43	37%	2	0%	47	34%
aggressive	1	0%	2	0%	1	0%	4	0%
fast expo	1	0%	2	0%	1	0%	4	0%
macro	18	33%	1	0%	1	0%	20	30%
defensive	1	0%	1	0%	44	16%	46	15%
total	24	25%	50	32%	50	14%	124	23%

#12 tyr	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	1	100%	1	0%	32	81%	34	79%
aggressive	0	0%	37	46%	8	75%	45	51%
fast expo	1	100%	3	33%	3	67%	7	57%
macro	1	0%	6	33%	3	33%	10	30%
defensive	1	0%	2	0%	3	33%	6	17%
total	5	40%	50	40%	50	72%	105	55%

I suspect that Tyr suffered here because it is a jvm bot and could not write its learning file.

#13 ecgberht	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	100%	38	89%	2	50%	41	88%
rush	1	100%	1	0%	43	67%	45	67%
aggressive	0	0%	4	75%	1	0%	5	60%
fast expo	1	100%	1	0%	2	0%	4	25%
macro	1	0%	3	67%	1	0%	5	40%
defensive	1	0%	3	67%	1	0%	5	40%
total	5	60%	50	82%	50	60%	105	70%

#15 titaniron	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	2	50%	4	25%
rush	1	0%	1	0%	3	33%	5	20%
aggressive	0	0%	42	79%	42	88%	84	83%
fast expo	1	0%	1	0%	1	0%	3	0%
macro	1	100%	2	50%	1	0%	4	50%
defensive	1	100%	3	0%	1	0%	5	20%
total	5	40%	50	68%	50	78%	105	71%

TitanIron appears to have been too predictable. Notice that the winning strategy on most maps was never tried (without crashing) on the 2-player map. It might have won there too.

#16 ziabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	16	50%	2	50%	1	0%	19	47%
rush	1	0%	2	0%	1	0%	4	0%
aggressive	1	0%	1	0%	3	33%	5	20%
fast expo	1	0%	2	50%	0	0%	3	33%
macro	1	0%	1	0%	1	0%	3	0%
defensive	3	33%	42	69%	44	57%	89	62%
total	23	39%	50	62%	50	52%	123	54%

#17 steamhammer	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	3	67%	4	75%	9	100%	16	88%
aggressive	3	100%	17	100%	15	100%	35	100%
fast expo	2	0%	2	0%	2	50%	6	17%
macro	1	100%	10	100%	1	0%	12	92%
defensive	14	100%	16	100%	22	100%	52	100%
total	24	83%	50	92%	50	94%	124	91%

#18 overkill	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	3	0%	2	50%	6	17%
rush	0	0%	2	50%	1	0%	3	33%
aggressive	0	0%	1	0%	10	60%	11	55%
fast expo	1	0%	3	67%	0	0%	4	50%
macro	0	0%	0	0%	0	0%	0	0%
defensive	16	88%	41	90%	37	78%	94	85%
total	18	78%	50	80%	50	72%	118	76%

#19 terranuab	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	100%	8	88%	1	0%	10	80%
rush	1	100%	11	100%	30	100%	42	100%
aggressive	0	0%	4	75%	2	50%	6	67%
fast expo	1	100%	16	100%	6	83%	23	96%
macro	1	100%	9	89%	10	90%	20	90%
defensive	1	100%	2	50%	1	0%	4	50%
total	5	100%	50	92%	50	90%	105	91%

#20 cunybot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	2	50%	4	75%	7	57%
rush	1	100%	1	0%	2	0%	4	25%
aggressive	0	0%	4	75%	13	92%	17	88%
fast expo	1	0%	2	50%	2	50%	5	40%
macro	1	100%	9	89%	13	100%	23	96%
defensive	1	100%	32	100%	15	100%	48	100%
total	5	60%	50	90%	49	90%	104	88%

#21 opprimobot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	100%	12	100%	6	83%	19	95%
rush	1	100%	5	100%	7	100%	13	100%
aggressive	0	0%	7	100%	4	100%	11	100%
fast expo	1	100%	11	100%	17	100%	29	100%
macro	1	100%	8	100%	7	100%	16	100%
defensive	1	100%	7	100%	9	100%	17	100%
total	5	100%	50	100%	50	98%	105	99%

#22 sling	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	1	100%	5	100%	2	50%	8	88%
aggressive	0	0%	13	100%	13	100%	26	100%
fast expo	1	100%	7	100%	10	100%	18	100%
macro	1	100%	8	100%	11	100%	20	100%
defensive	1	100%	16	100%	13	100%	30	100%
total	5	80%	50	98%	50	96%	105	96%

#23 srbotone	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	2	50%	1	0%	4	25%
rush	1	100%	9	100%	3	67%	13	92%
aggressive	0	0%	13	100%	16	100%	29	100%
fast expo	1	100%	10	100%	8	100%	19	100%
macro	1	100%	7	86%	6	100%	14	93%
defensive	1	100%	9	100%	16	100%	26	100%
total	5	80%	50	96%	50	96%	105	95%

#24 bonjwa	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	100%	9	100%	4	75%	14	93%
rush	1	100%	13	100%	10	100%	24	100%
aggressive	0	0%	7	100%	10	100%	17	100%
fast expo	1	100%	6	100%	7	100%	14	100%
macro	1	100%	7	100%	8	100%	16	100%
defensive	1	100%	8	100%	11	100%	20	100%
total	5	100%	50	100%	50	98%	105	99%

#25 stormbreaker	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	4	75%	1	0%	4	75%	9	67%
rush	0	0%	5	80%	10	100%	15	93%
aggressive	0	0%	18	100%	7	100%	25	100%
fast expo	0	0%	0	0%	6	100%	6	100%
macro	0	0%	9	100%	8	100%	17	100%
defensive	20	100%	17	100%	15	100%	52	100%
total	24	96%	50	96%	50	98%	124	97%

#26 korean	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	7	100%	2	100%	10	100%	19	100%
rush	0	0%	7	100%	8	100%	15	100%
aggressive	0	0%	5	100%	8	100%	13	100%
fast expo	0	0%	8	100%	8	100%	16	100%
macro	0	0%	5	100%	6	100%	11	100%
defensive	14	100%	23	100%	10	100%	47	100%
total	21	100%	50	100%	50	100%	121	100%

Well, if you win every game, learning cannot help.

#27 salsa	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	9	100%	15	100%	9	100%	33	100%
rush	0	0%	0	0%	3	100%	3	100%
aggressive	0	0%	11	100%	8	100%	19	100%
fast expo	0	0%	0	0%	4	100%	4	100%
macro	0	0%	8	100%	7	100%	15	100%
defensive	15	100%	16	100%	19	100%	50	100%
total	24	100%	50	100%	50	100%	124	100%