AIIDE 2017 what AIUR learned

Here is what AIUR learned about each opponent over the course of the tournament. I did this mostly because it’s easy; I already had the script from last year. But it’s also informative—AIUR’s reactions tell us how each bot played, and may tell bot authors what they need to work on.

The data is generated from files in AIUR’s final read directory. AIUR recorded 111 games against some opponents even though the tournament officially ran for 110 rounds; that is presumably because the tournament did run longer but was cut back to a multiple of 10 rounds for fairness (since there are 10 maps). On the other hand, AIUR’s total game count according to itself is 2938 and according to the tournament results is 2965, so it may have been unable to record some games (it is listed with 53 crashes, so that’s not a surprise). First an overall view, totalling the data for all opponents. We can see that all 6 of AIUR’s strategies (“moods” it calls them) were widely valuable: Every strategy has win rate over 50% on some map size. AIUR’s overall win rate in the tournament was 50.46%.

overall	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	159	55%	59	37%	161	44%	379	47%
rush	134	66%	87	55%	185	50%	406	56%
aggressive	107	56%	108	43%	155	30%	370	41%
fast expo	69	45%	84	33%	197	51%	350	46%
macro	46	28%	69	52%	211	37%	326	39%
defensive	352	60%	185	58%	570	55%	1107	57%
total	867	57%	592	49%	1479	48%	2938	50%

2, 3, 4 - map size, the number of starting positions
n - games recorded
wins - winning percentage over those games
cheese - cannon rush
rush - dark templar rush
aggressive - fast 4 zealot drop
fast expo - nexus first
macro - aim for a strong middle game army
defensive - be safe against rushes (not entirely successful)

#1 zzzkbot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	16	12%	1	0%	4	0%	21	10%
rush	5	0%	1	0%	1	0%	7	0%
aggressive	3	0%	1	0%	5	0%	9	0%
fast expo	4	0%	1	0%	5	0%	10	0%
macro	3	0%	2	0%	3	0%	8	0%
defensive	3	0%	16	31%	37	24%	56	25%
total	34	6%	22	23%	55	16%	111	14%

AIUR struggled against the tournament leader but was not entirely helpless. Its cannon rush had a chance on 2 player maps and its anti-rush strategy on the others. We see how AIUR gains by taking the map size into account.

#2 purplewave	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	2	0%	4	0%
rush	28	79%	3	33%	40	55%	71	63%
aggressive	1	0%	3	33%	1	0%	5	20%
fast expo	1	0%	11	36%	10	60%	22	45%
macro	1	0%	2	0%	1	0%	4	0%
defensive	1	0%	1	0%	1	0%	3	0%
total	33	67%	21	29%	55	51%	109	51%

AIUR upset #2 PurpleWave, a surprising outcome. The DT rush and the fast expand were both somewhat successful—rather unrelated strategies.

#3 iron	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	5	0%	1	0%	7	0%	13	0%
rush	5	0%	2	0%	7	0%	14	0%
aggressive	3	0%	2	0%	12	0%	17	0%
fast expo	8	0%	14	7%	9	0%	31	3%
macro	6	0%	1	0%	10	0%	17	0%
defensive	5	0%	2	0%	10	0%	17	0%
total	32	0%	22	5%	55	0%	109	1%

Learning can’t help if nothing you try wins....

#4 cpac	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	4	0%	0	0%	2	0%	6	0%
aggressive	2	0%	1	0%	1	0%	4	0%
fast expo	1	0%	1	0%	1	0%	3	0%
macro	2	0%	3	33%	2	0%	7	14%
defensive	24	38%	16	69%	48	50%	88	50%
total	34	26%	22	55%	55	44%	111	41%

Cpac was configured to play 5 pool against AIUR. It worked, but AIUR was able to compensate to an extent by playing its anti-rush build.

#5 microwave	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	2	0%	2	0%	4	0%	8	0%
rush	1	0%	1	0%	4	0%	6	0%
aggressive	20	20%	15	13%	11	0%	46	13%
fast expo	1	0%	2	0%	6	0%	9	0%
macro	1	0%	1	0%	4	0%	6	0%
defensive	1	0%	1	0%	26	12%	28	11%
total	26	15%	22	9%	55	5%	103	9%

Microwave was successful but showed a little vulnerability to surprise zealots dropped in its main. I suspect it’s a tactical reaction issue.

#6 cherrypi	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	1	0%	1	0%	1	0%	3	0%
aggressive	2	0%	2	0%	1	0%	5	0%
fast expo	2	0%	1	0%	1	0%	4	0%
macro	2	0%	1	0%	9	11%	12	8%
defensive	26	4%	16	12%	42	12%	84	10%
total	34	3%	22	9%	55	11%	111	8%

#7 mcrave	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	26	100%	5	60%	45	62%	76	75%
rush	3	67%	9	67%	4	50%	16	62%
aggressive	1	0%	4	50%	1	0%	6	33%
fast expo	1	0%	2	50%	2	50%	5	40%
macro	1	0%	1	0%	1	0%	3	0%
defensive	1	0%	1	0%	2	0%	4	0%
total	33	85%	22	55%	55	56%	110	65%

AIUR upset McRave with its cannon rush, and the dark templar rush did well too. AIUR executes the best cannon rush of any bot, in my opinion. It is a sign that McRave’s play was not robust enough against tricks.

#8 arrakhammer	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	2	0%	2	0%	3	0%	7	0%
rush	1	0%	1	0%	4	0%	6	0%
aggressive	1	0%	5	60%	3	0%	9	33%
fast expo	1	0%	1	0%	2	0%	4	0%
macro	0	0%	12	50%	38	37%	50	40%
defensive	29	66%	1	0%	4	25%	34	59%
total	34	56%	22	41%	54	28%	110	39%

#9 tyr	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	6	67%	1	0%	1	0%	8	50%
rush	20	100%	1	0%	2	0%	23	87%
aggressive	3	33%	10	20%	1	0%	14	21%
fast expo	1	0%	7	29%	49	35%	57	33%
macro	1	0%	1	0%	1	0%	3	0%
defensive	2	50%	2	0%	1	0%	5	20%
total	33	79%	22	18%	55	31%	110	43%

The DT rush won 100% of the time on 2 player maps and was tried only a few times on larger maps, losing. Was it only unlucky on the 3 and 4 player maps, or is there a real difference? With only 3 games total, we can’t tell from the numbers. It is a weakness of AIUR’s learning: It’s slow because there is so much to learn. The flip side of the slowness is that, over a long tournament, it learns a lot.

#10 steamhammer	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	2	0%	1	0%	1	0%	4	0%
rush	2	50%	1	0%	2	0%	5	20%
aggressive	1	0%	1	0%	1	0%	3	0%
fast expo	1	0%	1	0%	1	0%	3	0%
macro	0	0%	1	0%	1	0%	2	0%
defensive	27	81%	17	88%	49	67%	93	75%
total	33	70%	22	68%	55	60%	110	65%

I was surprised to see Steamhammer upset by AIUR. I had thought that AIUR was a solved problem. On SSCAIT too, Steamhammer started to show losses against AIUR in September for the first time in months. I may have introduced a weakness in some recent version and AIUR’s learning took that long to find it on SSCAIT. In AIIDE, the tournament was easily long enough.

#11 ailien	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	3	0%	1	0%	2	0%	6	0%
aggressive	1	0%	2	0%	1	0%	4	0%
fast expo	1	0%	2	50%	0	0%	3	33%
macro	4	50%	8	75%	1	0%	13	62%
defensive	24	58%	8	88%	49	37%	81	48%
total	34	47%	22	64%	54	33%	110	44%

#12 letabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	7	43%	1	0%	2	0%	10	30%
rush	3	33%	13	54%	43	40%	59	42%
aggressive	5	40%	1	0%	1	0%	7	29%
fast expo	13	46%	3	33%	1	0%	17	41%
macro	1	0%	1	0%	6	33%	8	25%
defensive	1	0%	3	33%	1	0%	5	20%
total	30	40%	22	41%	54	35%	106	38%

I suspect that fast expo was the best strategy on 4 player maps, but how was AIUR to know? A weakness of AIUR’s epsilon-greedy learning, compared to UCB, is that it doesn’t realize that a less-explored option is more likely to be misevaluated.

#13 ximp	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	34	35%	0	0%	1	0%	35	34%
rush	0	0%	0	0%	1	0%	1	0%
aggressive	0	0%	13	8%	52	2%	65	3%
fast expo	0	0%	9	0%	0	0%	9	0%
macro	0	0%	0	0%	1	0%	1	0%
defensive	0	0%	0	0%	0	0%	0	0%
total	34	35%	22	5%	55	2%	111	13%

#14 ualbertabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	0	0%	0	0%	1	100%	1	100%
rush	0	0%	0	0%	0	0%	0	0%
aggressive	0	0%	0	0%	1	100%	1	100%
fast expo	0	0%	0	0%	0	0%	0	0%
macro	0	0%	0	0%	0	0%	0	0%
defensive	34	32%	21	5%	52	27%	107	24%
total	34	32%	21	5%	54	30%	109	26%

What’s up with all those zeroes? AIUR is coded to try each strategy once before it starts making decisions, and that did not happen here. It turns out that AIUR has pre-learned data for Skynet, XIMP, and UAlbertaBot, so its learning in those cases looks different.

#16 icebot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	2	0%	1	0%	4	0%
rush	1	0%	2	50%	3	33%	6	33%
aggressive	3	100%	3	67%	4	50%	10	70%
fast expo	14	100%	3	67%	44	93%	61	93%
macro	4	75%	2	50%	1	0%	7	57%
defensive	9	89%	10	80%	2	50%	21	81%
total	32	88%	22	64%	55	82%	109	80%

#17 skynet	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	13	92%	0	0%	0	0%	13	92%
rush	21	95%	21	90%	51	88%	93	90%
aggressive	0	0%	0	0%	0	0%	0	0%
fast expo	0	0%	1	100%	0	0%	1	100%
macro	0	0%	0	0%	0	0%	0	0%
defensive	0	0%	0	0%	4	50%	4	50%
total	34	94%	22	91%	55	85%	111	89%

#18 killall	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	3	0%	1	0%	5	0%
rush	1	0%	2	0%	1	0%	4	0%
aggressive	1	0%	2	0%	1	0%	4	0%
fast expo	1	0%	3	0%	1	0%	5	0%
macro	0	0%	2	0%	2	50%	4	25%
defensive	30	80%	10	70%	49	76%	89	76%
total	34	71%	22	32%	55	69%	111	62%

#19 megabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	3	67%	1	0%	2	0%	6	33%
rush	2	0%	14	36%	5	0%	21	24%
aggressive	6	67%	4	25%	4	0%	14	36%
fast expo	2	50%	1	0%	4	0%	7	14%
macro	1	0%	1	0%	36	25%	38	24%
defensive	17	76%	1	0%	2	0%	20	65%
total	31	65%	22	27%	53	17%	106	33%

#20 xelnaga	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	9	100%	6	83%	1	0%	16	88%
rush	19	100%	4	75%	1	0%	24	92%
aggressive	1	0%	3	33%	1	0%	5	20%
fast expo	1	0%	4	75%	1	0%	6	50%
macro	2	0%	2	50%	50	36%	54	35%
defensive	2	50%	3	67%	1	0%	6	50%
total	34	85%	22	68%	55	33%	111	56%

Against Xelnaga, AIUR found solutions on 2 and 3 player maps but not on 4 player maps. Is it another case of underexploration?

#21 overkill	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	3	67%	5	40%
rush	2	50%	0	0%	0	0%	2	50%
aggressive	8	100%	4	100%	7	86%	19	95%
fast expo	3	67%	3	100%	7	100%	13	92%
macro	4	75%	3	67%	12	92%	19	84%
defensive	14	93%	11	100%	26	96%	51	96%
total	32	84%	22	91%	55	93%	109	90%

#22 juno	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	5	0%	14	36%	33	15%	52	19%
rush	3	0%	1	0%	1	0%	5	0%
aggressive	2	0%	1	0%	2	0%	5	0%
fast expo	2	0%	1	0%	16	12%	19	11%
macro	1	0%	1	0%	1	0%	3	0%
defensive	19	21%	4	25%	2	0%	25	20%
total	32	12%	22	27%	55	13%	109	16%

Juno’s cannon contain upset AIUR. Learning didn’t help much, because the problem wasn’t in any of the strategies, it was in AIUR’s poor reactions to cannons appearing in front of its base. It is amusing to watch 2 bots cannon each other when sometimes both get cannons up.

#23 garmbot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	2	50%	1	0%	0	0%	3	33%
aggressive	17	94%	17	100%	3	67%	37	95%
fast expo	0	0%	1	0%	23	83%	24	79%
macro	0	0%	1	0%	1	0%	2	0%
defensive	5	80%	1	0%	27	81%	33	79%
total	25	84%	22	77%	55	78%	102	79%

#24 myscbot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	2	50%	4	25%
rush	2	0%	3	67%	2	50%	7	43%
aggressive	3	33%	2	100%	9	78%	14	71%
fast expo	1	0%	2	50%	1	0%	4	25%
macro	4	50%	4	100%	3	67%	11	73%
defensive	23	61%	10	100%	38	79%	71	76%
total	34	50%	22	86%	55	75%	111	69%

#25 hannesbredberg	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	5	80%	3	100%	3	67%	11	82%
rush	2	50%	3	100%	2	50%	7	71%
aggressive	2	50%	2	50%	2	0%	6	33%
fast expo	8	100%	3	100%	9	89%	20	95%
macro	2	50%	4	100%	11	91%	17	88%
defensive	15	100%	7	100%	28	100%	50	100%
total	34	88%	22	95%	55	89%	111	90%

#26 sling	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	2	50%	1	0%	3	33%	6	33%
rush	2	50%	0	0%	1	0%	3	33%
aggressive	12	100%	0	0%	23	96%	35	97%
fast expo	1	0%	5	100%	1	0%	7	71%
macro	3	67%	5	80%	12	75%	20	75%
defensive	5	80%	11	100%	15	80%	31	87%
total	25	80%	22	91%	55	80%	102	82%

Here is another possible case of insufficient exploration. The 4 zealot drop won 100% of the time on 2 player maps and 96% of the time on 4 player maps, but was never tried on 3 player maps (I guess due to a crash, since AIUR tries to play each strategy once). It’s not a severe problem, though, because 3 player maps did have 2 strategies that scored 100%.

#27 forcebot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	1	0%	1	0%	1	0%	3	0%
rush	0	0%	1	0%	1	0%	2	0%
aggressive	3	67%	2	0%	1	0%	6	33%
fast expo	0	0%	1	0%	1	0%	2	0%
macro	0	0%	9	78%	3	67%	12	75%
defensive	29	100%	8	75%	48	94%	85	94%
total	33	94%	22	59%	55	85%	110	83%

#28 ziabot	2		3		4		total
	n	wins	n	wins	n	wins	n	wins
cheese	12	100%	7	86%	36	86%	55	89%
rush	1	0%	1	100%	4	75%	6	67%
aggressive	6	100%	8	88%	6	83%	20	90%
fast expo	1	0%	1	0%	2	0%	4	0%
macro	3	0%	1	0%	1	0%	5	0%
defensive	6	67%	4	75%	6	83%	16	75%
total	29	76%	22	77%	55	80%	106	78%

Next: AILien’s learning.

Trackbacks

No Trackbacks

Comments

DanG on Saturday, October 21. 2017:

Fantastic breakdown. I also overlooked AIUR in my testing. It turned out we had a bit of rock-paper-scissors in our strategy selection. His DT rush eschews any units at all and thus gets the DT out maybe 30 seconds faster. That approach is indefensible versus any early zealot pressure. But my DT rush doesn't attack until DTs come out, since it won't have the units to punish any "reasonable" build, and is supposed to throw up cannons for the mirror match, but the placement was bugged and they'd often be too late for his fast DT anyhow. So I'd switch to my 3 Gate Robo, which reliably beats the DT rush with a very fast Observer, but doesn't gain any meaningful advantages against other builds, and collapsed under AIUR's two base timing attack off fast expansion. I didn't test against AIUR and didn't consider my robustness against "unreasonable" build orders. It's often said that PvP is a contest of trying to cheat in as many ways as possible -- skipping Observers, skipping Gateways, or delaying Reavers for a faster expansion -- and hoping you never have to pay the bill. My limited PvP strategy selection was all "fair" builds, expecting that I could win on execution, but I paid the price against AIUR. Valuable, costly lessons.

Jay Scott on Saturday, October 21. 2017:

Nice analysis. Not cutting enough corners!

PurpleWaveJadien on Saturday, October 21. 2017:

All due credit, of course, to Florian and AIUR for having a creative and diverse pool of strategies!

krasi0 on Saturday, October 21. 2017:

Indeed! I wish he'd get back to working on his promising bot!

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA