archive by month
Skip to content

AIIDE 2021 - McRave versus WillyT

These tables tell more about McRave than about WillyT. Blue is good for McRave, red is good for WillyT.

mcrave strategies versus willyt strategies

overall1 rush2 fe bio-mech3 fe mech4 tonk
overall48/157 31%16/41 39%9/54 17%12/47 26%11/15 73%
HatchPool,12Hatch,2HatchMuta29/89 33%10/20 50%6/32 19%10/33 30%3/4 75%
PoolHatch,12Pool,3HatchMuta19/55 35%6/18 33%3/16 19%2/12 17%8/9 89%
PoolHatch,Overpool,2HatchMuta0/13 0%0/3 0%0/6 0%0/2 0%0/2 0%

I find it strange that McRave’s overpool into 2 hatch muta failed in every case. Is it a reaction build that turned out to be a misreaction to what WillyT does? Probably not, it was tried against every WillyT opener. McRave’s other 2 builds were about equal, though the table shows that they were best in different cases. Switching between them was likely correct. The ratio that they were tried in also looks good to me: You want a ratio that leads to the final results being about equal.

WillyT would have done better without 15 tonk builds.

willyt as seen by mcrave

willyt played#mcrave recognized
1 rush4140 Unknown,Unknown,Unknown | 1 RaxCC,1RaxFE,Unknown
2 fe bio-mech5423 RaxCC,1RaxFE,Unknown | 20 RaxCC,1RaxFE,1FactTanks | 6 Unknown,Unknown,Unknown | 5 RaxCC,1RaxFE,5FactGoliath
3 fe mech4733 RaxCC,1RaxFE,5FactGoliath | 8 Unknown,Unknown,Unknown | 6 RaxCC,1RaxFE,Unknown
4 tonk1513 Unknown,Unknown,Unknown | 2 RaxFact,Unknown,5FactGoliath

Both the rush and the tonk build usually denied scouting, which seems like it should have been important because the builds call for opposite reactions. Yet McRave defeated the tanks and had less trouble with the rush than with WillyT’s expansion builds. RaxCC and 1RaxFE seem simple enough to recognize, and were. The followup seems harder to recognize, and was. I doubt that so many were actually 5FactGoliath.

AIIDE 2021 - BananaBrain versus WillyT

Not much to see here, because the pairing was one-sided. But there are still a few points to note. In the tables, blue is good for BananaBrain, red is good for WillyT.

bananabrain strategies versus willyt strategies

overall1 rush2 fe bio-mech3 fe mech4 tonk
overall146/157 93%43/46 93%71/79 90%17/17 100%15/15 100%
10/12gate41/44 93%7/10 70%19/19 100%10/10 100%5/5 100%
12nexus5/6 83%1/1 100%1/2 50%1/1 100%2/2 100%
2gatedt0/1 0%-0/1 0%--
32nexus21/24 88%17/17 100%2/5 40%-2/2 100%
9/9proxygate76/77 99%18/18 100%46/47 98%6/6 100%6/6 100%
dtdrop1/2 50%-1/2 50%--
stove2/3 67%-2/3 67%--

It’s striking how quickly BananaBrain gave up on a build; it only had to fail once in six games. The reason is of course that the other builds were doing better than that. Now we see the relative success of WillyT’s build 2 bio-mech: It provided all of the wins in the builds that BananaBrain gave up on, and a few other wins as well. Otherwise, only WillyT’s rush was able to score a few wins, and then only against BananaBrain’s zealot play.

willyt as seen by bananabrain

willyt played#bananabrain recognized
1 rush4639 2rax | 7 unknown
2 fe bio-mech7940 fastexpand | 25 unknown | 14 2rax
3 fe mech1712 fastexpand | 2 unknown | 2 1fac | 1 2rax
4 tonk1510 1fac | 4 unknown | 1 2rax

BananaBrain seems to have diagnosed builds mostly correctly, when it was able to at all. WillyT’s build 2 seemed to be better at denying scouting. There is no sign that reading the opponent’s build helped BananaBrain play better; it’s the opposite if anything. But with the results so lopsided, we shouldn’t expect much of a sign anyway.

AIIDE 2021 - what WillyT learned

A middle group of bots finished close to each other, from #5 McRave at 41.70% to #8 DaQin at 39.63%. #6 WillyT is the second of the group.

WillyT’s learning files record the bot’s strategy as 01, 02, 03, or 04. Last year it only went up to 03. There may be an expectation of going up to 10 someday! Here is how I translated the strategy numbers into names, based on the numbering in the bot’s top-level README.

011 rush2 rax bio + SCVs
022 fe bio-mech 1 rax expand into bio-mech
033 fe mech1 rax expand into mech
044 tonkslowly make many tanks

#1 stardust

4 tonk1563%0155
1 opening1563%

Stardust got special treatment, and it was still only good enough for 3%. The tonk build seems to have been specially devised to give a chance against Stardust. I checked on BASIL and found that the chance there was never high. But it’s over zero, that’s better than Steamhammer did!

#2 bananabrain

1 rush467%4156
2 fe bio-mech7910%0154
3 fe mech170%3152
4 tonk150%1141
4 openings1577%

Switching between the rush and bio-mech was able to squeeze a little blood from BananaBrain. It’s interesting that mech scored lower, though the expected win rate is so low that it’s hard to be sure the difference is real. Does the build suffer from a weak timing?

#3 dragon

1 rush300%3155
2 fe bio-mech353%0156
3 fe mech665%1153
4 tonk260%2154
4 openings1573%

The author explained in a comment that WillyT is weak at TvT because it does not understand siege lines. With only one terran opponent, it wasn’t critical. This version of Dragon always makes a slow start to the game, so any slowness in WillyT’s bio-mech build did not matter, and having tanks likely helped.

#4 steamhammer

1 rush8649%10155
2 fe bio-mech5345%2156
3 fe mech1118%0150
4 tonk70%3147
4 openings15743%

WillyT could not outscore Steamhammer, but it made a good attempt. I find it distressing that the rush won so many games; it’s a strong rush but not that hard to hold. Again, bio-mech was better than mech. Well, that’s more expected versus zerg, but I wonder whether the reason is the same as versus BananaBrain?

#5 mcrave

1 rush4161%4155
2 fe bio-mech5483%0154
3 fe mech4774%13156
4 tonk1527%5138
4 openings15769%

McRave is meticulous in defense, which shows in these numbers. But it suffered against 2-base play. I think we can infer that WillyT has good mutalisk defense.

#7 microwave

1 rush5883%0155
2 fe bio-mech1942%7156
3 fe mech4764%8153
4 tonk3361%6151
4 openings15768%

The tonk build had success against Microwave, but was neither successful nor much played other than here and versus Stardust. I’d say the build is overspecialized, useful only in a narrow range of situations. The rush was overwhelming, though.

#8 daqin

1 rush1414%235
2 fe bio-mech13044%4154
3 fe mech70%834
4 tonk40%018
4 openings15538%

And again, bio-mech over mech. It’s not conclusive, but I feel that something may be weak in the mech build. Maybe WillyT is just better with marines.

#9 freshmeat

1 rush13372%4156
2 fe bio-mech1250%239
3 fe mech633%138
4 tonk650%0136
4 openings15768%

FreshMeat is newer and perhaps not ready yet to face the rush.

#10 ualbertabot

1 rush7381%5151
2 fe bio-mech4961%0154
3 fe mech2654%3148
4 tonk729%12144
4 openings15568%

UAlbertaBot, with aggressive openers and no strong defensive skill, also fell to the rush. It got outrushed. It strikes me that WillyT scored nearly the same against #5 McRave, #7 Microwave, #9 FreshMeat, and #10 UAlbertaBot, even though the four are different in style and strength. WillyT did not crush any opponent. To me that suggests some kind of inconsistency in its play: It may have flaws that even weaker bots can exploit sometimes.