Starcraft AI blog | Entries from May 2018

Steamhammer 1.4.2 has solidified

Vapor no longer, Steamhammer 1.4.2 is uploaded on SSCAIT. As I’ve mentioned, expect Randomhammer’s terran and protoss especially to play better. The usual change list and source to follow.

expect Steamhammer 1.4.2 within a day

Once I debugged the opponent model opening selection, a long careful test revealed that—drum roll—it didn't work as intended. When the number of games against a given opponent went over 30 or so, the "clever" system for trying out new openings became overactive and gummed up the mechanism, with poor game results. It was a design mistake, and I had to redesign it with less cleverness and rewrite a large section. Now it tests as working reasonably well against both predictable and difficult opponents, over both short and long runs of games.

Tests located the stubborn bug in the gas steal, which sometimes caused the gas steal drone to sit around next to the enemy extractor doing nothing. I fixed it. Is that the last gas steal bug? Based on history, probably not....

More tests to run, and the wild shots so far are par for the course. Everything is on track. Expect Steamhammer 1.4.2 to be uploaded late tonight or early tomorrow (in my time zone).

Tomorrow: Steamhammer 1.4.2 change list. It's kind of long.

inferring enemy buildings

I want Steamhammer to infer the existence of enemy buildings that it has not seen. Right now, the plan recognizer explicitly encodes its conditions: If I see an early factory, or a unit that came from a factory, or an armory, or a starport, then I can guess that the enemy is playing a factory opening. The zerg strategy boss uses related reasoning. Every bit of code that wants to know if the enemy has a factory has to also look for things that depend on the factory. It’s more general and powerful to decouple the inferences. The information manager should notice when it sees things that can’t be explained by known enemy structures, and infer what was necessary to produce them. If it sees a battlecruiser, it should infer the science facility with physics lab, the starport with control tower, and all the way down. Then the rest of Steamhammer can rely on the inferences, and not have to do any extra work on its own. In principle, it could draw conclusions like “Wait, I’ve seen the whole enemy base and there is no physics lab there. I should look harder at the rest of the map.”

Research counts too. If I see a wraith cloak, I know there is a control tower, even though a control tower is not necessary to make wraiths. It’s necessary to research cloaking. If I see a unit with an attack upgrade, I know what buildings were needed to research the upgrade.

It occurs to me that there are tricky cases. Suppose I break into the enemy base and find a factory—and destroy it. Later, I reach farther into the base and find an armory. I can’t infer that there is a second or new factory; the armory could have been started when the first factory still existed. To interpret the armory correctly, I have to remember that the enemy used to have a factory, and without more information I can’t be sure whether it has a factory now. This is building inference level 1; I have to know history to draw true conclusions.

But sometimes I can infer that there is another factory. If the armory is in a place that I know was empty when the first factory was destroyed, then I know that it is a new armory. There must have been another factory, or another factory was built after the first one was killed; in any case, I can infer a factory that I did not see. To do building inference perfectly, you have to not only remember the history of enemy buildings, you have to remember the history of places. This is building inference level 2; if I understand it, I can draw more conclusions.

Building inference level 3 (I’ll call it) is drawing conclusions about the number of enemy production buildings from the number of units that they are seen to have produced. “Hmm, my scout saw your barracks under construction, and by now it could theoretically have produced n marines. You have more than n marines, so you definitely have at least 2 barracks.” Resource counting can figure into this: If your scout counted the enemy minerals, or if you have a table “it’s possible to mine x amount by time t,” or if you simply know the maximum number of marines that can be produced by one barracks without being sure when it was started, you can draw inferences. In the most general case, you know all the good build orders and their production potential, and can rule out the ones which are not consistent with what you see. Later in the game it should at least be possible to make a rough estimate of the enemy’s income, correlate it to the types and/or number of units you see, and get a fair idea of how many production buildings are likely. At this point you’re making estimates instead of drawing definite inferences.

How many levels of building inference does PerfectBot have? “Oh, this army movement suggests that you are trying to draw my forces forward. Maybe you are trying to set up a drop in the rear? I’d say that means a 20% chance of a robo support bay....”

Steamhammer 1.4.2 is close

The opponent model changes in the upcoming Steamhammer version 1.4.2 are proving tedious to debug. I haven’t hit any serious problems and I don’t expect to, but simple bugs take time too and there are a lot of details to double-check. The new opening selection feature of the opponent model has many moving parts, and each part has to be tested and inspected and shown to work with the rest of the system.

All other aspects are passing tests with flying colors, even complicated stuff. The release might happen in 2 or 3 days. Terran and protoss are substantially stronger, so stand by!

Locutus has a pylon harassment feature

Bot authors do a lot of creative stuff. Sometimes they are so creative that I can’t figure out the idea. Like this game between Locutus and Iron: Why did Locutus’s scout probe build pylons all over Iron’s territory?

It’s not a bug. It’s an intentional feature in this version of Locutus—here is the commit. It’s called “pylon harassment” and in this version it is hardcoded to happen only against Iron. A comment says:

    // We want to build a pylon. Do so when:
    // - We have enough resources
    // - We are not close to the enemy mineral line
    // - We are in sight range of an enemy building
    // - Nothing is in the way

“In sight range of an enemy building.” The pylons are meant to be seen. The intention must be to direct the opponent’s attention, to somehow divert it from doing something Locutus doesn’t want it to do, such as leave its base and attack Locutus.

In this game, Iron was not diverted at all, and won easily since Locutus had wasted a ton of minerals on pylons. I can imagine that some bots would go wrong. Those that pull workers to defend against proxy buildings might pull too many workers and stop mining, for example. Still, if your bot does make a mistake like that, it shouldn’t be hard to fix. So if pylon harassment is useful at all, I guess it must be a trick to easily defeat some weaker bots, or bots which make a certain class of blunder and are not being updated. Or possibly it is an unfinished feature, and the way it worked in this game is not how it is intended to work. I like the second theory better.

Another comment says

    // In the future, recognize how opponents react to pylon harass and store it in the opponent model

So in its final state, it is not intended to be hand-configured, but automatically selected. It could use code similar to the auto gas steal code already in Steamhammer, which decides whether to steal gas. Or it might use code similar to the plan recognizer to decide whether the opponent made a sensible or a silly reaction, and repeat if the opponent was silly.

surprise Randomhammer crash

Randomhammer crashed—as zerg, no less. It got knocked back to 1 base with a few drones, and started to recover as normal, but after making a couple units... kaboom. Zerg has been recovering reliably from these upsets for a long time now, so it’s a surprise. I’m trying to triangulate the source of the problem from not-quite-adequate information.

The upcoming version 1.4.2 is closing in on being ready. The opponent model changes are finished, except for small details. The remaining work is a little bit of smoothing and a bunch of testing (and probably fixing). In between an untraced crash and a new and fairly complex addition to the opponent model, there’s no guessing how long debugging will take.

Steamhammer has a UCB bug

Ack, Steamhammer has a typo in its UCB formula! A parenthesis is misplaced. What a blunder! In UCB1_bound(), change this inexcusable mistake:

	return sqrt(2.0 * log(double(total)/tries));

to this:

	return sqrt(2.0 * log(total) / tries);

The typecast has no formal effect in C++11 and later, and made it harder to see the error.

The current Steamhammer 1.4.1 uses UCB only for deciding whether to steal gas, when AutoGasSteal is turned on. I had been wondering why it chose to steal gas so often against so many opponents. Was the gas steal really that effective? When I looked again at the code, I soon spotted the mistake.

The behavior is approximately right when the number of games is small. That’s how it passed my end-to-end tests. As the number of games goes up, it gets more and more wrong. It’s impossible to be too careful in testing. :-/

The upcoming version will use UCB for opening selection—not in the most direct way, like most bots, but with a twist to cope with the large number of openings, too many to explore. Good thing I caught the bug in time.

Steamhammer and the protoss building bug

I wrote up Steamhammer’s protoss building problem in March as Steamhammer’s time limit problem with protoss. Since then, I borrowed the smart Locutus idea of limiting the number of gateways to 10, a workaround that nearly eliminated the problem—though it doesn’t solve it. It worked so well that I decided to remove a different workaround, a rule in building placement that many protoss buildings are allowed to touch vertically, blocking horizontal movement. That rule sometimes caused units to get trapped in between buildings and the edge of the map, so it was now causing more harm than good.

Then, in a test game of protoss versus Killerbot by Marian Devecka, the building problem came up again. Steamhammer laid out its base poorly, and no buildings could be added without ordering a new pylon first, which it didn’t know to do. The bot slowed to a crawl, trying over and over again to place buildings that, under its rules, there was no room for. The space powered by pylons was filled up.

Too infuriating! That same evening I put together an elaborate system to solve it, connecting the building manager, the strategy manager, and the information manager. It’s a version of the “absolute minimum” fix that I wrote about before. When the building manager sees that a protoss building location cannot be found, and the building requires pylon power, it sets a flag “stalled for lack of space” and refuses to make any more attempts to place buildings that require pylon power. It can still place a pylon, nexus, or assimilator. The strategy manager recognizes a building manager stall as a production emergency and orders an emergency pylon. Finally, the information manager, which keeps track of units, now keeps a set of our pylons. When a new pylon completes, it notifies the building manager to clear the “stalled for lack of space” flag if it is set.

The system seems to work. As one test, I made an opening which ordered 20 forges in a row. It was fun to write "20 x forge". Everything worked as designed: The building manager stalled each time it came to a forge that could not be placed. In the stalled state it does little work, and there was no slowdown. The strategy manager saw the stall and ordered an emergency pylon, which the building manager started. The information manager saw the pylon finish and told the building manager to unstall, and construction proceeded. It was not too efficient in terms of game play, but nothing broke or froze or ran over the time limit. And it worked repeatedly until all 20 forges were made, and the bot switched into normal play. Still, it would be surprising if such a complicated arrangement didn’t cause any bugs....

The problem to solve is a slowdown, and this change is an optimization that speeds things up: Technically it is a fix, not a workaround. But it is not a complete fix. Building placement can fail due to bugs, rather than lack of space in range of pylon power, and this fix doesn’t take that into account. A building placement bug could cause unnecessary stalls. Also I’m not convinced it will always play nice with the “oh, hey, let’s choose here as my new main base” code. I need to put a lot more work into the building code to make it fast and reliable, and then to teach it more sensible building layout.

The next version is almost finished—nominally. I need to complete some coding in the opponent model and batter it with heavy tests until it doesn’t fall down, and that is all that is in the plan. The problem, to change metaphors, is that every time I kick the ball, the goal moves away, carried by creeping features.

SAIL map balance

Here’s a new table I haven’t generated before, at least not in this form: Map balance for each race, from the 20000 SAIL games. The left-side columns give win rates for the 3 matchups. The right-side columns give game counts and win rates for each race in all non-mirror matchups. (A mirror matchup always has 50% win rate, one winner and one loser of the same race, so including those would only pull the numbers closer to 50%.)

map	TvZ	ZvP	PvT	T games	T %	P games	P %	Z games	Z %	R games	R %
Benzene	50%	54%	48%	516	51%	654	48%	657	51%	231	50%
Destination	46%	49%	50%	498	49%	652	50%	627	52%	205	45%
HeartbreakRidge	43%	55%	41%	533	50%	588	44%	637	56%	222	50%
NeoMoonGlaive	40%	51%	48%	546	46%	652	49%	656	53%	232	53%
TauCross	47%	48%	48%	529	49%	668	51%	678	51%	229	47%
Andromeda	44%	49%	51%	506	47%	627	52%	646	52%	223	47%
CircuitBreaker	43%	51%	52%	550	46%	693	51%	692	52%	233	49%
EmpireoftheSun	47%	52%	48%	492	50%	612	49%	613	52%	227	49%
FightingSpirit	45%	52%	57%	535	45%	676	52%	694	52%	217	49%
Icarus	42%	54%	47%	543	48%	600	47%	650	54%	207	53%
Jade	44%	55%	50%	519	49%	611	48%	655	54%	211	46%
LaMancha	48%	51%	43%	501	52%	619	47%	613	52%	213	47%
Python	44%	49%	50%	521	47%	624	50%	642	52%	197	51%
Roadrunner	47%	46%	50%	557	49%	648	52%	716	51%	231	42%
overall	45%	51%	49%	7346	48%	8924	49%	9176	52%	3078	48%

Compare this to the race balance tables posted a couple days ago. Overall, there are no giant imbalances; the largest imbalance is 40%-60%, a 3:2 win rate for zerg over terran on Neo Moon Glaive. Terran has a small but consistent disadvantage against zerg on all but one map (Benzene came out even). That is a genuine race imbalance in current bot play: Z > T. But averaged across all the maps, the imbalance is only 55% zerg to 45% terran, 11:9, not a value to complain about. You can look back at the older tables to see which bots are responsible for the imbalance. By no means are all terran bots worse against zerg, or all zerg better against terran, but Krasi0 and Iron do show the pattern, and so does Steamhammer.

The other matchups have some pink and some blue, but tend to approximately balance out on average. Past investigation showed that the map race balance for pro players and bot bot players were pretty much uncorrelated in AIIDE 2015: See comparing pro and bot balance. I expect that it is still true. Bots have not gotten much better at exploiting map features.

SAIL bots and maps

More SAIL data: How well each bot performed on each map.

	games	overall	Benzen	Destin	Heartb	NeoMoo	TauCro	Androm	Circui	Empire	Fighti	Icarus	Jade	LaManc	Python	Roadru
100382319	40	0.00%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	-	0%	0%
AILien	484	66.32%	50%	52%	66%	70%	64%	69%	76%	62%	76%	61%	73%	69%	64%	71%
Alice	233	19.31%	20%	7%	16%	24%	24%	40%	9%	24%	19%	14%	23%	13%	15%	28%
AndrewSmith	479	66.81%	56%	67%	74%	62%	69%	68%	77%	79%	51%	79%	61%	59%	64%	71%
AndreyKurdiumov	464	69.83%	61%	54%	71%	79%	75%	67%	69%	71%	67%	71%	74%	76%	77%	70%
Antiga	472	70.97%	73%	74%	69%	70%	66%	60%	72%	86%	74%	69%	57%	62%	78%	86%
Arrakhammer	467	65.31%	62%	56%	68%	70%	73%	59%	64%	64%	61%	62%	75%	65%	66%	73%
AurelienLermant	472	31.36%	38%	45%	35%	18%	35%	24%	18%	25%	32%	37%	31%	32%	38%	36%
BananaBrain	469	73.99%	61%	66%	71%	67%	85%	70%	79%	78%	76%	78%	70%	78%	81%	70%
Bereaver	501	71.06%	88%	75%	66%	85%	62%	74%	79%	63%	78%	80%	56%	51%	56%	80%
BlackCrow	459	62.09%	62%	45%	63%	75%	52%	56%	59%	74%	56%	63%	73%	62%	55%	73%
BryanWeber	460	13.48%	19%	3%	18%	14%	8%	16%	17%	8%	12%	19%	17%	30%	8%	7%
CarstenNielsen	467	56.32%	37%	55%	58%	44%	63%	58%	50%	70%	56%	51%	50%	56%	70%	68%
CasiaBot	458	51.31%	55%	58%	52%	50%	48%	48%	41%	55%	51%	48%	62%	52%	50%	46%
CherryPi	482	76.56%	84%	88%	74%	73%	83%	82%	67%	85%	78%	76%	69%	83%	70%	67%
ChrisCoxe	476	70.38%	56%	64%	80%	83%	62%	67%	66%	55%	61%	80%	79%	69%	88%	72%
Cimex	41	17.07%	0%	20%	0%	-	50%	0%	0%	0%	50%	0%	33%	33%	0%	33%
cpac	6	66.67%	100%	-	67%	-	-	-	-	100%	-	-	0%	-	-	-
CruzBot	476	19.12%	18%	26%	18%	26%	18%	26%	14%	19%	11%	26%	10%	12%	17%	24%
DAIDOES	477	27.25%	21%	26%	12%	8%	32%	37%	33%	27%	40%	26%	29%	26%	30%	37%
DaveChurchill	474	60.97%	63%	59%	64%	69%	59%	65%	58%	65%	63%	63%	63%	49%	48%	63%
DawidLoranc	458	43.01%	49%	57%	34%	57%	33%	54%	27%	57%	39%	44%	42%	38%	44%	32%
Ecgberht	472	61.23%	57%	66%	52%	64%	56%	65%	62%	60%	74%	55%	65%	71%	48%	67%
Flash	492	60.37%	63%	61%	48%	63%	68%	78%	50%	50%	70%	62%	53%	53%	61%	61%
FlorianRichoux	462	36.36%	58%	25%	43%	31%	26%	41%	36%	31%	32%	34%	45%	30%	41%	40%
ForceBot	403	48.88%	55%	57%	40%	41%	43%	52%	45%	56%	49%	37%	58%	55%	42%	55%
GaoyuanChen	481	44.49%	29%	46%	40%	51%	42%	45%	50%	53%	51%	48%	41%	39%	44%	39%
Goliat	43	11.63%	0%	0%	0%	25%	20%	0%	33%	0%	0%	14%	0%	0%	100%	0%
GuiBot	50	32.00%	0%	0%	50%	50%	33%	33%	20%	43%	100%	20%	40%	67%	0%	50%
HannesBredberg	443	31.15%	32%	40%	44%	37%	22%	33%	32%	23%	28%	29%	35%	45%	21%	19%
HOLDZ	340	20.00%	6%	12%	18%	30%	20%	9%	33%	17%	24%	24%	24%	15%	35%	14%
ICELab	481	64.03%	52%	67%	56%	59%	57%	78%	57%	71%	65%	72%	76%	73%	53%	61%
Ironbot	485	89.48%	94%	81%	92%	91%	91%	95%	88%	95%	81%	90%	87%	84%	94%	89%
JakubTrancik	481	34.93%	54%	35%	37%	38%	35%	48%	38%	9%	21%	39%	40%	42%	25%	26%
JohanKayser	486	17.28%	14%	14%	10%	36%	21%	29%	21%	15%	11%	14%	16%	4%	24%	6%
Juno	4	50.00%	-	-	-	-	50%	0%	100%	-	-	-	-	-	-	-
KaonBot	460	31.30%	39%	28%	34%	32%	44%	15%	29%	33%	40%	24%	30%	38%	22%	31%
KillAlll	481	53.01%	45%	63%	55%	57%	52%	44%	56%	64%	59%	45%	56%	59%	39%	45%
Korean	478	20.50%	22%	18%	32%	27%	25%	0%	31%	0%	26%	33%	21%	21%	0%	31%
krasi0	495	94.34%	95%	94%	94%	89%	95%	90%	90%	97%	97%	94%	97%	100%	93%	97%
Kruecke	41	24.39%	0%	0%	0%	33%	0%	33%	67%	50%	0%	33%	0%	50%	0%	50%
Locutus	321	73.21%	80%	82%	77%	59%	78%	74%	71%	68%	85%	60%	81%	61%	77%	76%
LukasMoravec	472	31.99%	45%	26%	20%	38%	28%	32%	38%	30%	29%	30%	37%	23%	39%	29%
MadMixP	484	48.35%	45%	58%	48%	46%	41%	36%	52%	38%	42%	50%	59%	61%	53%	46%
MadMixT	452	30.75%	29%	29%	28%	24%	31%	32%	25%	31%	33%	28%	28%	46%	34%	28%
MadMixZ	471	32.06%	40%	23%	26%	27%	38%	39%	46%	26%	41%	33%	19%	19%	31%	29%
MarekKadek	472	20.13%	12%	21%	17%	12%	25%	21%	26%	29%	10%	24%	26%	15%	19%	30%
MarianDevecka	463	83.59%	85%	85%	82%	91%	77%	78%	77%	84%	87%	89%	93%	86%	76%	84%
MarineHell	483	23.60%	25%	15%	25%	21%	23%	35%	19%	21%	28%	16%	14%	31%	33%	21%
MartinRooijackers	455	67.91%	67%	70%	71%	69%	72%	77%	64%	66%	64%	66%	69%	72%	68%	59%
MatejIstenik	465	31.61%	32%	39%	29%	31%	50%	33%	25%	38%	23%	42%	16%	25%	36%	26%
MegaBot2017	442	49.32%	42%	58%	41%	47%	56%	53%	45%	63%	61%	45%	26%	57%	43%	57%
Microwave	466	76.82%	86%	72%	69%	78%	78%	87%	88%	68%	80%	66%	79%	84%	71%	70%
MiddleSchoolStrats	288	48.96%	39%	67%	67%	50%	55%	43%	54%	39%	39%	41%	43%	43%	59%	38%
Myscbot	473	34.04%	57%	60%	31%	25%	27%	24%	34%	36%	46%	6%	35%	29%	27%	44%
NeoEdmundZerg	475	73.05%	80%	74%	64%	76%	69%	81%	66%	70%	69%	75%	78%	81%	80%	57%
NielsJustesen	315	31.43%	19%	22%	22%	50%	24%	46%	32%	30%	32%	30%	39%	36%	25%	40%
NiteKatP	458	33.84%	29%	36%	41%	27%	45%	24%	34%	41%	36%	18%	27%	55%	28%	29%
NiteKatT	458	56.11%	50%	80%	55%	45%	62%	53%	58%	48%	58%	60%	61%	35%	62%	63%
NLPRbot	460	62.83%	66%	70%	63%	56%	65%	57%	85%	70%	54%	63%	67%	48%	63%	59%
NUSBot	500	23.60%	23%	34%	0%	34%	27%	31%	21%	22%	31%	11%	37%	23%	29%	8%
OpprimoBot	308	13.31%	12%	6%	11%	9%	12%	8%	22%	13%	21%	19%	10%	19%	13%	14%
PeregrineBot	471	45.01%	39%	48%	34%	48%	52%	52%	47%	50%	35%	50%	47%	36%	49%	45%
PineappleCactus	490	38.78%	21%	27%	46%	31%	45%	40%	39%	45%	27%	40%	43%	55%	36%	48%
PurpleSpirit	441	50.34%	48%	48%	53%	50%	47%	51%	44%	60%	41%	61%	35%	57%	68%	44%
PurpleSwarm	451	69.84%	79%	59%	70%	65%	67%	69%	84%	70%	74%	76%	69%	60%	76%	57%
PurpleWave	456	80.04%	69%	78%	75%	85%	87%	82%	82%	74%	82%	85%	81%	80%	82%	74%
Randomhammer	496	55.65%	59%	68%	64%	69%	53%	57%	45%	56%	50%	59%	50%	42%	74%	42%
RomanDanielis	489	32.92%	24%	29%	17%	33%	39%	42%	30%	25%	38%	32%	42%	21%	47%	36%
SijiaXu	479	57.83%	57%	51%	56%	61%	59%	54%	53%	60%	57%	53%	61%	54%	64%	68%
SimonPrins	446	72.20%	79%	79%	61%	69%	57%	70%	82%	64%	74%	63%	76%	73%	73%	84%
Sling	509	26.52%	32%	22%	29%	28%	17%	33%	25%	22%	40%	11%	22%	30%	25%	33%
SoerenKlett	455	49.01%	55%	38%	49%	39%	44%	47%	55%	50%	42%	61%	53%	27%	56%	58%
Sparks	474	35.23%	35%	28%	37%	45%	42%	38%	33%	41%	32%	26%	31%	33%	32%	42%
SRbotOne	4	25.00%	-	0%	-	0%	0%	-	-	-	-	-	-	-	100%	-
Steamhammer	466	76.18%	84%	72%	81%	65%	70%	86%	67%	81%	81%	82%	73%	71%	76%	78%
Stone	481	53.22%	68%	64%	69%	40%	44%	39%	45%	57%	55%	59%	56%	48%	63%	51%
SunggukCha	463	35.42%	38%	31%	37%	35%	28%	36%	31%	38%	41%	36%	38%	39%	29%	38%
TomasCere	485	36.91%	38%	30%	19%	31%	28%	34%	53%	39%	30%	34%	56%	45%	44%	31%
TomasVajda	459	72.11%	55%	90%	74%	71%	65%	72%	78%	79%	69%	74%	73%	73%	73%	63%
TravisShelton	536	12.50%	19%	10%	5%	13%	9%	17%	12%	17%	5%	18%	14%	18%	16%	7%
tscmoo	489	72.39%	80%	77%	74%	63%	77%	69%	73%	68%	82%	76%	63%	79%	68%	67%
tscmoop	468	75.64%	82%	73%	76%	79%	81%	73%	83%	66%	78%	75%	57%	76%	69%	86%
tscmoor	485	74.85%	84%	77%	76%	80%	72%	71%	78%	76%	84%	67%	68%	66%	87%	61%
tscmooz	446	54.48%	46%	57%	53%	45%	61%	61%	62%	59%	52%	59%	50%	57%	48%	54%
TyrProtoss	477	67.51%	82%	43%	57%	62%	75%	67%	72%	86%	72%	71%	68%	63%	57%	70%
UC3ManoloBot	36	2.78%	0%	0%	0%	0%	0%	0%	0%	50%	0%	0%	0%	0%	0%	0%
UPStarCraftAI2016	451	39.25%	35%	37%	52%	39%	54%	33%	31%	33%	50%	50%	30%	36%	32%	45%
WillBot	521	44.53%	49%	43%	39%	50%	40%	38%	57%	43%	52%	49%	38%	45%	33%	50%
WillyT	140	42.86%	60%	45%	29%	60%	100%	43%	0%	22%	21%	50%	50%	78%	31%	50%
WuliBot	486	72.02%	65%	72%	79%	78%	76%	79%	67%	57%	73%	66%	74%	78%	66%	74%
Xelnaga	456	32.24%	26%	43%	40%	31%	32%	18%	24%	38%	42%	25%	33%	28%	44%	20%
YuanhengZhu	488	37.91%	28%	24%	28%	40%	38%	36%	46%	37%	41%	60%	46%	24%	36%	46%
Zercgberht	471	35.03%	32%	48%	32%	40%	31%	29%	23%	37%	31%	28%	28%	49%	40%	39%
Ziabot	478	53.97%	50%	46%	70%	39%	65%	53%	57%	51%	57%	50%	52%	48%	62%	65%
ZurZurZur	483	61.08%	61%	60%	73%	73%	58%	61%	55%	61%	62%	67%	67%	53%	58%	52%

Many bots show dramatic performance differences between maps. The differences definitely mean something, even though I often find it hard to say what. What in your bot’s play interacts well or poorly with the map features? If your bot performs poorly on a given map, maybe it should choose a different strategy there. But it’s possible that the reason has to do with specific opponents, and you have to dig into details to find the answer.

Steamhammer’s results surprised me. I think of Heartbreak Ridge as Steamhammer’s worst map, but these numbers say it is a good map. After I fix the map block issues on the map, which will probably be in the version after next, maybe Heartbreak Ridge will become Steamhammer’s best map. In any case, the next version will try to adjust its play for each opponent according to the map, once it accumulates enough data from experience. It will be interesting to see whether that successfully evens out the performance across maps.

Next: Map balance.

SAIL race balance tables

SAIL is still down. I decided to analyze the game data anyway. I grabbed their file of the last 20,000 game results and modified my tournament result analyzer to handle it.

Here is the overall race balance. It’s nice and even. Terran has a little trouble against zerg, and random has a little trouble against protoss, but it’s all within expectations. Of course this averages together bots of all skill levels (at first I typed “kill levels,” which I guess means the same thing). We know that there’s a good mix of participants of each race, so these numbers mean that none of the races is finding its job much easier or harder than the others.

	vT	vP	vZ	vR
terran		51%	45%	52%
protoss	49%		49%	54%
zerg	55%	51%		50%
random	48%	46%	50%

The table of how each bot performs against the different races is more interesting. This is in alphabetical order, so the number of the left is just to show how many there are. There are few random players, so that column is less informative. (Of course on SAIL, “random” only means that both players learn the bot’s race when the game starts.)

#	bot	race	games	overall	vT	vP	vZ	vR
1	100382319	terran	40	0.00%	0%	0%	0%	0%
2	AILien	zerg	484	66.32%	64%	65%	66%	80%
3	Alice	zerg	233	19.31%	18%	22%	14%	32%
4	AndrewSmith	protoss	479	66.81%	65%	72%	63%	71%
5	AndreyKurdiumov	random	464	69.83%	65%	70%	74%	69%
6	Antiga	protoss	472	70.97%	60%	62%	83%	79%
7	Arrakhammer	zerg	467	65.31%	69%	59%	71%	53%
8	AurelienLermant	zerg	472	31.36%	36%	38%	16%	48%
9	BananaBrain	protoss	469	73.99%	69%	68%	82%	80%
10	Bereaver	protoss	501	71.06%	73%	77%	64%	72%
11	BlackCrow	zerg	459	62.09%	59%	61%	68%	55%
12	BryanWeber	zerg	460	13.48%	19%	14%	9%	14%
13	CarstenNielsen	protoss	467	56.32%	50%	57%	58%	65%
14	CasiaBot	zerg	458	51.31%	62%	39%	61%	24%
15	CherryPi	zerg	482	76.56%	75%	74%	85%	57%
16	ChrisCoxe	zerg	476	70.38%	79%	69%	64%	79%
17	Cimex	zerg	41	17.07%	8%	21%	17%	33%
18	cpac	zerg	6	66.67%	100%	33%	-	-
19	CruzBot	protoss	476	19.12%	24%	26%	8%	18%
20	DAIDOES	protoss	477	27.25%	17%	29%	33%	25%
21	DaveChurchill	random	474	60.97%	57%	54%	67%	76%
22	DawidLoranc	zerg	458	43.01%	46%	43%	39%	46%
23	Ecgberht	terran	472	61.23%	59%	68%	55%	68%
24	Flash	protoss	492	60.37%	45%	64%	67%	61%
25	FlorianRichoux	protoss	462	36.36%	43%	38%	26%	54%
26	ForceBot	zerg	403	48.88%	44%	52%	50%	46%
27	GaoyuanChen	protoss	481	44.49%	50%	46%	39%	45%
28	Goliat	terran	43	11.63%	12%	18%	0%	0%
29	GuiBot	protoss	50	32.00%	36%	28%	45%	0%
30	HannesBredberg	terran	443	31.15%	37%	31%	21%	53%
31	HOLDZ	zerg	340	20.00%	33%	22%	7%	26%
32	ICELab	terran	481	64.03%	73%	79%	43%	71%
33	Ironbot	terran	485	89.48%	88%	96%	85%	88%
34	JakubTrancik	protoss	481	34.93%	43%	41%	22%	46%
35	JohanKayser	terran	486	17.28%	12%	23%	17%	12%
36	Juno	protoss	4	50.00%	-	100%	0%	100%
37	KaonBot	terran	460	31.30%	37%	38%	22%	32%
38	KillAlll	zerg	481	53.01%	59%	64%	37%	54%
39	Korean	zerg	478	20.50%	25%	30%	7%	29%
40	krasi0	terran	495	94.34%	95%	98%	91%	91%
41	Kruecke	terran	41	24.39%	43%	31%	0%	0%
42	Locutus	protoss	321	73.21%	79%	66%	77%	73%
43	LukasMoravec	protoss	472	31.99%	40%	29%	29%	40%
44	MadMixP	protoss	484	48.35%	50%	46%	47%	62%
45	MadMixT	terran	452	30.75%	46%	28%	22%	29%
46	MadMixZ	zerg	471	32.06%	36%	32%	31%	24%
47	MarekKadek	terran	472	20.13%	12%	17%	29%	22%
48	MarianDevecka	zerg	463	83.59%	87%	85%	85%	69%
49	MarineHell	terran	483	23.60%	13%	29%	22%	29%
50	MartinRooijackers	terran	455	67.91%	52%	81%	69%	61%
51	MatejIstenik	terran	465	31.61%	33%	31%	29%	41%
52	MegaBot2017	protoss	442	49.32%	60%	53%	38%	47%
53	Microwave	zerg	466	76.82%	76%	77%	76%	84%
54	MiddleSchoolStrats	zerg	288	48.96%	62%	42%	47%	56%
55	Myscbot	protoss	473	34.04%	22%	31%	44%	34%
56	NeoEdmundZerg	zerg	475	73.05%	74%	68%	74%	83%
57	NielsJustesen	protoss	315	31.43%	35%	30%	29%	34%
58	NiteKatP	protoss	458	33.84%	33%	19%	49%	34%
59	NiteKatT	terran	458	56.11%	63%	56%	51%	62%
60	NLPRbot	zerg	460	62.83%	82%	45%	61%	70%
61	NUSBot	protoss	500	23.60%	40%	18%	16%	25%
62	OpprimoBot	random	308	13.31%	14%	18%	7%	13%
63	PeregrineBot	zerg	471	45.01%	55%	48%	39%	28%
64	PineappleCactus	zerg	490	38.78%	40%	34%	41%	42%
65	PurpleSpirit	terran	441	50.34%	58%	47%	50%	46%
66	PurpleSwarm	zerg	451	69.84%	60%	69%	79%	61%
67	PurpleWave	protoss	456	80.04%	73%	84%	82%	74%
68	Randomhammer	random	496	55.65%	62%	42%	66%	33%
69	RomanDanielis	protoss	489	32.92%	27%	33%	34%	42%
70	SijiaXu	zerg	479	57.83%	56%	59%	60%	49%
71	SimonPrins	terran	446	72.20%	61%	79%	74%	74%
72	Sling	zerg	509	26.52%	15%	34%	25%	37%
73	SoerenKlett	terran	455	49.01%	58%	55%	36%	56%
74	Sparks	terran	474	35.23%	42%	38%	28%	38%
75	SRbotOne	terran	4	25.00%	100%	0%	0%	-
76	Steamhammer	zerg	466	76.18%	85%	70%	80%	58%
77	Stone	terran	481	53.22%	34%	65%	55%	53%
78	SunggukCha	terran	463	35.42%	54%	31%	25%	42%
79	TomasCere	protoss	485	36.91%	47%	38%	29%	37%
80	TomasVajda	protoss	459	72.11%	62%	81%	72%	74%
81	TravisShelton	random	536	12.50%	13%	11%	13%	10%
82	tscmoo	terran	489	72.39%	83%	58%	80%	72%
83	tscmoop	protoss	468	75.64%	78%	76%	73%	78%
84	tscmoor	random	485	74.85%	81%	74%	68%	91%
85	tscmooz	zerg	446	54.48%	71%	53%	42%	61%
86	TyrProtoss	protoss	477	67.51%	53%	63%	78%	97%
87	UC3ManoloBot	terran	36	2.78%	10%	0%	0%	0%
88	UPStarCraftAI2016	zerg	451	39.25%	49%	46%	27%	34%
89	WillBot	random	521	44.53%	39%	50%	45%	38%
90	WillyT	terran	140	42.86%	37%	44%	48%	33%
91	WuliBot	protoss	486	72.02%	58%	77%	76%	78%
92	Xelnaga	protoss	456	32.24%	38%	36%	25%	30%
93	YuanhengZhu	protoss	488	37.91%	44%	41%	28%	51%
94	Zercgberht	zerg	471	35.03%	37%	39%	31%	30%
95	Ziabot	zerg	478	53.97%	51%	51%	60%	50%
96	ZurZurZur	zerg	483	61.08%	63%	53%	68%	59%

A lot of interesting details can be seen in which bot is better against which race. Iron does well against all races, but smashes protoss into the ground. I find it surprising that MarineHell scores better against protoss. Zerg KillAlll has a striking pattern in that it does well against terran and protoss, and can’t cope with another zerg. Zerg NLPRbot does much better against terran.

I’m pleased with Steamhammer’s numbers. In the last AIIDE tournament, Steamhammer played well against zerg, but unconvincingly against terran and protoss. This version looks good against all races, though still weaker against protoss.

Next: Data broken down by map.

both DLL and EXE?

A question for bot authors.

One of the changes Dave Churchill made to UAlbertaBot after I forked off Steamhammer was to make it work either as a DLL with direct access, or as a separate EXE. For running in a competition I think the DLL is more natural, because it should have lower overhead. But there are reasons to prefer the EXE. Dave Churchill said that it made his debugging much easier, and he strongly recommended the change.

For my debugging style, I don’t see an advantage. I could borrow code from UAlbertaBot, and unless I’m missing something it wouldn’t be much trouble to implement. Nevertheless, I haven’t been in a hurry. But maybe I don’t get it, or maybe I am some kind of freak. So my question is:

Do you see it as a useful feature? I expect I could implement it for the version after next, if people want it.

Update: After painstaking examination of the quantity and tenor of the comments, spending entire seconds of deep concentration on the task, I conclude that there is other than a groundswell of support. Steamhammer will stay DLL only.

no more enemy-specific strategies for Steamhammer

Working on the opponent model today, I made one of the key changes for the next version:

    "UseEnemySpecificStrategy" : false,
    "EnemySpecificStrategy" :
    {
    },

No more openings hand-configured for known opponents. Steamhammer has to figure out everything on its own. I’ve been working toward this for a long time, and it’s good to finally take the step. I expect play to become more varied—Steamhammer is likely to discover surprising solutions for some opponents. Play should also become stronger, especially in tournaments where opponents like to prepare specially against select enemies. They’ll have to look for ways to exploit Steamhammer’s tactical and micro mistakes, because the game plans will be too adaptive.

I also wrote the terran vulture-first recognizer for the plan recognizer today. It recognizes a plan called Factory that can only be followed by terran, and Steamhammer zerg is configured to counter the plan with the AntiFactory opening. Testing against Iron, it worked perfectly: The first game, Iron won easily. The second game, Steamhammer countered and fought back hard (and happened to win). That’s how it’s supposed to work.

The recognizer was easy to write. Maybe I should write a few more recognizers and counters.

Iron should be a good test case, because Iron is strong enough to usually defeat the counter—AntiFactory puts up a tough battle, but still mostly loses. Opening learning success looks like this: Steamhammer realizes that AntiFactory is probably best, though not all that good, and explores other openings sometimes but not too often. I think I should be able to get that right.

Will playing better games against Iron entice voters on SSCAIT? I think it might happen. If so, I will quickly grow bored with similar Iron-Steamhammer games, but stream watchers may be pleased. Iron would likely lose a few elo points to Steamhammer on average, instead of gaining as it does now.

The upcoming version 1.4.2 has important improvements for all races, including some improvements I haven’t mentioned. Strategy, macro, and micro are better. Look forward to higher rankings for Steamhammer and Randomhammer.