archive by month
Skip to content

Steamhammer 1.4.2 has solidified

Vapor no longer, Steamhammer 1.4.2 is uploaded on SSCAIT. As I’ve mentioned, expect Randomhammer’s terran and protoss especially to play better. The usual change list and source to follow.

expect Steamhammer 1.4.2 within a day

Once I debugged the opponent model opening selection, a long careful test revealed that—drum roll—it didn't work as intended. When the number of games against a given opponent went over 30 or so, the "clever" system for trying out new openings became overactive and gummed up the mechanism, with poor game results. It was a design mistake, and I had to redesign it with less cleverness and rewrite a large section. Now it tests as working reasonably well against both predictable and difficult opponents, over both short and long runs of games.

Tests located the stubborn bug in the gas steal, which sometimes caused the gas steal drone to sit around next to the enemy extractor doing nothing. I fixed it. Is that the last gas steal bug? Based on history, probably not....

More tests to run, and the wild shots so far are par for the course. Everything is on track. Expect Steamhammer 1.4.2 to be uploaded late tonight or early tomorrow (in my time zone).

Tomorrow: Steamhammer 1.4.2 change list. It's kind of long.

inferring enemy buildings

I want Steamhammer to infer the existence of enemy buildings that it has not seen. Right now, the plan recognizer explicitly encodes its conditions: If I see an early factory, or a unit that came from a factory, or an armory, or a starport, then I can guess that the enemy is playing a factory opening. The zerg strategy boss uses related reasoning. Every bit of code that wants to know if the enemy has a factory has to also look for things that depend on the factory. It’s more general and powerful to decouple the inferences. The information manager should notice when it sees things that can’t be explained by known enemy structures, and infer what was necessary to produce them. If it sees a battlecruiser, it should infer the science facility with physics lab, the starport with control tower, and all the way down. Then the rest of Steamhammer can rely on the inferences, and not have to do any extra work on its own. In principle, it could draw conclusions like “Wait, I’ve seen the whole enemy base and there is no physics lab there. I should look harder at the rest of the map.”

Research counts too. If I see a wraith cloak, I know there is a control tower, even though a control tower is not necessary to make wraiths. It’s necessary to research cloaking. If I see a unit with an attack upgrade, I know what buildings were needed to research the upgrade.

It occurs to me that there are tricky cases. Suppose I break into the enemy base and find a factory—and destroy it. Later, I reach farther into the base and find an armory. I can’t infer that there is a second or new factory; the armory could have been started when the first factory still existed. To interpret the armory correctly, I have to remember that the enemy used to have a factory, and without more information I can’t be sure whether it has a factory now. This is building inference level 1; I have to know history to draw true conclusions.

But sometimes I can infer that there is another factory. If the armory is in a place that I know was empty when the first factory was destroyed, then I know that it is a new armory. There must have been another factory, or another factory was built after the first one was killed; in any case, I can infer a factory that I did not see. To do building inference perfectly, you have to not only remember the history of enemy buildings, you have to remember the history of places. This is building inference level 2; if I understand it, I can draw more conclusions.

Building inference level 3 (I’ll call it) is drawing conclusions about the number of enemy production buildings from the number of units that they are seen to have produced. “Hmm, my scout saw your barracks under construction, and by now it could theoretically have produced n marines. You have more than n marines, so you definitely have at least 2 barracks.” Resource counting can figure into this: If your scout counted the enemy minerals, or if you have a table “it’s possible to mine x amount by time t,” or if you simply know the maximum number of marines that can be produced by one barracks without being sure when it was started, you can draw inferences. In the most general case, you know all the good build orders and their production potential, and can rule out the ones which are not consistent with what you see. Later in the game it should at least be possible to make a rough estimate of the enemy’s income, correlate it to the types and/or number of units you see, and get a fair idea of how many production buildings are likely. At this point you’re making estimates instead of drawing definite inferences.

How many levels of building inference does PerfectBot have? “Oh, this army movement suggests that you are trying to draw my forces forward. Maybe you are trying to set up a drop in the rear? I’d say that means a 20% chance of a robo support bay....”

Steamhammer 1.4.2 is close

The opponent model changes in the upcoming Steamhammer version 1.4.2 are proving tedious to debug. I haven’t hit any serious problems and I don’t expect to, but simple bugs take time too and there are a lot of details to double-check. The new opening selection feature of the opponent model has many moving parts, and each part has to be tested and inspected and shown to work with the rest of the system.

All other aspects are passing tests with flying colors, even complicated stuff. The release might happen in 2 or 3 days. Terran and protoss are substantially stronger, so stand by!

Locutus has a pylon harassment feature

Bot authors do a lot of creative stuff. Sometimes they are so creative that I can’t figure out the idea. Like this game between Locutus and Iron: Why did Locutus’s scout probe build pylons all over Iron’s territory?

too many pylons in the wrong place

It’s not a bug. It’s an intentional feature in this version of Locutus—here is the commit. It’s called “pylon harassment” and in this version it is hardcoded to happen only against Iron. A comment says:

    // We want to build a pylon. Do so when:
    // - We have enough resources
    // - We are not close to the enemy mineral line
    // - We are in sight range of an enemy building
    // - Nothing is in the way

“In sight range of an enemy building.” The pylons are meant to be seen. The intention must be to direct the opponent’s attention, to somehow divert it from doing something Locutus doesn’t want it to do, such as leave its base and attack Locutus.

In this game, Iron was not diverted at all, and won easily since Locutus had wasted a ton of minerals on pylons. I can imagine that some bots would go wrong. Those that pull workers to defend against proxy buildings might pull too many workers and stop mining, for example. Still, if your bot does make a mistake like that, it shouldn’t be hard to fix. So if pylon harassment is useful at all, I guess it must be a trick to easily defeat some weaker bots, or bots which make a certain class of blunder and are not being updated. Or possibly it is an unfinished feature, and the way it worked in this game is not how it is intended to work. I like the second theory better.

Another comment says

    // In the future, recognize how opponents react to pylon harass and store it in the opponent model

So in its final state, it is not intended to be hand-configured, but automatically selected. It could use code similar to the auto gas steal code already in Steamhammer, which decides whether to steal gas. Or it might use code similar to the plan recognizer to decide whether the opponent made a sensible or a silly reaction, and repeat if the opponent was silly.

surprise Randomhammer crash

Randomhammer crashed—as zerg, no less. It got knocked back to 1 base with a few drones, and started to recover as normal, but after making a couple units... kaboom. Zerg has been recovering reliably from these upsets for a long time now, so it’s a surprise. I’m trying to triangulate the source of the problem from not-quite-adequate information.

The upcoming version 1.4.2 is closing in on being ready. The opponent model changes are finished, except for small details. The remaining work is a little bit of smoothing and a bunch of testing (and probably fixing). In between an untraced crash and a new and fairly complex addition to the opponent model, there’s no guessing how long debugging will take.

Steamhammer has a UCB bug

Ack, Steamhammer has a typo in its UCB formula! A parenthesis is misplaced. What a blunder! In UCB1_bound(), change this inexcusable mistake:

	return sqrt(2.0 * log(double(total)/tries));

to this:

	return sqrt(2.0 * log(total) / tries);

The typecast has no formal effect in C++11 and later, and made it harder to see the error.

The current Steamhammer 1.4.1 uses UCB only for deciding whether to steal gas, when AutoGasSteal is turned on. I had been wondering why it chose to steal gas so often against so many opponents. Was the gas steal really that effective? When I looked again at the code, I soon spotted the mistake.

The behavior is approximately right when the number of games is small. That’s how it passed my end-to-end tests. As the number of games goes up, it gets more and more wrong. It’s impossible to be too careful in testing. :-/

The upcoming version will use UCB for opening selection—not in the most direct way, like most bots, but with a twist to cope with the large number of openings, too many to explore. Good thing I caught the bug in time.

Steamhammer and the protoss building bug

I wrote up Steamhammer’s protoss building problem in March as Steamhammer’s time limit problem with protoss. Since then, I borrowed the smart Locutus idea of limiting the number of gateways to 10, a workaround that nearly eliminated the problem—though it doesn’t solve it. It worked so well that I decided to remove a different workaround, a rule in building placement that many protoss buildings are allowed to touch vertically, blocking horizontal movement. That rule sometimes caused units to get trapped in between buildings and the edge of the map, so it was now causing more harm than good.

Then, in a test game of protoss versus Killerbot by Marian Devecka, the building problem came up again. Steamhammer laid out its base poorly, and no buildings could be added without ordering a new pylon first, which it didn’t know to do. The bot slowed to a crawl, trying over and over again to place buildings that, under its rules, there was no room for. The space powered by pylons was filled up.

Too infuriating! That same evening I put together an elaborate system to solve it, connecting the building manager, the strategy manager, and the information manager. It’s a version of the “absolute minimum” fix that I wrote about before. When the building manager sees that a protoss building location cannot be found, and the building requires pylon power, it sets a flag “stalled for lack of space” and refuses to make any more attempts to place buildings that require pylon power. It can still place a pylon, nexus, or assimilator. The strategy manager recognizes a building manager stall as a production emergency and orders an emergency pylon. Finally, the information manager, which keeps track of units, now keeps a set of our pylons. When a new pylon completes, it notifies the building manager to clear the “stalled for lack of space” flag if it is set.

The system seems to work. As one test, I made an opening which ordered 20 forges in a row. It was fun to write "20 x forge". Everything worked as designed: The building manager stalled each time it came to a forge that could not be placed. In the stalled state it does little work, and there was no slowdown. The strategy manager saw the stall and ordered an emergency pylon, which the building manager started. The information manager saw the pylon finish and told the building manager to unstall, and construction proceeded. It was not too efficient in terms of game play, but nothing broke or froze or ran over the time limit. And it worked repeatedly until all 20 forges were made, and the bot switched into normal play. Still, it would be surprising if such a complicated arrangement didn’t cause any bugs....

The problem to solve is a slowdown, and this change is an optimization that speeds things up: Technically it is a fix, not a workaround. But it is not a complete fix. Building placement can fail due to bugs, rather than lack of space in range of pylon power, and this fix doesn’t take that into account. A building placement bug could cause unnecessary stalls. Also I’m not convinced it will always play nice with the “oh, hey, let’s choose here as my new main base” code. I need to put a lot more work into the building code to make it fast and reliable, and then to teach it more sensible building layout.

The next version is almost finished—nominally. I need to complete some coding in the opponent model and batter it with heavy tests until it doesn’t fall down, and that is all that is in the plan. The problem, to change metaphors, is that every time I kick the ball, the goal moves away, carried by creeping features.

SAIL map balance

Here’s a new table I haven’t generated before, at least not in this form: Map balance for each race, from the 20000 SAIL games. The left-side columns give win rates for the 3 matchups. The right-side columns give game counts and win rates for each race in all non-mirror matchups. (A mirror matchup always has 50% win rate, one winner and one loser of the same race, so including those would only pull the numbers closer to 50%.)

mapTvZZvPPvTT gamesT %P gamesP %Z gamesZ %R gamesR %
Benzene50%54%48%51651%65448%65751%23150%
Destination46%49%50%49849%65250%62752%20545%
HeartbreakRidge43%55%41%53350%58844%63756%22250%
NeoMoonGlaive40%51%48%54646%65249%65653%23253%
TauCross47%48%48%52949%66851%67851%22947%
Andromeda44%49%51%50647%62752%64652%22347%
CircuitBreaker43%51%52%55046%69351%69252%23349%
EmpireoftheSun47%52%48%49250%61249%61352%22749%
FightingSpirit45%52%57%53545%67652%69452%21749%
Icarus42%54%47%54348%60047%65054%20753%
Jade44%55%50%51949%61148%65554%21146%
LaMancha48%51%43%50152%61947%61352%21347%
Python44%49%50%52147%62450%64252%19751%
Roadrunner47%46%50%55749%64852%71651%23142%
overall45%51%49%734648%892449%917652%307848%

Compare this to the race balance tables posted a couple days ago. Overall, there are no giant imbalances; the largest imbalance is 40%-60%, a 3:2 win rate for zerg over terran on Neo Moon Glaive. Terran has a small but consistent disadvantage against zerg on all but one map (Benzene came out even). That is a genuine race imbalance in current bot play: Z > T. But averaged across all the maps, the imbalance is only 55% zerg to 45% terran, 11:9, not a value to complain about. You can look back at the older tables to see which bots are responsible for the imbalance. By no means are all terran bots worse against zerg, or all zerg better against terran, but Krasi0 and Iron do show the pattern, and so does Steamhammer.

The other matchups have some pink and some blue, but tend to approximately balance out on average. Past investigation showed that the map race balance for pro players and bot bot players were pretty much uncorrelated in AIIDE 2015: See comparing pro and bot balance. I expect that it is still true. Bots have not gotten much better at exploiting map features.

SAIL bots and maps

More SAIL data: How well each bot performed on each map.

gamesoverallBenzenDestinHeartbNeoMooTauCroAndromCircuiEmpireFightiIcarusJadeLaMancPythonRoadru
100382319400.00%0%0%0%0%0%0%0%0%0%0%0%-0%0%
AILien48466.32%50%52%66%70%64%69%76%62%76%61%73%69%64%71%
Alice23319.31%20%7%16%24%24%40%9%24%19%14%23%13%15%28%
AndrewSmith47966.81%56%67%74%62%69%68%77%79%51%79%61%59%64%71%
AndreyKurdiumov46469.83%61%54%71%79%75%67%69%71%67%71%74%76%77%70%
Antiga47270.97%73%74%69%70%66%60%72%86%74%69%57%62%78%86%
Arrakhammer46765.31%62%56%68%70%73%59%64%64%61%62%75%65%66%73%
AurelienLermant47231.36%38%45%35%18%35%24%18%25%32%37%31%32%38%36%
BananaBrain46973.99%61%66%71%67%85%70%79%78%76%78%70%78%81%70%
Bereaver50171.06%88%75%66%85%62%74%79%63%78%80%56%51%56%80%
BlackCrow45962.09%62%45%63%75%52%56%59%74%56%63%73%62%55%73%
BryanWeber46013.48%19%3%18%14%8%16%17%8%12%19%17%30%8%7%
CarstenNielsen46756.32%37%55%58%44%63%58%50%70%56%51%50%56%70%68%
CasiaBot45851.31%55%58%52%50%48%48%41%55%51%48%62%52%50%46%
CherryPi48276.56%84%88%74%73%83%82%67%85%78%76%69%83%70%67%
ChrisCoxe47670.38%56%64%80%83%62%67%66%55%61%80%79%69%88%72%
Cimex4117.07%0%20%0%-50%0%0%0%50%0%33%33%0%33%
cpac666.67%100%-67%----100%--0%---
CruzBot47619.12%18%26%18%26%18%26%14%19%11%26%10%12%17%24%
DAIDOES47727.25%21%26%12%8%32%37%33%27%40%26%29%26%30%37%
DaveChurchill47460.97%63%59%64%69%59%65%58%65%63%63%63%49%48%63%
DawidLoranc45843.01%49%57%34%57%33%54%27%57%39%44%42%38%44%32%
Ecgberht47261.23%57%66%52%64%56%65%62%60%74%55%65%71%48%67%
Flash49260.37%63%61%48%63%68%78%50%50%70%62%53%53%61%61%
FlorianRichoux46236.36%58%25%43%31%26%41%36%31%32%34%45%30%41%40%
ForceBot40348.88%55%57%40%41%43%52%45%56%49%37%58%55%42%55%
GaoyuanChen48144.49%29%46%40%51%42%45%50%53%51%48%41%39%44%39%
Goliat4311.63%0%0%0%25%20%0%33%0%0%14%0%0%100%0%
GuiBot5032.00%0%0%50%50%33%33%20%43%100%20%40%67%0%50%
HannesBredberg44331.15%32%40%44%37%22%33%32%23%28%29%35%45%21%19%
HOLDZ34020.00%6%12%18%30%20%9%33%17%24%24%24%15%35%14%
ICELab48164.03%52%67%56%59%57%78%57%71%65%72%76%73%53%61%
Ironbot48589.48%94%81%92%91%91%95%88%95%81%90%87%84%94%89%
JakubTrancik48134.93%54%35%37%38%35%48%38%9%21%39%40%42%25%26%
JohanKayser48617.28%14%14%10%36%21%29%21%15%11%14%16%4%24%6%
Juno450.00%----50%0%100%-------
KaonBot46031.30%39%28%34%32%44%15%29%33%40%24%30%38%22%31%
KillAlll48153.01%45%63%55%57%52%44%56%64%59%45%56%59%39%45%
Korean47820.50%22%18%32%27%25%0%31%0%26%33%21%21%0%31%
krasi049594.34%95%94%94%89%95%90%90%97%97%94%97%100%93%97%
Kruecke4124.39%0%0%0%33%0%33%67%50%0%33%0%50%0%50%
Locutus32173.21%80%82%77%59%78%74%71%68%85%60%81%61%77%76%
LukasMoravec47231.99%45%26%20%38%28%32%38%30%29%30%37%23%39%29%
MadMixP48448.35%45%58%48%46%41%36%52%38%42%50%59%61%53%46%
MadMixT45230.75%29%29%28%24%31%32%25%31%33%28%28%46%34%28%
MadMixZ47132.06%40%23%26%27%38%39%46%26%41%33%19%19%31%29%
MarekKadek47220.13%12%21%17%12%25%21%26%29%10%24%26%15%19%30%
MarianDevecka46383.59%85%85%82%91%77%78%77%84%87%89%93%86%76%84%
MarineHell48323.60%25%15%25%21%23%35%19%21%28%16%14%31%33%21%
MartinRooijackers45567.91%67%70%71%69%72%77%64%66%64%66%69%72%68%59%
MatejIstenik46531.61%32%39%29%31%50%33%25%38%23%42%16%25%36%26%
MegaBot201744249.32%42%58%41%47%56%53%45%63%61%45%26%57%43%57%
Microwave46676.82%86%72%69%78%78%87%88%68%80%66%79%84%71%70%
MiddleSchoolStrats28848.96%39%67%67%50%55%43%54%39%39%41%43%43%59%38%
Myscbot47334.04%57%60%31%25%27%24%34%36%46%6%35%29%27%44%
NeoEdmundZerg47573.05%80%74%64%76%69%81%66%70%69%75%78%81%80%57%
NielsJustesen31531.43%19%22%22%50%24%46%32%30%32%30%39%36%25%40%
NiteKatP45833.84%29%36%41%27%45%24%34%41%36%18%27%55%28%29%
NiteKatT45856.11%50%80%55%45%62%53%58%48%58%60%61%35%62%63%
NLPRbot46062.83%66%70%63%56%65%57%85%70%54%63%67%48%63%59%
NUSBot50023.60%23%34%0%34%27%31%21%22%31%11%37%23%29%8%
OpprimoBot30813.31%12%6%11%9%12%8%22%13%21%19%10%19%13%14%
PeregrineBot47145.01%39%48%34%48%52%52%47%50%35%50%47%36%49%45%
PineappleCactus49038.78%21%27%46%31%45%40%39%45%27%40%43%55%36%48%
PurpleSpirit44150.34%48%48%53%50%47%51%44%60%41%61%35%57%68%44%
PurpleSwarm45169.84%79%59%70%65%67%69%84%70%74%76%69%60%76%57%
PurpleWave45680.04%69%78%75%85%87%82%82%74%82%85%81%80%82%74%
Randomhammer49655.65%59%68%64%69%53%57%45%56%50%59%50%42%74%42%
RomanDanielis48932.92%24%29%17%33%39%42%30%25%38%32%42%21%47%36%
SijiaXu47957.83%57%51%56%61%59%54%53%60%57%53%61%54%64%68%
SimonPrins44672.20%79%79%61%69%57%70%82%64%74%63%76%73%73%84%
Sling50926.52%32%22%29%28%17%33%25%22%40%11%22%30%25%33%
SoerenKlett45549.01%55%38%49%39%44%47%55%50%42%61%53%27%56%58%
Sparks47435.23%35%28%37%45%42%38%33%41%32%26%31%33%32%42%
SRbotOne425.00%-0%-0%0%-------100%-
Steamhammer46676.18%84%72%81%65%70%86%67%81%81%82%73%71%76%78%
Stone48153.22%68%64%69%40%44%39%45%57%55%59%56%48%63%51%
SunggukCha46335.42%38%31%37%35%28%36%31%38%41%36%38%39%29%38%
TomasCere48536.91%38%30%19%31%28%34%53%39%30%34%56%45%44%31%
TomasVajda45972.11%55%90%74%71%65%72%78%79%69%74%73%73%73%63%
TravisShelton53612.50%19%10%5%13%9%17%12%17%5%18%14%18%16%7%
tscmoo48972.39%80%77%74%63%77%69%73%68%82%76%63%79%68%67%
tscmoop46875.64%82%73%76%79%81%73%83%66%78%75%57%76%69%86%
tscmoor48574.85%84%77%76%80%72%71%78%76%84%67%68%66%87%61%
tscmooz44654.48%46%57%53%45%61%61%62%59%52%59%50%57%48%54%
TyrProtoss47767.51%82%43%57%62%75%67%72%86%72%71%68%63%57%70%
UC3ManoloBot362.78%0%0%0%0%0%0%0%50%0%0%0%0%0%0%
UPStarCraftAI201645139.25%35%37%52%39%54%33%31%33%50%50%30%36%32%45%
WillBot52144.53%49%43%39%50%40%38%57%43%52%49%38%45%33%50%
WillyT14042.86%60%45%29%60%100%43%0%22%21%50%50%78%31%50%
WuliBot48672.02%65%72%79%78%76%79%67%57%73%66%74%78%66%74%
Xelnaga45632.24%26%43%40%31%32%18%24%38%42%25%33%28%44%20%
YuanhengZhu48837.91%28%24%28%40%38%36%46%37%41%60%46%24%36%46%
Zercgberht47135.03%32%48%32%40%31%29%23%37%31%28%28%49%40%39%
Ziabot47853.97%50%46%70%39%65%53%57%51%57%50%52%48%62%65%
ZurZurZur48361.08%61%60%73%73%58%61%55%61%62%67%67%53%58%52%

Many bots show dramatic performance differences between maps. The differences definitely mean something, even though I often find it hard to say what. What in your bot’s play interacts well or poorly with the map features? If your bot performs poorly on a given map, maybe it should choose a different strategy there. But it’s possible that the reason has to do with specific opponents, and you have to dig into details to find the answer.

Steamhammer’s results surprised me. I think of Heartbreak Ridge as Steamhammer’s worst map, but these numbers say it is a good map. After I fix the map block issues on the map, which will probably be in the version after next, maybe Heartbreak Ridge will become Steamhammer’s best map. In any case, the next version will try to adjust its play for each opponent according to the map, once it accumulates enough data from experience. It will be interesting to see whether that successfully evens out the performance across maps.

Next: Map balance.

SAIL race balance tables

SAIL is still down. I decided to analyze the game data anyway. I grabbed their file of the last 20,000 game results and modified my tournament result analyzer to handle it.

Here is the overall race balance. It’s nice and even. Terran has a little trouble against zerg, and random has a little trouble against protoss, but it’s all within expectations. Of course this averages together bots of all skill levels (at first I typed “kill levels,” which I guess means the same thing). We know that there’s a good mix of participants of each race, so these numbers mean that none of the races is finding its job much easier or harder than the others.

vTvPvZvR
terran51%45%52%
protoss49%49%54%
zerg55%51%50%
random48%46%50%

The table of how each bot performs against the different races is more interesting. This is in alphabetical order, so the number of the left is just to show how many there are. There are few random players, so that column is less informative. (Of course on SAIL, “random” only means that both players learn the bot’s race when the game starts.)

#botracegamesoverallvTvPvZvR
1100382319terran400.00%0%0%0%0%
2AILienzerg48466.32%64%65%66%80%
3Alicezerg23319.31%18%22%14%32%
4AndrewSmithprotoss47966.81%65%72%63%71%
5AndreyKurdiumovrandom46469.83%65%70%74%69%
6Antigaprotoss47270.97%60%62%83%79%
7Arrakhammerzerg46765.31%69%59%71%53%
8AurelienLermantzerg47231.36%36%38%16%48%
9BananaBrainprotoss46973.99%69%68%82%80%
10Bereaverprotoss50171.06%73%77%64%72%
11BlackCrowzerg45962.09%59%61%68%55%
12BryanWeberzerg46013.48%19%14%9%14%
13CarstenNielsenprotoss46756.32%50%57%58%65%
14CasiaBotzerg45851.31%62%39%61%24%
15CherryPizerg48276.56%75%74%85%57%
16ChrisCoxezerg47670.38%79%69%64%79%
17Cimexzerg4117.07%8%21%17%33%
18cpaczerg666.67%100%33%--
19CruzBotprotoss47619.12%24%26%8%18%
20DAIDOESprotoss47727.25%17%29%33%25%
21DaveChurchillrandom47460.97%57%54%67%76%
22DawidLoranczerg45843.01%46%43%39%46%
23Ecgberhtterran47261.23%59%68%55%68%
24Flashprotoss49260.37%45%64%67%61%
25FlorianRichouxprotoss46236.36%43%38%26%54%
26ForceBotzerg40348.88%44%52%50%46%
27GaoyuanChenprotoss48144.49%50%46%39%45%
28Goliatterran4311.63%12%18%0%0%
29GuiBotprotoss5032.00%36%28%45%0%
30HannesBredbergterran44331.15%37%31%21%53%
31HOLDZzerg34020.00%33%22%7%26%
32ICELabterran48164.03%73%79%43%71%
33Ironbotterran48589.48%88%96%85%88%
34JakubTrancikprotoss48134.93%43%41%22%46%
35JohanKayserterran48617.28%12%23%17%12%
36Junoprotoss450.00%-100%0%100%
37KaonBotterran46031.30%37%38%22%32%
38KillAlllzerg48153.01%59%64%37%54%
39Koreanzerg47820.50%25%30%7%29%
40krasi0terran49594.34%95%98%91%91%
41Kruecketerran4124.39%43%31%0%0%
42Locutusprotoss32173.21%79%66%77%73%
43LukasMoravecprotoss47231.99%40%29%29%40%
44MadMixPprotoss48448.35%50%46%47%62%
45MadMixTterran45230.75%46%28%22%29%
46MadMixZzerg47132.06%36%32%31%24%
47MarekKadekterran47220.13%12%17%29%22%
48MarianDeveckazerg46383.59%87%85%85%69%
49MarineHellterran48323.60%13%29%22%29%
50MartinRooijackersterran45567.91%52%81%69%61%
51MatejIstenikterran46531.61%33%31%29%41%
52MegaBot2017protoss44249.32%60%53%38%47%
53Microwavezerg46676.82%76%77%76%84%
54MiddleSchoolStratszerg28848.96%62%42%47%56%
55Myscbotprotoss47334.04%22%31%44%34%
56NeoEdmundZergzerg47573.05%74%68%74%83%
57NielsJustesenprotoss31531.43%35%30%29%34%
58NiteKatPprotoss45833.84%33%19%49%34%
59NiteKatTterran45856.11%63%56%51%62%
60NLPRbotzerg46062.83%82%45%61%70%
61NUSBotprotoss50023.60%40%18%16%25%
62OpprimoBotrandom30813.31%14%18%7%13%
63PeregrineBotzerg47145.01%55%48%39%28%
64PineappleCactuszerg49038.78%40%34%41%42%
65PurpleSpiritterran44150.34%58%47%50%46%
66PurpleSwarmzerg45169.84%60%69%79%61%
67PurpleWaveprotoss45680.04%73%84%82%74%
68Randomhammerrandom49655.65%62%42%66%33%
69RomanDanielisprotoss48932.92%27%33%34%42%
70SijiaXuzerg47957.83%56%59%60%49%
71SimonPrinsterran44672.20%61%79%74%74%
72Slingzerg50926.52%15%34%25%37%
73SoerenKlettterran45549.01%58%55%36%56%
74Sparksterran47435.23%42%38%28%38%
75SRbotOneterran425.00%100%0%0%-
76Steamhammerzerg46676.18%85%70%80%58%
77Stoneterran48153.22%34%65%55%53%
78SunggukChaterran46335.42%54%31%25%42%
79TomasCereprotoss48536.91%47%38%29%37%
80TomasVajdaprotoss45972.11%62%81%72%74%
81TravisSheltonrandom53612.50%13%11%13%10%
82tscmooterran48972.39%83%58%80%72%
83tscmoopprotoss46875.64%78%76%73%78%
84tscmoorrandom48574.85%81%74%68%91%
85tscmoozzerg44654.48%71%53%42%61%
86TyrProtossprotoss47767.51%53%63%78%97%
87UC3ManoloBotterran362.78%10%0%0%0%
88UPStarCraftAI2016zerg45139.25%49%46%27%34%
89WillBotrandom52144.53%39%50%45%38%
90WillyTterran14042.86%37%44%48%33%
91WuliBotprotoss48672.02%58%77%76%78%
92Xelnagaprotoss45632.24%38%36%25%30%
93YuanhengZhuprotoss48837.91%44%41%28%51%
94Zercgberhtzerg47135.03%37%39%31%30%
95Ziabotzerg47853.97%51%51%60%50%
96ZurZurZurzerg48361.08%63%53%68%59%

A lot of interesting details can be seen in which bot is better against which race. Iron does well against all races, but smashes protoss into the ground. I find it surprising that MarineHell scores better against protoss. Zerg KillAlll has a striking pattern in that it does well against terran and protoss, and can’t cope with another zerg. Zerg NLPRbot does much better against terran.

I’m pleased with Steamhammer’s numbers. In the last AIIDE tournament, Steamhammer played well against zerg, but unconvincingly against terran and protoss. This version looks good against all races, though still weaker against protoss.

Next: Data broken down by map.

both DLL and EXE?

A question for bot authors.

One of the changes Dave Churchill made to UAlbertaBot after I forked off Steamhammer was to make it work either as a DLL with direct access, or as a separate EXE. For running in a competition I think the DLL is more natural, because it should have lower overhead. But there are reasons to prefer the EXE. Dave Churchill said that it made his debugging much easier, and he strongly recommended the change.

For my debugging style, I don’t see an advantage. I could borrow code from UAlbertaBot, and unless I’m missing something it wouldn’t be much trouble to implement. Nevertheless, I haven’t been in a hurry. But maybe I don’t get it, or maybe I am some kind of freak. So my question is:

Do you see it as a useful feature? I expect I could implement it for the version after next, if people want it.

Update: After painstaking examination of the quantity and tenor of the comments, spending entire seconds of deep concentration on the task, I conclude that there is other than a groundswell of support. Steamhammer will stay DLL only.

no more enemy-specific strategies for Steamhammer

Working on the opponent model today, I made one of the key changes for the next version:

    "UseEnemySpecificStrategy" : false,
    "EnemySpecificStrategy" :
    {
    },

No more openings hand-configured for known opponents. Steamhammer has to figure out everything on its own. I’ve been working toward this for a long time, and it’s good to finally take the step. I expect play to become more varied—Steamhammer is likely to discover surprising solutions for some opponents. Play should also become stronger, especially in tournaments where opponents like to prepare specially against select enemies. They’ll have to look for ways to exploit Steamhammer’s tactical and micro mistakes, because the game plans will be too adaptive.

I also wrote the terran vulture-first recognizer for the plan recognizer today. It recognizes a plan called Factory that can only be followed by terran, and Steamhammer zerg is configured to counter the plan with the AntiFactory opening. Testing against Iron, it worked perfectly: The first game, Iron won easily. The second game, Steamhammer countered and fought back hard (and happened to win). That’s how it’s supposed to work.

The recognizer was easy to write. Maybe I should write a few more recognizers and counters.

Iron should be a good test case, because Iron is strong enough to usually defeat the counter—AntiFactory puts up a tough battle, but still mostly loses. Opening learning success looks like this: Steamhammer realizes that AntiFactory is probably best, though not all that good, and explores other openings sometimes but not too often. I think I should be able to get that right.

Will playing better games against Iron entice voters on SSCAIT? I think it might happen. If so, I will quickly grow bored with similar Iron-Steamhammer games, but stream watchers may be pleased. Iron would likely lose a few elo points to Steamhammer on average, instead of gaining as it does now.

The upcoming version 1.4.2 has important improvements for all races, including some improvements I haven’t mentioned. Strategy, macro, and micro are better. Look forward to higher rankings for Steamhammer and Randomhammer.