Steamhammer is progressing

I estimate that the Steamhammer version active on SSCAIT and BASIL, 3.5.2, is about 50-100 elo weaker than the previous active version, 3.5. The improvements are outweighed by new issues, the most important of which came from an “optimization” of combat simulation which sometimes fed it stale data. Oops. Advice to all persons: Do not make mistakes, they can hurt.

I fixed that yesterday. The only remaining new weakness (that I can see) is a tendency to sometimes overdo it on the sunkens. I trimmed that back with limits based on additional information. It is less severe than the earlier weakness of forgetting the sunkens or leaving them until too late, and there are other improvements besides. I have the strongest Steamhammer yet.

I have time for more improvements. Early signs for Steamhammer in AIIDE 2021 look good, provided I can follow my own advice to all persons.

Trackbacks

No Trackbacks

Comments

Dilyan on Wednesday, September 8. 2021:

Hype, hype, hype! GL

Tully Elliston on Thursday, September 9. 2021:

Good luck!

Have you ever crunched the numbers on the average game length of games won versus games lost?

The same calculation for each race matchup, and for each top opponent, might be illustrative. Depending on the number of games, map choice might need to be consistent also, but it would be nice indicator toward what area of play needs work for each match up (losses much shorter than wins = early game. Not much difference = mid game. Losses much longer than wins = late game).

Jay Scott on Thursday, September 9. 2021:

No, but analyzing game lengths is on my list.

MarcoDBAA on Monday, September 13. 2021:

BASIL gives us that information now btw.

I have chosen games from one year ago (to make it less random) to now for zerg:

Steamhammer win rates peak between 6 and 9 (or 10) minutes
at more than 70%, then it goes down to slightly more than 50%
There is a weakness at 4 and 5 minutes (under 50%)

Other zerg:

Crona peaks at 4 to 6 minutes at around 90%, then "crashes" to 40% or even 30% later.

Monster on the other hand has win rates of around 85% between 4 and 14 minutes (a bit less at 6 and 7 minutes), then it goes down, but only slightly, to between 70 and 75%.

McRaveZ peaks at 4 minutes at 95%, then there is another peak at 10 minutes at 73%.

Microwave peaks at 4 (or 3 but not that many games) to 6 minutes at 75%, then it falls to under or around 50%

Chris Coxe peaks between 3 (or even 2) and 4 minutes at over 95%

CherryPi is really different with an early peak at 4 to 5 minutes (under 70% overall), then it really crashes at 8 and 9 minutes (30% only, while this is peak for SH for example) to rise again to 70% at 15 and 16 minutes. After that it goes down again, but this is less severe.

Proxy has an early peak at 92% at 4 minutes, then there is another at 10 minutes (with many games) at 60%. Win rates do never fall much, if you discount the early peak. It looks a bit less rushy than SH, but wins much more 4 minute games on the other hand.

Marian Devecka is similar to McRave, early peak at 4 minutes (85%), then a peak between 11 and 14 minutes at over 60% (many games ending there too). Well, they play similar too.

AILien peak is at 18 minutes at 70%, while it already loses many games before 7 minutes. I always thought, that it is really good, if it manages to go strongly into the late game, very fluid. The winrate is similar to Monster there, but of course AILien already lost against the best opponents at that time, so it is not really comparable.

Simplicity peak is at even 22 minutes at 75%. It also has a bad record in games, that end early.

Fresh Meat peak seems to be at 7 to 8 minutes between 75 and 80% (not many games for that bot).

Also because of game length:
Chris Coxe is the biggest rusher, folllowed closely by Crona, followed by Microwave, then Steamhammer (and Fresh Meat?), Proxy in the middle, while McRaveZ and Marian Devecka, then Monster and CherryPi, and especially AILien and Simplicity also go for and win in longer games.

P.S: tscmooz (tscmoor is of course random and statistics are not usable) is extremely absurd. Catastrophic losses between 3 and 8 minutes (0% winrate of games ending at that time), then it peaks at 90% at 14 to 16 minutes. The openings enabled in that version are just wrong of course, especially in ZvZ, so that is the reason of course.

I would like to see real statistics here. :)

Jay Scott on Monday, September 13. 2021:

Oh, nice new graph! Squeezing game count and win rate into the same bars makes it a little hard to read, though.

MarcoDBAA on Monday, September 13. 2021:

Think that it makes sense like that.

A bot might have a catastrophic winrate at 4 minutes, but if it did not lose many games, this just means, that it plays a very defensive game, and nothing is wrong at all with its play. And you see these things easily here.

Letabot (again chosen games of the last year) is a good example for this:
https://www.basil-ladder.net/bot.html?bot=Martin%20Rooijackers

Between 3 and 5 minutes the winrate is catastrophic in numbers, but not many games were lost, and therefore the defense worked very well.

Bytekeeper on Monday, September 13. 2021:

Thanks! Basically triggered by you saying you want to analyze game lengths and me thinking 'something like that would be cool for BASIL'.

I am open to suggestions regarding the readability. This kind of graph packs a lot of information in a small space: Win ratio + total amount of games. That's why I chose this way.

Jay Scott on Monday, September 13. 2021:

That’s a good reason. For now, my only suggestion is to not print “0.0%” in columns with no games.

Tully Elliston on Tuesday, September 14. 2021:

> Steamhammer win rates peak between 6 and 9 (or 10) minutes
at more than 70%, then it goes down to slightly more than 50%
There is a weakness at 4 and 5 minutes (under 50%)

I'm not sure this information is meaningful when the opponent varies. This could just suggest that SH crushes the field of weak bots in under 10 minutes.

MarcoDBAA on Tuesday, September 14. 2021:

Sure, win rates will go down later because of weaker bots being already defeated, but I still think, that you see something there, also if you compare bots of similar strength.

It is tricky of course. Would be helpful, if the graph was at least split into opponents of different races

Jay Scott on Tuesday, September 14. 2021:

Steamhammer does crush the weakest terrans and zergs in under 10 minutes, but it doesn’t play that many of them because of BASIL’s matchmaking. One contributor is the hydra rush wins over adias.

Games near and over 20 minutes are late game for Steamhammer, with hive tech and at or close to max economy. In large part, the wins are natural and the losses are because Steamhammer was losing the whole time and the opponent was slow about it.

Tully Elliston on Tuesday, September 14. 2021:

Looking at the existing graph more, my understanding would be that the absolute red area per column, rather than the winrate percentage per column, is the important figure when deciding where to focus attention to improve your bots win-rate.

For SH results on BASIL over the previous 6 months:

- the largest absolute number of losses occour at the 5th and 7th minutes.

- visibile trend that games are most likely to end in between 5 to 9 minutes, and if they last longer the distrubion of end times becomes much larger.

- SH is most likely to win games that last between 6 to 9 minutes.

- If SH is either rushing or rushed (any pre-5 minutes win strategy in play), win rate is low.

- The numbers suggest that SH rushes (4 pool) almost never. Can tell this by looking at the Chris Coxe graph (almost 100% wins between 4th and 5th minute).

Jay Scott on Tuesday, September 14. 2021:

True, Steamhammer’s rushes are not as successful as they could be, so it learns to ignore them. I think it is mostly because micro is not tuned to be aggressive enough.

Tully Elliston on Tuesday, September 14. 2021:

imo 4Pool is a useful tool for a large game set, a risky choice for a small one. The main cost of neglecting it is that in allowing your learning opponents to learn you don't rush, you let them play greedy.

MarcoDBAA on Tuesday, September 14. 2021:

Both things are important.

An abysmal win rate, but few total losses, just means, that the bot is in defensive mode. Mentioned Letabot.

On the other hand, there can be many absolute losses ( red area), but if the win percentage is high for the bot compared to other times, it just means, that the bot tries to go for the throat, which may result in losing some games, but if the win rate is high in comparison to other times, it does work well, and there is no problem at that time.

Tully Elliston on Tuesday, September 14. 2021:

That's a good point (Low number of games ending at a given time indicates defensive posture of your bot). But I still think a focus should be on where the most absolute losses occur, even if the win rate is also good, as by the same logic lots of games ending at a time indicates your bot is often making aggressive plays around then. If there are a lot of absolute losses there, effort in the logic/play leading up to this period of aggression is going to change the most outcomes, Eg. better detect when aggression is the wrong move, or improve early play so you snowball into a better position ready for it.

MicroDK on Friday, September 10. 2021:

I have been thinking about using game length as a factor for the learning. So eg. favoring short wins over long wins, and favoring long losses over short losses.

Tully Elliston on Friday, September 10. 2021:

I think learning is increasingly offering diminishing returns with the current meta. When your opponent is also capable of adjustment the edge learning delivers is often at best temporary.

If your opponent has core skills that outclass yours, your bot ends up deforming itself strategically to try and make up the shortfall by overproducing units/defences or relying overly on trying to produce a circumstance where the opponent's core skills are less important (the unit density of a high resource macro games for example can effectively devalue micro).

It seems to be much stronger to force an adaption than to adapt yourself (adapting is hard). zzzkbot was a good proof of this, winning lots of games with minimal code.

Jay Scott on Friday, September 10. 2021:

It’s a plausible idea, worth a try. But my guess is it won’t give much leverage.

Ideal would be to choose openings which leave you in a strong position, regardless of what happens in the rest of the game. Then the bot can choose better builds in cases where the learning signal is weak, like when the opponent is weak and there are few losses or the opponent is strong and there are few wins. It’s not that easy to come up with an evaluation function accurately tuned to your bot’s strengths and weaknesses, though.

MicroDK on Friday, September 10. 2021:

It also seems Steamhammer has some crashes on Basil. 8 when I writes this since last update.

Jay Scott on Friday, September 10. 2021:

Yes. I also fixed that.

Jay Scott on Monday, September 13. 2021:

These crashes, by the way, only happen(ed) when the bot has no bases. It is losing the games regardless. Even so, I fixed it immediately once I saw it. Crashing bugs must die.

Tully Elliston on Wednesday, September 15. 2021:

Yet more interesting insights can be made looking at the stats of the top bots. Perhaps the most useful is studying which minute(s) bots are most likely to try kick your butt by:

Stardust - 7-8
Krasi0 - 7-8
Monster - 4-5 or 12-14
Crona - 5
Banana Brain - 9 or 12
Adias - 17, 18 or 20
Locutus - 9-11
Hao Pan - 8-9
PurpleWave - 9-18
Dragon - 14-17
Microwave - 4-6

Feeding this information to your learning might be useful for selecting appropriate openings.

Eg.
- brace for a likely rush from Microwave, Crona, Monster.
- play greedy vs Adias , Dragon.
- Pick builds suited to surviving timing attacks vs Stardust, Krasio, Hao Pan.

The numbers also show that a strength of Monster is unpredictability, with it kicking a lot of ass at either minute 4 or 12-14 - no general rule can easily be applied to a bot that can both rush or macro effectively.

Looking at how many top opponents do most of their beat downs before the 8 minute mark - a good case for focusing on skills that can be used in this window.

MarcoDBAA on Wednesday, September 15. 2021:

These stats are interesting to me.

I did not watch that many games this year, so I could be off of course...

krasi0 is like Monster really, that it might rush you, or go for a long game. And it might and probably will crush you in both. M&M control is great, middle and late game armies are terrific.

I have chosen games from the first newer krasi0 update in August to now, and there is one weakness at 8 minutes. When not enough tanks are out and the positioning of single tanks can be suboptimal I think. At least in older versions I watched more, tanks had to be at the cliffs, but they weren´t, and Stardust or Bananabrain (or some zerg in general) profited from it. Late game, both these bots are (or were?) not that dangerous for krasi0, if it came out ok of early struggles, and I rated Purplewave or tscmoor (Protoss) higher (at least a year or so ago, also before that bad PW update).

Dragon graph shows, that you should attack him early, NOT playing greedy. Like all tscmoo author bots, it (relatively) struggles in early game, and becomes very strong later on. Yes, I remember, that Jay mentioned SH winning against it with a greedy strat, but well, I think it is more probable to hurt him early, and the graph shows this too, I think.

adias can be buggy in the early game and be vulnerable to rushes, but I think you can indeed try to be greedy too. If you survive the massive push later (against zerg and protoss, vs terran you need drop defense), you might have a winnable game. I am not rating its late game extremely high.

Yes, vs Hao Pan/Halo, you really need to defend, the Iron derived micro and new strategies make it nearly invulnerable early on. It tries to kill you and doesn´t seem to lose to rushes against it normally. Win rates for Halo go down noticeably, if bots survive the early game. It can win later games too however.

Stardust (newest update) seems to be invulnerable in the early game too (for all existing bots). Good defense is best, as you said? Well, noone besides krasi0 and McraveZ (and they have a losing record too) have much success against the newest update anyway. But you see, what it wants to do too you, and if you find a good answer you might come on top maybe. And I wasn´t sold on its late game.

Against Locutus you want to play similar, and it is a bit easier.
Not the same code (I heard), but the same author and similar play. Stardust is even more single minded going for the dragoon rush.

Bananabrain variability shows in the graph too. Should be fun to play against it. If you can, controlled defense might be the best option? If you are good at 20 minutes (or using an advantage opportunistically), you might win. For many bots win rates stay at 50% there at least, BB struggles somewhat however, and I have seen this in game too. It was bad at controlling the map and defending its workers. BB seemed to be a bit out of its element, if it and the opponent have too many bases.

Purplewave? Graph since the last update (did not really watch that version, so it might be different than the version I knew) that improved it again (but not its zerg and terran sadly) shows good early midgame strength. Rush it or be somewhat greedy maybe?

Agree, that Monster is always dangerous and clearly the best zerg of them all. Crona might be the better rusher maybe. CherryPi might be as strong in the later mid game (stats and watching it), but not in all other situations. Ultra late a tscmooz with correct strategies might be comparable too. McRaveZ might have better Mutas, but Monster is close here too. Monster has no clear weaknesses (I don´t think the same about Stardust, that rather does one "kick" extremely well, than knowing 100 kicks), and the graph clearly makes it visible (maybe problems at similar times as krasi0 to dragoon rushes too, but Stardust is just very good at them)

Crona? If you survive the rush variants, you should be good mostly according to the graph, if your bot has some decent skills. Or rush it very early maybe. Situation is a bit less clear vs Microwave, but similar, yes.

I am adding McRaveZ to it as a top zerg (also because it can hurt Stardust). Looks (games since that new update) like you win before it really gets its mutas, but don´t rush too early either. Late game vs it might be another option, but it isn´t that clear, that there is a weakness.

Just watched this McRaveZ Stardust game btw http://www.openbw.com/replay-viewer/?rep=https://data.basil-ladder.net/bots/McRaveZ/McRaveZ%20vs%20Stardust%20Python%20CTR_134541D6.rep

McRaveZ did everything correctly, until not returning home, when a serious attack on its sunkens started. You need to define a serious attack in the code, because you should go for the probes too (and harass in general, catching dragoons coming in), and one dragoon shooting on your sunken isn´t an attack. Serious might be when a certain percentage of your sunkens take damage or are able to "kiss" :P multiple enemies (or people might have better ideas). You win the race to reach the bases, because mutas are faster, but you don´t win the base race on the other hand, so you must go out and harass an undefended base, but you must return immediately enough too to still create a combined sunken-muta army.

Bruce Nielsen on Monday, September 20. 2021:

Pretty good analysis all around I think.

For Stardust I would add that, although it hasn't lost any games outright very early, it can still lose games to rushes where it just clings on for a few more minutes without having any chance of winning. This exemplifies a possible pitfall of reading too much into the graph, as while it shows when the game ends, that doesn't necessarily correlate to when the game was lost (i.e. the result was inevitable) - that might have been a minute before (especially in the case of bots that gg appropriately, like Monster) or 10 minutes before.

That is not to say that the data is without merit though, it's an interesting angle to analyze.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA