archive by month
Skip to content

new Steamhammer versus old Iron

I ran a 15 game match of this year’s tournament Steamhammer versus last year’s tournament Iron. This is a true year-over-year comparison. By comparing the 2017 SSCAIT Steamhammer versus the 2016 SSCAIT Iron, we can get a measure of how much Steamhammer has improved.

Over the course of 2017, Iron has consistently stayed well ahead of Steamhammer, fixing weaknesses before Steamhammer was able attack them properly. I think Iron has even pulled ahead a little, which is impressive because Iron is stronger and therefore harder to improve. Last year’s Iron versus last year’s Steamhammer goes to Iron with a few losses, this year’s Iron versus this year’s Steamhammer goes to Iron with maybe a couple losses, maybe none.

It turns out that Steamhammer has more than caught up with last year’s Iron. The score was 10-5, new Steamhammer over old Iron. Steamhammer won both with mutalisks (which Iron at the time thought it was ready for) and with lurkers (which old Iron was not prepared for). 2016 Iron was already tough, so it’s a good result. The result table here is from Steamhammer’s point of view.

mapopeningw/lnotes
BenzeneOverhatchLateGas0A mass zergling opening with slow tech. It can’t touch Iron’s vulture build.
Destination4PoolSoft0Iron has zergling rushes down cold.
Heartbreak Ridge9PoolSpeed1Steamhammer struggled with the map, as it often does, but eventually won with lurkers.
Moon GlaiveZvT_2HatchMuta1Smash.
Tau CrossOverhatchLateGas0Bad luck for this inappropriate opening to come up a second time. Steamhammer plays it in 1% of games.
AndromedaZvT_3HatchMutaExpo1A bit of a struggle, but the mutalisks prevailed.
Circuit Breaker9PoolSpeed1Iron defended too cautiously, pulling excess SCVs, letting Steamhammer draw ahead in macro. Zerg won with lurkers.
Electric CircuitZvT_3HatchMutaExpo1Smash.
Empire of the SunOverpoolSpeed0Steamhammer got lurker tech but decided to make no lurkers. It went hydra and lost after the tank numbers built up.
Fighting SpiritZvT_3HatchMutaExpo1Smash.
Icarus2HatchLurker1Iron was not ready. Spider mines and wraiths eventually cleared the lurkers, but the command center was already lost. Mutalisks finished the game.
JadeZvT_3HatchMutaExpo1The most hard-fought game of the match. Iron held on with good turret repair and its vulture containment, but Steamhammer slid drones past the contain to take the map and eventually win with hive tech.
La ManchaZvT_2HatchMuta1Another hard fight going to hive tech.
Python3HatchLurker1Smash.
RoadrunnerZvT_13Pool0One of Steamhammer’s classic openings, but objectively weaker. The initial pressure looked good but was never enough to tip Iron down the slope.

The games were varied, especially considering that Iron plays a single strategy, and both sides got to show their strengths. Some were one-sided smashes for one player or the other.

Steamhammer was able to prevent too much damage from Iron’s standard 3 vulture runby with zerglings. Iron’s vulture micro at the time was not slick enough to keep the vultures safe from speedlings in the zerg base. Or to say it differently, Steamhammer’s micro has improved enough to nab the vultures when they get stuck on a building or take a wrong turn.

In a number of games, Steamhammer won by getting ahead of Iron in workers. When mutalisks fly in, Iron has turrets ready and works hard to keep them repaired. Steamhammer likes to pick off the repairing SCVs. It’s costly in mutalisks and reduces the immediate pressure, but zerg gets ahead in macro and can win later in the game.

Iron’s forward turrets protecting its vulture contain sometimes confused mutalisks and distracted them from what they should be doing, but I don’t think the effect was decisive in any game. Loss of drones that tried to exit through the vulture contain was a problem, though.

It was striking that old Iron did not scout the map in the middle game. If Steamhammer squeezed a drone past the vulture containment and the game continued long enough, zerg could set up additional bases that remained safe throughout. It happened in both of the hive tech games.

What do the last 2 days’ results mean? In the SSCAIT 2016 round robin phase, Steamhammer 0.2 finished tied for places 16-17, and the latest Steamhammer is stronger than its much-improved successor Steamhammer 1.0. Iron finished 5th with 89% win rate, and the latest Steamhammer is stronger. Steamhammer improved tremendously in 2017, so much that it would have likely have finished in the top 4 last year if we sent it backward in time. Where did I leave that time machine?

We’ll see how it does this year. The tide has risen all around. Still, 2018 had better watch out!

Trackbacks

No Trackbacks

Comments

krasi0 on :

Could you perhaps also measure the strength improvements of other bots (their latest versions compared to their 2016 versions) by running some matches against the old / new SH, Iron, etc.?
Also, in terms of ELO, how much has Steamhammer gained in total throughout 2017?

Jay Scott on :

Hmm, I think the bots that would be interesting to compare are Krasi0, LetaBot, and Tscmoo. I don’t have archived SSCAIT 2016 versions of any of them. I don’t actually run that wide a variety of test opponents locally, so some might give me setup trouble. But if somebody can volunteer the binaries, I can try....

Jay Scott on :

Elo comparisons over time are tough, because elo depends on the opponent pool. To do a serious comparison, I would need to test new-old versions of a ton of bots. I don’t think it’s feasible. This would have to be done using the historical results data, which means that when one author uploads a stronger version of their bot, everybody else’s elo appears to decrease—elo computed this way gives good rankings at any given time, but it’s not accurate for comparisons over time.

Jay Scott on :

Here’s a way to do elo comparison over time from the history, if you have the game records and know when each fresh upload happened: Treat each upload as a different bot, and calculate its elo independently of the other uploads with the same name. The elo for each upload can be treated as static even if the bot has learning, so bayeselo is a suitable analysis program. Some bots are not updated over the whole period, and they provide fixed points to help stabilize elo over time. Then you can graph each bot’s elo over time as a sequence of horizontal lines and see when improvements were made.

krasi0 on :

In order to help estimate ELO advancements over time, I've started recording historical data for each bot / date. This is what feeds in the data for the two `ELO charts over time` for SSCAIT (now broken due to the running tournament).

Tully Elliston on :

Steamhammer is doing great so far: 20-1, which is the best win ratio for that number of games.

Is there somewhere I can find the replays to watch?

Jay Scott on :

The SSCAIT replay page now has OpenBW links so you can watch: https://sscaitournament.com/Replays/

Add Comment

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA

Form options

Submitted comments will be subject to moderation before being displayed.