Entries by Jay Scott | Starcraft AI blog

Steamhammer wants opening learning 2: zealot rushes

Wuli’s zealot rush beats Steamhammer about 4 games out of 5. UAlbertaBot’s usually wins too. Carsten Nielsen’s wins fairly often, and Lukas Moravec sometimes plays a winning zealot rush. Steamhammer’s weakness is not the initial defense, which holds the zealots for quite a long time. It’s similar to games versus McRave; the weakness is in the transition to lair tech. Steamhammer ends up without enough drones.

Someday the strategy boss will be smart enough to understand the situation and make a safe transition. It will be easier if I improve Steamhammer’s defensive skills, which are simpleminded. But those things take time. In the short run, it’s easier to come up with an opening that beats zealot rushes. Then all Steamhammer needs is to know when to play that opening.

Bots should benefit hugely from opponent modeling, because other bots mostly play in stereotyped ways. Wuli and Carsten Nielsen never deviate from their rush builds. But Lukas Moravec, among others, plays more than 1 build. So to counter zealot rushes, a bot also wants plan recognition. If the scout sees 2 or more gates and a lack of other stuff (no gas, forge, or expansion), then the bot had better stay safe against a hard rush. If it sees a forge and cannons, it had better emphasize drones and hatcheries instead. Opponent modeling and plan recognition can be combined: Here are the plans the enemy has been seen to follow (described in some abstract way, such as the timings at which units appear). Based on scouting information, this past enemy behavior is the closest match, so let’s counter it. Since bots are predictable, simple opponent modeling is likely to give big leverage.

More about opponent modeling: If you know the opponent’s range of openings, then you can figure out how best to open yourself—what to do until you get scouting information and can adapt. Lukas Moravec never tries a fast rush, so you don’t need to stay safe against rushes. If the opponent never plays proxies (like most), then you don’t have to scout for proxies. If the opponent plays a mix of fast and slow openings, then you can try to estimate the best counter-mix with game theory. There are a ton of ways to gain advantage by knowing the opponent’s habits.

tscmoop-Steamhammer hell-for-leather game

This game Tscmoo protoss vs. Steamhammer went back and forth, with both sides repeatedly decimating the other’s workers. On the one hand it’s kind of entertaining, but on the other it shows how weak both bots still are. Steamhammer was much too hesitant with its mutalisks, and Tscmoo did not make the right units to keep control of the situation. They both made plenty of mistakes.

Tscmoo went for a forge expand build, much safer than the bare expansions it has been opening with recently. Steamhammer opened 12 hatchery and got away with it because tscmoo didn’t scout early (if protoss scouts it in time the scouting probe can delay the hatchery, and even if that fails protoss can pull ahead economically by starting the nexus before any cannons, since zerglings will be delayed). Steamhammer has a partial understanding of how to counter forge expand, and it made extra drones and went up to 4 bases. But zerg was also clumsy around the cannons and lost zerglings unnecessarily.

Tscmoo built up zealot numbers while teching. It got a templar archives and high templar, plus a stargate that it never used (but which zerg had to prepare against). Steamhammer started with hydraling on the ground since hydralisks are good against cannons, but seeing the zealots it switched to mutalisks. Zerg started its carapace upgrade shortly after protoss started attack +1, which was correct timing, and later in the game drew ahead in upgrades.

The mutalisks cleaned up probes in the protoss main, apparently putting zerg well ahead. Notice the red zealots in the middle on the minimap.

But the zealots were too strong for the ground army and returned the favor. The mutalisks indecisively moved back and forth, taking occasional swipes at protoss stuff but reacting late to the zealots. Here zerg seems to be fighting back, but at the 3 o’clock expansion zealots are ravaging drones with little opposition.

Zerg ingeniously transferred drones between bases during the fight and lost more than it should have. When the smoke cleared, zerg had 9 workers and protoss had 12. By the numbers, zerg was narrowly ahead in army, but protoss was merging archons and if the archons maneuvered well then Steamhammer would have to back off while both sides rebuilt. The game was still on.

Well, the archons did not maneuver well. The first one tried to engage the mutalisks by itself, instead of retreating to the cannons to wait for the second archon. That would have been a difficult move for a bot to find. Protoss lost more probes and the templar archives came under fire.

At the same time, zealots returned and killed more drones. This time Steamhammer had the sense to defend its natural with a sunken, and the zealots could not land a killing blow. Steamhammer was able to restore a modest economy while Tscmoo kept losing probes, and archons, and high templar before they could merge, and finally buildings.

Tomorrow: Steamhammer wants opening learning 2.

Steamhammer wants opening learning 1: McRave

I still have some essential bugs to fix. Also I keep literally forgetting that I promised a public repository next, and having to remind myself; it’s not appealing work. But beyond all that, I’m thinking about the next step.

The next major feature will be a start on opponent modeling and opening learning. As I think about it more, I’m seeing how important a feature it is to add soon, so I don’t regret setting out down the path. I’m going to write a series of posts giving examples to show how key a feature opponent modeling is.

McRave

The current version of protoss McRave doesn’t play that strongly. Its new forge-expand strategy is not polished yet, and I think new weaknesses have been introduced. But even so, it beats Steamhammer. The live Steamhammer always plays low-econ pressure builds versus protoss, and (unlike most protoss bots, even Bereaver) McRave is a sturdy enough defender to hold off the pressure. While trying to pile on pressure, Steamhammer doesn’t make enough drones, and as the game wears on it can’t keep up.

It’s easy enough to change Steamhammer’s behavior. There is a standard opening which is slower than 12 hatch 11 pool and faster than 3 hatch before pool: It is 12 hatch 13 pool, squeezing in 2 extra drones before the spawning pool. Those 2 drones make a substantial difference in the economy, because the earlier you spawn a drone, the more it pays off. The trade is that zerglings come later and the opponent gets a window for early aggression. Anyway, this McRave version doesn’t go for early aggression, so I coded up a quick 12 hatch 13 pool into 3 hatch hydra build. Sure enough, my first draft won most test games against McRave.

I don’t want to make 12 hatch 13 pool a standard build versus protoss, because too many bots go for early attacks. Steamhammer’s low-econ openings are effective against most opponents. But I need economic openings to win against defensive opponents like McRave.

I could specify an enemy specific opening mix versus McRave and immediately turn a bunch of losses into wins, at least for a time. Steamhammer does that versus rushbots and a few others. Arrakhammer does it too; it has hand-made builds to beat Wuli and McRave, and one to give it a chance versus Iron. But it’s not satisfying. It’s not sustainable, because I have to keep updating the hand-made builds by hand as opponents appear and change. And it’s likely to fail in tournaments, because many opponents show up with surprise updates that are specifically intended to throw off opponents which tune against them.

The answer is so learn each opponent’s habits from experience.

more new bots: GOAL project, Antiga

GOAL

SSCAIT is seeing new bots from the GOAL project of the Technical University of Delft. They include AyyyLmao that appeared on a weekly broadcast and JEMMET. Possibly Adrian Mensing is another? At first none of them started, but now JEMMET appears to be running. I hope the secret will be passed around so more of these bots can get underway.

From the material, it seems that these are first year undergraduate students who are just learning AI and put the bots together in a 10 week course, so—expect not too much skill, but a lot of creativity. Should be fun.

The GOAL system I take to be a learning environment for agent programming. In these bots, each Starcraft unit is a separate agent that in principle makes its own decisions. It’s not what I would choose, but it’s a cool architecture. The agents do their thinking in Prolog, and talk to the game via a Java connector.

The GOAL Agent Programming Language Home
Multi-Agent Systems in StarCraft - PDF paper describing the setup

Antiga

Antiga is a Steamhammer 1.2.3 fork playing protoss. It looks as though has been recompiled, but any source changes to the original are probably minimal, because the .dll file size is the same in bytes (though the contents are not identical). But there are some curious builds listed in the configuration file.

The version I downloaded has its openings set like this (earlier versions played different openings, and I expect later versions will change again).

versus terran - 13 nexus fast expansion
versus protoss - the Stove
versus zerg - the Stove
versus random - 40% 10-12 gate, 60% dark templar rush

It is a very strange collection of builds. Except for 10-12 gate, they should lose quickly if the opponent goes for early aggression. The Stove is a wacky opening with scouts followed by dark templar—it’s not hard to defend, but it is a weird unit mix. The bot has a variety of other openings coded into its config file but not currently configured, including (by name): 10-12GateExpo, 3GateCore, 1GateExpo, 2GateGoon, 2BaseDT.

TyrProtoss

Apparently there is a trend for established bots to come out with versions playing a different race. McRave brought Sparks, now Simon Prins’s Tyr has brought TyrProtoss. Both off-race bots are the same code as the original, with updates as needed.

TyrProtoss, like Tyr, knows more than one build and tries to learn an appropriate one against each opponent. In its first game against an opponent, no matter the race, it opens with forge, gateway, and cannon to secure its natural, then starts the nexus. When it has built up a strong gateway army, it moves out to attack. If that build loses, in the next game against the same opponent it tries a dark templar rush. Both openings are weak against early aggression, so there should be at least one more build that I haven’t seen yet.

Looking into the jar file, I see classes called CarrierRush, CorsairHarass, DragoonPush, MassReaver, and MassRecall, among others. Nothing says that these are all complete and enabled, but TyrProtoss may have a lot to show us.

TyrProtoss still comes across to me as unsophisticated next to its terran version. I don’t see as wide a unit choice, or as many reactions. But I guess we can expect it to get smarter with time.

One good game was versus Zia, which came down to a base race in which neither side paid much attention to defense.

Big eyes

Steamhammer played a couple games against the new protoss Big eyes today. Big eyes opens with zealots, then adds some cannons while it techs to corsairs and dark templars (it plays about the same against all races). As usual for a new bot, the build is not efficient and doesn’t place much pressure on the opponent; good bots are hard and take a long time. But one point I notice in particular: Big eyes always makes pylons for twice as much supply as it needs. I think the author did not notice that BWAPI reports supply numbers double what Brood War shows. Fix that oversight, and the bot would suddenly become much stronger.

I would like to develop intuition about how much strength different bugs and missing features lose, so that I can do the important stuff first. Right now, Big eyes is rated 1754. I think this is a giant oversight, worth over 100 elo... but probably not as much as 150? I’ll call it 125. That’s my guess. I hope to find out how close I came. Does anybody else want to guess and help develop my intuition?

Tomorrow: TyrProtoss.

more findings on the new SparCraft version

Most of my attention is going to bugs, but I’m still poking at the new SparCraft version too (I also ran a few tests). My latest conclusions:

1. I’m still worried whether it will be fast enough. It feels slow. I’ll have to try a test on the SSCAIT server to find out.

2. It’s lying when it claims to support scourge. The provided UnitTypeSupported() call returns true, but with scourge in the sim it throws and you don’t get a result. I modified UnitTypeSupported() to exclude scourge, so that at least it’s honest about what it can do.

3. It also doesn’t support spore colonies correctly. I’ve seen it be startlingly accurate in predicting a small-scale fight of marines versus a sunken and a few zerglings (attacking with barely enough marines and scraping a win with 2 bleeding survivors), but in a fight of mutalisks versus spores it says “sure, 1 mutalisk can win, no problem.” This is the map to Suicide City; enter here and do not exit.

I decided to compensate with a crude adjustment. Don’t add enemy spore colonies to the combat sim. And for every enemy spore not added, drop 6 of our mutalisks (pro rated to the hitpoints of the spore colony). That’s about the number you need to beat a spore safely. In initial tests the adjustment seems not too silly. In reality 6 is too large a number, especially if there are separated spores, but I’m tired of the one-way trips to Suicide City.

Out of the huge range of unit combinations in the combat sim, I have hardly tested any. I expect to find more that SparCraft gets wrong.

Arrakhammer’s infested terrans

Arrakhammer’s description changed recently to say that it supports infested terrans. As far as I know, it is the first bot to make them. Now it has played a game with infested terrans versus Randomhammer.

The game is on Destination, a 2-player map. Arrakhammer opened with hatchery on 9, and instead of playing the opening the natural way with mass zerglings, only made a handful, aiming to tech up fast. Randomhammer opened with 2 barracks and before long had enough forces to push to the zerg natural. Zerg was forced to make sunkens, but terran did not dare to break in, and lost medics due to a bug in retreating them.

Zerg went for lurkers against the infantry, but the expensive sunkens set Arrakhammer’s tech plans far back and the lurkers were slow. Terran accumulated marines to hold a strong contain while expanding. Randomhammer had an objectively winning game; with good play, zerg should never catch up (see the worker and army counts in the picture). But Randomhammer is not a good player.

Straight infantry with only scanners for detection is not a winning combination against lurkers, especially not if you try to attack across a narrow bridge and repeatedly lose medics to the retreat bug. As long as scanner energy held out, Randomhammer held its own. But the scanners ran down and Arrakhammer hesitantly pushed out into the open map.

In this picture the lurkers have defeated most of the infantry after scanner energy ran out during a big fight. The queen has just now infested a command center, and from the flashing ramp and the production tab you can see that an infested terran has already started. Before the base was taken, terran had 5 bases to 4. Zerg had more or less caught up, and both sides had chances.

The first infested terran was shot down barely before it reached its intended victim. Steamhammer knows that an infested terran is a high-priority target, even though this is the first time it has seen one. In the overall situation, marines continued to fight lurkers without enough scan energy, and zerg was pulling ahead.

In the next picture, terran has lost another base, but the marines gave a good account of themselves, clearing the attack and killing the infested command center before it could produce. Losing the base was down to poor tactical skill. The marine army was larger and had 1-1 upgrades versus 0-0 for zerg, and repeatedly fought well until collapsing when scanner energy ran out. The queen, by the way, broodlinged an SCV before the zerg attack.

Infested terrans are difficult units to use well. They’re similar to scourge: Suicide units that do a ton of damage, but are fragile and costly in gas. If you walk them into an enemy army, the army should normally kill them without much risk, as above—it only took 4 marines. If they walk in the open, they should attack SCVs or defenseless buildings. A few infesteds can clear out a dense mineral line, or demolish a substantial block of supply depots; they’re efficient for that. You can drop them on tanks, or on dense concentrations of units that can’t shoot up. Otherwise you need dark swarm to attack an army with infested terrans. You go to a lot of trouble to get them, and they’re a specialty unit which is complex to use.

Of course bots don’t know how to react, so walking into an army may work in practice. Steamhammer, for example, doesn’t react at all when retreating, so if you attack while the army is running away, you can do massive damage. 2 infested terrans did this, with excellent cost-efficiency, by catching the army in retreat:

Notice that even after the carnage, terran is ahead in army, though it has fallen behind in economy. Zerg is winning but has some fighting ahead.

The upcoming version of Randomhammer, by the way, has TvZ improvements so that Steamhammer can be a tough test opponent for itself. The upcoming version would have put up a fiercer fight, without the medic retreat bug and with a vessel and tanks against the lurkers.

1000 comments

Joseph Huang aka bftjoe posted the 1000th published blog comment. (The number of rejected spam comments is much higher, unfortunately. So far I have rejected only 1 non-spam comment for content.) This blog entry is the 314th, so it averages a little more than 3 comments per entry.

Community successfully engaged!

fitting tournaments into the calendar

About a dozen bots have been updated in the last week, and a bunch of new bots have been uploaded this month. We’re going through a burst of activity similar to the one around the previous SSCAI tournament.

The reason might be that the CIG tournament is coming up. The timing fits, but other evidence is not as clear. I don’t see an upcoming tournament as a good reason to release new rushbots, or to put out an instance of McRave playing terran (Sparks). But only the bot authors know their own motivations. Maybe some will comment?

A tournament in April or thereabouts might be a good way to maintain interest, if tournaments really are good motivation. Thinking about the timescale on which interest waxes and wanes, I guess April would be about right, depending on how long the tournament runs. The CIG and AIIDE tournaments are tied to academic conferences and can’t be moved around the calendar, and SSCAIT traditionally runs starting near the end of the year. It’s possible that a gap-filling tournament, if people see it as important, might keep the scene lively. Well, it’s a speculative idea, but spacing tournaments around the calendar makes sense. Authors should have time to make updates before the next submission.

Sparks, by the way, has a pretty good strategy. It is McRave set to play terran, and obviously some thought has been put into the terran, though McRave’s protoss play is more sophisticated (so it will presumably play as protoss in tournaments). I imagine it is named after “sparks terran,” which was traditionally a sunken bust timed for just before mutalisks came out. (Sparks terran is a strategy I’ve been keeping in mind as tough for a zerg bot to counter when following a mutalisk plan. Luckily for Steamhammer, it’s not easy to implement well. Tscmoo comes closest but is missing skills.) Today people often say “sparks terran” and mean nothing more than straight infantry play.

Anyway, Sparks the bot opens with 2 barracks and puts its first marines in the mineral line for safety. When it has enough it transfers them to the ramp, arranging them in a nice arc. When stim is done, upgrades are started, and medics are available, it moves out to attack. A terran should stop the attack easily; a protoss shouldn’t have much trouble if it saw what was coming; but a zerg needs to pay attention and defend smartly to hold. It’s a good attack timing.

recommended deep learning textbook

The recent textbook Deep Learning looks excellent to me. The authors are Ian Goodfellow, Yoshua Bengio, and Aaron Courville. There is an expensive edition, but it is also readable for free at the website. It’s a much-recommended book, and I recommend it too.

It is a theory book more than a practical book. I would say it is for people who have a computer background and perhaps don’t have deep math experience yet but aren’t afraid of math and are willing to dig in. I think it should be a good book for an early grad student, or an undergrad with strong interest, or a bright high schooler. The first part of the book presents the math knowledge you’ll need, like linear algebra and probability theory, so it is possible to start if you don’t know much. As always, the more background you have, the easier it gets.

To become expert, you have to know the theory and have experience applying it. If you want practical exercises to work your way into the technology, I think your approach should be to pick a software framework first (for example TensorFlow, Torch, Caffe) and then seek out tutorials or sample projects specific to the framework.

Everyone has their own learning style. If I were getting into deep learning from scratch, my approach would be: Read the whole book once through quickly to get an idea of the shape of things. With an overview in my mind, I could pick out parts that I needed to step through slowly and carefully. It takes time and practice to make unfamiliar concepts familiar, so if I hit topics where I felt weak or awkward I might seek out other resources.

map analysis plans

One of my goals for Steamhammer is to remove the dependency on BWTA, a large, slow and troublesome library. Its startup time to analyze a new map for the first time is ridiculous, and I want full power to analyze map blocks. I could go to BWEM, and I may yet change my mind and do that. But I’m not 100% satisfied; I might have to modify the library to simplify hypothetical reasoning and add features like pathfinding for units which are pushed through minerals. Another idea is to start with UAlbertaBot’s existing distance maps. It calculates ground distances at build tile resolution (32 x 32 pixel tiles) rather than the walk tile resolution of the map (8 x 8 pixel tiles), but this doesn’t cause any obvious problems and is easy to change if necessary. UAlbertaBot doesn’t actually rely on many BWTA features.

I was leaning toward the native distance map solution already. It calls for writing more code, but not that much, and the final solution would end up simpler and better tailored. Now Dave Churchill has made exactly that change to UAlbertaBot. It smooths my path since I can borrow code, and I plan to follow.

Dave Churchill removed BWTA big-bang style from UAlbertaBot, dropping the dependency and replacing its major uses in one step. I think the largest piece was to add a BaseLocation class—not a large piece at all. My implementation strategy will be the opposite. I’ll replace uses of BWTA item by item, testing as I go, and when I’m satisfied I’ll drop BWTA. The development process should be gradual and stable.

As a first step I taught Steamhammer to pay attention to map blocks in calculating ground distance maps. It doesn’t notice when blocks are removed, or route units around blocks, it only calculates distances more accurately based on the map’s initial state. Even this small step improves play, since Steamhammer decides on expansion locations partly based on ground distance. Since I reflexively clean up any code that I touch, I also fixed a bug in checking whether a tile is walkable and reduced the unnecessarily large size of the data structures. Distance maps are better, faster, and cheaper—there should be less memory traffic in calculating a map, and lookups will be more compact in cache.

new LetaBot human and machine tournament

Martin Rooijackers aka LetaBot is running a Mini team melee + bots tournament in a week, on next Saturday. As usual there are curious experimental rules, this time team melee games for human players. Bots will play ordinary 1v1 games and will notice nothing out of the ordinary.

These tournaments are small and super easy to participate in. If you let LetaBot know you want to play, he can take care of everything. Afterward you can download the replays. I recommend it.

I really want Steamhammer to play, but it has a deadly bug: It can’t play on LAN latency (or higher). It drops steps from its opening build and gets completely messed up, and there’s no point in playing like that. Maybe I can fix it in the next week....

Update: The tournament has been postponed. See the original announcement thread.

tricky bugs

Bugs can be deeply interconnected in obscure ways. Sometimes one appears after changes that seem to have no relation.

If you watch the latest Steamhammer, you’ll sometimes see idle drones in its base, sitting on the creep doing nothing. It happens especially when the bot has been holding off heavy pressure for a long time, as if its APM were not enough to keep up with managing its base. And I haven’t seen the bug in older versions.

It’s actually a primordial UAlbertaBot design flaw that happens to manifest now because of changes in Steamhammer that have nothing to do with drones. When ProductionManager sees that a building is coming up next, it checks whether it can save time by moving a worker to the building location immediately, so that construction can start as soon as resources are available. If you then insert something into the production queue ahead of the building—which parent UAlbertaBot will do when it realizes a sudden need for detection—then the building will come up again in the queue and the bot may send another drone, leaving the previous one idle. There is no tracking except the order of items in the production queue, which is unstable. The newest Steamhammer triggers it more often because its urgent reactions, the things it does when under heavy pressure, more often insert stuff into the queue ahead of buildings. Then of course the bug causes mining to slow down, so that the pressure breaks through, and the reactions end up backfiring.

By the way, ProductionManager ought to also check when tech for the building will be available. If you watch Steamhammer build its spire, sometimes you’ll see the building drone waiting in place, twiddling its zergy thumbs instead of working, well before the lair is finished. ProductionManager only checked the resources, so it thought the spire could start earlier and sent the drone too soon. I’ll fix it eventually, but it probably doesn’t lose more than 24 minerals.

I have seen the same bug in Arrakhammer. Microwave solves the bug by catching idle drones and putting them back to work. One older Steamhammer version did the same. Unfortunately, a drone that was about to start a building may be put back to work instead, causing a construction delay—that’s why I undid the change in Steamhammer.

Another solution would be to return drones when the queue is messed with. Steamhammer sometimes decides that an upcoming production item is useless and should be dropped; if it drops a building from the queue, which it sometimes does, that could cause the same bug. Messing with the queue behind the scenes needs to notify ProductionManager.

A more thorough fix would be to delegate all the work to BuildingManager. It’s awkward for drone pre-positioning to be in ProductionManager while the rest is in BuildingManager; better to keep the related parts together. If buildings are sent to BuildingManager before they can be constructed and BuildingManager is responsible for positioning workers, then the existing BuildingManager state (with the addition of an “in preparation” label for buildings that can’t be started yet) can keep track of which worker has been assigned to each building and avoid some construction delays that happen now when workers move around unnecessarily before starting the building. It’s more complicated, because messing with the queue then needs to notify BuildingManager. So I might go with the simpler solution.

Getting all the errors out of the infrastructure is hard. I have fixed about 10 bugs for the next version, but there are other basic infrastructure bugs that are as bad as this one, and I have fixed zero of them. They’re tricky. Meanwhile, the zerg emergency reactions also indirectly cause other errors in the strategy boss, sometimes preventing necessary tech switches....

the latest new bots

We’ve had an “unscheduled” influx of new bots. Here are the latest.

Goliat - Terran. Doesn’t start.

ZergYue - Zerg. Doesn’t start.

zhandong - Zerg. UAlbertaBot fork (not an exact clone) playing 4 pool. Because of UAlbertaBot limitations, the opening is not efficient. It plays 4 pool, drone scout (leaving 2 drones mining), then 2 new drones. The 2 drones delay zerglings, but UAlbertaBot never returns the scout drone and chases the enemy scout with 1 drone, so you have to make that many if you want 3 drones mining to produce constant zerglings. Still, it’s not a convincing build.

Black Crow - Zerg. This one I like. Its strategy makes me think of a modernized, improved Bjorn P. Mattson. It opens 9 pool speed, so it should be safe against rushes. (Idea: Overpool should still be safe against rushes and would reach a strong economy faster. It might require a little defensive skill, though.) Like the old zerg, it builds up its drone count and hatcheries to flood the enemy with zerglings. Unlike Bjorn P. Mattson it stages zergling forces on its ramp for safety and unleashes them in waves, and of course it gets speed so the lings are more dangerous.

Black Crow is made from scratch. The commit message “First global strategy implemented” suggests that it may evolve into something fancier. And in fact the tech tree code and what-unit-to-build-next choices look fairly general in design, nothing like “durr zergling bot make zergling.” Of this crop, I vote Black Crow the Most Likely To Succeed.

in other news

PurpleWave has been headed up the rankings. It used to be weaker than its first draft PurpleCheese, and is now considerably stronger, rated over 2100. I think it is the bot with the greatest recent improvement. I predicted that the -ave name is a sign of success, and it looks as though the prediction is coming true.

a few SparCraft test results

I ran test games of the new SparCraft version versus the old one, using Steamhammer versions that are otherwise identical. So far I ran a 15-game test match with a fast mutalisk strategy, plus a few scattered test games with other strategies.

The bad news: The new SparCraft is slower. Both versions have spiky, unpredictable CPU usage; small differences in the input can make large differences in the runtime. It’s common to see most runs taking microseconds and occasional runs taking milliseconds or tens of milliseconds. New SparCraft seems to spike up more often and several times higher. I often saw it spike up 4 times higher. I’m worried that in a complex situation it could bust the time limit and lose. If so, it will need an annoying mitigation like being run in a separate thread, or cutting the search short when it threatens to run too long.

The good news: The new SparCraft is sometimes much more accurate, and I haven’t found a case where it is worse. I tested by fighting the 2 versions against each other, both playing the same opening. It’s only good for testing restricted sets of units, but it is the most sensitive real game test I can do. Other tests would require running more games to make differences show over the noise. The new version scored:

11-4 with 11 gas 10 pool opening, zerglings into fast mutalisks
stalemated games that take too long to finish with a mass ling strategy
2-1 with a mass hydra-ling strategy
2-1 with a dragoon strategy

Only the first result is meaningful in itself. The new SparCraft is much, much more successful with air-to-air battles between mutalisks. New SparCraft claims to support scourge (the function to test whether a unit type is supported returns true), but I can’t tell from the results. Muta vs scourge fights are as confused as ever; mutas don’t know when to attack and when to run. Also muta versus spore colony is as useless as before, to my surprise; the air units still suicide into static defense. Only muta versus muta is improved, but the improvement is giant. The new version pulled out some wins from behind, which was impressive to see.

The other 3 results only mean that the new SparCraft is not glaringly worse in those cases. It might be better, it might be the same, it’s less likely to be downright awful. I didn’t collect enough data to say more.

Of course I tested only a few unit mixes. There may be many more unit combinations it loves like muta-vs-muta, and many it hates like muta-vs-spore. I hope that by looking at melee units, ranged ground units, and ranged air units, I covered everything at least lightly, but who knows what quirks may be lurking?

My test with identical players playing identically except for the SparCraft version was designed to amplify differences between the SparCraft versions. If I didn’t restrict both sides to play the same opening, the results would not be 11-4, but probably closer. Differences in the openings would add their own effect. The difference in real world muta-vs-muta fights is probably not as big as the 3:1 that the match result suggests, because of all the other differences between bots.

To find out the real world difference, I’ll have to try it on the server.