solid versus daring

A game player of a given strength is solid if it wins reliably against weaker opponents, and daring if it loses more games to weaker opponents and makes up for it by winning some against the stronger. I think the term solid is common. I decided for myself that its opposite should be daring.

The idea applies to all games of skill with winners and losers. You can always find more solid and more daring players, unless the game is so constraining that it leaves no room for stylistic differences. From the point of view of a player with a fixed level of skill, you could say that being solid means that your style of play aims to reduce your risk of losing, while playing daringly means you try to increase your chance of winning. From the point of view of an author, you could say that trying to make your bot more solid means working to reduce exploitable weaknesses that cause losses, while trying to be more daring means creating strengths that will catch out some opponents (like timing attacks or unusual rushes or tech switches). It makes sense for authors of weak bots to focus on daringly beating the stronger, and authors of strong bots to solidly beat the weaker. (Of course it also makes sense to do whatever is more fun.)

I’ve never seen a statistical measure of solidness, in the same way the elo is a statistical measure of strength. It seems widely useful, so I hope somebody has worked one out, or will work one out now that they know about it. A good one seems complicated, though. You could do something like estimate the winning chances each player has against each opponent with a method like that of bayeselo, then try to fit a measure of deviation from flatness over the range for each player. Does the difference between predicted and measured winning chance vary systematically depending on the predicted winning chance?

Here’s one simple measure for the top finishers in the SSCAIT round robin: What proportion of a bot’s losses came against the top 16? If most losses are against strong opponents, the bot is solid. The measure is approximately statistically fair only for the top few bots. We can see that Iron is solid and Tscmoo and McRave much more daring, while Killerbot and Bereaver are more solid than Tscmoo and McRave. I don’t think this number gives us much insight into whether Iron is more solid than Bereaver.

#	bot	top16 loss rate	%
1	Iron	7/10	70%
2	Tscmoo	4/14	28%
3	McRave	5/15	33%
4	Killerbot	9/19	47%
5	Bereaver	11/22	50%

Another simple measure for the stronger bots is: What’s the weakest opponent that you lost to in the SSCAIT round robin? The measure will be noisy, and comparisons only work for players that are close in strength. Also extremely daring lower-rank players like Oleg Ostroumov can distort it. But it’s quick to figure out and that counts for a blog post. I read the results from the unofficial crosstable.

#	bot	worst loss
1	Iron	#31 PurpleCheese
2	Tscmoo	#56 NUS Bot
3	McRave	#69 FTTankTER
4	Killerbot	#60 Oleg Ostroumov
5	Bereaver	#35 Dawid Loranc
6	Steamhammer	#44 Lukas Moravec
7	Wuli	#61 Marine Hell
8	CherryPi	#60 Oleg Ostroumov

My feeling is that Killerbot and Wuli are more solid than this noisy measure gives them credit for, and otherwise the numbers give a rough but fair idea. Iron is more solid than Tscmoo or McRave. Bereaver and Steamhammer are more solid than, say, McRave and CherryPi. In Steamhammer I’ve worked toward solidness, so I’m pleased to have it.

Trackbacks

No Trackbacks

Comments

McRave on Friday, January 19. 2018:

Interesting analysis! I have to agree with your solid vs daring comparisons. Oleg is an interesting bot that works very rarely, but can pack a punch versus zergs. Very daring indeed.

Jay Scott on Friday, January 19. 2018:

In the round robin, Oleg actually scored more upsets over protoss than over zerg. The strategy is risky and not well executed, but even so it is genuinely dangerous when it catches you off guard. I’ve seen master defender Krasi0 lose to it.

McRave on Friday, January 19. 2018:

Wouldn't expect that!

Thomas Peck on Friday, January 19. 2018:

Elo is defined vaguely as the mean of a skill distribution. What you want for the daringness score is something like the standard deviation of the same distribution.

krasi0 on Saturday, January 20. 2018:

Well, that's the thing with opening learning - you've gotta break some eggs to make an omelette... Oleg crashing during start in 95% of the games against me doesn't really help learning either...

Jay Scott on Saturday, January 20. 2018:

Do any bots try to recognize opponent crashes? I think that if the opponent leaves the game and you are not ahead, you might prefer not to learn from that game. Is there any other way to recognize an opponent crash, or is it indistinguishable from the opponent giving up and leaving the game? MegaBot2017 claims to have crash discounting, but when I read the code I found out that it was discounting its own component strategies when they crashed, to help avoid the bad ones.

krasi0 on Saturday, January 20. 2018:

I have thought about it a little and decided that it's not worth pondering too much. I would just let it consider opponent crashes as our victories. I mean opponents that crash way too often won't win a Bo9 against me anyway, so why bother. :)

Sq on Sunday, January 21. 2018:

One potential daringness score: rank all the bots by their win percent, then take (wins against higher% bots + losses against lower% bots) / (total games). Note that the scores will tend higher towards the middle of the pack; being on the top or bottom of the list requires some consistency.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA