solid versus daring
A game player of a given strength is solid if it wins reliably against weaker opponents, and daring if it loses more games to weaker opponents and makes up for it by winning some against the stronger. I think the term solid is common. I decided for myself that its opposite should be daring.
The idea applies to all games of skill with winners and losers. You can always find more solid and more daring players, unless the game is so constraining that it leaves no room for stylistic differences. From the point of view of a player with a fixed level of skill, you could say that being solid means that your style of play aims to reduce your risk of losing, while playing daringly means you try to increase your chance of winning. From the point of view of an author, you could say that trying to make your bot more solid means working to reduce exploitable weaknesses that cause losses, while trying to be more daring means creating strengths that will catch out some opponents (like timing attacks or unusual rushes or tech switches). It makes sense for authors of weak bots to focus on daringly beating the stronger, and authors of strong bots to solidly beat the weaker. (Of course it also makes sense to do whatever is more fun.)
I’ve never seen a statistical measure of solidness, in the same way the elo is a statistical measure of strength. It seems widely useful, so I hope somebody has worked one out, or will work one out now that they know about it. A good one seems complicated, though. You could do something like estimate the winning chances each player has against each opponent with a method like that of bayeselo
, then try to fit a measure of deviation from flatness over the range for each player. Does the difference between predicted and measured winning chance vary systematically depending on the predicted winning chance?
Here’s one simple measure for the top finishers in the SSCAIT round robin: What proportion of a bot’s losses came against the top 16? If most losses are against strong opponents, the bot is solid. The measure is approximately statistically fair only for the top few bots. We can see that Iron is solid and Tscmoo and McRave much more daring, while Killerbot and Bereaver are more solid than Tscmoo and McRave. I don’t think this number gives us much insight into whether Iron is more solid than Bereaver.
# | bot | top16 loss rate | % |
---|---|---|---|
1 | Iron | 7/10 | 70% |
2 | Tscmoo | 4/14 | 28% |
3 | McRave | 5/15 | 33% |
4 | Killerbot | 9/19 | 47% |
5 | Bereaver | 11/22 | 50% |
Another simple measure for the stronger bots is: What’s the weakest opponent that you lost to in the SSCAIT round robin? The measure will be noisy, and comparisons only work for players that are close in strength. Also extremely daring lower-rank players like Oleg Ostroumov can distort it. But it’s quick to figure out and that counts for a blog post. I read the results from the unofficial crosstable.
# | bot | worst loss |
---|---|---|
1 | Iron | #31 PurpleCheese |
2 | Tscmoo | #56 NUS Bot |
3 | McRave | #69 FTTankTER |
4 | Killerbot | #60 Oleg Ostroumov |
5 | Bereaver | #35 Dawid Loranc |
6 | Steamhammer | #44 Lukas Moravec |
7 | Wuli | #61 Marine Hell |
8 | CherryPi | #60 Oleg Ostroumov |
My feeling is that Killerbot and Wuli are more solid than this noisy measure gives them credit for, and otherwise the numbers give a rough but fair idea. Iron is more solid than Tscmoo or McRave. Bereaver and Steamhammer are more solid than, say, McRave and CherryPi. In Steamhammer I’ve worked toward solidness, so I’m pleased to have it.
Comments
McRave on :
Jay Scott on :
McRave on :
Thomas Peck on :
krasi0 on :
Jay Scott on :
krasi0 on :
Sq on :