tournaments - 20 | Starcraft AI blog

how bots like maps

I decided to analyze the maps to see how bots feel about them overall. This data is derived from yesterday’s big table of how much bots like each SSCAIT map. The “spread” column is the mean of the absolute value of the Elo deviation numbers for a given map, across all the bots. I thought of calling it “controversy”; it measures how much bots like or dislike the map. Maps that all bots do OK on get low numbers; maps that some bots love and others hate get high numbers.

The “RMS” column is the root mean square of the same data. Statistically, it’s a fairer measure of the differences. It’s bigger because it puts more weight on outliers. The two measures don’t agree closely.

Destination is the most “controversial” map, with 60 Elo spread. If you pick one bot that likes Destination and one bot that dislikes it, on average the bot that likes it will have a 60 Elo advantage, which means a 59% win rate if the bots are otherwise even—nothing devastating. Neo Moon Glaive has Elo spread 41 or about 56% advantage, not much different. Even if you go with the RMS number, the peak 81 Elo RMS difference means a 61% win rate, still not much different.

map	spread	RMS
Benzene	45	57
Destination	60	78
HeartbreakRidge	53	81
NeoMoonGlaive	41	53
TauCross	51	74
Andromeda	49	70
CircuitBreaker	54	69
EmpireoftheSun	50	69
FightingSpirit	51	72
Icarus	46	60
Jade	50	64
LaMancha1.1	49	65
Python	47	60
Roadrunner	46	63

Bottom line: On this analysis, the maps don’t seem to be distorting the competition. No highly “controversial” maps are introducing widespread unfairness.

Elo rating variations by map

From the SSCAIT data, I calculated the Elo advantage or disadvantage that each bot sees on each map. If it played all its games on that map, its Elo would change by that much. More or less; there isn’t as much data, so the advantage numbers are less accurate than the original Elo. I increased the Elo K factor to account for the smaller amount of data.

The 14 active maps:

(2)Benzene.scx
(2)Destination.scx
(2)HeartbreakRidge.scx
(3)NeoMoonGlaive.scx
(3)TauCross.scx
(4)Andromeda.scx
(4)CircuitBreaker.scx
(4)EmpireoftheSun.scm
(4)FightingSpirit.scx
(4)Icarus.scm
(4)Jade.scx
(4)LaMancha1.1.scx
(4)Python.scx
(4)Roadrunner.scx

The Elo ratings are repeated from the original post. I dropped 5 bots of the 103 for lack of data. The top number in each colored cell is the advantage or disadvantage that bot sees when playing on that map, in Elo points. You can look up winning rates for a given advantage in the Elo table. The bottom number is the count of games. Some bots have few games, like the new ZerGreenBot. A few bots have not played on every map, and get “-” instead of numbers.

bot	Elo	Benz	Dest	Hear	NeoM	TauC	Andr	Circ	Empi	Figh	Icar	Jade	LaMa	Pyth	Road	earliest	latest
krasi0	2163 2128	53 164	57 169	26 146	7 159	18 143	26 126	-76 144	-35 158	-73 140	-31 158	23 156	-63 155	-6 156	72 154	2015 Nov 30	2016 Sep 27
Iron bot	2081 1990	139 141	-15 144	47 134	-35 157	53 140	-34 133	-79 129	-36 142	17 145	-12 142	17 148	1 150	-78 137	15 148	2015 Nov 27	2016 Sep 26
Marian Devecka	2065 4117	-57 320	-18 255	70 303	-12 287	-37 322	104 273	45 285	-22 297	23 301	-41 289	-72 309	28 310	26 284	-38 282	2014 Oct 29	2016 Sep 27
Martin Rooijackers	2011 6449	-25 462	56 473	-67 478	0 450	40 477	44 449	88 478	-37 458	-15 463	42 480	-156 446	0 429	31 475	-2 431	2014 Oct 29	2016 Sep 27
tscmooz	1991 4972	-23 380	120 354	-20 323	-52 316	50 354	-50 393	21 354	-9 341	-78 370	43 364	-6 354	29 377	-54 328	29 364	2015 Feb 27	2016 Sep 27
tscmoo	1978 5682	-8 359	44 355	-15 438	39 389	27 445	30 402	12 447	-14 420	-97 410	53 408	-70 385	29 397	41 396	-71 431	2015 Jan 22	2016 Sep 27
LetaBot CIG 2016	1932 444	120 26	33 38	-126 35	5 29	-81 30	40 30	-12 33	-64 26	100 23	2 29	-92 35	14 37	35 45	27 28	2016 Aug 01	2016 Sep 27
WuliBot	1871 984	-9 73	-58 83	66 56	77 77	-32 68	14 59	-22 71	36 84	12 64	-21 64	54 80	-37 67	-55 68	-26 70	2016 Apr 19	2016 Sep 26
Simon Prins	1867 5400	-20 388	-25 410	51 381	67 432	19 356	-18 412	-110 374	97 376	-21 396	-25 385	95 375	-54 324	17 373	-73 418	2015 Jan 25	2016 Sep 27
ICELab	1865 6078	-21 441	-69 442	73 458	-127 377	-52 447	-17 398	-2 455	3 440	-13 425	-25 471	49 421	34 437	72 435	96 431	2014 Oct 29	2016 Sep 27
FlashTest	1863 204	-71 13	-106 12	-117 13	149 22	94 18	31 14	-121 15	2 20	78 15	-57 10	79 13	59 13	4 12	-23 14	2016 Mar 22	2016 Jul 27
Sijia Xu	1849 2313	22 171	30 155	19 182	-8 166	-2 148	-49 164	19 166	-25 165	-45 138	24 171	36 161	-72 153	51 201	0 172	2015 Oct 10	2016 Sep 27
LetaBot SSCAI 2015 Final	1813 416	-10 28	83 31	-41 22	31 34	15 27	-223 37	199 28	-44 29	-96 33	-44 36	-32 27	1 29	79 25	81 30	2016 Aug 04	2016 Sep 27
Dave Churchill	1804 6023	-29 473	200 428	-82 436	18 413	19 433	-74 412	-50 417	72 429	47 445	-68 396	-53 408	100 438	-46 455	-53 440	2014 Oct 29	2016 Sep 27
Chris Coxe	1800 2195	31 153	188 169	66 153	-106 165	-25 149	-79 149	-34 166	-7 167	-18 153	115 146	-124 153	-114 154	114 156	-8 162	2015 Sep 03	2016 Sep 27
Tomas Vajda	1790 6088	-2 441	9 439	21 441	-77 449	-14 443	-18 421	34 398	-7 458	55 422	17 425	61 440	-80 424	-6 439	7 448	2014 Oct 29	2016 Sep 27
Flash	1777 991	-19 70	-163 65	16 59	7 68	-70 59	-12 85	3 76	72 69	23 69	64 87	71 84	-25 61	30 75	3 64	2016 Apr 18	2016 Sep 27
LetaBot IM noMCTS	1766 1226	61 79	127 88	-116 83	39 106	85 87	-114 93	-106 89	-24 93	-28 89	58 93	-66 70	18 86	-2 92	68 78	2016 May 18	2016 Aug 01
Zia bot	1757 536	54 36	-63 40	15 39	-22 41	50 39	-63 43	93 42	-7 34	20 36	-78 33	-58 30	-55 40	66 39	48 44	2016 Jul 07	2016 Sep 27
A Jarocki	1741 932	20 63	121 66	125 62	-42 72	-48 74	18 53	-104 78	74 81	25 76	-93 59	-110 64	-39 55	52 67	3 62	2015 Oct 04	2016 Jan 26
PeregrineBot	1728 1262	-21 104	94 89	-28 95	123 76	-133 90	77 95	-4 77	10 85	-67 94	-32 83	-51 94	11 90	14 103	9 87	2016 Feb 09	2016 Sep 10
tscmoop	1721 1982	155 139	56 145	47 140	50 127	-7 130	53 154	-68 127	13 141	-17 140	-54 171	-59 128	-44 158	-51 148	-74 134	2015 Nov 11	2016 Sep 26
Andrew Smith	1718 6160	63 460	-37 387	46 445	-10 457	-11 452	33 449	-21 463	89 480	22 441	-58 455	-25 445	-29 418	-58 400	-3 408	2014 Oct 29	2016 Sep 27
Florian Richoux	1716 5970	-35 408	-107 425	120 441	42 411	-43 446	-46 443	37 397	152 468	66 416	-25 451	46 433	-90 410	-29 391	-88 430	2014 Oct 29	2016 Sep 27
Carsten Nielsen	1695 4683	-19 361	27 341	-65 334	33 315	-21 330	-1 319	-27 353	53 336	88 369	-35 318	6 318	-78 282	26 340	12 367	2015 Mar 17	2016 Sep 27
Soeren Klett	1687 6002	39 459	35 426	-12 420	63 396	-108 440	39 481	-20 402	-25 451	-4 454	36 433	-63 407	-19 406	71 410	-32 417	2014 Oct 29	2016 Sep 27
Vaclav Horazny	1686 4140	33 291	78 291	-17 302	18 295	-22 274	-46 332	33 290	-31 308	4 284	-38 270	58 313	-96 315	25 296	2 279	2014 Oct 29	2015 Nov 18
La Nuee	1662 558	-66 36	41 24	81 40	100 35	57 44	-11 41	-82 44	-21 36	50 52	-7 34	-78 43	-11 36	68 55	-121 38	2015 Dec 13	2016 Mar 18
Jakub Trancik	1657 6136	102 427	79 454	1 434	-32 443	-64 445	-10 459	-16 438	-106 424	18 439	-35 413	-23 404	-40 463	63 445	63 448	2014 Oct 29	2016 Sep 27
Marek Suppa	1655 4397	61 322	78 334	1 330	14 321	-48 304	-59 328	-88 315	-6 291	25 336	56 326	6 292	-25 300	-41 302	25 296	2015 Jan 05	2016 Mar 18
Krasimir Krystev	1653 4292	-24 322	-59 283	92 304	49 298	8 327	-56 295	-51 296	-43 318	-76 326	-9 318	-24 280	132 318	-58 299	117 308	2014 Oct 29	2016 Mar 10
ASPbot2011	1652 222	-22 21	-178 17	-27 12	-13 17	98 20	154 11	-29 13	87 17	17 10	75 14	-59 17	-135 15	21 18	11 20	2015 Jan 29	2016 Feb 25
Marcin Bartnicki	1633 1377	59 97	43 106	63 88	-54 110	-33 109	-20 110	-21 92	-46 96	-20 97	-12 92	53 92	-38 99	75 102	-48 87	2014 Nov 28	2016 Mar 18
Tomas Cere	1631 6131	-27 446	-41 419	-2 444	-46 443	109 459	14 448	2 429	2 424	48 433	-24 442	-69 417	49 480	-12 426	-2 421	2014 Oct 29	2016 Sep 27
MegaBot	1630 419	-22 29	27 27	6 34	36 27	-27 26	-21 25	-28 21	-1 28	44 41	64 28	-18 37	-2 37	-66 34	9 25	2016 Aug 01	2016 Sep 27
Aurelien Lermant	1622 3673	-30 258	-8 249	15 268	-32 247	30 260	-37 287	-46 250	-73 266	87 262	38 253	51 255	80 285	-20 261	-53 272	2015 Jun 22	2016 Sep 27
Matej Kravjar	1619 983	-29 69	22 67	-96 75	16 71	-266 75	12 65	78 73	74 79	-25 59	20 75	82 75	68 58	105 78	-60 64	2014 Oct 29	2015 Feb 18
Daniel Blackburn	1605 4591	-61 327	-83 332	74 337	8 326	-8 306	66 330	-56 354	-26 350	74 327	-23 312	17 311	-3 319	-10 345	31 315	2014 Oct 29	2016 Jan 26
Gabriel Synnaeve	1584 18	-3 1	-38 3	31 1	24 1	-26 2	-	-	19 1	-	-5 2	13 1	13 2	-9 2	-19 2	2015 Jan 30	2015 Nov 24
David Milec	1566 49	-	-11 5	48 2	-47 4	-73 4	9 4	-2 3	-26 6	-13 2	16 4	25 3	31 5	70 5	-26 2	2015 Jan 13	2015 Jan 20
Odin2014	1565 5602	-60 416	-40 377	16 396	-40 399	-38 400	42 402	-45 380	-68 401	-29 424	58 416	52 393	66 411	14 365	73 422	2014 Dec 21	2016 Sep 11
Gaoyuan Chen	1559 5106	-18 387	-40 348	-20 357	-64 387	59 379	48 352	-18 330	75 359	2 391	-24 362	-53 383	33 371	18 345	1 355	2015 Feb 10	2016 Sep 27
Henri Kumpulainen	1553 881	62 75	-128 64	37 68	51 64	-23 69	33 69	-48 58	15 67	-36 62	22 74	27 48	1 58	-11 54	-1 51	2016 Jan 13	2016 May 31
Martin Dekar	1533 2627	-58 195	14 186	25 189	5 196	-105 168	-27 189	73 178	-3 206	30 174	18 207	72 202	-9 166	-53 201	17 170	2014 Oct 29	2016 Jan 25
Serega	1505 3802	-35 280	104 257	131 262	51 260	-3 278	26 261	-23 275	-114 285	-76 276	6 270	-4 268	-1 279	-100 291	38 260	2015 Jan 31	2016 Jan 26
Chris Ayers	1481 1520	-34 115	-40 124	11 112	-28 106	0 106	-72 111	65 88	69 105	38 113	-52 117	49 109	-44 89	-44 102	82 123	2015 Aug 10	2016 Jan 26
Nathan a David	1481 991	13 57	0 61	-91 77	31 88	-7 65	-34 75	-23 72	103 61	68 65	43 72	-26 78	-110 70	-34 64	68 86	2016 Feb 23	2016 Aug 08
DAIDOES	1471 485	131 36	39 32	47 31	-27 30	1 28	-25 42	123 39	-159 28	29 44	32 34	-64 34	-80 36	-51 30	5 41	2016 Jun 13	2016 Sep 08
Igor Lacik	1454 5852	19 420	48 399	13 447	-28 408	-38 375	31 418	-95 461	111 418	-28 386	-9 442	12 415	-54 405	26 429	-7 429	2014 Oct 29	2016 Sep 08
Matej Istenik	1449 6054	-63 412	18 457	44 458	12 421	-20 429	-9 435	-109 417	-4 472	8 458	88 382	-3 414	8 426	-2 456	33 417	2014 Oct 29	2016 Sep 27
EradicatumXVR	1443 4539	-41 340	18 324	62 309	66 319	-14 322	-5 315	60 322	-90 303	-24 331	-39 330	11 340	24 311	19 347	-47 326	2014 Nov 04	2016 Jan 23
Tomasz Michalski	1432 433	6 29	34 30	-109 34	-35 23	142 27	30 40	-16 27	-92 31	174 32	-54 44	-5 23	12 31	-24 28	-63 34	2015 Dec 22	2016 Mar 18
Oleg Ostroumov	1431 1345	10 92	38 83	63 69	-33 116	74 96	22 98	60 96	-1 116	101 109	-82 80	-104 97	-89 90	80 102	-138 101	2014 Oct 29	2016 Jan 26
NUS Bot	1426 3333	78 216	-20 210	-525 233	64 236	29 249	25 221	-19 257	100 241	23 257	-39 254	133 240	104 232	36 257	11 230	2015 May 19	2016 Sep 06
Martin Pinter	1425 1580	53 114	-54 110	69 108	7 110	-44 122	0 123	8 101	77 108	-60 97	26 131	-31 139	61 112	-53 95	-59 110	2014 Oct 29	2015 Dec 11
Roman Danielis	1417 2945	-82 222	35 209	23 198	18 206	-9 202	13 221	89 202	-5 206	38 207	10 200	-18 225	-32 211	-29 226	-50 210	2014 Oct 29	2016 Sep 26
ZerGreenBot	1416 36	-	-21 3	-13 3	-6 2	13 3	42 2	-29 3	8 1	6 1	-73 5	45 4	-7 4	36 4	0 1	2016 Sep 22	2016 Sep 27
Marek Kadek	1413 5246	41 406	75 382	8 392	5 385	-28 383	-24 357	-12 369	-62 394	-49 359	-24 367	18 379	37 350	18 348	-3 375	2014 Oct 29	2016 May 22
Ian Nicholas DaCosta	1404 2928	-38 217	-31 210	55 214	29 213	-134 222	-62 192	5 193	10 218	-17 205	-18 213	73 233	-47 199	90 186	84 213	2015 Apr 27	2016 Sep 08
AwesomeBot	1403 473	-19 32	-24 31	-11 32	-23 49	20 29	149 29	-18 26	-3 30	9 53	-67 36	4 36	33 30	-12 32	-39 28	2016 Jun 16	2016 Sep 08
Radim Bobek	1390 1151	-85 70	11 96	-36 68	54 77	-39 94	-95 81	-11 78	0 89	184 87	43 91	61 70	-58 75	-72 100	44 75	2015 Oct 01	2016 Mar 06
Adrian Sternmuller	1375 4379	8 316	78 313	-85 330	-74 321	73 325	79 335	73 287	-42 315	-59 330	69 293	-115 288	33 331	-109 305	72 290	2014 Oct 30	2016 Jul 22
Martin Strapko	1366 1144	-55 98	-2 103	-21 82	12 65	-6 83	88 89	20 73	-51 76	23 72	69 80	-57 81	-108 80	50 81	38 81	2014 Oct 29	2016 Jan 26
Maja Nemsilajova	1363 4117	73 301	-73 292	73 322	7 309	31 298	-51 302	-77 269	-10 290	76 264	-36 302	-82 299	14 309	44 270	10 290	2014 Nov 04	2015 Nov 29
Johan Kayser	1361 413	-58 24	153 19	22 33	-50 29	-2 29	27 36	-156 19	-150 28	-59 40	66 28	21 27	128 28	26 42	34 31	2016 Jul 29	2016 Sep 27
UPStarcraftAI	1360 600	-16 43	56 48	6 41	45 48	11 36	12 43	-76 56	22 40	-28 44	52 37	37 43	23 32	-47 40	-98 49	2015 Dec 24	2016 Apr 13
Martin Vlcak	1353 1210	-49 68	-83 97	-51 81	-26 86	-25 98	-31 87	88 60	119 94	35 79	-24 102	16 90	45 90	33 83	-46 95	2016 Feb 16	2016 Sep 07
Johannes Holzfuss	1351 674	-46 52	98 39	24 36	59 51	-42 47	-27 54	-73 57	92 64	-82 49	-6 33	-4 57	-82 52	80 44	8 39	2016 Mar 05	2016 Jun 15
Vojtech Jirsa	1350 2759	-87 212	-79 182	32 200	-8 207	-127 205	6 174	49 184	83 211	27 199	-30 186	79 207	117 193	31 214	-93 185	2015 Jan 12	2015 Sep 05
JompaBot	1349 1043	-151 67	107 78	-180 70	-44 67	-52 83	81 61	95 88	1 83	71 75	82 80	-9 70	-72 77	68 73	2 71	2016 Feb 04	2016 Aug 13
Rob Bogie	1346 651	42 48	-313 54	-193 38	135 45	365 49	-361 47	246 38	-333 43	-418 52	291 55	273 45	298 43	-306 40	274 54	2016 May 14	2016 Sep 06
Christoffer Artmann	1344 395	30 25	-123 23	14 22	57 29	-45 31	-155 22	-36 27	-52 34	143 30	109 32	28 41	-17 31	-51 26	99 22	2016 Aug 07	2016 Sep 27
Marek Gajdos	1331 1370	2 91	-102 100	90 95	-13 102	-139 99	93 87	42 81	39 107	-78 107	-30 106	-7 97	9 94	79 101	15 103	2016 Jan 30	2016 Sep 11
Travis Shelton	1314 1212	38 72	-4 78	77 84	-31 80	-1 70	47 90	-25 105	2 92	-27 100	-30 104	-9 93	44 86	-87 82	5 76	2016 Feb 28	2016 Sep 06
Peter Dobsa	1307 3015	27 213	26 205	-45 218	-71 199	-19 215	82 228	-4 232	81 197	58 212	-28 224	-32 207	42 204	-65 237	-54 224	2015 Jan 11	2015 Oct 02
VeRLab	1304 888	-75 65	-5 52	-16 64	25 75	-20 56	-27 63	84 51	-6 79	42 71	57 77	99 51	6 52	-32 54	-131 78	2016 Feb 28	2016 Aug 01
Bjorn P Mattsson	1295 4432	18 303	75 328	48 340	-4 303	-20 307	39 302	6 333	-26 317	12 304	-58 315	-96 326	-50 345	113 280	-57 329	2015 Apr 05	2016 Sep 27
Lukas Sedlacek	1293 63	48 3	-80 10	74 3	-78 5	2 4	-14 6	-53 9	39 3	77 3	29 4	-15 2	7 3	-18 5	-20 3	2015 Jan 12	2015 Jan 20
Sergei Lebedinskij	1293 1083	-16 59	11 71	-19 77	69 77	47 72	-97 83	70 70	46 69	3 86	-85 85	55 71	36 106	-51 74	-68 83	2015 May 28	2015 Sep 03
Vladimir Jurenka	1278 6041	-87 429	-66 435	-33 454	-28 432	-15 402	11 438	44 453	77 435	-96 443	33 406	15 467	77 435	57 413	10 399	2014 Nov 04	2016 Sep 27
neverdieTRX	1272 334	65 29	-3 27	56 21	-9 26	-38 19	13 20	-82 27	-50 28	44 26	-159 27	67 21	35 25	-3 24	62 14	2016 Jul 19	2016 Sep 10
OpprimoBot	1256 1994	8 131	-23 138	14 144	122 146	70 143	-8 160	33 131	30 153	-70 135	-88 134	38 140	1 149	-93 139	-35 151	2015 Nov 18	2016 Sep 27
Marek Kruzliak	1255 399	-1 31	-92 35	66 23	-46 27	146 28	111 27	-123 23	112 25	-43 36	-99 32	80 26	15 30	-25 25	-101 31	2014 Nov 28	2015 Jan 20
Sungguk Cha	1250 697	-9 47	-38 46	34 54	-148 40	50 52	-62 51	130 48	-23 41	-3 65	-63 51	117 43	76 48	-44 61	-17 50	2016 Jun 05	2016 Sep 27
Jacob Knudsen	1247 1244	-13 79	-32 89	92 99	-20 89	36 81	34 75	70 87	-87 88	-77 81	-5 99	55 87	-7 103	3 94	-47 93	2016 Feb 23	2016 Sep 10
Ludmila Nemsilajova	1228 409	52 27	15 21	13 22	74 25	-90 27	-9 27	65 46	-71 37	-51 32	-75 28	1 30	-24 29	58 30	43 28	2014 Nov 28	2015 Jan 21
Karin Valisova	1226 1067	159 84	-42 86	-11 71	-2 72	-28 74	0 79	-62 68	-40 76	6 72	-16 66	4 68	-29 85	31 82	31 84	2014 Nov 04	2016 Jan 26
HoangPhuc	1209 300	-54 28	-46 32	25 22	3 16	83 21	-88 24	-21 16	51 20	-52 24	56 17	-70 25	-27 15	-37 25	178 15	2016 Jul 18	2016 Sep 07
Sebastian Mahr	1182 1191	-2 67	-61 89	-18 83	-34 96	14 71	118 74	95 87	25 81	25 79	-13 97	-63 94	-40 89	17 97	-61 87	2016 Jan 13	2016 Aug 08
Jan Pajan	1179 997	-15 85	58 71	14 64	-49 77	14 67	37 72	-61 78	2 61	-29 67	-19 70	-77 91	52 72	21 62	53 60	2014 Nov 04	2016 Jan 05
Pablo Garcia Sanchez	1174 579	26 33	11 53	-28 42	-74 33	3 45	35 39	84 34	-49 43	52 39	26 37	24 50	-21 46	-37 51	-51 34	2015 Dec 24	2016 Apr 13
Ivana Kellyerova	1131 1499	-59 115	-89 113	2 99	-20 113	71 111	38 108	-39 95	-6 99	19 125	30 92	53 106	-54 110	-5 97	60 116	2014 Nov 04	2015 Apr 01
Lucia Pivackova	1090 717	-69 50	-1 53	-32 55	20 50	94 41	-13 58	-25 47	23 42	-57 47	39 49	-42 55	-32 50	48 59	46 61	2014 Oct 30	2015 Jan 20
Tae Jun Oh	1036 138	43 11	21 8	-7 10	-40 7	96 8	-18 9	95 6	-35 9	-90 17	129 7	-40 14	-95 11	-21 9	-38 12	2016 Mar 22	2016 Apr 11
Denis Ivancik	1022 418	-65 37	46 30	-23 25	-34 26	109 29	78 21	-26 36	96 26	-90 43	4 28	-35 27	-28 18	-43 34	11 38	2014 Nov 28	2015 Jan 20
ButcherBoy	970 422	38 21	38 23	-68 32	-30 35	-43 34	100 31	46 29	-40 29	-6 35	-40 31	-49 34	128 23	7 30	-81 35	2016 Jun 21	2016 Sep 06
Jon W	964 790	-30 58	47 66	-100 62	6 59	83 45	27 67	4 52	-44 50	7 60	35 47	-9 57	79 49	-66 62	-39 56	2015 Apr 30	2015 Jul 09
Matyas Novy	885 1693	77 103	-76 132	-60 133	-69 122	-2 110	5 120	1 107	67 119	110 104	-59 145	-20 122	66 126	-83 124	44 126	2015 Feb 04	2015 Jul 09

There are some interesting things to see in the chart, but first look at Rob Bogie! That’s the bot MaasCraft. All the bots have preferences, some have strong preferences, but MaasCraft loves some maps and hates others. Why is that? If it could be made to love all the maps....

SSCAIT top 10 crosstables

Krasi0 passed Marian Devecka’s Killerbot to become #1 on SSCAIT on 18 August. A 100 by 100 crosstable is too big, but here’s a crosstable of the games played among the top 10 SSCAIT bots from 17 August to 27 September. The top 10 are chosen based on their Elo ratings at the end of the period. The top number in each box is the winning rate of the bot in that row against the bot in that column; the bottom number is the count of games. The overall column is the winning rate against the other 9 top bots; it doesn’t have to be closely related to the Elo rating, which is computed with all 100 bots.

since 17 Aug	overall	kras	Iron	Mari	Mart	tscm	tscm	Leta	Wuli	Simo	ICEL
krasi0	66.67%		20% 10	57% 14	100% 17	0% 14	55% 11	100% 7	100% 9	100% 8	100% 9
Iron bot	71.28%	80% 10		78% 9	36% 11	89% 9	64% 14	40% 15	100% 9	90% 10	100% 7
Marian Devecka	65.38%	43% 14	22% 9		100% 6	86% 7	58% 12	82% 11	80% 5	50% 6	100% 8
Martin Rooijackers	56.82%	0% 17	64% 11	0% 6		64% 14	78% 9	70% 10	100% 8	75% 4	100% 9
tscmooz	54.43%	100% 14	11% 9	14% 7	36% 14		50% 8	38% 8	43% 7	100% 6	100% 6
tscmoo	48.75%	45% 11	36% 14	42% 12	22% 9	50% 8		50% 8	0% 4	100% 6	100% 8
LetaBot CIG 2016	55.06%	0% 7	60% 15	18% 11	30% 10	62% 8	50% 8		80% 10	100% 14	67% 6
WuliBot	19.12%	0% 9	0% 9	20% 5	0% 8	57% 7	100% 4	20% 10		0% 8	25% 8
Simon Prins	31.94%	0% 8	10% 10	50% 6	25% 4	0% 6	0% 6	0% 14	100% 8		100% 10
ICELab	11.27%	0% 9	0% 7	0% 8	0% 9	0% 6	0% 8	33% 6	75% 8	0% 10

It’s amazing that Tscmoo zerg scored 100% versus Krasi0 during the period. It’s equally amazing that former champion IceBot has a 0% score against 7 of the other 9 top bots; apparently once you’re better, you’re a lot better. Bots have improved that much in the last 2 years.

Here’s the same crosstable, except starting 1 January 2016.

since 1 Jan	overall	kras	Iron	Mari	Mart	tscm	tscm	Leta	Wuli	Simo	ICEL
krasi0	45.47%		49% 57	26% 57	78% 64	13% 61	16% 73	100% 10	83% 35	55% 56	55% 51
Iron bot	53.37%	51% 57		41% 37	39% 66	58% 43	54% 59	40% 15	85% 27	55% 47	64% 50
Marian Devecka	63.12%	74% 57	59% 37		57% 100	72% 67	53% 95	79% 14	83% 12	80% 95	49% 106
Martin Rooijackers	53.71%	22% 64	61% 66	43% 100		57% 168	47% 230	64% 11	74% 31	44% 162	81% 164
tscmooz	62.23%	87% 61	42% 43	28% 67	43% 168		66% 142	27% 11	57% 23	77% 138	84% 128
tscmoo	62.43%	84% 73	46% 59	47% 95	53% 230	34% 142		43% 14	53% 17	88% 154	84% 161
LetaBot CIG 2016	56.64%	0% 10	60% 15	21% 14	36% 11	73% 11	57% 14		69% 13	100% 17	75% 8
WuliBot	24.63%	17% 35	15% 27	17% 12	26% 31	43% 23	47% 17	31% 13		16% 19	19% 26
Simon Prins	33.01%	45% 56	45% 47	20% 95	56% 162	23% 138	12% 154	0% 17	84% 19		38% 136
ICELab	33.73%	45% 51	36% 50	51% 106	19% 164	16% 128	16% 161	25% 8	81% 26	62% 136

The number of games in each cell depends on the lifetimes of both bots; not all were active over the whole period. The cells are mostly in paler colors, because many of the bots were updated during the year—they were neither always weak nor always strong.

I want to create a breakdown by map, but the data is not supporting it. There are 14 active maps. If I render data from the shorter period, there aren’t enough games for all the maps to be played in each pairing. If I use the longer period, there are usually enough games but the bots vary in strength with time so we’ll see a smear.

To break out crosstables by map, I’ll have to combine data one way or another. I could combine over time, using the longer period. I could combine bots by race, or try to compare groups of macro bots versus rush bots. I could lump together maps by number of starting spots. Or I could just do bot-map instead of bot-opponent-map. What do you think is useful? Maybe I should start with bot-map and then break it down further to bot-some group of opponents-map?

comparing AIIDE 2015 and CIG 2016 Elo ratings

The cool technique I had in mind to compare ratings across tournaments turned out not to work. Not cool after all. But 6 bots played unchanged in both AIIDE 2015 and CIG 2016, and we can compare their relative ratings. In this table the subtract column gives the AIIDE 2015 rating minus the CIG 2016 rating.

bot	AIIDE Elo	CIG elo	subtract	normalize
UAlbertaBot	1895	1778	117	35
Overkill	1890	1796	94	12
Aiur	1784	1687	97	15
TerranUAB	1372	1338	34	-48
OpprimoBot	1231	1154	77	-5
Bonjwa	1171	1099	72	-10
average			82	0

As you might expect, two tournaments with different maps and different opponents give different ratings. UAlbertaBot and Overkill swapped ranks among the 6. But after correcting for the 82 point offset (since only rating differences matter), the ratings turn out to be quite close between the tournaments. The biggest difference is for TerranUAB. Look up 48 points in the Elo table—it says that TerranUAB has a 57% probability of beating itself, not a drastic error.

You can try to convert a CIG 2016 rating into a rough estimate of an AIIDE 2015 rating by adding 82. For example, tscmoo terran earned a CIG rating of 1888, which corresponds to an AIIDE rating of 1888+82 = 1970, whereas the tscmoo zerg that played in AIIDE earned a rating there of 2026. So the estimate appears to be way off. But estimates made this way are likely to be closer for bots near the middle of the pack.

Next: Another mass of colorful crosstables.

CIG 2016 Bayesian Elo ratings

Same as yesterday, Bayesian Elo ratings calculated by bayeselo, this time for CIG 2016. I included both the qualifier and the final, of course. That gives the best possible ratings, so that confidence is higher for the 8 finalists. But the “score” column becomes difficult to interpret, because part of the score of the top 8 bots comes from the final when they faced tougher opposition. You can’t directly compare the scores of bots 1-8 with the scores of 9-16, only the ratings.

Also, with this analysis it doesn’t make sense to compare the rating values between tournaments. Each tournament is independently scaled to have an average rating of 1500. Only the relative ratings of bots in the same tournament can be compared. Ratings are relative.

	bot	score	Elo	95% conf.	better?
1	tscmoo	73%	1888	1872-1904	98.5%
2	Iron	71%	1864	1848-1880	99.9%
3	LetaBot	68%	1827	1811-1843	99.7%
4	Overkill	65%	1796	1781-1812	70.9%
5	ZZZKBot	64%	1790	1775-1805	86.8%
6	UAlbertaBot	63%	1778	1763-1793	99.8%
7	MegaBot	60%	1746	1731-1761	99.9%
8	Aiur	54%	1687	1671-1702	72.7%
9	Tyr	62%	1679	1659-1699	100%
10	Ziabot	46%	1500	1479-1521	100%
11	TerranUAB	34%	1338	1316-1360	100%
12	SRbotOne	22%	1158	1133-1183	59.1%
13	OpprimoBot	22%	1154	1128-1179	97.1%
14	XelnagaII	21%	1119	1092-1145	86.3%
15	Bonjwa	19%	1099	1072-1125	100%
16	Salsa	1%	579	510-636	-

The official results have LetaBot a hair ahead of ZZZKBot, then Overkill following. bayeselo has ZZZKBot and Overkill reversed, saying that LetaBot is clearly superior to Overkill, which is fairly likely to be superior to ZZZKBot. The difference comes about because, of course, the official results include only the final. Martin Rooijackers was justified after all in saying that ZZZKBot had fallen from the top 3. All other results agree with the official ranking. The tailing finalist Aiur is 72.7% likely to be superior to Tyr, so there is some doubt that the best finalists won through (in general the doubt can’t be avoided, though).

The tail-ender Salsa has a wide and asymmetrical confidence interval. It takes more evidence to pin down an extreme rating than a middle-of-the-road rating.

Tomorrow: I’ll try an analysis in which the ratings of unchanged bots are carried over from AIIDE 2015 to CIG 2016, so that we can compare between tournaments. I’m not sure how well it will work, or even if I can get it to work at all, but it will be interesting to try.

AIIDE 2015 Bayesian Elo ratings

Krasi0 asked me to calculate ratings for tournaments using Rémi Coulom’s excellent bayeselo program. Here are ratings for AIIDE 2015.

bayeselo does not calculate basic Elo ratings like my little code snippets. It can’t calculate an Elo curve over time. It assumes that the players are fixed and have one true rating, and it crunches a full-on Bayesian statistical analysis to not only find the rating as accurately as possible, but also a 95% confidence interval so you can see how accurate the rating is. The ratings for the bots that learn, which aren’t fixed in strength as bayeselo assumes, can be seen as measuring the average strength over the tournament—the tournament score is no different in that respect.

The last column of the table is the probability of superiority, bayeselo’s calculated probability that the bot truly is better than the bot ranked immediately below it. The last bot doesn’t get one, of course. (bayeselo calculates this for all pairs, but in a tournament this long it rounds off to 100% for most.)

	bot	score	Elo	95% conf.	better?
1	tscmoo	89%	2026	2002-2050	81.0%
2	ZZZKBot	88%	2011	1988-2035	99.9%
3	UAlbertaBot	80%	1895	1874-1916	61.2%
4	Overkill	81%	1890	1870-1911	99.9%
5	Aiur	73%	1784	1765-1803	99.9%
6	Ximp	68%	1712	1694-1731	99.9%
7	Skynet	64%	1666	1648-1684	50.7%
8	IceBot	64%	1666	1648-1684	88.4%
9	Xelnaga	63%	1650	1632-1668	81.4%
10	LetaBot	61%	1638	1620-1656	99.9%
11	Tyr	54%	1553	1534-1572	96.0%
12	GarmBot	52%	1531	1513-1549	100%
13	NUSBot	39%	1380	1362-1398	73.1%
14	TerranUAB	38%	1372	1354-1390	99.8%
15	Cimex	36%	1335	1316-1353	99.6%
16	CruzBot	32%	1299	1280-1317	99.9%
17	OpprimoBot	28%	1231	1211-1250	96.7%
18	Oritaka	26%	1205	1185-1225	84.0%
19	Stone	25%	1190	1170-1210	91.3%
20	Bonjwa	23%	1171	1151-1191	100%
21	Yarmouk	9%	913	885-939	95.0%
22	SusanooTricks	8%	882	853-910	-

In the official results, Overkill came in ahead of UAlbertaBot with a higher tournament score. bayeselo ratings are more accurate than score because they take into account more information, and bayeselo says UAlbertaBot > Overkill with probability 61%. As explained in the original results, it’s a statistical tie, but bayeselo says it’s not an even tie but a little tilted in a counterintuitive way.

Skynet looks dead even with IceBot in the rounded-off numbers above. bayeselo says that Skynet > IceBot with probability 50.7%, a hair off dead even. Even the large number of games in this tournament could not rank all the bots accurately.

Tomorrow: The same for CIG 2016.

7 eras of SSCAIT champions

The player with the highest rating on each day I’ll call the “champion” of the day. It turns out that the SSCAIT daily champions change frequently. I calculated all the daily champions and manually divided the time period into eras that looked to me to have different sets of champions.

dates		champions
2013 12 25	2015 02 16	usually ICELab, sometimes Tomas Vajda (XIMP)
2015 02 17	2015 02 25	Tomas Vajda and Dave Churchill (UAlbertaBot)
2015 02 26	2015 06 17	mostly tscmoo, sometimes Tomas Vajda or ICELab
2015 06 21	2015 11 19	largely tscmooz, sometimes Tomas Vajda or ICELab, occasionally Dave Churchill or Florian Richoux (AIUR), 3 days of Sijia Xu (Overkill)
2015 11 20	2016 02 08	Marian Devecka (Killerbot), 2 days of tscmooz
2016 02 09	2016 08 17	mostly Marian Devecka, sometimes tscmoop, 4 days of Iron
2016 08 18	2016 09 26	mostly Krasi0, sometimes Marian Devecka

You can download the SSCAIT champions file in .csv format.

SSCAIT Elo ratings over time

Here it is, the great chart of SSCAIT Elo ratings over time. Well, not here actually, I put it on a separate page so that not every blog visitor has to load the mass of Javascript and data.

SSCAIT interactive ratings chart for 100 bots

The chart is generated from this csv file. Spreadsheet software or stats software should open it right up, if you want to poke the data yourself. It’s 950 lines of 101 columns each, a date and ratings for 100 bots.

Data in the csv file is filled in for each day from the bot’s first to its last game in the original raw data file (which is just a list of games), and left blank on other days. There may be an off-by-one error causing some bots to miss their last day of data; I didn’t bother to verify it since it’s hardly visible. Some bots have short lifetimes and only appear on the graph as a brief squiggle. Some bots have inactive periods in between their first and last games; the inactive periods with no games graph as flat lines. In excluding the 3 bots with insufficient games, I also removed them from the rating calculation, which improves the ratings to a tiny degree. The rankings stay exactly the same for all 100 bots, though.

a few preliminary Elo charts

The SSCAIT data includes 103 bots, and 3 of them have 10 or fewer games, leaving exactly 100 with useful rating curves. I’ve crunched and formatted the data, and now all I have to do is draw it. I hope to create a humongalicious zoomable graph of daily rating data for all 100 bots—if I can find a way to draw that many lines on a graph in a way that’s usable. Well, I’ll think of something. I chose powerful graphing software that’s fully capable of doing the job, but it’s complicated and my skill and patience may be less than fully capable....

Anyway, another appetizer. Here are static rating graphs for 2016 for the top 3 CIG finishers, all of which had many updates this year. The graphs run from 1 January 2016 to 27 September 2016. The authors may be interested in comparing their updates with movements in their graph. Krasi0 shows steady improvement since April, while the other two look more irregular.

graph of Krasi0’s rating in 2016

graph of Iron’s rating in 2016

graph of Tacmoo terran’s rating in 2016

SSCAIT initial and current Elo ratings

I’m still working on Elo curves over time, but today I have Elo ratings for each bot in the SSCAIT data at the beginning and end of its career. Here is yesterday’s table plus the new info, now sorted by decreasing current rating—the bot’s real strength yesterday as best we can measure. The topmost ratings are, to my surprise, exactly in the order I expected!

To make the ratings easier to interpret, I added two columns labeled “expect”. These are the expected winning rate of the bot against the average opponent. The rating system is designed so that the average Elo rating is constant at 1500, and it’s easy to compute the expected winning rate against an opponent rated 1500. The constant average rating, by the way, means that a bot which remains the same can see its rating decline over time if its opponents improve.

Ratings are not accurate for bots with a very small number of games. I plan to exclude those bots from the curves over time.

		initial		current
bot	win %	Elo	expect	Elo	expect	games	earliest	latest
krasi0	68.77%	1593	63.07%	2163	97.85%	2142	2015 Nov 30	2016 Sep 27
Iron bot	77.74%	1580	61.31%	2081	96.59%	1999	2015 Nov 27	2016 Sep 26
Marian Devecka	58.66%	1790	84.15%	2065	96.28%	6289	2013 Dec 25	2016 Sep 27
Martin Rooijackers	68.50%	1840	87.62%	2011	94.99%	7290	2014 Jul 28	2016 Sep 27
tscmooz	79.80%	1823	86.52%	1991	94.41%	5006	2015 Feb 27	2016 Sep 27
tscmoo	72.06%	1838	87.50%	1978	94.00%	5719	2015 Jan 22	2016 Sep 27
LetaBot CIG 2016	75.68%	1748	80.65%	1932	92.32%	444	2016 Aug 01	2016 Sep 27
WuliBot	72.76%	1773	82.80%	1871	89.43%	984	2016 Apr 19	2016 Sep 26
Simon Prins	55.48%	1513	51.87%	1867	89.21%	5431	2015 Jan 25	2016 Sep 27
ICELab	81.12%	2189	98.14%	1865	89.10%	8344	2013 Dec 25	2016 Sep 27
FlashTest	69.44%	1744	80.29%	1863	88.99%	216	2016 Mar 22	2016 Jul 27
Sijia Xu	71.65%	1850	88.23%	1849	88.17%	2328	2015 Oct 10	2016 Sep 27
LetaBot SSCAI 2015 Final	65.87%	1710	77.01%	1813	85.84%	416	2016 Aug 04	2016 Sep 27
Dave Churchill	75.48%	1985	94.22%	1804	85.19%	8275	2013 Dec 25	2016 Sep 27
Chris Coxe	73.10%	1754	81.19%	1800	84.90%	2201	2015 Sep 03	2016 Sep 27
Tomas Vajda	79.37%	2169	97.92%	1790	84.15%	8372	2013 Dec 25	2016 Sep 27
Flash	65.69%	1458	43.98%	1777	83.13%	991	2016 Apr 18	2016 Sep 27
LetaBot IM noMCTS	60.93%	1645	69.73%	1766	82.22%	1226	2016 May 18	2016 Aug 01
Zia bot	52.24%	1568	59.66%	1757	81.45%	536	2016 Jul 07	2016 Sep 27
A Jarocki	62.77%	1711	77.11%	1741	80.02%	932	2015 Oct 04	2016 Jan 26
PeregrineBot	57.29%	1692	75.12%	1728	78.79%	1276	2016 Feb 09	2016 Sep 10
tscmoop	78.16%	1895	90.67%	1721	78.11%	1992	2015 Nov 11	2016 Sep 26
Andrew Smith	65.00%	1705	76.50%	1718	77.81%	8391	2013 Dec 25	2016 Sep 27
Florian Richoux	62.11%	1770	82.55%	1716	77.62%	8203	2013 Dec 25	2016 Sep 27
Carsten Nielsen	66.08%	1708	76.81%	1695	75.45%	4711	2015 Mar 17	2016 Sep 27
Soeren Klett	63.62%	2068	96.34%	1687	74.58%	8277	2013 Dec 25	2016 Sep 27
Vaclav Horazny	37.35%	1066	7.60%	1686	74.47%	6455	2013 Dec 25	2015 Nov 18
La Nuee	51.61%	1499	49.86%	1662	71.76%	558	2015 Dec 13	2016 Mar 18
Jakub Trancik	45.08%	1755	81.27%	1657	71.17%	8416	2013 Dec 25	2016 Sep 27
Marek Suppa	51.85%	1746	80.47%	1655	70.94%	4413	2015 Jan 05	2016 Mar 18
Krasimir Krystev	70.52%	2033	95.56%	1653	70.70%	6510	2013 Dec 25	2016 Mar 10
ASPbot2011	49.78%	1671	72.80%	1652	70.58%	227	2015 Jan 29	2016 Feb 25
Marcin Bartnicki	60.42%	1855	88.53%	1633	68.26%	1435	2014 Nov 28	2016 Mar 18
Tomas Cere	61.11%	1888	90.32%	1631	68.01%	8373	2013 Dec 25	2016 Sep 27
MegaBot	49.40%	1576	60.77%	1630	67.88%	419	2016 Aug 01	2016 Sep 27
Aurelien Lermant	58.26%	1688	74.69%	1622	66.87%	3687	2015 Jun 22	2016 Sep 27
Matej Kravjar	49.57%	1723	78.31%	1619	66.49%	3234	2013 Dec 25	2015 Feb 18
Daniel Blackburn	43.79%	1651	70.46%	1605	64.67%	6883	2013 Dec 25	2016 Jan 26
Gabriel Synnaeve	45.96%	1737	79.65%	1584	61.86%	1658	2013 Dec 25	2015 Nov 24
David Milec	49.09%	1552	57.43%	1566	59.39%	55	2015 Jan 13	2015 Jan 20
Odin2014	55.65%	1659	71.41%	1565	59.25%	5648	2014 Dec 21	2016 Sep 11
Gaoyuan Chen	48.05%	1582	61.59%	1559	58.41%	5118	2015 Feb 10	2016 Sep 27
Henri Kumpulainen	38.81%	1447	42.43%	1553	57.57%	894	2016 Jan 13	2016 May 31
Martin Dekar	33.14%	1429	39.92%	1533	54.73%	4910	2013 Dec 25	2016 Jan 25
Serega	48.20%	1771	82.64%	1505	50.72%	3803	2015 Jan 31	2016 Jan 26
Chris Ayers	35.53%	1610	65.32%	1481	47.27%	1520	2015 Aug 10	2016 Jan 26
Nathan a David	39.34%	1446	42.29%	1481	47.27%	1004	2016 Feb 23	2016 Aug 08
DAIDOES	34.02%	1370	32.12%	1471	45.84%	485	2016 Jun 13	2016 Sep 08
FlashZerg	0.00%	1474	46.27%	1459	44.13%	7	2016 Apr 24	2016 May 12
Igor Lacik	39.32%	1608	65.06%	1454	43.42%	8073	2013 Dec 25	2016 Sep 08
Matej Istenik	44.74%	1709	76.91%	1449	42.71%	8297	2013 Dec 25	2016 Sep 27
EradicatumXVR	40.88%	1537	55.30%	1443	41.87%	4687	2013 Dec 25	2016 Jan 23
Ibrahim Awwal	30.57%	1510	51.44%	1437	41.03%	530	2013 Dec 25	2014 Mar 24
Tomasz Michalski	27.02%	1314	25.53%	1432	40.34%	433	2015 Dec 22	2016 Mar 18
Oleg Ostroumov	48.75%	1714	77.41%	1431	40.20%	3641	2013 Dec 25	2016 Jan 26
NUS Bot	35.72%	1482	47.41%	1426	39.51%	3337	2015 May 19	2016 Sep 06
Martin Pinter	28.98%	1409	37.20%	1425	39.37%	3740	2013 Dec 25	2015 Dec 11
Roman Danielis	45.63%	1688	74.69%	1417	38.28%	5155	2013 Dec 25	2016 Sep 26
ZerGreenBot	22.22%	1404	36.53%	1416	38.14%	36	2016 Sep 22	2016 Sep 27
Rafael Bocquet	0.00%	1450	42.85%	1415	38.01%	10	2015 Jun 23	2015 Jun 26
Flashrelease	0.00%	1449	42.71%	1413	37.73%	8	2016 Apr 24	2016 Apr 24
Marek Kadek	37.29%	1557	58.13%	1413	37.73%	7641	2013 Dec 25	2016 May 22
Ian Nicholas DaCosta	37.12%	1394	35.20%	1404	36.53%	2928	2015 Apr 27	2016 Sep 08
AwesomeBot	29.81%	1326	26.86%	1403	36.39%	473	2016 Jun 16	2016 Sep 08
Radim Bobek	23.37%	1315	25.64%	1390	34.68%	1151	2015 Oct 01	2016 Mar 06
Adrian Sternmuller	26.89%	1436	40.89%	1375	32.75%	4529	2013 Dec 25	2016 Jul 22
Martin Strapko	19.76%	1388	34.42%	1366	31.62%	3386	2013 Dec 25	2016 Jan 26
Maja Nemsilajova	23.81%	1365	31.49%	1363	31.25%	4246	2013 Dec 25	2015 Nov 29
Johan Kayser	24.46%	1294	23.40%	1361	31.00%	413	2016 Jul 29	2016 Sep 27
UPStarcraftAI	24.75%	1346	29.18%	1360	30.88%	610	2015 Dec 24	2016 Apr 13
Martin Vlcak	28.92%	1370	32.12%	1353	30.02%	1224	2016 Feb 16	2016 Sep 07
Johannes Holzfuss	35.04%	1531	54.45%	1351	29.78%	685	2016 Mar 05	2016 Jun 15
Vojtech Jirsa	14.14%	1186	14.09%	1350	29.66%	2786	2015 Jan 12	2015 Sep 05
JompaBot	21.99%	1316	25.75%	1349	29.54%	1055	2016 Feb 04	2016 Aug 13
Rob Bogie	31.34%	1335	27.89%	1346	29.18%	651	2016 May 14	2016 Sep 06
Christoffer Artmann	20.51%	1289	22.89%	1344	28.95%	395	2016 Aug 07	2016 Sep 27
Marek Gajdos	22.69%	1251	19.26%	1331	27.43%	1384	2016 Jan 30	2016 Sep 11
Travis Shelton	23.59%	1390	34.68%	1314	25.53%	1221	2016 Feb 28	2016 Sep 06
Peter Dobsa	13.25%	1227	17.20%	1307	24.77%	3027	2015 Jan 11	2015 Oct 02
VeRLab	17.06%	1241	18.38%	1304	24.45%	897	2016 Feb 28	2016 Aug 01
Andrej Sekac	11.76%	1359	30.75%	1296	23.61%	68	2013 Dec 25	2014 Jan 04
Bjorn P Mattsson	22.22%	1351	29.78%	1295	23.50%	4442	2015 Apr 05	2016 Sep 27
Lukas Sedlacek	22.86%	1344	28.95%	1293	23.30%	70	2015 Jan 12	2015 Jan 20
Sergei Lebedinskij	13.30%	1178	13.55%	1293	23.30%	1083	2015 May 28	2015 Sep 03
Vladimir Jurenka	38.45%	1635	68.51%	1278	21.79%	6167	2013 Dec 25	2016 Sep 27
neverdieTRX	20.66%	1265	20.54%	1272	21.21%	334	2016 Jul 19	2016 Sep 10
OpprimoBot	21.85%	1321	26.30%	1256	19.71%	2009	2015 Nov 18	2016 Sep 27
Marek Kruzliak	14.45%	1151	11.83%	1255	19.62%	934	2013 Dec 25	2015 Jan 20
Sungguk Cha	18.65%	1207	15.62%	1250	19.17%	697	2016 Jun 05	2016 Sep 27
Jacob Knudsen	20.53%	1083	8.31%	1247	18.90%	1257	2016 Feb 23	2016 Sep 10
Ludmila Nemsilajova	16.04%	1133	10.79%	1228	17.28%	505	2013 Dec 25	2015 Jan 21
Karin Valisova	17.68%	1238	18.12%	1226	17.12%	1171	2013 Dec 25	2016 Jan 26
HoangPhuc	15.67%	1132	10.73%	1209	15.77%	300	2016 Jul 18	2016 Sep 07
Sebastian Mahr	15.06%	1205	15.47%	1182	13.82%	1202	2016 Jan 13	2016 Aug 08
Jan Pajan	14.48%	1210	15.85%	1179	13.61%	1119	2013 Dec 25	2016 Jan 05
Pablo Garcia Sanchez	12.20%	1123	10.25%	1174	13.28%	590	2015 Dec 24	2016 Apr 13
Ivana Kellyerova	11.47%	1129	10.57%	1131	10.68%	1630	2013 Dec 25	2015 Apr 01
Lucia Pivackova	13.29%	1111	9.63%	1090	8.63%	835	2013 Dec 25	2015 Jan 20
Tae Jun Oh	4.55%	1069	7.72%	1036	6.47%	154	2016 Mar 22	2016 Apr 11
Denis Ivancik	10.76%	1102	9.19%	1022	6.00%	502	2013 Dec 25	2015 Jan 20
ButcherBoy	4.74%	921	3.45%	970	4.52%	422	2016 Jun 21	2016 Sep 06
Jon W	5.06%	920	3.43%	964	4.37%	790	2015 Apr 30	2015 Jul 09
Matyas Novy	6.32%	1130	10.62%	885	2.82%	1693	2015 Feb 04	2015 Jul 09

How did I get the initial ratings? I had a cute idea. One of the issues with computing Elo ratings over time is: How do you initialize the ratings? Most systems either start everybody with the same rating, which makes an ugly graph, or use a different and less accurate method to estimate the rating in early games. But in this case I have the whole data set in hand. I set the final rating of every bot to the same rating and computed ratings backwards in time to find an initial rating. Then I threw away everything except the initial rating, and calculated the real ratings forward in time to find the ratings over time and the final ratings. That way every data point is equally good, from beginning to end. I doubt I’m the first to think of it, but it’s a cute idea and I’m pleased.

Next: I’ll find some sensible way to plot the curves. Stand by!

SSCAIT career records

Krasimir Krastev aka Krasi0 sent me a file of game results from SSCAIT, including 141,163 games recorded between 25 December 2013 and today. (Obviously it doesn’t include all games played today.) He’s particularly interested in the evolution of Elo ratings over time and my colorful crosstables per map.

It may take me a while to get to that stuff. Here’s a down payment. First, the career record of the 103 bots in the data, with win rates and dates. The top career win rate is IceBot from ICELab, followed by Tscmoo zerg and Tomas Vajda’s XIMP. Of course career win rate is not a fair comparison for bots which improved greatly over their careers, or which have shorter careers.

bot	win %	games	earliest	latest
A Jarocki	62.77%	932	2015 Oct 04	2016 Jan 26
Adrian Sternmuller	26.89%	4529	2013 Dec 25	2016 Jul 22
Andrej Sekac	11.76%	68	2013 Dec 25	2014 Jan 04
Andrew Smith	65.00%	8391	2013 Dec 25	2016 Sep 27
ASPbot2011	49.78%	227	2015 Jan 29	2016 Feb 25
Aurelien Lermant	58.26%	3687	2015 Jun 22	2016 Sep 27
AwesomeBot	29.81%	473	2016 Jun 16	2016 Sep 08
Bjorn P Mattsson	22.22%	4442	2015 Apr 05	2016 Sep 27
ButcherBoy	4.74%	422	2016 Jun 21	2016 Sep 06
Carsten Nielsen	66.08%	4711	2015 Mar 17	2016 Sep 27
Chris Ayers	35.53%	1520	2015 Aug 10	2016 Jan 26
Chris Coxe	73.10%	2201	2015 Sep 03	2016 Sep 27
Christoffer Artmann	20.51%	395	2016 Aug 07	2016 Sep 27
DAIDOES	34.02%	485	2016 Jun 13	2016 Sep 08
Daniel Blackburn	43.79%	6883	2013 Dec 25	2016 Jan 26
Dave Churchill	75.48%	8275	2013 Dec 25	2016 Sep 27
David Milec	49.09%	55	2015 Jan 13	2015 Jan 20
Denis Ivancik	10.76%	502	2013 Dec 25	2015 Jan 20
EradicatumXVR	40.88%	4687	2013 Dec 25	2016 Jan 23
Flash	65.69%	991	2016 Apr 18	2016 Sep 27
Flashrelease	0.00%	8	2016 Apr 24	2016 Apr 24
FlashTest	69.44%	216	2016 Mar 22	2016 Jul 27
FlashZerg	0.00%	7	2016 Apr 24	2016 May 12
Florian Richoux	62.11%	8203	2013 Dec 25	2016 Sep 27
Gabriel Synnaeve	45.96%	1658	2013 Dec 25	2015 Nov 24
Gaoyuan Chen	48.05%	5118	2015 Feb 10	2016 Sep 27
Henri Kumpulainen	38.81%	894	2016 Jan 13	2016 May 31
HoangPhuc	15.67%	300	2016 Jul 18	2016 Sep 07
Ian Nicholas DaCosta	37.12%	2928	2015 Apr 27	2016 Sep 08
Ibrahim Awwal	30.57%	530	2013 Dec 25	2014 Mar 24
ICELab	81.12%	8344	2013 Dec 25	2016 Sep 27
Igor Lacik	39.32%	8073	2013 Dec 25	2016 Sep 08
Iron bot	77.74%	1999	2015 Nov 27	2016 Sep 26
Ivana Kellyerova	11.47%	1630	2013 Dec 25	2015 Apr 01
Jacob Knudsen	20.53%	1257	2016 Feb 23	2016 Sep 10
Jakub Trancik	45.08%	8416	2013 Dec 25	2016 Sep 27
Jan Pajan	14.48%	1119	2013 Dec 25	2016 Jan 05
Johan Kayser	24.46%	413	2016 Jul 29	2016 Sep 27
Johannes Holzfuss	35.04%	685	2016 Mar 05	2016 Jun 15
JompaBot	21.99%	1055	2016 Feb 04	2016 Aug 13
Jon W	5.06%	790	2015 Apr 30	2015 Jul 09
Karin Valisova	17.68%	1171	2013 Dec 25	2016 Jan 26
krasi0	68.77%	2142	2015 Nov 30	2016 Sep 27
Krasimir Krystev	70.52%	6510	2013 Dec 25	2016 Mar 10
La Nuee	51.61%	558	2015 Dec 13	2016 Mar 18
LetaBot CIG 2016	75.68%	444	2016 Aug 01	2016 Sep 27
LetaBot IM noMCTS	60.93%	1226	2016 May 18	2016 Aug 01
LetaBot SSCAI 2015 Final	65.87%	416	2016 Aug 04	2016 Sep 27
Lucia Pivackova	13.29%	835	2013 Dec 25	2015 Jan 20
Ludmila Nemsilajova	16.04%	505	2013 Dec 25	2015 Jan 21
Lukas Sedlacek	22.86%	70	2015 Jan 12	2015 Jan 20
Maja Nemsilajova	23.81%	4246	2013 Dec 25	2015 Nov 29
Marcin Bartnicki	60.42%	1435	2014 Nov 28	2016 Mar 18
Marek Gajdos	22.69%	1384	2016 Jan 30	2016 Sep 11
Marek Kadek	37.29%	7641	2013 Dec 25	2016 May 22
Marek Kruzliak	14.45%	934	2013 Dec 25	2015 Jan 20
Marek Suppa	51.85%	4413	2015 Jan 05	2016 Mar 18
Marian Devecka	58.66%	6289	2013 Dec 25	2016 Sep 27
Martin Dekar	33.14%	4910	2013 Dec 25	2016 Jan 25
Martin Pinter	28.98%	3740	2013 Dec 25	2015 Dec 11
Martin Rooijackers	68.50%	7290	2014 Jul 28	2016 Sep 27
Martin Strapko	19.76%	3386	2013 Dec 25	2016 Jan 26
Martin Vlcak	28.92%	1224	2016 Feb 16	2016 Sep 07
Matej Istenik	44.74%	8297	2013 Dec 25	2016 Sep 27
Matej Kravjar	49.57%	3234	2013 Dec 25	2015 Feb 18
Matyas Novy	6.32%	1693	2015 Feb 04	2015 Jul 09
MegaBot	49.40%	419	2016 Aug 01	2016 Sep 27
Nathan a David	39.34%	1004	2016 Feb 23	2016 Aug 08
neverdieTRX	20.66%	334	2016 Jul 19	2016 Sep 10
NUS Bot	35.72%	3337	2015 May 19	2016 Sep 06
Odin2014	55.65%	5648	2014 Dec 21	2016 Sep 11
Oleg Ostroumov	48.75%	3641	2013 Dec 25	2016 Jan 26
OpprimoBot	21.85%	2009	2015 Nov 18	2016 Sep 27
Pablo Garcia Sanchez	12.20%	590	2015 Dec 24	2016 Apr 13
PeregrineBot	57.29%	1276	2016 Feb 09	2016 Sep 10
Peter Dobsa	13.25%	3027	2015 Jan 11	2015 Oct 02
Radim Bobek	23.37%	1151	2015 Oct 01	2016 Mar 06
Rafael Bocquet	0.00%	10	2015 Jun 23	2015 Jun 26
Rob Bogie	31.34%	651	2016 May 14	2016 Sep 06
Roman Danielis	45.63%	5155	2013 Dec 25	2016 Sep 26
Sebastian Mahr	15.06%	1202	2016 Jan 13	2016 Aug 08
Serega	48.20%	3803	2015 Jan 31	2016 Jan 26
Sergei Lebedinskij	13.30%	1083	2015 May 28	2015 Sep 03
Sijia Xu	71.65%	2328	2015 Oct 10	2016 Sep 27
Simon Prins	55.48%	5431	2015 Jan 25	2016 Sep 27
Soeren Klett	63.62%	8277	2013 Dec 25	2016 Sep 27
Sungguk Cha	18.65%	697	2016 Jun 05	2016 Sep 27
Tae Jun Oh	4.55%	154	2016 Mar 22	2016 Apr 11
Tomas Cere	61.11%	8373	2013 Dec 25	2016 Sep 27
Tomas Vajda	79.37%	8372	2013 Dec 25	2016 Sep 27
Tomasz Michalski	27.02%	433	2015 Dec 22	2016 Mar 18
Travis Shelton	23.59%	1221	2016 Feb 28	2016 Sep 06
tscmoo	72.06%	5719	2015 Jan 22	2016 Sep 27
tscmoop	78.16%	1992	2015 Nov 11	2016 Sep 26
tscmooz	79.80%	5006	2015 Feb 27	2016 Sep 27
UPStarcraftAI	24.75%	610	2015 Dec 24	2016 Apr 13
Vaclav Horazny	37.35%	6455	2013 Dec 25	2015 Nov 18
VeRLab	17.06%	897	2016 Feb 28	2016 Aug 01
Vladimir Jurenka	38.45%	6167	2013 Dec 25	2016 Sep 27
Vojtech Jirsa	14.14%	2786	2015 Jan 12	2015 Sep 05
WuliBot	72.76%	984	2016 Apr 19	2016 Sep 26
ZerGreenBot	22.22%	36	2016 Sep 22	2016 Sep 27
Zia bot	52.24%	536	2016 Jul 07	2016 Sep 27

Also the maps. Games on 2014 October 24 and earlier did not specify the map; it is blank in the file. The first game with a map specified was 2014 October 29, so there’s a gap in the records (maybe downtime, or tournament stuff). Anyway, we can see that the maps are the usual SSCAIT map pack plus BGH for a small number of games on April Fools.

It is Most Curious that Electric Circuit has fewer games. It was last played on 2015 Feb 3, though I still see it in the map pack that they distribute.

map	games	earliest	latest
(2)Benzene.scx	8257	2014 Oct 29	2016 Sep 27
(2)Destination.scx	8137	2014 Oct 29	2016 Sep 27
(2)HeartbreakRidge.scx	8249	2014 Oct 29	2016 Sep 27
(3)NeoMoonGlaive.scx	8157	2014 Oct 29	2016 Sep 27
(3)TauCross.scx	8182	2014 Oct 29	2016 Sep 27
(4)Andromeda.scx	8233	2014 Oct 29	2016 Sep 27
(4)CircuitBreaker.scx	8083	2014 Oct 29	2016 Sep 27
(4)ElectricCircuit.scx	975	2014 Oct 29	2015 Feb 03
(4)EmpireoftheSun.scm	8318	2014 Oct 29	2016 Sep 26
(4)FightingSpirit.scx	8288	2014 Oct 29	2016 Sep 27
(4)Icarus.scm	8237	2014 Oct 29	2016 Sep 27
(4)Jade.scx	8154	2014 Oct 29	2016 Sep 27
(4)LaMancha1.1.scx	8130	2014 Oct 29	2016 Sep 27
(4)Python.scx	8175	2014 Oct 29	2016 Sep 27
(4)Roadrunner.scx	8172	2014 Oct 29	2016 Sep 27
(8)BGH.scm	463	2015 Apr 01	2016 Apr 02
[none specified]	24953	2013 Dec 25	2014 Oct 24

CIG 2016 - crosstables per map

Today I crush you under a mass of charts, crosstables for each of the 5 maps in CIG 2016. This is 4 dimensional data (bot 1, bot 2, map, winning rate) and I imagine there’s a clearer way to present it, but I don’t know what it is so you get it in the first form I thought of. At the end is a link to the software.

As a reminder, here are the maps.

(2)RideofValkyries1.0
(3)Alchemist1.0
(3)TauCross1.1
(4)LunaTheFinal2.3
(4)Python1.3

With 100 rounds and 5 maps, for each pairing 20 games were played on each map (minus a few games missing due to errors). So the percentages vary in steps of 5% (or more if games are missing), and the error bars are wide.

The qualifier tables are big. The first is the full tournament for comparison, the rest are the subtournaments played on each map.

	overall	Iron	tscm	Leta	Over	Mega	UAlb	ZZZK	Aiur	Tyr	Ziab	Terr	SRbo	Oppr	Xeln	Bonj	Sals
Iron	79.20%		56%	39%	56%	38%	63%	53%	91%	99%	95%	100%	100%	98%	100%	100%	100%
tscmoo	76.97%	44%		48%	53%	75%	87%	82%	43%	45%	81%	100%	100%	100%	97%	100%	100%
LetaBot	74.07%	61%	52%		81%	28%	60%	51%	31%	72%	78%	100%	100%	99%	99%	99%	100%
Overkill	70.98%	44%	47%	19%		84%	32%	56%	79%	31%	88%	98%	95%	93%	99%	100%	100%
MegaBot	70.11%	62%	25%	72%	16%		52%	7%	66%	85%	99%	96%	95%	87%	91%	99%	100%
UAlbertaBot	69.25%	37%	13%	40%	68%	48%		55%	77%	42%	88%	100%	89%	85%	97%	100%	100%
ZZZKBot	69.18%	47%	18%	49%	44%	93%	45%		69%	29%	49%	100%	100%	100%	94%	100%	100%
Aiur	63.15%	9%	57%	69%	21%	34%	23%	31%		44%	89%	86%	96%	99%	94%	96%	100%
Tyr	61.64%	1%	55%	28%	69%	15%	58%	71%	56%		24%	74%	98%	94%	88%	95%	99%
Ziabot	46.43%	5%	19%	22%	12%	1%	12%	51%	11%	76%		59%	98%	100%	32%	100%	100%
TerranUAB	33.51%	0%	0%	0%	2%	4%	0%	0%	14%	26%	41%		81%	76%	90%	70%	98%
SRbotOne	22.15%	0%	0%	0%	5%	5%	11%	0%	4%	2%	2%	19%		19%	92%	74%	99%
OpprimoBot	22.10%	2%	0%	1%	7%	13%	15%	0%	1%	6%	0%	24%	81%		56%	27%	99%
XelnagaII	20.71%	0%	3%	1%	1%	9%	3%	6%	6%	12%	68%	10%	8%	44%		56%	83%
Bonjwa	18.95%	0%	0%	1%	0%	1%	0%	0%	4%	5%	0%	30%	26%	73%	44%		100%
Salsa	1.47%	0%	0%	0%	0%	0%	0%	0%	0%	1%	0%	2%	1%	1%	17%	0%

Valkyries	overall	Iron	tscm	Leta	Over	Mega	UAlb	ZZZK	Aiur	Tyr	Ziab	Terr	SRbo	Oppr	Xeln	Bonj	Sals
Iron	72.67%		35%	10%	40%	25%	75%	40%	75%	95%	100%	100%	100%	95%	100%	100%	100%
tscmoo	80.67%	65%		55%	60%	80%	90%	80%	30%	55%	95%	100%	100%	100%	100%	100%	100%
LetaBot	72.00%	90%	45%		95%	30%	60%	10%	10%	85%	55%	100%	100%	100%	100%	100%	100%
Overkill	71.33%	60%	40%	5%		85%	25%	50%	95%	45%	70%	95%	100%	100%	100%	100%	100%
MegaBot	70.33%	75%	20%	70%	15%		30%	0%	65%	90%	100%	95%	95%	100%	100%	100%	100%
UAlbertaBot	69.67%	25%	10%	40%	75%	70%		50%	85%	35%	90%	100%	75%	90%	100%	100%	100%
ZZZKBot	77.67%	60%	20%	90%	50%	100%	50%		100%	55%	40%	100%	100%	100%	100%	100%	100%
Aiur	63.00%	25%	70%	90%	5%	35%	15%	0%		70%	70%	80%	95%	100%	90%	100%	100%
Tyr	59.67%	5%	45%	15%	55%	10%	65%	45%	30%		65%	70%	100%	100%	95%	95%	100%
Ziabot	51.33%	0%	5%	45%	30%	0%	10%	60%	30%	35%		65%	100%	100%	90%	100%	100%
TerranUAB	34.67%	0%	0%	0%	5%	5%	0%	0%	20%	30%	35%		80%	80%	90%	80%	95%
SRbotOne	24.33%	0%	0%	0%	0%	5%	25%	0%	5%	0%	0%	20%		25%	85%	100%	100%
OpprimoBot	20.00%	5%	0%	0%	0%	0%	10%	0%	0%	0%	0%	20%	75%		75%	20%	95%
XelnagaII	12.00%	0%	0%	0%	0%	0%	0%	0%	10%	5%	10%	10%	15%	25%		40%	65%
Bonjwa	17.67%	0%	0%	0%	0%	0%	0%	0%	0%	5%	0%	20%	0%	80%	60%		100%
Salsa	3.00%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	5%	0%	5%	35%	0%

Alchemist	overall	Iron	tscm	Leta	Over	Mega	UAlb	ZZZK	Aiur	Tyr	Ziab	Terr	SRbo	Oppr	Xeln	Bonj	Sals
Iron	77.00%		80%	45%	80%	40%	40%	0%	95%	100%	75%	100%	100%	100%	100%	100%	100%
tscmoo	69.90%	20%		35%	75%	50%	80%	75%	15%	35%	75%	100%	100%	100%	89%	100%	100%
LetaBot	72.33%	55%	65%		55%	15%	40%	80%	15%	75%	90%	100%	100%	100%	100%	95%	100%
Overkill	66.67%	20%	25%	45%		90%	10%	55%	60%	45%	85%	100%	95%	75%	95%	100%	100%
MegaBot	70.67%	60%	50%	85%	10%		45%	15%	60%	90%	100%	85%	90%	95%	80%	95%	100%
UAlbertaBot	77.00%	60%	20%	60%	90%	55%		40%	85%	80%	95%	100%	85%	85%	100%	100%	100%
ZZZKBot	72.67%	100%	25%	20%	45%	85%	60%		65%	50%	45%	100%	100%	100%	95%	100%	100%
Aiur	68.00%	5%	85%	85%	40%	40%	15%	35%		60%	90%	80%	90%	100%	100%	95%	100%
Tyr	51.33%	0%	65%	25%	55%	10%	20%	50%	40%		10%	60%	90%	80%	80%	85%	100%
Ziabot	47.33%	25%	25%	10%	15%	0%	5%	55%	10%	90%		50%	100%	100%	25%	100%	100%
TerranUAB	36.67%	0%	0%	0%	0%	15%	0%	0%	20%	40%	50%		80%	80%	90%	80%	95%
SRbotOne	23.67%	0%	0%	0%	5%	10%	15%	0%	10%	10%	0%	20%		5%	100%	80%	100%
OpprimoBot	22.33%	0%	0%	0%	25%	5%	15%	0%	0%	20%	0%	20%	95%		35%	20%	100%
XelnagaII	25.08%	0%	11%	0%	5%	20%	0%	5%	0%	20%	75%	10%	0%	65%		75%	90%
Bonjwa	18.33%	0%	0%	5%	0%	5%	0%	0%	5%	15%	0%	20%	20%	80%	25%		100%
Salsa	1.00%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	5%	0%	0%	10%	0%

Tau Cross	overall	Iron	tscm	Leta	Over	Mega	UAlb	ZZZK	Aiur	Tyr	Ziab	Terr	SRbo	Oppr	Xeln	Bonj	Sals
Iron	82.33%		65%	40%	75%	15%	55%	95%	90%	100%	100%	100%	100%	100%	100%	100%	100%
tscmoo	77.33%	35%		35%	40%	100%	90%	95%	45%	35%	85%	100%	100%	100%	100%	100%	100%
LetaBot	80.67%	60%	65%		90%	25%	80%	80%	60%	50%	100%	100%	100%	100%	100%	100%	100%
Overkill	73.33%	25%	60%	10%		100%	40%	55%	95%	20%	95%	100%	100%	100%	100%	100%	100%
MegaBot	72.33%	85%	0%	75%	0%		55%	15%	85%	95%	100%	100%	100%	80%	95%	100%	100%
UAlbertaBot	67.33%	45%	10%	20%	60%	45%		65%	75%	35%	85%	100%	95%	75%	100%	100%	100%
ZZZKBot	57.67%	5%	5%	20%	45%	85%	35%		25%	5%	55%	100%	100%	100%	85%	100%	100%
Aiur	61.20%	10%	55%	40%	5%	15%	25%	75%		15%	95%	95%	100%	100%	95%	95%	100%
Tyr	68.33%	0%	65%	50%	80%	5%	65%	95%	85%		10%	80%	100%	100%	90%	100%	100%
Ziabot	43.00%	0%	15%	0%	5%	0%	15%	45%	5%	90%		65%	95%	100%	10%	100%	100%
TerranUAB	33.44%	0%	0%	0%	0%	0%	0%	0%	5%	20%	35%		80%	90%	100%	70%	100%
SRbotOne	21.00%	0%	0%	0%	0%	0%	5%	0%	0%	0%	5%	20%		20%	95%	70%	100%
OpprimoBot	20.67%	0%	0%	0%	0%	20%	25%	0%	0%	0%	0%	10%	80%		55%	20%	100%
XelnagaII	22.00%	0%	0%	0%	0%	5%	0%	15%	5%	10%	90%	0%	5%	45%		70%	85%
Bonjwa	18.33%	0%	0%	0%	0%	0%	0%	0%	5%	0%	0%	30%	30%	80%	30%		100%
Salsa	1.00%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	15%	0%

Luna	overall	Iron	tscm	Leta	Over	Mega	UAlb	ZZZK	Aiur	Tyr	Ziab	Terr	SRbo	Oppr	Xeln	Bonj	Sals
Iron	81.33%		35%	55%	50%	50%	65%	70%	95%	100%	100%	100%	100%	100%	100%	100%	100%
tscmoo	81.00%	65%		40%	35%	70%	95%	90%	65%	75%	85%	100%	100%	100%	95%	100%	100%
LetaBot	74.00%	45%	60%		70%	25%	55%	55%	30%	85%	85%	100%	100%	100%	100%	100%	100%
Overkill	68.90%	50%	65%	30%		70%	35%	55%	65%	0%	95%	100%	80%	89%	100%	100%	100%
MegaBot	71.24%	50%	30%	75%	30%		70%	0%	65%	75%	95%	100%	95%	90%	95%	100%	100%
UAlbertaBot	66.22%	35%	5%	45%	65%	30%		60%	75%	35%	75%	100%	95%	80%	95%	100%	100%
ZZZKBot	68.33%	30%	10%	45%	45%	100%	40%		80%	30%	55%	100%	100%	100%	90%	100%	100%
Aiur	60.67%	5%	35%	70%	35%	35%	25%	20%		35%	90%	85%	95%	95%	90%	95%	100%
Tyr	63.21%	0%	25%	15%	100%	25%	65%	70%	65%		30%	75%	100%	100%	85%	100%	95%
Ziabot	45.33%	0%	15%	15%	5%	5%	25%	45%	10%	70%		60%	95%	100%	35%	100%	100%
TerranUAB	31.67%	0%	0%	0%	0%	0%	0%	0%	15%	25%	40%		85%	65%	80%	65%	100%
SRbotOne	20.67%	0%	0%	0%	20%	5%	5%	0%	5%	0%	5%	15%		10%	80%	65%	100%
OpprimoBot	24.16%	0%	0%	0%	11%	10%	20%	0%	5%	0%	0%	35%	90%		60%	35%	100%
XelnagaII	22.07%	0%	5%	0%	0%	5%	5%	10%	10%	15%	65%	20%	20%	40%		45%	90%
Bonjwa	19.67%	0%	0%	0%	0%	0%	0%	0%	5%	0%	0%	35%	35%	65%	55%		100%
Salsa	1.01%	0%	0%	0%	0%	0%	0%	0%	0%	5%	0%	0%	0%	0%	10%	0%

Python	overall	Iron	tscm	Leta	Over	Mega	UAlb	ZZZK	Aiur	Tyr	Ziab	Terr	SRbo	Oppr	Xeln	Bonj	Sals
Iron	82.67%		65%	45%	35%	60%	80%	60%	100%	100%	100%	100%	100%	95%	100%	100%	100%
tscmoo	75.92%	35%		75%	55%	75%	80%	70%	60%	25%	65%	100%	100%	100%	100%	100%	100%
LetaBot	71.33%	55%	25%		95%	45%	65%	30%	40%	65%	60%	100%	100%	95%	95%	100%	100%
Overkill	74.67%	65%	45%	5%		75%	50%	65%	80%	45%	95%	95%	100%	100%	100%	100%	100%
MegaBot	66.00%	40%	25%	55%	25%		60%	5%	55%	75%	100%	100%	95%	70%	85%	100%	100%
UAlbertaBot	66.00%	20%	20%	35%	50%	40%		60%	65%	25%	95%	100%	95%	95%	90%	100%	100%
ZZZKBot	69.57%	40%	30%	70%	35%	95%	40%		75%	5%	53%	100%	100%	100%	100%	100%	100%
Aiur	62.88%	0%	40%	60%	20%	45%	35%	25%		40%	100%	90%	100%	100%	95%	95%	100%
Tyr	65.67%	0%	75%	35%	55%	25%	75%	95%	60%		5%	85%	100%	90%	90%	95%	100%
Ziabot	45.12%	0%	35%	40%	5%	0%	5%	47%	0%	95%		53%	100%	100%	0%	100%	100%
TerranUAB	31.10%	0%	0%	0%	5%	0%	0%	0%	10%	15%	47%		80%	65%	90%	55%	100%
SRbotOne	21.07%	0%	0%	0%	0%	5%	5%	0%	0%	0%	0%	20%		35%	100%	55%	95%
OpprimoBot	23.33%	5%	0%	5%	0%	30%	5%	0%	0%	10%	0%	35%	65%		55%	40%	100%
XelnagaII	22.41%	0%	0%	5%	0%	15%	10%	0%	5%	10%	100%	10%	0%	45%		50%	85%
Bonjwa	20.74%	0%	0%	0%	0%	0%	0%	0%	5%	5%	0%	45%	45%	60%	50%		100%
Salsa	1.33%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	0%	5%	0%	15%	0%

The final tables are small. Again, the first is the full tournament, the rest are the maps.

	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	65.14%		52%	44%	79%	71%	77%	83%	50%
Iron	54.43%	48%		38%	49%	49%	74%	30%	93%
LetaBot	53.71%	56%	62%		49%	81%	69%	30%	29%
ZZZKBot	53.08%	21%	51%	51%		42%	35%	93%	78%
Overkill	51.43%	29%	51%	19%	58%		43%	81%	79%
UAlbertaBot	49.07%	23%	26%	31%	65%	57%		76%	66%
MegaBot	38.00%	17%	70%	70%	7%	19%	24%		59%
Aiur	35.14%	50%	7%	71%	22%	21%	34%	41%

Valkyries	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	68.57%		60%	30%	75%	75%	85%	90%	65%
Iron	47.86%	40%		25%	40%	50%	60%	25%	95%
LetaBot	50.71%	70%	75%		0%	85%	70%	30%	25%
ZZZKBot	67.86%	25%	60%	100%		40%	55%	95%	100%
Overkill	52.14%	25%	50%	15%	60%		40%	100%	75%
UAlbertaBot	50.00%	15%	40%	30%	45%	60%		90%	70%
MegaBot	35.71%	10%	75%	70%	5%	0%	10%		80%
Aiur	27.14%	35%	5%	75%	0%	25%	30%	20%

Alchemist	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	49.29%		45%	35%	65%	80%	55%	50%	15%
Iron	44.29%	55%		50%	0%	35%	60%	30%	80%
LetaBot	51.43%	65%	50%		75%	60%	65%	20%	25%
ZZZKBot	61.43%	35%	100%	25%		45%	45%	95%	85%
Overkill	51.43%	20%	65%	40%	55%		45%	75%	60%
UAlbertaBot	52.14%	45%	40%	35%	55%	55%		75%	60%
MegaBot	42.14%	50%	70%	80%	5%	25%	25%		40%
Aiur	47.86%	85%	20%	75%	15%	40%	40%	60%

Tau Cross	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	61.43%		25%	45%	90%	45%	100%	95%	30%
Iron	66.43%	75%		55%	95%	50%	85%	15%	90%
LetaBot	61.43%	55%	45%		85%	90%	80%	35%	40%
ZZZKBot	29.29%	10%	5%	15%		50%	25%	75%	25%
Overkill	52.14%	55%	50%	10%	50%		40%	70%	90%
UAlbertaBot	45.00%	0%	15%	20%	75%	60%		75%	70%
MegaBot	37.86%	5%	85%	65%	25%	30%	25%		30%
Aiur	46.43%	70%	10%	60%	75%	10%	30%	70%

Luna	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	71.43%		70%	60%	80%	75%	75%	90%	50%
Iron	55.00%	30%		30%	55%	50%	80%	40%	100%
LetaBot	51.43%	40%	70%		65%	75%	65%	20%	25%
ZZZKBot	52.14%	20%	45%	35%		60%	25%	100%	80%
Overkill	48.57%	25%	50%	25%	40%		45%	85%	70%
UAlbertaBot	50.00%	25%	20%	35%	75%	55%		70%	70%
MegaBot	40.00%	10%	60%	80%	0%	15%	30%		85%
Aiur	31.43%	50%	0%	75%	20%	30%	30%	15%

Python	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	75.00%		60%	50%	85%	80%	70%	90%	90%
Iron	58.57%	40%		30%	55%	60%	85%	40%	100%
LetaBot	53.57%	50%	70%		20%	95%	65%	45%	30%
ZZZKBot	54.68%	15%	45%	80%		15%	26%	100%	100%
Overkill	52.86%	20%	40%	5%	85%		45%	75%	100%
UAlbertaBot	48.20%	30%	15%	35%	74%	55%		70%	60%
MegaBot	34.29%	10%	60%	55%	0%	25%	30%		60%
Aiur	22.86%	10%	0%	70%	0%	0%	40%	40%

The charts are full of small insights—more than I have time to examine. See for example how XelnagaII’s upset of Ziabot occurred on all maps except Ride of Valkyries; I’m sure that says something about at least one of those bots. We can tease out which pairings the map imbalances spring from. ZZZKBot did poorly on Tau Cross, as explained by Martin Rooijackers due to the long rush distance. And so on.

My strongest impression is how much results vary from map to map. I still think 5 maps are not enough to judge strength fairly. To my eye, the datapoint that stands out most is that ZZZKBot defeated the powerful Iron 100% of the time on Alchemist, in both the qualifier and the final, although otherwise Alchemist was a mediocre map for ZZZKBot. It looks as though Iron has a strategy bug on that map which ZZZKBot exploits. All bot authors who competed may want to eye the charts for hints about weaknesses to fix.

Download a zip file of the perl scripts with documentation.

CIG 2016 - the final hidden in the qualifier

Yesterday I claimed that the final stage of CIG 2016 produced little new information, because it was equivalent to drawing a subset from the qualifiers. Is it true? I wrote a script to render crosstables from subsets of game results.

Here’s my rendition of the real finals. I liked the red and green color coding of win rates in the original, but some people are red-green colorblind so my version has red and blue instead. I also went with a more contrasty color curve.

	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	65.14%		52%	44%	79%	71%	77%	83%	50%
Iron	54.43%	48%		38%	49%	49%	74%	30%	93%
LetaBot	53.71%	56%	62%		49%	81%	69%	30%	29%
ZZZKBot	53.08%	21%	51%	51%		42%	35%	93%	78%
Overkill	51.43%	29%	51%	19%	58%		43%	81%	79%
UAlbertaBot	49.07%	23%	26%	31%	65%	57%		76%	66%
MegaBot	38.00%	17%	70%	70%	7%	19%	24%		59%
Aiur	35.14%	50%	7%	71%	22%	21%	34%	41%

Here is the crosstable of the final hidden in the qualifier, which is to say the qualifier games played between finalists.

	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	61.71%		44%	48%	82%	53%	87%	75%	43%
Iron	56.57%	56%		39%	53%	56%	63%	38%	91%
LetaBot	52.00%	52%	61%		51%	81%	60%	28%	31%
ZZZKBot	52.14%	18%	47%	49%		44%	45%	93%	69%
Overkill	51.57%	47%	44%	19%	56%		32%	84%	79%
UAlbertaBot	48.29%	13%	37%	40%	55%	68%		48%	77%
MegaBot	42.86%	25%	62%	72%	7%	16%	52%		66%
Aiur	34.86%	57%	9%	69%	31%	21%	23%	34%

Overall results match closely. LetaBot and ZZZKBot have switched ranks, but that’s not a surprise because their scores were extremely close.

The 2 table cells with the largest differences are Tscmoo vs Overkill and MegaBot vs UAlbertaBot. The Tscmoo-Overkill numbers are within the expected range of statistical variation, according to spot checks with Fisher’s Exact Test, but the MegaBot-UAlbertaBot numbers are highly surprising, far outside the expected range. (The right way to do this would test both whole tables as a sample of samples of samples. :-) So there’s indication that something may be afoot.

I had a new thought. It’s theoretically possible that differences are caused by learning bots which generalize across opponents. Tscmoo and MegaBot are both learning bots (I verified it: they both wrote stuff to their learning files) and both seem as though they might be able to generalize across opponents. (Overkill is a learning bot but does not generalize.) So my original claim is not 100% true: The qualifiers don’t entirely duplicate the final in the presence of learning bots which generalize across opponents. Alternately, there could have been a problem with a big effect on that pairing (such as a bug in MegaBot related to its learning, an example which is equivalent to mis-generalizing across opponents). We have the source and the replays, so a sufficiently deep dig should turn up the issue if it is in the bots. There’s a chance that the issue is with the tournament operations, or with my script.

Here I combine the qualifier results with the final results to get the best numbers available. The organizers for whatever reason explicitly decided not to do this. Luckily, it doesn’t change the ranking of the bots.

	overall
tscmoo	63.43%
Iron	55.50%
LetaBot	52.86%
ZZZKBot	52.61%
Overkill	51.50%
UAlbertaBot	48.68%
MegaBot	40.43%
Aiur	35.00%

Tomorrow: More map analysis. Also I’ll release the script for others to play with.

CIG 2016 results discussion

I got ahead of myself yesterday—I should step back and talk about the CIG 2016 results more generally! Martin Rooijackers aka LetaBot sent me a few observations by e-mail. They mostly match up with my observations, and I’ll add a few of my own.

• Terran Renaissance confirmed, as predicted (probably by everybody who cared to predict).

• The top 3 winners, besides being terran, are all bots with many updates over the last several months.

• 3 bots of the final 8 are carryovers from past years (#5 Overkill, #6 UAlbertaBot, and #8 AIUR). They scored in the lower half. #4 ZZZKBot seems to have been only slightly updated. The long work put into the top 3 paid off in playing strength.

• Martin Rooijackers observes that #7 MegaBot is the highest-scoring brand new bot. It’s true if you count Iron as a continuation of Stone. And given MegaBot’s self-description as a meta-bot that uses the strategies of others, MegaBot is arguably not brand new either. In any case, the point is that it seems to take a long period of work to get to the top. The competition is fierce.

• None of the final 8 bots dominated the others. Even tail-ender AIUR had an equal record against winner Tscmoo and a winning record against LetaBot. The CIG 2016 finals crosstable has upsets throughout. Comparing to the AIIDE 2015 crosstable with 22 participants, the rate of upsets of bots near each other in rank seems visually similar, so with only 8 final bots the upsets run all the way through. Generally, bot #n is not clearly better than bot #n+1; the ranking is not stable at that level. In the qualifying stage, the rate of upsets visually looks steady down to #9 Tyr and then falls. AIIDE 2015 did not have that pattern.

• I predicted that ZZZKBot still had a chance to make it into the top 3. It didn’t, but it scored 53.08% to make #4 in the finals versus #3 LetaBot’s 53.71%. I think the prediction was justified. This was its last chance, though, without big updates.

• The qualifier results and finals results look different. Iron was narrowly on top in the qualifiers, but Tscmoo pulled well ahead in the finals (a surprise to me). Apparently Tscmoo is better tuned to defeat strong opponents.

• The slides on the result page include a chart of win rates over time which shows that learning helps some, but (as in the past) not as much as you’d hope. To learn more we need smarter learning. I’ll drop a few suggestions in a future post.

The bottom line is that we’re making good progress, though we’re still not far along the path. Tscmoo’s long short term memory is a pioneering idea and Tscmoo finished #1, but we don’t know much about it. Did the memory help results? Meanwhile, LetaBot finished #3 here, and is in a strong position as Martin Rooijackers tries to pioneer a next step in another direction, a tactical search derived from MaasCraft. Will the search lead to the hoped-for jump in strength? Tune in next time!

I question the tournament design. They ran a 100-round round robin with 16 bots and used the results to accept half of the entrants into the final—a staged design with qualifier and finals. That’s perfectly reasonable; it says that they’re more interested in who beats the strong than who consistently beats the weak. Having selected the finalists, they discarded the qualifier results and ran an independent final with 100 more rounds on the same maps for the 8 finalists. They even discarded bot learning files from the qualifier, so that nothing carried over. The final duplicated the qualifiers, only with fewer bots, and produced little new information. They could have saved the time and extracted the final results from the qualifier stage. It would have been equivalent.

In a staged tournament, each stage should produce new information. It could add to the qualifier results. It could have more rounds. It could include seeded opponents that skipped the qualifiers (though I wouldn’t recommend that for an academic tournament). It could include different maps. It could follow harsher rules. But something!

I can understand why they didn’t pass the qualifier results through to the final stage. They had the software they had, and an organizer’s time is always short. But this final had no point. I hope future tournaments will remember the lesson.

map balance - bot balance in CIG 2016

CIG 2016 reported its results in the same format as AIIDE 2015 (I’m sure they used the same software), so I was able to compute the map balance with a few adjustments to my script. The tournament was run in two halves, qualifiers and finals, each with 100 rounds. With 5 maps, that makes 20 times through the map pool. They could have used twice as many maps without any disadvantage that I see.

The qualifiers, with 16 bots playing 12,000 games total (minus a few lost to errors):

map	TvZ		ZvP		PvT
	wins	n	wins	n	wins	n
(2)RideofValkyries.scx	49%	640	61%	240	57%	480
(3)Alchemist.scm	50%	640	45%	240	60%	479
(3)TauCross.scx	56%	640	43%	240	53%	479
(4)LunaTheFinal.scx	53%	637	47%	240	53%	480
(4)Python.scx	49%	638	45%	240	50%	478
overall	51%	3195	48%	1200	55%	2396

The 3 races came out remarkably even! We already know that’s more due to the strength distribution of bots in the tournament than to the fairness of the game. The low-high spread in TvZ was 56%-49% = 7%; in ZvP 18%, and in PvT 7%. Ride of Valkyries had strikingly different ZvP results than the other maps. I don’t know why. Can anybody guess? The human balance also showed one map standing out in ZvP, but it was Alchemist.

The final, with 8 bots playing 2800 games, looks considerably different:

map	TvZ		ZvP		PvT
	wins	n	wins	n	wins	n
(2)RideofValkyries.scx	54%	120	92%	80	45%	120
(3)Alchemist.scm	52%	120	79%	80	63%	120
(3)TauCross.scx	76%	120	65%	80	49%	120
(4)LunaTheFinal.scx	67%	120	84%	80	46%	120
(4)Python.scx	66%	120	94%	80	34%	120
overall	63%	600	83%	400	48%	600

Here, protoss did poorly because the protoss bots came out on the bottom this time. It’s interesting that the middle-of-the-table zergs did more to hold down the protoss than the winning terrans (but it fits with the game storyline :-). Beyond that, I’m reluctant to draw conclusions from this smaller number of games with fewer players.

I feel vindicated: Map balance can make a difference, even though we don’t understand what the difference is!

map balance - comparing pro and bot balance

I started to think about fancy ways to normalize map balance data so that the numbers could be compared—and then I realized, who the hell cares? The data’s not good enough in the first place, at least the bot data, which is based on only 21 bots with idiosyncratic play styles and big race imbalances regardless of the maps. We can only get a general idea of the comparison anyway.

So I decided on a simple subtraction of the average from each map balance number, so that a map with average balance has normalized balance 0%. Then we can compare maps to see if they have similar relative balance for pros and bots. After normalization, TvZ > 0 means that terran did better than average on that map, and TvZ < 0 means that terran did worse.

map	TvZ		ZvP		PvT
	pro	bot	pro	bot	pro	bot
Benzene	10.8%	-2.9%	-5.3%	1.3%	-5.1%	-0.4%
Destination	-1.0%	-1.9%	2.6%	2.3%	0.7%	-0.4%
Heartbreak Ridge	-4.7%	3.1%	2.2%	-0.7%	5.3%	-3.4%
Aztec	-14.3%	-1.9%	-4.4%	1.3%	11.6%	-0.4%
Tau Cross	-3.3%	-0.9%	-4.4%	-0.7%	-1.8%	-0.4%
Andromeda	-10.6%	1.1%	4.4%	-1.7%	3.8%	-4.4%
Circuit Breaker	-0.4%	-0.9%	-2.6%	-1.7%	-0.8%	2.6%
Empire of the Sun	10.9%	-4.9%	-4.4%	-1.7%	-2.7%	0.6%
Fortress	11.0%	8.1%	12.3%	-0.7%	-2.6%	4.6%
Python	1.9%	1.1%	-0.5%	2.3%	-8.0%	1.6%

There’s no “overall” row because, after normalization, it’s just a row of zeroes. Also, as I mentioned, the sizes of the imbalances can’t be compared directly. A relative balance of -5% in the bot ZvP column (average balance 71%) doesn’t mean the same thing as -5% in the pro ZvP column (average balance 54.4%).

No convincing pattern is visible. The pro and bot columns have the same sign in 12 cases, which is not distinguishable from 50% (15 cases). Sometimes a pro map with a large imbalance has a large imbalance for bots too; sometimes not. Here’s a scatter chart with relative pro balance on the x-axis and relative bot balance on the y. Remember that the signs are arbitrary: We arbitrarily chose to compare TvZ rather than ZvT, so + and - were chosen arbitrarily. If your eyes think they see a pattern, flip one or two of the symbol sets around one axis or the other before you decide it’s real.

scatter chart showing the lack of relationship between map balance for pros and for bots

What does it all mean in practice? There are some maps with apparent imbalances, which means we should have map pools large enough that imbalances tend to average out. Most maps are not far from balanced, so 10 maps should be enough; the 5 maps of CIG 2016 do not seem enough. Other than that, there’s no reason to change how we select maps. We don’t know whether last year’s relative map balances will carry over to this year, when the skill of the top bots is greater and they are terran rather than zerg. The main conclusion is the same as the conclusion of all studies since the invention of science: More research is needed!