CIG 2016 - the final hidden in the qualifier

Yesterday I claimed that the final stage of CIG 2016 produced little new information, because it was equivalent to drawing a subset from the qualifiers. Is it true? I wrote a script to render crosstables from subsets of game results.

Here’s my rendition of the real finals. I liked the red and green color coding of win rates in the original, but some people are red-green colorblind so my version has red and blue instead. I also went with a more contrasty color curve.

	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	65.14%		52%	44%	79%	71%	77%	83%	50%
Iron	54.43%	48%		38%	49%	49%	74%	30%	93%
LetaBot	53.71%	56%	62%		49%	81%	69%	30%	29%
ZZZKBot	53.08%	21%	51%	51%		42%	35%	93%	78%
Overkill	51.43%	29%	51%	19%	58%		43%	81%	79%
UAlbertaBot	49.07%	23%	26%	31%	65%	57%		76%	66%
MegaBot	38.00%	17%	70%	70%	7%	19%	24%		59%
Aiur	35.14%	50%	7%	71%	22%	21%	34%	41%

Here is the crosstable of the final hidden in the qualifier, which is to say the qualifier games played between finalists.

	overall	tscm	Iron	Leta	ZZZK	Over	UAlb	Mega	Aiur
tscmoo	61.71%		44%	48%	82%	53%	87%	75%	43%
Iron	56.57%	56%		39%	53%	56%	63%	38%	91%
LetaBot	52.00%	52%	61%		51%	81%	60%	28%	31%
ZZZKBot	52.14%	18%	47%	49%		44%	45%	93%	69%
Overkill	51.57%	47%	44%	19%	56%		32%	84%	79%
UAlbertaBot	48.29%	13%	37%	40%	55%	68%		48%	77%
MegaBot	42.86%	25%	62%	72%	7%	16%	52%		66%
Aiur	34.86%	57%	9%	69%	31%	21%	23%	34%

Overall results match closely. LetaBot and ZZZKBot have switched ranks, but that’s not a surprise because their scores were extremely close.

The 2 table cells with the largest differences are Tscmoo vs Overkill and MegaBot vs UAlbertaBot. The Tscmoo-Overkill numbers are within the expected range of statistical variation, according to spot checks with Fisher’s Exact Test, but the MegaBot-UAlbertaBot numbers are highly surprising, far outside the expected range. (The right way to do this would test both whole tables as a sample of samples of samples. :-) So there’s indication that something may be afoot.

I had a new thought. It’s theoretically possible that differences are caused by learning bots which generalize across opponents. Tscmoo and MegaBot are both learning bots (I verified it: they both wrote stuff to their learning files) and both seem as though they might be able to generalize across opponents. (Overkill is a learning bot but does not generalize.) So my original claim is not 100% true: The qualifiers don’t entirely duplicate the final in the presence of learning bots which generalize across opponents. Alternately, there could have been a problem with a big effect on that pairing (such as a bug in MegaBot related to its learning, an example which is equivalent to mis-generalizing across opponents). We have the source and the replays, so a sufficiently deep dig should turn up the issue if it is in the bots. There’s a chance that the issue is with the tournament operations, or with my script.

Here I combine the qualifier results with the final results to get the best numbers available. The organizers for whatever reason explicitly decided not to do this. Luckily, it doesn’t change the ranking of the bots.

	overall
tscmoo	63.43%
Iron	55.50%
LetaBot	52.86%
ZZZKBot	52.61%
Overkill	51.50%
UAlbertaBot	48.68%
MegaBot	40.43%
Aiur	35.00%

Tomorrow: More map analysis. Also I’ll release the script for others to play with.

Trackbacks

No Trackbacks

Comments

Jay Scott on Sunday, September 25. 2016:

I should note: If there is an error, it is not likely to be in my script, because the cells in the table it prints match the corresponding cells in the original full qualifier crosstable. The script works from the raw data, but checking its output is easy in this case.

krasi0 on Tuesday, September 27. 2016:

Are you sure that any bot can generalize across multiple opponents? I thought the way IO works is that one can only store data per specific opponent that they have encountered previously...

Jay Scott on Tuesday, September 27. 2016:

I’m not sure that any bot does, but it seems possible. The file-per-opponent method is only a recommended way to avoid data loss. MegaBot’s I/O folder has 2 files per opponent, plus a file named “megabot_matchsummary.csv”. Looking at the files, though, I don’t see any sign that it actually does generalize.

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA