Overkill’s new learning 4 - the model and its features

What does it mean to have a linear model with binary features? “Linear” means that each feature comes with a number, its weight, so that with binary features you find Q(s,a) by adding up the weights for each feature that is present. Usually only a small proportion of all the features are present, so it’s not as crazy as it may sound.

Overkill gives its features long multi-part names, which it implements throughout as strings accessed via maps. (I was surprised to see that in a real-time program, but it’s probably easier.) The feature names are written out plainly in the I/O files. Here are a few scattered samples from the file feature_valueAiur, which lists 9638 features altogether:

action_battle_combine_state_battle_feature:enemyKeyBuilding_hasP_robotics_facility*hydraBuild:0.13396
action_battle_combine_state_battle_feature:enemyKeyBuilding_hasP_robotics_facility*mutaBuild:0.07588
action_battle_combine_state_battle_feature:enemyKeyBuilding_hasP_robotics_facility*zerglingBuild:0.06963
action_battle_combine_state_battle_feature:enemyKeyBuilding_hasP_stargate*hydraBuild:0.05439
action_battle_combine_state_battle_feature:enemyKeyBuilding_hasP_stargate*mutaBuild:0.10049
action_battle_combine_state_battle_feature:enemyKeyBuilding_hasP_stargate*zerglingBuild:0.26210

state_raw_combine_feature:enemyP_cannon_1*ourHydra_6:-0.21410
state_raw_combine_feature:enemyP_cannon_1*ourHydra_12:-0.43786
state_raw_combine_feature:enemyP_cannon_1*ourHydra_18:-0.08806
state_raw_combine_feature:enemyP_cannon_1*ourHydra_24:0.24174
state_raw_combine_feature:enemyP_cannon_1*ourHydra_36:0.42465
state_raw_combine_feature:enemyP_cannon_1*ourHydra_48:0.39939
state_raw_combine_feature:enemyP_cannon_1*ourHydra_60:0.52629
state_raw_combine_feature:enemyP_cannon_1*ourHydra_max:0.59403

state_tech_feature:ourKeyUpgrade_zerglingsAttackSpeed:2.33542
state_tech_feature:ourTechLevel_hatchery:2.28803
state_tech_feature:ourTechLevel_lair:0.25170
state_tech_feature:ourTechLevel_hive:1.48611

You can guess what the feature names mean: Enemy has 1 cannon and we have up to 6 hydralisks, for example. That’s how it got so many features!

Each opponent’s file seems to list a different number of features, probably leaving out features that never came up, so 9638 is not the total number of features. But there’s something here I don’t understand. 9638 is not divisible by 3. Each line gives one weight—shouldn’t there be 3 weights for each state, so that the 3 actions can all be evaluated?

Here’s the routine that calculates Q(s,a). Its arguments are reversed—it puts the action before the state.

double StrategyManager::calActionFeature(std::string curAction, std::map<std::string, std::map<std::string, int>>& features)
{
	for (auto categoryStateFeature : features)
	{
		if (categoryStateFeature.first == "state_raw_combine_feature" || categoryStateFeature.first == "state_building_feature")
		{
			for (auto stateFeature : categoryStateFeature.second)
			{
				std::string combineFeatureName = stateFeature.first + "*" + curAction;
				features["action_battle_combine_state_battle_feature"][combineFeatureName] = 1;
			}
		}
	}

	if (features["state_tech_feature"].find("ourKeyUpgrade_zerglingsAttackSpeed") != features["state_tech_feature"].end())
	{
		std::string combineFeatureName = std::string("ourKeyUpgrade_zerglingsAttackSpeed") + "*" + curAction;
		features["action_battle_combine_state_battle_feature"][combineFeatureName] = 1;
	}

	double curQValue = 0;
	for (auto categoryFeature : features)
	{
		for (auto curfeature : categoryFeature.second)
		{
			int curfeatureValue = curfeature.second;
			if (parameterValue.find(categoryFeature.first) != parameterValue.end() && parameterValue[categoryFeature.first].find(curfeature.first) != parameterValue[categoryFeature.first].end())
			{
				double curParameterValue = parameterValue[categoryFeature.first][curfeature.first];
				curQValue += curParameterValue * curfeatureValue;
			}
		}
	}
	return curQValue;

}

parameterValue holds the model. curAction is the action and the features map with its nested type is the state. Having read this, I still don’t understand. The action name is coded into some feature names and not others, which we see above as + curAction. The list of actions:

	stateActions = {"zerglingBuild", "hydraBuild", "mutaBuild"};

Here’s the call, the bit of code which chooses the action with the highest Q value. (Below this is another bit where it changes the action if it feels like exploring.)

		for (auto action : stateActions)
		{
			std::map<std::string, std::map<std::string, int>> actionFeatureValue = featureValue;
			double curQValue = calActionFeature(action, actionFeatureValue);

			if (curQValue > maxQValue)
			{
				maxQValue = curQValue;
				maxAction = action;
				maxFeatureValue = actionFeatureValue;
			}
		}

The call does nothing to differentiate actions. As far as I can tell, only the features which include the action in their names can be used to tell actions apart, and the other features are irrelevant constants that happen to be added in.

$ grep hydraBuild feature_valueAiur | wc -l
    2176
$ grep mutaBuild feature_valueAiur | wc -l
    2267
$ grep zerglingBuild feature_valueAiur | wc -l
    2403

So 2176+2267+2403 = 6846 features out of 9638 encode the build name in the I/O file for AIUR. As far as I can tell, the other 2792 features are irrelevant. And those 2792 features include some that look important. Surely you want to pay attention to what upgrades you have when you choose which units to make!

The number of features is different for each action. That means two things. 1. The fact that the total number of features is not divisible by 3 is meaningless. 2. Not all actions have been explored in the different states. As expected, the games played against AIUR were not enough to fill in the model.

Either I’ve misunderstood something, or Overkill’s learning has flaws (I wouldn’t go so far as to say bugs, it is only a loss of effectiveness, not an error). Can anybody correct me? I’ll contact Sijia Xu.

Next: How it fits into the rest of the program.

Trackbacks

No Trackbacks

Comments

No comments

Add Comment

Name*

Homepage

Comment*

In reply to

E-Mail addresses will not be displayed and will only be used for E-Mail notifications.

To prevent automated Bots from commentspamming, please enter the string you see in the image below in the appropriate input box. Your comment will only be submitted if the strings match. Please ensure that your browser supports and accepts cookies, or your comment cannot be verified correctly.
CAPTCHA