Monday, June 30, 2008

latest NPR Weekend Edition puzzle

It was really easy, and can be found here, under the heading Next Week's Challenge.

From a 19th century trade card advertising Bassetts Horehound Troches, a remedy for coughs and colds: A man buys 20 pencils for 20 cents and gets three kinds of pencils in return. Some of the pencils cost 4 cents each, some are two for a penny and the rest are four for a penny. How many pencils of each type does the man get?


The facts can be related as (the sum of all three kinds of pencils is 20) and (the sum of the costs for the pencils is 20 cents). When you do elimination on these two equations, resulting in for the last equation, there's still a free variable! But if you step through all possible values for , you'll find that the only one where and are both integers (under the reasonable assumption that he can't have a fraction of any pencil) is two. If , then and .

Thursday, June 26, 2008

normal variates are krieg

Normally, I only really like word problems that force you to model the solution. This one, from a book I don't even like that much, was sufficiently cute to merit posting it along with its solution.

A random variable is normally distributed. If and , determine and .


The data appear pretty scant, but don't let that upset you, it's not too hard to think through. Notice that is normally distributed. In other words, (a linear transformation of the standardized normal distribution ), and so . If you multiply that term out, the expectation is .

Because the expectation is one (can be proven with calculus, but I'm leaving it out), the expectation of the first term is of course . The expectatation is obviously zero, because is standardized normal. And the last term is expectation of a constant, which is always that constant! So, all told, .

The next clue given is the value of the cdf at ten. Again, because is a normal variate, you can use this value to deduce further constraints on the values of and , which are the standard deviation and mean of , respectively. Since , do a reverse lookup on a table of values for . The value where is one. So solve for in and substitute its value in terms of into the expectation from earlier. Since , the problem boils down beautifully to middle school algebra from here. The roots of are eight and two. As such, it can be true either that (and thus that ) or that (and thus that ). And of course and (the variance).

At the bottom of it, everything is pretty basic. Problem is finding out how.

Wednesday, June 18, 2008

Tuesday, June 17, 2008

AI algorithm idea

I was playing Battle for Wesnoth earlier and became interested in the problem of partitioning all the enemy units according to which defensive position they're headed for. That interest was actually motivated by watching my enemy react to the movement of my units.

I don't know what approach Wesnoth uses ... it's probably much more advanced, but I proposed a simple solution that should be pretty accurate.

After the enemy units take a turn, compare the current position of each enemy unit to its previous position. If the new position reduces the distance from the enemy unit to one or more defensive position, take whichever one is closest and assume that the enemy is moving towards that defensive position.

The first solution you might think of is picking arbitrary units and doing a kind of breadth-first search around them until it finds clusters of enemies, then tracking these clusters. The simple yes/no question of whether a unit occupies a given hex grid asked repeatedly by such an algorithm would make it a lot cheaper than having to number crunch the distance for each enemy unit and defensive position ... when enemies move together in tight groups. But in a game like Wesnoth, units can spread out when using attrition tactics and even when using conventional tactics, if certain units have much greater movement rates than others. So the BFS approach doesn't work generally.

Still, this approach isn't computationally terrible. Since the total number of distances calculated is defensive positions times enemies, doubled because the current position has to be compared with the previous position in each pairing of defensive positions with enemy units. So the algorithm consistently runs at .

That's the basic approach, but there are a few ways it could be improved. A few suggestions from myself and others:

One of the considerations I had immediately after coming up with the algorithm is that feints and even honest temporary retreats sometimes break up an approach. A simple way to handle this fact is to assign each defensive position a number and keep a running average of how many times an enemy unit attempts to approach which defensive positions. The expected value, then, is the most likely target of the enemy unit. For instance, say a castle is labeled zero and an outlying keep is labeled one. If an enemy unit is seen to approach the castle four times but has to head away from powerful guards towards the keep one time, the expected value is .20. Round that down to zero and the enemy is considered to be approaching the castle.

Neural networks would be ideal for detecting feints and other maneuvers, but I don't want to get bogged down in that shit right now. Maybe when I'm doing grad research.

If parts of the terrain are impassible, it's reasonable to assume that enemy units moving in a certain corridor are headed in a certain direction. The way to track enemies according to those data would be to flag passages in between large tracts of difficult terrain—passages that are shorter than the paths around the masses that surround them—and take note whenever an enemy crosses them. That might not even be worth the effort though.

Another idea is to look at the algorithm sailors use to anticipate collisions, Closest Point of Approach. A variant of this algorithm can be very useful when the defenders are moving and when movement of units is continuous. (On a game board, time and space are discrete.)

Someone else suggested that, if there's any ambiguity in whether an enemy unit is approaching a given position (i.e., might be approaching two or more positions), there should be a probability of attack attached to each position to clear up the ambiguity. Calculating that probability might be difficult though. A castle is worth more for the enemy than a small keep, but, for example, a few lesser units stand no chance against a hardened facility. There are so many factors to take into account and it would probably be necessary to analyze games for each possible approach. Overall, I feel that this algorithm should be limited to gathering data on movement; threat assessment should be a separate task altogether.

Monday, June 16, 2008

explaining the variance of the Poisson distribution

I'm reading a lot about probability and statistics these days. It's very interesting and practical. Many of the formulae and proofs involved are pretty ugly, but one of the cuter proofs is the proof of the variance of the Poisson distribution. I guess if you read this far, you know what 'variance' and 'Poisson distribution' are, but for a brief reminder, 'variance' is the expected value of the square of the difference between each point on the pdf and the mean, i.e., . The Poisson distribution is a special case of the binomial distribution which tends to be useful when events happen sporadically, but with more or less the same frequency over time. A common example of such a real-life distribution would be how often errors occur on a noisy connection.

Anyway, on to the explanation of the variance itself. Recall that, for the Poisson distribution, , where is the expected value multiplied by units of time. (I'm using instead of because the event in the Poisson distribution is said to happen times.) Also recall that the Taylor expansion of is . (And you can look up the proofs for those things on your own.) Letting in the Taylor expansion above equal , you'll notice that dividing the series by results in , the sum of the terms of the Poisson pdf. Appropriately, they add up to one. That'll be useful in a moment.

Now, consider trying to figure out the variance using the naïve method. (remember, using ) can't possibly work because ! So one would essentially expect zero. Since expectation of a constant is always that constant and different Poisson distributions obviously have different variances, that doesn't work. Putting the expectation in the form doesn't work either, because (skip because it adds nothing to the series) turns into an ugly sequence , which can't be manipulated into the Taylor expansion above. The solution is to find the expectation . The sequence for ( is the first where the factorial term is valid) converges neatly. is the same as . The bracketed term of course converges to one, as seen above, and so is (or ). Then is . So the variance of the Poisson distribution is the same as the mean!