Monday, November 7, 2016

A Fourier combination week

The Oct 3 - Oct 9 week was calmer. Intel Code Challenge Final Round has gathered the top competitors for 3 hours instead of the usual 2 we see on Codeforces (problems, results, top 5 on the left, analysis). Maybe TooDifficult did not notice this, as he finished all problems within the first 2 hours anyway, and thus won easily :) Congratulations!

On Sunday, Open Cup 2016-17 Grand Prix of SPb has completed the 3-week run (results, top 5 on the left). Reminding us how things usually go in Open Cup contests, Past Glory team have won convincingly, and were also the only team to solve problem I - great job!

Here's what the problem was about. You're given the following random number generator, an implementation of the Fischer-Yates shuffle, and a program using them:

Function random (range):
  state := (state * 1664525 + 1013904223) mod 232;
  return (state * range) div 232;

Function shuffle (array):
  n := length of array;
  for i := 0, 1, 2, ..., n - 1:
    j := random (i + 1);
    swap(array[i], array[j]);
  return array;

state := seed;
n := 10000;
array := [1, 2, ..., n];
array := shuffle (array);
array := shuffle (array);

In other words, we shuffle the numbers from 1 to n twice. The particular generator used here is well-known, so no particular weaknesses could be expected except of course the fact that it's a linear congruential generator. The task is: given the final state of the array, find what the seed was.

In my previous summary, I've mentioned a tricky AtCoder combinatorics problem: You are given a tree with n<=200000 vertices. Now we consider all C(n,k) ways to pick k vertices of the tree, and for each of them we consider the "convex hull" of the k vertices: the smallest part of the tree that connects all of them together. Your goal is to find the sum of the sizes of those convex hulls over all C(n,k) ways. What's more, you need to find n sums: for each value of k between 1 and n. Each sum must be computed modulo 924844033.

Even though the problem statement doesn't mention probabilities and expected values, the solution starts with an observation that is very similar to the linearity of expectation: in order to find the sum of the sizes of the convex hulls, we will find for each vertex the number of convex hulls containing it, and then add up those numbers.

In order to find the latter, let's find the number of convex hulls not containing it, and subtract it from C(n,k). In order for the convex hull to not contain a given vertex, all k chosen vertices must lie completely within one of the subtrees formed by removing this vertex from the tree. So if the sizes of all 2(n-1) subtrees of the original tree are a1, a2, ..., a2(n-1), then we need to find ΣiC(ai, k). Finding all ai is a standard task solved by a traversal of the tree, so we can solve our problem for one value of k in O(n). However, since we have n possible values of k to process, the total running time will be O(n2) which is too slow.

The idea to speed up such computation from O(n2) to O(nlogn) using Fast Fourier Transform is not new, but I don't think I've described it in this blog before. Let's reformulate the problem slightly: let bi be the number of subtrees of size i. We need to find ΣiC(ik)*bi for each k. Now let's transform our sum:

ΣiC(ik)*bi = Σibi*i!/k!/(i-k)! = 1/k!*Σi(bi*i!)*(1/(i-k)!)

Now we can notice that the remaining sum is nothing else but multiplication of polynomials, which can be done fast using FFT. To make it completely clear, let's denote ui=bi*i!, and vj=1/(n-j)!, and multiply two polynomials with those coefficients. The (n+k)-th coefficient of the product will be Σiuivn+k-i = Σi(bi*i!)*(1/(n-(n+k-i)!) = Σi(bi*i!)*(1/(i-k)!), just as we need.

To conclude, it also seems that this technique can be applied to a wide variety of sums, not just those involving C(n,k). The only property that we need from the function being summed is that it must decompose as a product of functions of n, k, and n-k. However, C(n,k) seems to be the only practically relevant function with this property. Do you have an example of a different such function in mind, or maybe you can even point me to other problems solved using this trick?

Thanks for reading, and check back for more!

No comments:

Post a Comment