Sunday, February 21, 2016

A vertex-dominating week

The Feb 8 - Feb 14 week featured the 8VC Venture Cup 2016 Elimination Round by Codeforces (problems, results, top 5 on the left). With 7 problems there was plenty to choose from, and I think that problem E was the most difficult algorithmically, despite not being valued the highest. You are given a set with 200000 numbers, and need to find the subset of that set where the difference between the mean and the median is the highest. At the first glance, it seems that either some greedy could work, or the problem would require at least O(n2) to solve. However, it turns out neither is true!

Open Cup 2015-2016 Grand Prix of China (results, top 5 on the left). Problem B had a very simple statement and yet an interesting solution: given a bipartite graph, find how many subsets of its vertices are vertex-dominating. A subset of vertices is vertex-dominating if any vertex of the graph either lies on this subset, or has a neighbor in this subset, or both. It is of course the constraints that make this problem interesting. The Open Cup problem told that the graph has at most 30 vertices, but I encourage you to solve a bit tougher problem: let's say the number of vertices is at most 45.

In my previous post, I've mentioned two interesting problems. The first one was about processing a sequence of 10 million numbers using just 1 MB of memory. To be more precise, you're given a way to generate the sequence: start with a given number X0, then repeatedly compute Xi = ((Xi-1 xor A) + B) mod 250, where A and B are also given. You're looking to find the radius of each element in the sequence, defined as the largest number k such that the previous k elements and the next k elements are strictly less than the given element, and return the sum of the radii for all elements.

The were two key observations required to solve this problem. First, one needed to pay close attention to the way the sequence is generated. In most cases such random generators are given in problem statements simply to avoid handling large input data. However, in this case the generator had one additional important property: it was reversible. Not only can we generate the number that follows the given number, we can also find which number precedes the given number!

The second observation is the fact that the answer can't be very big. More precisely, for a sequence of length N it's O(N*logN): it's not hard to see that the elements that cover the given element with their radii lie exponentially further away. Together with the first observation, we can now write an extremely simple solution: just iterate over the sequence, and find the radius for each element in the most straightforward way, using the ability to generate the next and previous elements! This solution runs in O(N*logN) and needs only O(1) memory.

The other interesting problem was about speeding up exponential dynamic programming. You're given a 10x10 chessboard with some cells removed. You need to find any connected set of cells on it that has exactly W white cells and exactly B black cells, or report that there isn't any.

One could readily imagine a dynamic programming solution that goes row by row, remembering how many white cells we've taken so far, how many black cells we've taken so far, which cells are taken amount the last jagged row, and which connected components they form. However, such DP has too many states (roughly 10000 states of the jagged row with connected components, times 100 steps, times 50 white cells, times 50 black cells equals 2.5 billion, and although not all those states are reachable, a big fraction of them are) and will never run in time.

In order to speed it up, instead of remembering the number of white and black cells taken, we will find the maximum number of black cells for each number of white cells. This reduces the number of states by the factor of 50, and the solution fits into the time limit now. However, this doesn't give us the answer we need: in the end we need a concrete number of black cells, and we'll only know the upper bound.

Here comes the main idea, which in my opinion is amazing. If we have a connected set with a given number of white cells, but with more black cells than we need, we can start removing black cells while keeping the set connected. We can't do this forever, though - at some point removing any black cell will cause the set to become disconnected. However, we can notice that in such state the number of white cells is always strictly greater than the number of black cells! In other words, this simple solution will work just fine if the number B of black cells we need is >= the number W of white cells we need. And what to do if it's less? Well, we can swap the colors then and apply the same solution!

Thanks for reading, and check back soon for the last week's summary!

No comments:

Post a Comment