Algorithms Weekly by Petr Mitrichev: A caterpillar week

Codeforces Round 635 was the first round of last week (problems, results, top 5 on the left, analysis). Problems A, B, C and E1 were solved by all top competitors, but the next step was less clear. maroonrk went for the hardest problem F, solving it with just two minutes remaining. However, this was not enough to beat boboniu who did not notice the additional difficulty in E2, solving both E1 and E2 together, and then wrapped up their victory with D. Congratulations to both!

TopCoder Open 2020 Round 1A then started a busy weekend (problems, results, top 5 on the left, parallel round results, analysis). Most top competitors got a bye into Round 2, therefore the scoreboard looked more yellow this time. Pasqual45's points just from the coding phase would be enough for the first place, but they made sure of the win with 100 more challenge points. Well done!

Open Cup 2019-20 Grand Prix of Tokyo was next (problems, results, top 5 on the left, analysis). All teams solving 4+ problems got E, F, H and I, but the remaining problems were significantly more difficult. Even though 9 problems were solved by at least one team, the best teams only got 6. Our team was close to 6 problems as well, as our last submission on K was correct algorithmically but had an off-by-one error. Congratulations to teams ext71, Polish Mafia and USA1 on actually getting to 6!

Solving problem E involved this wonderful feeling where a relatively natural statement hides a solution with multiple interesting layers that "click" together. You are given k (k<=200) distinct positive integers a_i (a_i<=10⁵). Consider all sequences of length n (n<=10¹⁸) with all elements equal to one of the given integers. Count the number of those sequences with the given sum s (s<=10¹⁸). Is that number odd or even?

And finally, Google Code Jam 2020 Round 1B wrapped up the last week (problems, results, top 5 on the left). xll114514 was the first to finish all problems, but mnbvmar was the fastest of the European competitors who didn't want to wake up at 3am for Round 1A, and had one less incorrect attempt than xll114514. Congratulations on the win!

In my previous summary, I have mentioned two problems. The first one came from TopCoder: you are given a directed tournament with n<=23 vertices. You are then constructing a "power" of this tournament with n^k vertices (k<=1000), with each vertex described with a k-tuple of vertices of the original tournament. The direction of the edge between two k-tuples is given by the direction of the edge between the vertices that appear in the first position the k-tuples differ. For example, the direction of the edge between (a, b, a, c) and (a, b, d, e) is the same as the direction of the edge between a and d in the original tournament. How many subsets of vertices of this big graph induce a strongly connected subgraph, modulo 998244353?

First, let's deal with the k-th power. Consider a set S of vertices of the power graph. Each of those vertices is a k-tuple of vertices of the original graph. Let's consider the first position p in those tuples which has at least two different vertices. For example, if S has vertices (a, b, a, c), (a, b, a, d) and (a, b, b, c), then we're interested in the third position since both a and b appear there, while in the first position we always have a, and in the second position we always have b.

Now let's consider the set T of all vertices of the original graph that appear in this position. If S is strongly connected, then T is also strongly connected, because any path can be "projected" to T and still be a valid path due to the construction of the power graph.

Now suppose T is strongly connected. It turns out that then S is strongly connected as well, because when there's an edge from a to b in the original graph, there's an edge from any vertex with a in position p to any vertex with b in position p, therefore in order to find a path from any vertex in S to any other vertex in S it suffices to find a path between the corresponding vertices in the p-th position, and then pick any matching vertices for the intermediate stopping points. Note that we rely on the fact that T has at least two vertices here, as we need a cycle in T to build a path between two vertices in S that have the same vertex in the p-th position.

We have found that the strong connectivities (is there such a word?) of S and T are equivalent. Therefore if we can solve our problem for k=1, we can solve it for any k: consider a subset T of size m>=2 of the original graph that is strongly connected. How many sets S correspond to T in the p-th position? For each of the m vertices of T we have n^k-p vertices of the power graph with this vertex in the p-th position, and we need to take any non-empty subset of those, so in total we get (2^{n^k-p}-1)^m subsets.

Here comes the more interesting part of the problem: what to do for k=1? More specifically, how to count the number sets of vertices of each size that induce a strongly connected subgraph in a given graph with n<=23 vertices?

My approach was based on the Floyd-Warshall algorithm. We would like to run it on all 2ⁿ subsets, which would take O(2ⁿ*n³), which is too slow for n=23. We can replace the inner loop with one bitwise or operation, bringing the running time down to O(2ⁿ*n²) — still too slow.

However, we can now merge the iteration over subsets with the Floyd-Warshall algorithm! The outer loop of Floyd-Warshall iterates over intermediate vertices, so if we take a recursive algorithm that iterates over the 2ⁿ subsets, we can do one iteration of Floyd-Warshall outer loop as soon as we've decided that a vertex belongs to the subset. Therefore we will need to do at most one iteration of the Floyd-Warshall outer loop at each level of recursion, bringing our running time to O(2ⁿ*n+2^n-1*n+2^n-2*n+...)=O(2ⁿ*n)!

Here is the relevant part of my contest submission with some comments:

// Recursive search: we have decided that from the first
// "at" vertices we have taken the subset "mask",
// and the intermediate state of Floyd-Warshall after
// iterating over them is in "preach".
private void rec(int at, int mask, int[] preach) {
if (at >= n) {
      // Check if the graph is strongly connected.
      // "preach" already has the transitive closure for the
      // chosen subset, so we check if it's complete.
int count = 0;
for (int i = 0; i < n; ++i) if ((mask & (1 << i)) != 0) {
++count;
if ((preach[i] & mask) != mask) {
return;
}
}
if (count >= 2) {
++res[count];
}
return;
}
// Do not take at-th vertex.
rec(at + 1, mask, preach);
// Take at-th vertex, meaning that we need to do one more
// iteration of Floyd-Warshall outer loop.
// To avoid allocating new arrays in Java too often,
// I have pre-allocated one array per level of recursion
// which I overwrite here.
int[] nreach = reach[at + 1];
System.arraycopy(preach, 0, nreach, 0, n);
for (int i = 0; i < n; ++i) if ((nreach[i] & (1 << at)) != 0) {
nreach[i] |= nreach[at];
}
rec(at + 1, mask | (1 << at), nreach);
}

The second problem was from Codeforces: you are given a tree with n<=10⁵ vertices. You want to draw n closed curves on the plane, one per vertex, in such a way that two curves intersect if and only if the corresponding vertices are adjacent in the tree. What is the maximum size of a chain of curves that are nested in one another (but do not intersect) in such drawing?

The space of possibilities for drawing arbitrary curves on the plane is enormous, so it's not clear how to approach this problem. First, we can notice that the curves for adjacent vertices intersect, and therefore can't be nested in one another. Is that the only restriction? In other words, can we choose the curves in such a way that any two curves either intersect (when the corresponding vertices are adjacent) or are nested (if they're not)? At first I believed it was always possible, then finding the size of the longest chain is just finding the size of the largest independent set in a tree, which can be done with a simple greedy from the leaves. I made a quick submission, which turned out to be incorrect :)

Then I started to think when does this approach fail. After some doodling on paper, I have found a relatively small counterexample (on the left). Moreover, I have also come up with a description of cases where it does in fact work: they all seemed to be caterpillar trees! After some more doodling, I was able to build the set of curves that achieves the perfect nesting for any caterpillar tree (on the right, the curves corresponding to the main path of the caterpillar are red, and the dotted blue curves corresponding to the legs of the caterpillar could be multiple at each level).

Finding the size of the largest independent set in all sub-caterpillars of the given tree is slightly more tricky, but still doable with dynamic programming on the tree. However, is it the correct solution? I did not have a proof, but after so much doodling that didn't provide a counterexample it felt right, so I decided to try with another submission, which worked. You can find the proof in the editorial :)

Thanks for reading, and check back for this week's summary!

Algorithms Weekly by Petr Mitrichev

Sunday, April 26, 2020

A caterpillar week

No comments:

Post a Comment