Algorithms Weekly by Petr Mitrichev: July 2016

Sunday, July 24, 2016

A fixed-parameter-tractable week

This week had quite a few competitions, culminating in the ultimate TCO elimination round where one had to place in top 4. But before we come to that, let's start with TopCoder SRM 695 on Tuesday (problems, results, top 5 on the left, my screencast).

The medium problem was from a relatively new but growing family of problems inspired by the theoretical concept of fixed-parameter tractability: you are given an undirected graph with at most 1000 vertices and at most 1000 edges, such that each vertex has at most three adjacent edges. Each vertex also has a weight, and you need to pick at most K edges maximizing the total weight of vertices covered by (adjacent to at least one of) the picked edges. The interesting aspect is that K is really small: at most 7. So you're allowed algorithms exponential in K, but polynomial in the size of the graph. Can you come up with one?

After a 30-minute break, Codeforces Round 363 followed (problems, results, top 5 on the left, my screencast, analysis). Some of the problems in this round were from the VK Cup onsite finals, but I've not yet seen the mapping between this round's problems and the onsite final round problems. Does anybody have it?

The hardest problem in this round was an great example of slowly and gradually (one might even say tediously :)) unwinding a problem, combining ideas from combinatorics and arithmetic - as opposed to problems that require just one, if brilliant, idea to be solved. The problem statement was relatively simple: we define a permutation of integers between 1 and n to be coprime, when a pair of its elements is coprime if and only if their 1-based indices are coprime. Given a partially filled permutation (some numbers known, some arbitrary), how many ways are there to complete it as a coprime permutation?

Codeforces Round 364 on Friday was also based on VK Cup onsite finals problems (problems, results, top 5 on the left). Ainta has solved the very difficult problem D in just under 40 minutes, and was able to cruise to an easy victory afterwards. Congratulations!

I'm still curious how to solve even more difficult problem E, as string problems were always my weak spot and I'd love to learn more about them. You are given a string with 200000 characters. You need to form the longest possible sequence of strings starting with the given string, and such that each following string appears as a substring at least twice in the preceding string (those two occurrences may partially overlap).

And finally, we come to TopCoder Open 2016 Round 3A, where only top 4 participants would earn a spot in the onsite competition (problems, results, top 5 on the left, my screencast). In the aftermath of this round, an interesting discussion about appropriate problems for high-level tournament round ensued. What's your view?

I've enjoyed all three problems in this round, but the 250 seems to be the most well-received by everybody. You are given a permutation of integers between 0 and 2n-1. You need to come up with any balanced parenthesis sequence that stays balanced when we apply the given permutation to it (i.e., if the 5-th element of the permutation is 9, then the parenthesis on position 5 of the first sequence moves to position 9 in the second sequence). n is up to 25.

Thanks for reading, and check back next week!

A lull week

Last week was the lull before the storm, with just one contest among the usual suspects: Codeforces Round 362 (problems, results, top 5 on the left, analysis). TooDifficuIt has won and confirmed his third place in the overall rating list - in fact, he's now much closer to the top two than the fourth place is to him. Congratulations!

Thanks for reading this super short summary, and check back soon for this week's contests which are aplenty.

Monday, July 11, 2016

A testcase preparation week

CodeChef SnackDown 2016 Final Round in Mumbai has continued the run of two-person team contests this week (problems, results, top 5 on the left). The winning team was 50% the same as last week - congratulations Gennady and Boris!

TopCoder SRM 694 followed a few hours later (problems, results, top 5 on the left, my screencast). The problems were quite standard, and jcvb and xudyh showed great mastery of flow algorithms to solve the 900 very fast and keep the battle for the first place just between themselves. In the end jcvb emerged victorious by a mere 0.09 points - congratulations!

Unusually for TopCoder there was quite some time to spare at the end of the coding phase, and thus one had the opportunity to prepare for the challenge phase thoroughly. The required mindset is quite similar to the one of the problemsetter: need to come up with testcases that would fail various potential incorrect solutions, and ideally with ones that are likely to fail even unforeseen incorrect solutions, too.

Here's the approach I've chosen this time for the easy problem (you can see it in more detail, but a bit slowly, on the screencast): let's construct random testcases that have many parts, and yet exactly one way to group the parts to achieve the maximum score. An incorrect solution in this problem is likely to be some kind of greedy instead of dynamic programming, and having exactly one solution makes sure that there are many more ways for the greedy to make mistakes. Having the testcase random instead of manually crafted ensures that it doesn't become too well-structured, as greedy solutions might actually be correct for well-structured cases.

More precisely, I've started with just three numbers which would be the xors of the groups in the final answer, and then repeatedly tried to replace any number x with (x xor y, y), where y is a random number, checked if there's still just one way to achieve the maximum, and if yes, then kept the replacement. I've generated a few cases using this approach, and picked the one with more components, and without components that looked too simple (powers of 2).

When I opened a greedy solution during the challenge phase (screencast pointer), I could then successfully use the testcase prepared in advance without crafting a case specifically against that solution. My other challenge (screencast pointer) was a bit more straightforward :)

How did you prepare for the challenge phase this time? Or maybe you remember a useful challenge phase preparation trick you've used earlier?

Codeforces held the online mirror of the 2016 Helvetic Coding Contest on Sunday (problems, results, top 5 on the left). The onsite version took place quite a while ago, and had slightly different problemset and rules. Congratulations to team Zg on earning the victory with more than an hour to spare, and on coming head and shoulders ahead of all onsite teams as well!

Thanks for reading, and see you next week!

Saturday, July 9, 2016

An under 23 week

The last week was exclusively on Codeforces. First, Codeforces Round 360 on Wednesday provided top competitors ample time both for solving all problems and for challenging the solutions of others (problems, results, top 5 on the left, analysis). The top 3 stood out because of the challenges, but TooDifficult has also executed everything in the right order - solving before spending too much time on challenging - and thus claimed the first place. Congratulations!

VK Cup 2016 Finals was the main event of the weekend (results, top 5 on the left). Unlike last year, the winning team, built from the top two teams of 2015, didn't really leave the others any chance. Congratulations on the super convincing victory, Adam and Gennady!

My previous summary included a nice TopCoder problem: you are given just one integer k between 1 and 10⁹, and need to construct a bipartite graph with exactly k perfect matchings. Not any such graph will do: it must have at most 20 vertices in each part, and at most 120 edges. There can be more than one edge between the same two vertices, and all those edges are distinct for the purpose of counting the number of perfect matchings. Apart from those restrictions, you're free to construct the graph in any way you like.

I think there are two main approaches to this kind of problem. One is to start with a random solution, and then keep modifying it towards the required property. I don't have a good feeling whether it would've worked in this problem, although I won't be surprised if it would. The other is to learn how to build answers for some values of k, and how to combine those together, and then represent the required number as a sum or product of primitives we know how to build.

Those primitives in this problem are somewhat atypical: the values of k=3ⁿ. It's quite straightforward to construct a graph with exactly 3ⁿ perfect matchings: it will have n vertices in each part, and for each i it will have 3 parallel edges connecting the i-th vertex in the first part with the i-th vertex in the second part. We haven't yet exceeded our limit of 20 vertices in each part, as 3²⁰is more than 10⁹. Every other value of k can be represented in base-3 as a sum of powers of 3 (each taken 0, 1 or 2 times), but how do we achieve addition of the numbers of perfect matchings?

Well, addition in combinatorial problems usually means that there's a choice: variant a leads to x possibilities, variant b leads to y possibilities, so if we have to choose exactly one of a or b we have x+y possibilities. And matchings lend themselves well to having choices: for each vertex, we have to choose which vertex it will match. So if we want to build a graph with 3ⁿ+3^m perfect matchings, we can make one of its vertices have exactly two adjacent edges, picking one of them would lead to 3ⁿpossibilities of matching the rest of the graph, and the other would give 3^m.

And here comes the "pull an idea out of the blue sky" moment: let's start with a graph with 19 vertices in each part and 3 parallel edges from connecting the i-th vertex in the first part with the i-th vertex in the second part, as before. Now let's add a 20-th vertex to both parts, without adding any edges initially. Then we add one edge from i-th vertex of first part to (i+1)-th vertex of the second part for all i between 1 and 19. The 20-th vertex in the first part still has no adjacent edges: this vertex will be the pivot representing addition. If we want to add 3ⁱ perfect matchings for any i between 0 and 19, we'll add an edge from the 20-th vertex in the first part to the (i+1)-th vertex in the second part. Were a perfect matching to include this edge, it's not hard to see that matching of vertices with numbers (i+1) and up is uniquely determined - we have to use the diagonal edges. The matching of vertices with numbers up to i, on the other hand, must use the horizontal edges, with 3 choices for each, and thus we have exactly 3ⁱchoices in total.

We can represent any sum of powers of 3 between 3⁰ and 3¹⁹, possibly with repetitions, in this way. The base-3 representation of any k up to 3²⁰-1 will require at most 2*20=40 summands, which require 40 edges in our graph. In addition to those, we have 3*19 horizontal edges and 19 diagonal edges, so the total number of edges doesn't exceed 40+4*19=116, which is good enough. Also note that the boundary is quite tight, so other similar constructions might not work.

Thanks for reading, and check back for this week's summary!

Monday, July 4, 2016

A perfectly matching week

The June 20 - June 26 week was much calmer than the previous one. There were two "regular" rounds, the first being Codeforces Round 359 on Thursday (problems, results, top 5 on the left, analysis).

The second regular round was TopCoder SRM 693 on Saturday (problems, results, top 5 on the left, my screencast).

The medium problem was from the rising category of "constructive" problems. You are given just one integer k between 1 and 10⁹, and need to construct a bipartite graph with exactly k perfect matchings. Not any such graph will do: it must have at most 20 vertices in each part, and at most 120 edges. There can be more than one edge between the same two vertices, and all those edges are distinct for the purpose of counting the number of perfect matchings. Apart from those restrictions, you're free to construct the graph in any way you like. Can you solve this one?

Challenge24 in Budapest was the onsite competition of the week, challenging the brave with 24 hours of problem solving (problems, results, top 5 on the left). The scores were extremely close in the final standings, and the winner was team Dandelion - congratulations!

In the last week's summary, I have once again mentioned quite a few interesting problems - let's analyze the IPSC one here. To remind, we're studying two permutations of 2n objects, called L and R. The permutation L is constructed like this: the first n objects in the old order go to odd-numbered positions in the new order without shuffling, and the last n objects in the old order go to even-numbered positions in the new order, without shuffling. The permutation R does almost the same, but the first n objects go to even-numbered positions, and last n go to odd-numbered positions. You are given three numbers: n, a and b. An object is currently on position a in the permutation of 2n objects, and we want to put it to position b using only operations L and R. Construct any shortest sequence of those two operations that achieves it.

Solving this problem involved studying the properties of operations L and R, trying to understand how they work. Such exploration involved several dead ends which I won't describe here, so it might seem that we're progressing almost miraculously :)

First, we will solve this problem from the end towards the start - in other words, let's consider the reverse operations L' and R'. How do they transform our current position x? It's not hard to see that L' moves odd-numbered positions x to x/2 and even-numbered positions x to x/2+n, and R' maps odd positions to x/2+n and even positions to x/2, where "/" stands for integer division (rounding down), and positions are numbered from 0 to 2n-1.

In other words, we repeatedly do the following: remove the last bit of our number, then choose to add or not to add n. Adding n after removing i last bits can also be viewed as adding n*2ⁱ in the very beginning. Assuming we do k steps in total, we can add n*2, n*4, ..., n*2^k, and that means we can add n*m for any even number m between 0 and 2^k+1-2.

After all the additions are done, we remove k last bits, and want to obtain the given number a. It means that before removing the k last bits, our number has to be between a*2^k and (a+1)*2^k-1.

Which means that the problem has boiled down to: does there exist such even number m between 0 and 2^k+1-2 that a*2^k<= b+n*m <= (a+1)*2^k-1?

This is of course easy to check for any given k. Moreover, we can see now that once n<=2^k and b/(a+1)<2^k there will always be such m, so we need to try only a logarithmic number of k values before we find the answer! Reconstructing the sequence of L and R operations knowing the values of k and m is an exercise in carefully traversing the above reasoning.

I find the way the solution flows from combinatorics to arithmetics in this problem quite beautiful. The solution explained in the official analysis skips the "from the end" part, and thus ends up being even simpler - but then it's not as satisfying to make it work :)

Thanks for reading, and check back soon for this week's summary!

Sunday, July 3, 2016

A half-plane week

The June 13 - June 19 week contained a lot of important qualification rounds for various onsite competitions. It started with Yandex.Algorithm 2016 Round 3 on Monday morning (problems requiring Yandex login, results, top 5 on the left, analysis). For the third time in a row exactly one contestant could solve all problems, and this time with 5 blind submissions - big congratulations to umnik2296!

However, most of the other results suggest that this problemset has strongly favored submitting problems in the open, in particular problem C had less than 10% submissions correct. Unfortunately I was one of those caught out by the trick in this problem, and thus couldn't qualify for the final round - good luck to all those who did!

I think problem D was the most beautiful in this set, and it was also quite tricky with just 34% success rate. You were given... nothing, but you could ask questions. In each question, you could name an integer not exceeding 10¹⁸ by absolute value, and you were told whether this number is black or white. You also knew that the structure of black/white numbers is very regular: more precisely, there exists a positive integer l not exceeding 10¹⁷ such that black/white numbers form alternating blocks of l consecutive numbers each: numbers from a to a+l-1 are white, numbers from a+l to a+2l-1 are black, numbers from a+2l to a+3l-1 are white again, and so on both in positive and negative directions. However, you don't know the values of a and l. Your goal is to determine the value of l using at most 2000 questions.

Internet Problem Solving Contest 2016 is the only non-elimination round in this summary, but it stands out in many other ways, attracting many retired contestants together with the active ones (problems, results, top 5 on the left, analysis). Big congratulations to the unnamed team from Taiwan on winning it by a 2-point margin!

Problem F in this round was an interesting exercise in researching and gradually gaining an understanding until the solution becomes clear. The research subject is two permutations of 2n objects, called L and R. The permutation L is constructed like this: the first n objects in the old order go to odd-numbered positions in the new order without shuffling, and the last n objects in the old order go to even-numbered positions in the new order, without shuffling. The permutation R does almost the same, but the first n objects go to even-numbered positions, and last n go to odd-numbered positions. You are given three numbers: n, a and b. An object is currently on position a in the permutation of 2n objects, and we want to put it to position b using only operations L and R. Construct any shortest sequence of those two operations that achieves it.

TopCoder Open 2016 Round 2C has finalized the list of 120 Round 3 participants who will compete for just 8 onsite spots (problems, results, top 5 on the left, parallel round results, my screencast, analysis). Congratulations to liymsheep on the convincing victory!

Here are the 120 qualified contestants grouped by country:
Russian Federation - 30: Petr, Merkurev, kuniavski, ariacas, lhic, AMashrabov, ifsmirnov, knightL, Dembel, Egor, UdH-WiNGeR, Burunduk1, Vercingetorix, Kankuro, Endagorion, Um_nik, qwerty787788, VArtem, eatmore, 2rf,Michael_Levin, -XraY-, Enot, pashka, niyaznigmatul, RiaDWaW, HellKitsune, Copymaster, LoRd_TaPaKaH, KalininN
China - 26: jcvb, SanSiroWaltz, maopao, jiry_2, liympanda, jinzhao, edly01, dnvtmf, BSBandme, lxhimo, hzt1, quailty, zhuojie, panjf1987, zyxwvu164, ftiasch, liymsheep, ACRush, Blue.Mary, Herzu, cvcvb_lyp, matthew99a, Ronnoc,xudyh, xyz111, wenhanhuang
Japan - 17: snuke, semiexp, sugim48, rng_58, camypaper, sky58, yosupo, uwi, rickytheta, j_gui0121, chokudai, LayCurse, tozangezan, logicmachine, anta, hogloid, natsugiri
Poland - 12: embe, marek.cygan, dasko, Marcin_smu, fruwajacybyk, tom612pl, Errichto, Swistakk, mnbvmar, Stonefeang, krismaz, dj3500
Ukraine - 8: K.A.D.R, sdya, mgch, Vasyl[alphacom], LeBron, dzhulgakov, MrDindows, Milanin
Taiwan - 5: peter50216, eddy1021, ShikXD, aaaaajack, dreamoon
Belarus - 4: tourist, Arterm, subscriber, Ra16bit
South Korea - 4: ainu7, ainta, jaydoubleuel, Kriii
Australia - 2: izrak, John Dethridge
United States - 2: scott_wu, ksun48
Canada - 2: FatalEagle, azneye
Netherlands - 1: krijgertje
Brazil - 1: ffao
Switzerland - 1: W4yneb0t
Germany - 1: pwahs
Croatia - 1: ikatanic
South Africa - 1: bmerry
Indonesia - 1: azaky
Viet Nam - 1: skyvn97

The solution for the easy problem relied on a cute geometric observation. The problem went like this: you are given 1500 points on the plane. You can move from any point to any other point if there are no more points on the segment connecting them. For all pair of points you need to determine what's the smallest number of moves needed to get from one to the other - and of course to do it faster than the standard O(n³) all-pairs-shortest-paths algorithms.

Russian Code Cup 2016 Elimination Round has chosen the 50 contestants for the final round, which might or might not happen onsite (problems, results, top 5 on the left, my screencast, analysis).

The hardest problem F forced one to pick a good tradeoff between studying too many cases on paper and implementing more logic in the solution. You were given at most 100000 integers a_i, each between -C and C, and a goal d. You needed to find such integers x_i, also between -C and C, that sum of a_i*x_i is equal to d. Additionally, none of x_imust be equal to zero.

It's not hard to see that without the "between -C and C, and non-zero" restriction a solution exists if and only if the greatest common divisor of all a_i divides d. It turns out that for sufficiently large values of C nothing changes. This problem in particular had C always equal to 1000000, so solving it was a matter of careful reasoning and implementation.

What is the largest value of C for which the condition mentioned above is not sufficient?

Finally, CodeChef SnackDown 2016 Elimination Round selected the 25 two-person teams qualifying for the onsite round in Mumbai (problems, results, top 5 on the left). Congratulations to the team deepdark on the victory, and to all those who qualified!

I've mentioned a few nice problems in the previous week's summary. One solution was explained in a separate post, so let me cover another one here. The problem was: you are given a convex polygon with 100000 sides. For each point strictly within the polygon we can define its asymmetry value: the maximum ratio of the two segments between this point and the boundary of the polygon along any line. For example, if that point is the center of symmetry of the polygon, its asymmetry value is 1. What is the smallest asymmetry value over all points inside the given polygon?

The first step is quite usual: instead of finding the point with the smallest asymmetry value directly, we'll learn to check if there are points with asymmetry value below the given value, relying on binary search to then produce the final answer.

Then we notice that when determining the asymmetry value of a given point, we can consider only the lines passing through a vertex of the polygon, and where the longer of the two resulting segments is between the point and the vertex (assuming it's not, we can show that turning the line in one of the two directions will not decrease the ratio).

Now let's look at our picture from the point of view of one of the vertices A of the polygon. Which points O inside the polygon have the property that AO/OY<=m, where Y is the other point of intersection of line AO with the polygon, and m is the asymmetry value limit we're testing? It's not hard to see that those are exactly the points of a smaller polygon that is obtained by compressing our polygon m/(m+1) times towards A (see the picture on the right).

That observation reduces our problem to the following question: consider the n smaller polygons obtained by compressing our polygon m/(m+1) times towards each of its n vertices. Do they intersect?

An intersection of a few convex polygons can also be thought of as an intersection of all half-planes containing their sides. However, we have n n-sided polygons to intersect here, so we have n² half-planes, and that's too much.

However, all those n² half-planes were obtained by shifting one of the n sides towards one of the n vertices. As we see on the picture on the left, instead of looking at all n images of one side, it's enough to consider the one where it's shifted the furthest - in other words, towards the vertex that's most distant from the line containing given side! We can find the most distant vertex for each side once using the rotating calipers algorithm, and then we have only n half-planes to intersect each time.

And that means we have reduced the original problem to a standard one: check if n half-planes have a non-empty intersection in O(nlogn) or faster. This problem has a tedious standard solution and a beautiful randomized solution. Let me describe the latter.

The randomized algorithm inductively builds the answer to the following question: what is the point (x, y) that lies in the intersection of the given half-planes and maximizes the value of ax+by, where a and b are some arbitrary constants, for example the normal vector to the first half-plane's boundary (such pick will make sure that the value ax+by in that half-plane is bounded from above)?

Assume we already know such point (x, y) for some set of half-planes and want to add one more half-plane. There are two possibilities: either the point (x, y) belongs to the new half-plane, or not. In the former case, nothing needs to be done - the old maximum is also the maximum now.

In the latter case, it's easy to see that the new point (x, y) must lie on the boundary of the new half-plane. Knowing that, we can find it by intersecting all previous half-planes with that boundary, and then solving the 1D half-line intersection problem by finding the bottom-most downwards and top-most upwards half-lines.

That's the entire algorithm - we will add all half-planes in random order, and either find a point (x, y) in their intersection or learn that there's no such point.

The interesting part is, of course, its running time. In the worst case, we'll have to solve the 1D problem every time, for the total running time of 1+2+...+n=O(n²). However, since we consider the half-planes in random order, it's unlikely that we will always run into the worse of two cases. More precisely, consider the i-th step of the algorithm. The point (x, y) after this step lies on the intersection of the boundaries of some two of the first i half-planes. This point was also the answer on the previous step i-1 unless the i-th half-plane is one of those two. That means that the probability that we need to solve the 1D problem on the i-th step is at most 2/i. Solving the i-th 1D problem takes O(i), and thus the expected total running time is O(1)/1+O(2)/2+O(3)/3+...+O(n)/n=O(1)+O(1)+...+O(1)=O(n) - linear!

This is the first time I describe this algorithm and I haven't yet implemented it myself, so please correct me if there are any errors or ways to implement it even simpler!

And of course, thanks for reading, and check back soon for more.