Tuesday, May 30, 2017

A 7-time week

ACM ICPC 2017 World Finals headlined the last week (problems, results, top 12 on the left, our stream, text analysis, video analysis). The ITMO team was leading for quite some time, but they did not manage to solve problem J in time, which gave a chance to the other teams. They did not take advantage of that chance, however, and ITMO became 7-time world champions. Congratulations!

Problem D, while being a bit on the "professional" side, was quite cute. You are given 500000 top-left corners and 500000 bottom-right corners on the plane, and need to pick one of each to obtain a valid rectangle with maximum possible area.

Here's its analysis video, in case you give up :)

AtCoder Grand Contest 015 took place on Saturday, when most World Finals contestants should have already got back home (problems, results, top 5 on the left, my screencast with commentary, analysis). When just two last problems remained, I went for the harder one, and almost got it (got accepted in upsolving after about 5 more minutes) - but not quite. sky58, on the other hand, chose the right strategy and won - congratulations!

Problem C allowed multiple different solutions, each with a non-trivial observation and thus quite exciting to get. You are given a set of blue cells on the 2000x2000 grid that forms a forest with regard to 4-connectivity, and 200000 queries. Each query asks: if we take a certain sub-rectangle of our grid, how many connected components of blue cells are there if we look just at that sub-rectangle?

A few hours later, Yandex.Algorithm 2017 Round 2 provided another chance to score place points towards qualification for the final round (problems, results, top 5 on the left, my screencast, analysis). Tourist threw all strategy considerations out of the window by solving all problems with 15 minutes remaining, while others have barely managed solve solve 5 out of 6. Amazing performance!

The solution to problem E relied on a standard idea which I forgot, so maybe explaining the solution in my blog will help me remember :) Here's what it was about: consider all sequences of balls of k<=15 colors, with exactly ai<=15 balls of i-th color, and no two adjacent balls of the same color. Let's arrange them in lexicographical order. What is the number of the given sequence in this order?

Finally, Codeforces held the online mirror of Helvetic Coding Contest 2017 which I have mentioned earlier (problems, onsite results, online results, online top 5 on the left, analysis). Congratulations to the sweet team on the victory (and their penalty time is better than ours from the onsite contest, too)!

In my previous summary, I have mentioned a TopCoder problem: you are given the distances from two vertices to all others in an unknown undirected graph with 50 vertices. You need to construct any graph with such distances from the first two vertices.

Consider an arbitrary pair of vertices. If their distances to vertex 1 differ by at least 2, then we can't have an edge between them. The same is true for their distances to vertex 2. This is relatively easy to spot, but here comes the hard part: if neither of the above is true, in other words if both pairs of distances differ by at most 1, then we can assume to always have this edge in our graph. Because if we don't have it, we can add it and distances to vertices 1 and 2 will not be affected for the vertices we just connected, and thus for all other vertices as well.

Now, since for each edge we can determine whether we have it in our graph or not, all that remains is to construct the graph, and check if the distances to vertices 1 and 2 come out as expected.

Thanks for reading, and check back next week!

Wednesday, May 24, 2017

ACM ICPC 2017 World Finals stream

ACM ICPC 2017 World Finals start tomorrow at 9:00 local time. There will be quite a few ways to follow the event, the most prominent being ICPC Live. Together with tourist and Endagorion, we have decided to provide another perspective on this competition: we'll try to solve the problems as soon as we get the statements (the rumor has it, we'll be able to submit on Kattis as well), and will stream our screen and our discussions on Youtube! Tune in tomorrow around the time World Finals starts, although we might start a bit later.

We're not sure how this will work out, and would appreciate any advice in comments! One thing that we're not yet sure about is whether we should talk in English or in Russian. The latter should be more productive and thus more realistic, but will naturally be harder to follow for most of the audience. On the other side, the sound quality might be so bad that it won't even matter :)

Monday, May 22, 2017

An almost Rapid week

Last week was relatively calm, with just two competitions that I want to mention, both on Saturday. First, TopCoder Open 2017 Round 2A has significantly raised the stakes compared to Round 1, with just 40 top finishers qualifying (problems, results, top 5 on the left, my screencast). I have enjoyed the medium problem of this round the most, as it is quite rewarding to come up with an easy-to-code beautiful solution after wasting some time coding a very tricky one that comes to one's mind first. Especially rewarding step is removing all code written for the tricky solution (screencast position) :)

Here's what the problem was about: you are given the distances from two vertices to all others in an unknown undirected graph with 50 vertices. You need to construct any graph with such distances from the first two vertices.

With just a few minutes break, Codeforces hosted its Round 415 (problems, results, top 5 on the left). With the only successful solution to problem E coming from a contestant with no other solved problems, it was the speed that decided the winner, and Radewoosh was almost half an hour faster than the rest. Congratulations!

In my previous summary, I have mentioned a Codeforces problem: you are given a connected undirected graph with at most 300000 edges. You suspect that this graph was constructed in the following manner: we started with a graph with no edges and assigned each vertex an integer label, then connected all pairs of vertices for which labels differed by at most one. Your goal is to return a set of labels that could have been used to construct the given graph, or report that there isn't any.

First of all, shifting all labels by a constant does not change the answer, so let's pick a vertex A and say that its label is 0. Now, the labels for all other vertices are almost uniquely determined: it's not hard to see that for all vertices labeled not 0, the absolute value of the label is equal to the shortest distance from A. So, we just need to determine which sign will each label have, and which vertices (out of those at distance 1) will have label 0.

Here we can see that vertices at distance 1 from A are the most tricky part, so let's concentrate on them. We can assign three labels to them: let's say those with label 1 are set X, with label 0 are set Y, and with label -1 are set Z. By the problem statement, all those sets must be cliques, and additionally we must have all edges between X and Y, and between Y and Z, but no edges between X and Z.

Let's assume we have a representative B from X, and a representative C from Z. Then the label of each vertex can be determined trivially: if it's connected only to B, it's 1, only to C, then -1, to both, then 0.

It doesn't matter which representatives we pick - in fact, it's not hard to see that we need to pick any two vertices B and C that are connected to A but not between themselves. If we remember that A can also be picked freely, our goal now is to find a chain of two edges such that its endpoints are not connected.

And this, in turn, can be done like this: first, let's find any missing edge. The graph is connected, so there's a path between its ends. If this path is of length 2, we're found what we need. If it's longer, consider its next-to-last vertex. If it's connected to its first vertex, we've found what we need. If not, then we can remove the last vertex and obtain a shorter path such that its ends are not connected. By repeating the process, we will eventually find the required path of length 2.

Now we have solved the problem for vertices with labels -1, 0 and 1, but how do we determine the sign of the label for the remaining vertices? Well, for vertices with label 2/-2, we can use connectivity to any vertex with label 1 as the differentiating factor, and so on.

Finally, having determined all labels, we need to check if our graph does in fact correspond to those labels. The simplest way to do that seems to be: let's check that for all edges in our graph the difference between the labels of the ends is at most one, and also check that the total number of edges in our graph matches the total number of pairs of vertices with labels differing by at most 1. After the first check, the only way we can have an incorrect graph would be having not all required edges, and the second check takes care of that.

Thanks for reading, and check back here and in my Twitter for news from ACM ICPC World Finals this week!

Another speaking week

Just like the previous week, the fun of May 8 - May 14 week started on Thursday with a Codeforces round, this time with Playrix Codescapes Cup (problems, results, top 5 on the left, analysis). Even an incorrect submission for E could not stop tourist, as he still won thanks to solving problem G and having much more points than everybody else on F. Well done!

The next round of the week was also a named Codeforces round - this time Tinkoff Challenge Final Round (problems, results, top 5 on the left, analysis, my screencast with commentary). This time explaining everything aloud did not lead to a disastrous performance for me (finally!). Maybe the quality of explanations suffered :) V--o_o--V was still significantly faster, so congratulations on the victory!

Problem D was a nice exercise in discovering a reliable way to detect a relatively simple pattern. You are given a connected undirected graph with at most 300000 edges. You suspect that this graph was constructed in the following manner: we started with a graph with no edges and assigned each vertex an integer label, then connected all pairs of vertices for which labels differed by at most one. Your goal is to return a set of labels that could have been used to construct the given graph, or report that there isn't any.

Later on Saturday, Google Code Jam Round 2 has narrowed the field to just 500 competitors (problems, results, top 5 on the left, analysis). Congratulations to jsannemo on the victory - quite impressive form for the KTH team before the upcoming World Finals, with this win and simonlindholm's win a week earlier.

Yandex.Algorithm 2017 Round 1 took place early on Sunday (problems, results, top 5 on the left, analysis, my screencast). Um_nik was flawless on the five easier problems and correctly noticed the fact that problem E was also, in fact, not very hard. Well done!

Just 80 minutes later, Russian Code Cup 2017 Elimination Round has revealed the 55 finalists (problems, results, top 5 on the left, analysis, my screencast). LHiC did not make any mistakes, and that turned out to be the key to getting the first place. Congratulations!

And finally, Distributed Google Code Jam Round 1 wrapped up the week (problems, results, top 5 on the left, analysis). mk.al13n was ten minutes faster than the rest of the pack in this still quite unusual format with parallel computations. Great job on the victory!

In my previous summary, I have mentioned an AtCoder problem: you are given two trees on the same set of vertices, one blue and one red. You want to convert the blue tree into the red one, step-by-step. At each step, you must take any path consisting of blue edges, add a red edge connecting its endpoints, but remove one of the edges of the path. After n-1 steps all blue edges will be removed, and n-1 red edges will be added, and you want those edges to form the required red tree.

The key idea in this problem is: let's look at the process from the end. Before the last step, we have just one blue edge connecting vertices, say, A and B, so our only option is to remove that edge and add a red edge connecting A and B. Now in the next-to-last step, we must either do the same, or make use of the last blue edge: for example, we can remove blue edge A-C, and add red edge B-C. After some staring at the paper, one can figure out what does this mean: first, we need to find an edge that is both blue and red for the last step, and then we need to contract it - unit its ends together into one vertex. Then, we need to find an edge that is both blue and red in the resulting graph (it might either be blue and red in the beginning, or be a result of a merge of different blue and red edges during the contraction), and contract it again, and so on until the graph is just one vertex.

Now it becomes clear that it doesn't really matter in which order we do the contractions, as they never make things worse. So we should just repeatedly perform any available contraction. There's some technical mastery involved in making the process run in O(n*polylog(n)) time, but that part is relatively standard and I don't want to focus on it too much. You can check the analysis for more details.

Thanks for reading, and check back soon for the last week's summary!

Sunday, May 21, 2017

A plus-minus week

The first contest in May was the Codeforces Round 411 (problems, results, top 5 on the left, analysis). This was one of those rare rounds where finding bugs in the solutions of others turned out to be a better strategy than solving more problems - but just barely. Congratulations to AlexDmitriev for finding 18 challenges and finishing less than one challenge above the second place!

And then, the weekend. First off, AtCoder Grand Contest 014 (problems, results, top 5 on the left, analysis, my screencast). The round seemed to have been decided in the first 45 minutes thanks to the amazing performance by w4yneb0t, but with just 20 seconds left simonlindholm overcame all tricks in the last problem and claimed the victory. Huge congratulations!

Problem E looked tedious at first, but coming up with the right idea helped make the implementation really easy. You are given two trees on the same set of vertices, one blue and one red. You want to convert the blue tree into the red one, step-by-step. At each step, you must take any path consisting of blue edges, add a red edge connecting its endpoints, but remove one of the edges of the path. After n-1 steps all blue edges will be removed, and n-1 red edges will be added, and you want those edges to form the required red tree.

In a few ours after the end of AGC 014, TopCoder Open 2017 Round 1C provided the last chance to qualify into Round 2 (problems, results, top 5 on the left). The results were not very surprising, with all contestants with a "red" rating who took part advancing to the next round. Nevertheless, the fight for the first place was extremely heated with several changes of the leader. In the end Baz93 has emerged on top thanks to 9 successful challenges, including one made 7 seconds before the end of the challenge phase. Congratulations!

On Sunday, TopCoder hosted a TopCoder Open 2017 Regional event in St Petersburg (problems, results, top 5 on the left, SRM 714 results with 2 same problems out of 3, my screencast). The medium problem (easy in SRM) was another example of extremely easy implementation after coming up with the right idea. You are given a string of at most 2500 parentheses which is a valid parentheses sequence. You repeatedly do the following: remove the first character of the string (which is always an opening parenthesis), and remove any closing parenthesis from the string. The remaining string must also be a valid parentheses sequence. You repeat this step until everything has been removed. How many ways are there to complete the entire process, modulo 109+7?

Finally, Codeforces hosted VK Cup 2017 Round 3 (problems, results, top 5 on the left, parallel round results, analysis, my screencast). This time it was another team of Moscow students who dominated the proceedings, with over 500 points margin. Congratulations to the sinful team!

In my previous summary, I have mentioned an interactive Open Cup problem. You are given six six-sided dice, each having some digits and mathematical symbols written on it, as follows:

Die 1: = < > != <= >=
Die 2: 4 + - ( ( )
Die 3: 0 / / / 8 +
Die 4: 2 3 4 5 - )
Die 5: + - * / 1 9
Die 6: 6 7 + - ( )

You need to pick a die, then roll it (by interacting with the judge program), and record the symbol that appears. Then you can do it again, using either the same or a different die, and so on, doing at most 1000 rolls. At some point you need to stop rolling, and form a correct mathematical expression (equality or inequality) by concatenating all recorded symbols in some order. No need to optimize anything - just build any correct expression after at most 1000 rolls.

There must be a ton of different approaches in this problem, as any valid expression suffices. The first idea lies on the surface: since the first die always gives us a comparison, and there must be exactly one comparison in each expression, we can start by rolling the first die once. This will give us the type of comparison we're building. In the remainder of the explanation I will assume we get "=", as that is the most tricky case - the rest can be handled in the same manner.

But then, things get not so easy. First, we should get enough symbols to form any syntactically valid expression: we should get the same amount of opening and closing parentheses, and the amount of operations we get should be two less (one for each side) than the amount of numbers we have (after the contest I've noticed that the numbers can be joined together to form multiple-digit numbers, but I did not notice this in the heat of the moment). Then, of course, we need to form not just syntactically valid expression but a correct equality.

The next idea is: the "correct" part is actually much easier than "syntactically valid" part. Assuming we have only digits and +/- signs, then we just need to split the digits into two groups with the same sum, and with enough different digits this is always possible.

Now, we need to find a way to get the right left-right parenthesis balance, and the right operation-digit balance. The two key dice that help accomplish that are, for example, die 3 and die 2. Let's assume we already rolled some dice, and we have more closing parentheses than opening parentheses. Then we can repeatedly roll die 2 until we get the right parentheses balance. After that, let's assume we have more digits than operations. Then we can repeatedly roll die 3 until we get the right balance (and this won't affect the parentheses).

So now we only need to prepare the ground for rolling dies 2 and 3 - we need to roll some dice in such a way that we have more closing than opening parentheses,  and much more digits than operations (so that even rolling die 2 a few times does not cancel that), and enough different digits and +/- operations to build our equality in the end. Die 3 also gives us "/" operations, but to avoid complications we'll just divide 0 by some numbers to get rid of those.

It's easy to see now that die 4 is exactly what we need. Let's roll it a few (say, 100) times. We will have a few (~16) closing parentheses, and a lot more digits than operations, and those digits will be from a wide variety, so we're guaranteed to get two equal sums. Now we also need a 0 to handle the divisions, so let's roll die 3 until we get it. After this we can switch to "die 2, then die 3" strategy described above to get the right balances, and finally form our equation.

How did your team solve this problem? Maybe there's a simpler approach?

Saturday, May 13, 2017

A week modulo 3

TopCoder SRM 713 opened the last week of April (problems, results, top 5 on the left). Once again, nobody has managed to solve the hard problem. Endagorion was the fastest on the first two, and defended his lead during the challenge phase with a +50. Well done!

Yandex.Algorithm 2017 Qualification Round was available as a virtual contest for the whole Saturday (problems, results, top 5 on the left, analysis). Egor has dominated the round thanks to very fast problem solving, and the most appropriate strategy: get one problem accepted in the open to ensure qualification, and then submit the rest blindly to minimize the penalty time. Very well executed!

Russian Code Cup 2017 Qualification Round 3 also took place on Saturday (problems, results, top 5 on the left, analysis). -XraY- fought back after missing the top 200 in the first qualification round and solved all problems cleanly and so fast that his penalty time is more than 2x smaller than that of the second place - amazing!

The final Open Cup round of the 2016-17 season took place on Sunday, the Grand Prix of Ural (results, top 5 on the left, overall Open Cup results, top 5 on the left). Team Past Glory did not really leave much doubt as to who would win this round, solving 12 problems 1.5 hours before the end of the contest, and having all the time in the world to solve the tricky geometric problem F. Congratulations on the victory!

Problem E was a great example of a non-obvious problem that still requires almost pure creativity to solve, instead of mathematical theorems, algorithms or data structure knowledge. It is an interactive problem. You are given six six-sided dice, each having some digits and mathematical symbols written on it, as follows:

Die 1: = < > != <= >=
Die 2: 4 + - ( ( )
Die 3: 0 / / / 8 +
Die 4: 2 3 4 5 - )
Die 5: + - * / 1 9
Die 6: 6 7 + - ( )

You need to pick a die, then roll it (by interacting with the judge program), and record the symbol that appears. Then you can do it again, using either the same or a different die, and so on, doing at most 1000 rolls. At some point you need to stop rolling, and form a correct mathematical expression (equality or inequality) by concatenating all recorded symbols in some order. No need to optimize anything - just build any correct expression after at most 1000 rolls.

How would you approach this problem?

In keeping with the new trend of having multiple competitions at the same time, Google Code Jam 2017 Round 1C took place in the middle of the Open Cup (problems, results, top 5 on the left, analysis). It was EgorKulikov who followed Eryx's strategy from Round 1A this time, submitting the super tricky hardest problem much earlier than everybody else. Congratulations on the victory!

In my previous summary, I have mentioned another Open Cup problem: construct a completely multiplicative function f(x) such that f(x)=1 or -1, f(a*b)=f(a)*f(b), and that for each n between 1 and 1000000 the prefix sum f(1)+f(2)+...+f(n) does not exceed 20 by absolute value.

When trying to solve it, I was approaching it in the way that many recent "constructive" problems were meant to be solved: try a few things on the computer, maybe do some local optimizations, and it should work. However, I did not manage to find success on this path.

It turns out that there is a solution that can be easily done on paper without using any computer time. Let's define f(3k+1)=1, f(3k+2)=-1, f(3k)=f(k). The multiplicativity follows from the fact that powers of 3 do not impact the value of the function, and numbers 1 and 2 modulo 3 are the same as 1 and -1 modulo 3. Finally, almost all numbers in the prefix sum cancel out: if we look at positions not divisible by 3, 1 and -1 alternate, so the prefix sum of those positions is always 1 or 0. For positions divisible by 3, but not by 9, the same argument applies, and so on; thus each prefix sum will never exceed log31000000 (one for each power of 3), which is less than 20.

Thanks for reading, and check back soon for the last week's summary!

An OEIS week

TopCoder SRM 712 took place on Tuesday, April 18 (problems, results, top 5 on the left). The results remind me of the second SRM that I prepared myself - two accepted solutions on the medium, and none on the hard :)

The reason for such crushing difficulty of the medium problem was the fact that the most obvious solution did not work because of catastrophic cancellation in floating-point computations. I do not want to go into the details of this particular problem here, so I will refer you to the post-match discussion for the details of how my solution overcame this obstacle.

More generally, I think understanding how floating-point numbers work is not that hard, and it pays off not just in such black-and-white cases like this problem, but also in more subtle situations, for example when thinking about whether an eps is required or not in comparisons in a geometry problem, and how big it should be if it's required. I think at some point this misof's tutorial was the eye-opener for me with regard to floating-point computations, so I heartily recommend it. At the same time, it's quite likely that even more useful tutorials on floating-point numbers have been published since then, so please point those out in comments!

Yandex.Algorithm 2017 started with its Warm-up round on Saturday (problems, results, top 5 on the left, analysis). It was enough to solve any problem to advance, and thus the first place did not hold much tournament value; nevertheless, it was still the first place. Congratulations to kusas2018 on the victory!

Google Code Jam 2017 Round 1B followed in a little over an hour (problems, results, top 5 on the left, analysis). The problems were not as tricky as in Round 1A, and JAPLJ had to be really fast to beat the second place by less than a minute!

No April weekend is complete without an Open Cup round, this time the penultimate Grand Prix of Moscow Workshop (results, top 5 on the left, analysis). The SPb ITMO 1 team has clinched the first place in the overall standings with a win in this round - the first season in a while where Gennady's team could not win the Open Cup. Big congratulations to Ivan, Ilya and Vladimir!

Problem B was a nice mathematical puzzle that did not crack for our team: one needed to construct a completely multiplicative function f(x) such that f(x)=1 or -1, f(a*b)=f(a)*f(b), and that for each n between 1 and 1000000 the prefix sum f(1)+f(2)+...+f(n) does not exceed 20 by absolute value. Do you see a way?

And finally, Codeforces held the Elimination Round for another company-sponsored tournament: the Tinkoff Challenge (problems, results, top 5 on the left, analysis). LHiC held the first place despite failing one of the problems during the system test, thanks to being the only one to solve both problems E and F. Well done!

In my previous summary, I have mentioned several problems, and this AtCoder problem allows me to share an interesting experience, so I will explain it. We are given a long (109) segment. Some (at most 105) integer points on the segment are special. We consider all ways to take some set of non-special integer points on the segment. Each such set splits the segment into smaller segments. Let's find the product p of their lengths, and then compute the sum of the squares p2 of those products over all ways. What number are we going to get, modulo 109+7?

For quite some time, I had no clue how to approach it. I've started to think about an easier version: what if there are no special points? Even for that problem, I could not come up with a solution that's fast enough for 109. However, I could come up with a somewhat straightforward O(n2) dynamic programming approach: to find the answer for a given length, we will simply iterate over all possibilities for the last segment and use the answers for smaller lengths that were computed previously.

I've implemented this solution quickly, obtained the answers for small values of n, and entered them into the OEIS search which yielded me this sequence. For a moment it seemed that the OEIS entry is not very helpful, but then I noticed the generating function (x+1)/(1-4x+2x2-x3), which is in a form of polynomial fraction, which means that the elements of the sequence correspond to a linear recurrence an=4an-1-2an-2+an-3. In order to see that, we can rewrite the equation for the generating function like this:

(a0x0+a1x1+...)*(1-4x+2x2-x3)=(x+1)

Since the right part is a finite polynomial, so is the left part, and it means that the coefficient near xn in that product is 0 for almost all values of n, and expanding the product allows us to find that the coefficient near xis an-4an-1+2an-2-an-3, which directly yields the recurrence.

Finding the 109-th element of a linear recurrence is a standard task that can be solved using fast matrix exponentiation, so we have now solved the version of the problem without the special points. The answer is given by (some element of) the product Mnv where M is some matrix and v is some vector.

Now suppose we have exactly one special point y. We need to subtract the decompositions that use this special point, and that means that if f(n) is the answer without special points, then the answer with one special point is equal to g(n)=f(n)-f(y)*f(n-y). We can now rewrite it using the formula for f(n) that we have: g(n)=Mnv-f(y)*Mn-yv=Mnv-f(y)*MnM-yv=Mn(v-f(y)*M-yv). In other words, g(n) has the same form: the n-th power of the matrix M times some vector!

It's not hard to generalize this to any amount of special points. I.e., with the second special point placed at z we will have h(n)=g(n)-g(z)*f(n-z) which transforms in exactly the same way, and so on. Each special point is handled in logarithmic time (to compute f(y) and M-y), so the overall running time is O(mlogn), where m is the number of special points.

This story is quite the opposite to the approach I have shown in two previous posts: here we do not build any meaningful intuition about the problem, and instead try to almost mechanically build something that works using random facts found on the Internet.

Of course, this problem has a more sensible solution, which you can find in the official analysis. Thanks for reading, and check back soon!

Wednesday, May 10, 2017

A zero pigeon week

The April 15-16 Easter weekend was even more packed with contests. First off, Google Code Jam Round 1A took place very early on Saturday (problems, results, top 5 on the left, analysis). Eryx has solved all problems 40 minutes faster than anybody else, but he might've been taking a big gamble, as the last problem was exceptionally tricky (8 correct out of 124 attempts). In any case, congratulations on the victory!

I think problem A was a great example of a beautiful easy problem. You are given a grid where some cell contain letters, and some are empty, such that each letter appears at most once. Your goal is to write letters to all empty cells in such a way that the cells with each letter form a grid-aligned rectangle. You're only allowed to use letters that are already present in the grid. Any valid solution suffices.

For example,
G??
?C?
??J
can be solved as
GGJ
CCJ
CCJ

A few hours later AtCoder held its Grand Contest 013 (problems, results, top 5 on the left, analysis). Nobody has managed to get all problems right, and yosupo was the fastest to solve all but one. Well done!

I've had the most fun with problem E. We are given a long (109) segment. Some (at most 105) integer points on the segment are special. We consider all ways to take some set of non-special integer points on the segment. Each such set splits the segment into smaller segments. Let's find the product p of their lengths, and then compute the sum of the squares p2 of those products over all ways. What number are we going to get, modulo 109+7?

Open Cup 2016-17 Grand Prix of America took place on Sunday (results, top 5 on the left, other results on same problems). Huge congratulations to team Deep Dark Fantasy on solving the hardest problem I and winning the round!

We didn't solve the very nice problem K during the round. We are given a number k, a parentheses sequence of length 105, and also for each position you're given the cost of changing the parenthesis in this position to the opposite one. Our goal is to produce a parentheses sequence that is k-unbalanced: it requires changing at least k+1 parentheses to arrive at a balanced parentheses sequence. What is the smallest total cost to achieve that?

Right in the middle of the Open Cup round, Russian Code Cup 2017 held its second Qualification Round (problems, results, top 5 on the left, analysis). Congratulations to AndreiNet on beating the others to the first place by just two minutes of penalty time!

And to wrap up the Sunday, Codeforces hosted VK Cup 2017 Round 2 (problems, results, top 5 on the left, parallel round results, analysis). The goose team stood out by being the only one to solve all five problems - congratulations!

In my previous summary, I have mentioned an interactive Open Cup problem: two players are playing a game using a grid with n+1 rows and n columns. Each cell of the grid will contain a 0 or a 1, but the contents of the cells are initially undetermined. The first player, let's call him Dirichlet, can ask questions of form "what is the sum modulo 2 in this subset of cells", for any subset of cells. The second player, let's call him Pigeon, answers the questions - each answer is either 0 or 1. Pigeon can answer questions in any way - for example, he does not have to imagine a concrete grid and use it to compute the answers.

Dirichlet's goal is to get one of the three outcomes:
1) Find a contradiction in Pigeon's answers. More precisely, find a set of questions such that each cell of the grid appears an even number of times in this set, and thus the sum of all answers must be 0 modulo 2, but the sum of Pigeon's answers is 1 modulo 2. One simple case of a contradiction is when he asked about the same set twice and got different answers, but there are much more complicated situations possible.
2) Find two different cells in the same column that he can prove to contain 1. In order to prove that a certain cell contains 1, he can use a set of his questions such that the interesting cell appears an odd number of times in them, and all other cells appear an even number of times in them, and the sum of Pigeon's answers for those questions is 1 modulo 2. One simple way to prove is to ask about the set containing just the interesting cell and receive answer 1, but there are also other ways.
3) Find a row such that he can prove that all its cells contain 0. Proof for each cell must be done similarly to the previous case.

Dirichlet knows that sooner or later one of those three things will happen, since the grid has more rows than columns, and thus it either has a row of 0s or two 1s in the same column. One way to certainly arrive at one of the outcomes is to ask about each cell of the grid separately, using n*(n+1) questions. But Dirichlet wants to win faster: using at most 5*(n+1) questions. Pigeon's goal is to prevent Dirichlet from arriving at one of the three outcomes so fast. You need to implement a program that plays for Dirichlet. n is at most 70.

Let's ask about the sum modulo 2 in each column. First, consider the case where all answers are 0 (it's not a special case in the solution, but it helps understand the general case). Then we can do the following: let's pick any row, and ask about its cells one by one. In case all of those come back as 0 as well, we have found a row of zeroes so we're done. In case one of those comes back as 1, we know there's another 1 in its column, since the sum of each column is 0 modulo 2, so we can just ask about other cells in that column one by one to find the second 1, and we're done.

Now, let's solve the general case: when some columns have an odd number of ones. Let's split all rows into two almost equal halves A and B arbitrarily, and let's ask about sum mod 2 in rows from A in each column with total sum 1 (we also learn the sum in each such column in rows from B, by subtraction). Since the total sum in each "interesting" column is 1, exactly one of its sums in A and in B will be 0, and exactly one will be 1. Now we want to pick either A or B, and continue recursively with this set of rows and only with columns that have sum 1 in it, until at some point we have no columns with sum 1 anymore. Which one to pick between A and B? Well, in order for the recursive calls to converge, we need to maintain the invariant: the number of rows in our set is strictly greater than the number of interesting columns. Initially it's true (n+1>n), and since we split the rows into two parts and columns into two parts, it's not hard to see that it will still be true either for A, or for B — and that's what we should pick.

What do we have after the dust settles? We have used at most n (initial column queries) + n (column queries on first split) + n/2 (column queries on second split, since we split the rows in two and the number of interesting columns keeps being less than the number of remaining rows) +n/4+... <= 3n queries. And we have the following artifact: we have a row such that for each column there's a segment of cells in that column that has sum 0 modulo 2 and contains the given row (whenever a column becomes "not interesting", we find such a segment for this column).

Now we can just do the same thing we did for the case where all columns had sum 0: we iterate over our row, find any 1, and then find the second 1 in its column (it exists, since we have a segment with sum 0 there), spending at most 2n more queries, so overall we fit exactly in 5n as needed.

How to come up with this solution? Once again, I think it was about slowly building up enough intuition about the problem. I've kept asking myself questions, and there were two that combined to make a breakthrough. One was: when does any knowledge about a sum of a set of many (more than 1) cells lead to an improvement versus a naive approach of just asking about all cells in some order? And the other was: how can we defeat Pigeon's strategy that always answers 0 until the last moment when that would lead to Dirichlet finding a row of zeros?

At this point I realized that knowing that the sum of a column is 0 is actually very valuable, since then it suffices to find just one 1 in this column instead of two. That led to the solution for the case where the answer for all columns is 0, and then the "parallel binary search" for the general case was a somewhat standard follow-up approach.

Thanks for reading, and check back soon for the next week's summary!

Monday, May 8, 2017

A double Dirichlet week

Google Code Jam 2017 Qualification Round spanned 27 hours over the April 8-9 weekend (problems, results, top 5 on the left, analysis). Excited to see so many participants, and hope you enjoyed the problems!

TopCoder Open 2017 Round 1B took place later on Saturday (problems, results, top 5 on the left). While the problem were a bit on the easy side, xudyh was still very impressive in solving all three in a little over 9 minutes, and adding 200 challenge points to boot. This result has earned him a spot in the official record book, and could've placed him second in this unofficial one had it still been up. Well done!

Open Cup 2016-17 Grand Prix of Two Capitals continued the weekly rush of spring Open Cup rounds (results, top 5 on the left). The interactive problem G "Pigeonhole Principle" was very beautiful, so please try to bear with the long statement :)

Two players are playing a game using a grid with n+1 rows and n columns. Each cell of the grid will contain a 0 or a 1, but the contents of the cells are initially undetermined. The first player, let's call him Dirichlet, can ask questions of form "what is the sum modulo 2 in this subset of cells", for any subset of cells. The second player, let's call him Pigeon, answers the questions - each answer is either 0 or 1. Pigeon can answer questions in any way - for example, he does not have to imagine a concrete grid and use it to compute the answers.

Dirichlet's goal is to get one of the three outcomes:
1) Find a contradiction in Pigeon's answers. More precisely, find a set of questions such that each cell of the grid appears an even number of times in this set, and thus the sum of all answers must be 0 modulo 2, but the sum of Pigeon's answers is 1 modulo 2. One simple case of a contradiction is when he asked about the same set twice and got different answers, but there are much more complicated situations possible.
2) Find two different cells in the same column that he can prove to contain 1. In order to prove that a certain cell contains 1, he can use a set of his questions such that the interesting cell appears an odd number of times in them, and all other cells appear an even number of times in them, and the sum of Pigeon's answers for those questions is 1 modulo 2. One simple way to prove is to ask about the set containing just the interesting cell and receive answer 1, but there are also other ways.
3) Find a row such that he can prove that all its cells contain 0. Proof for each cell must be done similarly to the previous case.

Dirichlet knows that sooner or later one of those three things will happen, since the grid has more rows than columns, and thus it either has a row of 0s or two 1s in the same column. One way to certainly arrive at one of the outcomes is to ask about each cell of the grid separately, using n*(n+1) questions. But Dirichlet wants to win faster: using at most 5*(n+1) questions. Pigeon's goal is to prevent Dirichlet from arriving at one of the three outcomes so fast. You need to implement a program that plays for Dirichlet. n is at most 70.

In the previous summary, I have mentioned a cute Open Cup problem: you are given two numbers n and k. You need to choose the smallest amount of pairs (a,b) such that 1<=a,b<=k for each pair, and such that any sequence of n not necessarily distinct numbers, each between 1 and k, will contain at least one of your chosen pairs as a subsequence.

First of all, consider a sequence of n equal numbers shows that we mush include all k (a,a) pairs in our answer. Moreover, when n>k, we don't need any other pairs thanks to our old friend Dirichlet. But what to do when n<=k? Here one needs to build some intuition first in order to see the right approach.

Let's consider the case n=k. After taking the (a,a) pairs we need to additionally cover all permutations of k numbers. Let's consider one of them: 1,2,...,k. Without loss of generality, we can assume that the pair (1,2) is one of its subsequences present in the answer, so we have at k+1 pairs in our answer now. Is that enough? No, since there exist permutations that do not contain (1,2) as a subsequence. Moreover, there exists a permutation that does not have any common subsequence with our first permutation: k,k-1,...,1. This shows that we need at least k+2 pairs. Which pair should we cover the decreasing permutation with? A natural choice is to use the (2,1) pair. In fact, it's not hard to see that any permutation of k numbers contains either the (1,2) pair or the (2,1) pair as a subsequence, so our k+2 pairs cover all sequences.

Now let's turn to the n=k-1 case. Here the reasoning becomes less formal and more intuitive. Our solution for n=k does not cover sequences that do not contain 1 or 2, for example 2,3,...,k. Let's say this sequence will be covered by (3,4) pair (a pair that contains 2, like (2,3) is less useful since we can replace 2 with 1 and get another uncovered sequence). Again, its reverse will need to be covered, so let's take (4,3) pair as well. Now we have k+4 pairs: (1,1), (2,2), ..., (k,k), (1,2), (2,1), (3,4), (4,3). Dirichlet can easily notice that these k+4 pairs are enough to cover all sequences of length k-1, since any such sequence either contains a repeat, or is missing just one number, and thus either contains both 1 and 2, or contains both 3 and 4.

The cases above have hopefully helped build some intuitive understanding of the problem that allows to generalize the construction to any n: we will split all k numbers into n-1 groups of equal or almost equal size, and will include all pairs of elements from one group into our answer. By the pigeonhole principle, each sequence of length n will contain two elements from the same group, and thus will be covered. As an example, when n=3 and k=5, we will output pairs (1,1), (1,2), (1,3), (2,1), (2,2), (2,3), (3,1), (3,2), (3,3), (4,5), and (5,4).

A formal proof that this answer is optimal is a bit tedious, so I will just refer you to the Wikipedia article. During the actual contest, the intuition built on simple cases helped this solution to "click" in my head, and thus I submitted it without having a formal proof.

Thanks for reading, and check back soon!

Sunday, April 9, 2017

A dominator week

First of all, please note that there's still a bit more than 3 hours left in the Google Code Jam qualification. You can still hop on the Code Jam train!

The last week was of the extremely busy type. First off, Codeforces Round 407 took place on Wednesday (problems, results, top 5 on the left, analysis). -XraY- and Um_nik stood out from the pack by solving all problems, and slightly better speed has earned -XraY- his first - but definitely not his last - victory of the week. Congratulations!

A Swiss-resident-only Helvetic Coding Contest 2017 took place on Saturday (results, top 5 on the left). Just like last year, the problems will be used for an online mirror some time later, so I will not spoil you the fun of solving them.

AtCoder Grand Contest 012 took place at the same time (problems, results, top 5 on the left, analysis). -XraY-'s second (but still not his last) victory was more clear-cut than the first one, as he managed to solve five problems fifteen minutes faster than everybody else, and was the only one at the top without any penalty minutes. Well done!

A few hours later TopCoder Open 2017 has taken off with its Round 1A (problems, results, top 5 on the left). With the top 250 active coders receiving a bye into Round 2, the contest has once again given semi-retired veterans a chance to shine, and it seems that ACRush did not forget how to win TopCoder rounds :) Congratulations!

Sunday took off with Open Cup 2016-17 Grand Prix of Tatarstan (problemsresults, top 5 on the left), where -XraY- has earned his third and final victory of the week, this time together with his team - amazing!

Problem E was of the kind that I prefer to give on my own contests, as it makes preparing the testcases very easy: with just two numbers in the input :) On a more serious side, here's what it was about: you are given two numbers n and k. You need to choose the smallest amount of pairs (a,b) such that 1<=a,b<=k for each pair, and such that any sequence of n not necessarily distinct numbers, each between 1 and k, will contain at least one of your chosen pairs as a subsequence.

Can you see the solution? Can you prove it?

Finally, Russian Code Cup 2017 Qualification Round 1 wrapped up the week on Sunday evening (problems, results, top 5 on the left, analysis). Even though the main goal here was to get into the top 200, there was still a scoreboard and it was possible to get the first place - which eatmore did thanks to flawless execution without any incorrect attempts. Way to go!

In my previous summary, I have mentioned a couple of problems. The first one went like this: you are given 300 trees on the same set of 300 vertices. You want to pick exactly one edge in each tree, remove those, then add the edge you removed from i-th tree to the (i+1)-th tree, for all i (the edge from the last tree is added to the first one), in such a way that the resulting graphs are still trees. How many ways are there to do that, modulo 109+7?

Let's say we picked the edge x-y in the first tree. After we add it to the second tree, a cycle consisting of this edge and the (only) simple path from x to y in the second tree will be formed, so we need to remove any edge in the said simple path to obtain a tree again. This edge in turn will define a path on the third tree where we need to pick an edge to remove, and so on. After we make a full cycle, we need to get back to the first edge.

If we fix the edge we pick in the first tree, we can use dynamic programming to find the number of ways to pick the edge to remove in the first a trees, such that b-th edge of the a-th tree is removed. This dynamic programming has O(n2) states, each state can be processed in O(n) (we need to traverse the corresponding path in the next tree), and we have O(n) outside iterations for the edge of the first tree, so the total running time is O(n4).

However, we can notice that the innermost O(n) is used to support an operation "add a constant to all edges of the tree along the given path". Here's how we can do it faster. Every path in a tree always goes towards the root until we reach the lowest common ancestor, then away from root. Now, instead of having values on edges and adding a constant c to the entire path from x to y, we can have values in vertices in such a way that the value for each edge is computed as the sum of values of vertices in the corresponding subtree, and then in order to add a constant to a path we need to add c to vertices x and y, and subtract 2c from lca(x, y) - just 3 numbers to be updated. This makes handling of each dynamic programming state O(1), and the total running time is now O(n3).

The second problem was: you are given a directed acyclic graph with 200000 vertices and 500000 arcs. We're going to remove one of its arcs. For each vertex v of the graph, you have to answer a question: is it true that v is guaranteed to be reachable from vertex 1, no matter which arc we remove?

This problem is closely related to the concept of dominator tree. Here, we would define that arc a dominates vertex v, if one must pass through a when going from 1 to v. Our goal is to tell if there are any dominating arcs for each vertex.

It turns out the following approach (corrected by Shubham in comments below - thanks!) works: assuming we have topologically sorted our graph from left to right, let's compute the leftmost dominating arc (or the fact there's no dominating arc) for each vertex. The computation will also go from left to right, and work like this for vertex u: if for each vertex reachable from 1 that have arcs into u the leftmost dominating arc is the same, then this arc is the leftmost dominating arc for u as well. If that's not the case, but there's just one vertex v reachable from 1 that has an arc into u, then the arc from v to u is the leftmost dominating arc for u (note that in this case v has no dominating arcs).

So far, everything is somewhat intuitive and easy to prove, but here comes the insight: in all other cases, there are no dominating arcs for u! Suppose it's not the case, and some arc a dominates u. We have at least two vertices reachable from 1 that have arcs into u, and the don't have the same dominating arc. Thus a can't be directly incoming to u, and at least one of the vertices that have arcs to u doesn't have a as its dominating arc, which leads to a contradiction since then we can bypass a on our way to u, too.

Thanks for reading, and check back next week!