Sunday, October 8, 2017

A randomized week

The Aug 14 - Aug 20 week contained the usual suspects: a TopCoder SRM and a Codeforces round. First off, TopCoder SRM 719 took place in the early Tuesday hours (problems, results, top 5 on the left, analysis). yanQval was a lot faster than everybody else and won the round - well done!

Codeforces Round 429 followed on Friday (problems, results, top 5 on the left, analysis). anta and LHiC have both solved their final fifth problem with only a few minutes to spare and thus obtained quite a margin at the top of the scoreboard - way to go!

Problem D in this round allowed for a cute randomized solution that in my view is a bit easier than the one from the official analysis. The problem was: you are given an array of 300000 numbers, and 300000 queries of the following form: a segment [l;r] and a number k, for which you need to find the smallest number that appears on at least 1/k of all positions in the given segment of the array. Note that k is at most 5, so the number must be quite frequent within the segment. Can you see how to use that fact to implement a simple randomized solution?

And finally, I've checked out CodeChef August 2017 Cook-Off on Sunday (problems, results, top 5 on the left, my screencast). EgorK was at the top of the standings with 4 problems for quite some time, but gennady.korotkevich has then submitted his fourth and fifth problems in rapid succession and won the competition. Congratulations!

In my previous summary, I have mentioned a problem that I have set for Google Code Jam 2017 finals: you are given a number k between 3 and 10000. Output any simple undirected graph with at most 22 vertices that has exactly k spanning trees.

The approach that works the most frequently in such problems is to learn several transitions that help cover all required numbers. For example, if we could come up with a way to transition from a graph with k spanning trees to graphs with 2k and 2k+1 spanning trees by adding one vertex, the problem would be solved. Alas, this does not seem possible, as it's not clear how to achieve that extra "+1" spanning tree.

As explicit construction turns out to be hard to impossible, we have to turn to more experimental approaches. One thing that stands out is that there's an enormous amount of graphs on 22 vertices, and while they have at most 2220 spanning trees, many have under 10000 spanning trees, and thus we can hope that for each k in the given range there are many possible answers. So the next idea is to try generating random graphs and checking how many spanning trees those graphs have using the matrix tree theorem, hoping to find all numbers between 3 and 10000 at least once.

It turns out that some amounts of trees are more difficult to achieve than others, so this approach by itself is not sufficient to generate all answers within reasonable time. However, there are numerous different ideas that help it find all required numbers faster, and thus help get the problem accepted. I won't be surprised if everybody who got this problem accepted at the Code Jam has used a different approach :) My experimentation led me down the following path:

  • Since some numbers are hard to get, we try to steer the random graphs towards those numbers. Suppose we have a "goal" number we're trying to achieve. The easiest way to do that is to avoid recreating the graph from scratch, but instead take the graph from the previous attempt, and check if it has less trees than the goal, or more. In the former case, we add a random edge to it, and otherwise remove one. As a result, we're more likely to hit the goal, and as soon as we do it, we pick another goal from the yet unseen numbers and repeat the process.
  • This process has an unfortunate bad state: when we remove an edge, we can make the graph disconnected, thus having 0 spanning trees. At this point we start adding random edges, but if the graph stays disconnected, it keeps having 0 spanning trees. Finally, we add an edge that connects the graph back, but since we have added so many other edges in the meantime, suddenly the graph has a lot of spanning trees, way more than we need. So we need to remove a lot of edges to get back to reasonable numbers, and may well jump back to 0 in the process. Having noticed this happening, I've modified the process to never remove an edge that would make the graph disconnected.
  • Finally, we get almost all numbers we need! After a minute or so, my solution has generated all numbers between 3 and 10000 except 13 and 22 (if I remember correctly). The final idea is that small numbers can still be generated by an explicit construction: a cycle with k vertices has k spanning trees. Since we're allowed at most 22 vertices, 13 and 22 can be done using a cycle.
That's all for my solution, but I have one more question for you: why is 22 so hard to get? In particular, does there exist a simple undirected graph with at most 21 vertices that has 22 spanning trees? I strongly suspect that the answer is no, but don't know for sure.

Thanks for reading, and check back later!

Wednesday, September 27, 2017

A week of 22

The Aug 7 - Aug 13 week was centered around Google Code Jam final rounds. First off, the 21 finalists competed in the Distributed Code Jam final round on Thursday (problems, results, top 5 on the left, analysis). ecnerwala has snatched the first place with just 3 minutes left in the contest on his 6th attempt on D-small - here's for perseverance :) Congratulations!

One day later, the traditional Google Code Jam final round saw its 26 finalists compete for the first place (problems, results, top 5 on the left, analysis, commentary stream). Gennady.Korotkevich has won the Code Jam for the fourth year in a row - well done! He has given everybody else a chance this time by solving only two large inputs, but his competitors could not catch up for various reasons. SnapDragon was leading going into the system test phase, but he knew himself that his solution for D-large was incorrect as it was not precise enough. Second-placed zemen could have claimed the victory had he solved C instead of E-small or F-small or both in the last hour.

I have set the said problem C for this round, which went like this: you are given a number k between 3 and 10000. Output any simple undirected graph with at most 22 vertices that has exactly k spanning trees. I don't think it is solvable just in one's head, but if you have some time to experiment on a computer, I think it's an interesting problem to try!

In my previous summary, I have mentioned a TopCoder problem: given a tree with n<=250 vertices, you need solve a system of at most 250 equations on this tree with m<=250 variables x1, x2, ..., xm , where each variable must correspond to some vertex in the tree. Several variables can correspond to the same vertex. Each equation has the following form: in the path from vertex xai to vertex xbi the vertex closest to vertex pi must be the vertex qi.

If we look at the tree as rooted at vertex pi, the corresponding equation means that both xai and xbi must be in the subtree of the vertex qi, and there must be no child of qi such that both xai and xbi are in the subtree of this child. At this point one had to strongly suspect this problem will reduce to a 2-SAT instance with boolean variables corresponding to "this variable is in this subtree". The equations are already in 2-SAT form on these boolean variables as described above, but the tricky part is to come up with additional clauses that guarantee that the values of boolean variables are consistent and we can pick one vertex for each variable.

First difficulty is that the tree is rooted differently for each equation. However, we can notice that each subtree of any re-rooting of the given tree is either a subtree of this tree when rooted at the first vertex, or a complement of such subtree. So if we have boolean variables for "xi is in the subtree of vertex y when rooted at vertex 1", then the complementary subtrees can be expressed simply as negations of those variables, as each xi must go to some vertex.

Second difficulty is ensuring that when we look at all subtrees which contain a given xi according to our boolean variables, they have exactly one vertex in their intersection that we can assign xi to. It turns out that this can be achieved using simple clauses "if xi is in the subtree of a vertex, it's also in the subtree of its parent, and not in subtrees of its siblings", which are handily in 2-SAT form. We can then find the deepest vertex containing xi in its subtree according to the values of the boolean variables, and it's not hard to see that all clauses will be consistent when xi is placed in that vertex.

Thanks for reading, and check back later for more competitive programming news!

Sunday, September 24, 2017

A 2-SAT week

The contests of July 31 - Aug 6 week started with the second competition day of IOI 2017 on Tuesday (problems, overall results, top 5 on the left). As expected, only the three highest scorers of day 1 had a real chance for the first place, and Yuta Takaya has grabbed this chance with the amazing score of 300 out of 300. Well done!

On Saturday, my fruitless attempts to qualify for the TopCoder Open onsite in Buffalo started with TopCoder Open 2017 Round 3A (problems, results, top 5 on the left, analysis). tourist has claimed the first place thanks to solving both the easy and the hard problem - congratulations - but qualification was also possible with easy+challenges, as Scott and Igor have demonstrated.

I could not figure out the solution for the hard problem in time, even though I suspected it would involve a reduction to 2-SAT from the very beginning. Can you see the way?

You were given a tree with n<=250 vertices and need solve a system of at most 250 equations on this tree with m<=250 variables x1, x2, ..., xm , where each variable must correspond to some vertex in the tree. Several variables can correspond to the same vertex. Each equation has the following form: in the path from vertex xai to vertex xbi the vertex closest to vertex pi must be the vertex qi.

Somewhat orthogonally to this particular problem, I felt like my general skill of reduction to 2-SAT could use some improvement. It seems that there are some common tricks that allow to replace complex conditions with just "or"s of two variables. For example, suppose we want a constraint "at most one of this subset of variables is true". It could naively be translated into n2 2-SAT clauses. However, we can introduce auxiliary variables to get by with only O(n) clauses: "bi must be true when at least one of the first i original variables ai is true" with clauses bi->bi+1 and ai->bi. Disallowing two true variables is then done via the clause bi->!ai+1.

What are the other cool 2-SAT reduction tricks?

Thanks for reading, and check back as I continue to remember August and September :)

Sunday, July 30, 2017

A week with two trees

This week's contests were clustered on Sunday. In the morning, the first day of IOI 2017 took place in Tehran (problems, results, top 5 on the left, video commentary). The "classical format" problems wiring and train were pretty tough, so most contestants invested their time into the marathon-style problem nowruz. Three contestants have managed to gain a decent score there together with 200 on the other problems, so it's quite likely that they will battle for the crown between themselves on day 2. Congratulations Attila, Yuta and Mingkuan!

A few hours later, Codeforces Round 426 gave a chance to people older than 20 :) (problems, results, top 5 on the left). The round contained a bunch of quite tedious implementation problems, and Radewoosh and LHiC have demonstrated that they are not afraid of tedious implementation by solving 4 out of 5. Well done!

In my previous summary, I have raised a sportsmanship question, asking whether it's OK to bail out of an AtCoder contest without submitting anything. The poll results came back as 42% OK, 28% not OK, and 30% saying that the contest format should be improved. As another outcome of that post, tourist and moejy0viiiiiv have shared their late submission strategies that do not consider bailing out as a benefit, so I was kind of asking the wrong question :) I encourage you to read the linked posts to see how the AtCoder format actually makes the competition more enjoyable.

I have also mentioned a bunch of problems in that post, so let me come back to one of them: you are given two rooted trees on the same set of 105 vertices. You need to assign integer weights to all vertices in such a way that for each subtree of each of the trees the sum of weights is either 1 or -1, or report that it's impossible.

First of all, we can notice that the value assigned to each vertex can be computed as (the sum of its subtree) - (the sum of subtrees of all its children). Since all said sums are either -1 or 1, the parity of this value depends solely on the parity of the number of children. So we can compare said parities in two trees, and if there's a mismatch, then there's definitely no solution.

Now, after playing a bit with a few examples, we can start to believe that in case all parities match, there is always a solution. During the actual contest I've implemented a brute force to make sure this is true. Moreover, we can make an even stronger hypothesis: there is always a solution where all values are -1,0 or 1. Again, my brute force has confirmed that this is most likely true.

For vertices where the value must be even, we can assume it's 0. For vertices where the value must be odd, we have two choices: -1 or 1. Let's call the latter odd vertices.

Our parity check guarantees that each subtree of each tree will have an odd number of odd vertices, in order to be able to get -1 or 1 in total. So we must either have k -1's and k+1 1's, or k+1 -1's and k 1's. Here comes the main idea: in order to achieve this, it suffices to split all odd vertices into pairs in such a way that in each subtree all odd vertices but one form k whole pairs, and each pair is guaranteed to have one 1 and one -1.

Having said that idea aloud, we can very quickly find a way to form the pairs: we would just go from leaves to the root of the tree, and each subtree will pass at most one unpaired odd vertex to its parent. There we would pair up as many odd vertices as possible, and send at most one further up.

Since we have not one, but two trees, we will get two sets of pairs. Can we still find an assignment of 1's and -1's that satisfies both sets? It turns out we can: the necessary and sufficient condition for such assignment to exist is for the graph formed by the pairs to be bipartite. And sure enough, since the graph has two types of edges, and each vertex has at most one edge of each type, any cycle must necessarily have the edge types alternate, and thus have even length. And when all cycles have even length, the graph is bipartite.

Looking back at the whole solution, we can see that the main challenge is to restrict one's solution in the right way. When we work with arbitrary numbers, there are just too many dimensions in the problem. When we restrict the numbers to just -1, 0, and 1, the problem becomes more tractable, and it becomes easier to see which ideas work and which don't. And when we add the extra restriction in form of achieving the necessary conditions by pairing up the vertices, the solution flows very naturally.

Thanks for reading, and check back next week!

Sunday, July 23, 2017

An unsportsmanlike week

Yandex.Algorithm 2017 Final Round took place both in Moscow and online on Tuesday (problems, results, top 5 on the left). My contest started like a nightmare: I was unable to solve any of the given problems during the first hour, modulo a quick incorrect heuristic submission on problem F. As more and more people submitted problems D (turned out to be a straightforward dynamic programming) and E (turned out to be about finding n-th Catalan number), I grew increasingly frustrated, but still couldn't solve them. After my wits finally came back to me after an hour, all problems suddenly seemed easy, and I did my best to catch up with the leaders, submitting 5 problems. Unfortunately, one of them turned out to have a very small bug, and I was denied that improbable victory :)

Congratulations to tourist, W4yneb0t and rng.58 on winning the prizes!

Here's the Catalan number problem that got me stumped. Two people are dividing a heap of n stones. They take stones in turns until none are left. At their turn, the first person can take any non-zero amount of stones. The second person can then also take any non-zero amount of stones, but with an additional restriction that the total amount of stones he has after this turn must not exceed the total amount of stones the first person has. How many ways are there for this process to go to completion, if we also want the total amount of stones of each person to be equal in the end?

TopCoder Open 2017 Round 2C was the first event of the weekend (problems, results, top 5 on the left, parallel round resultsmy screencast). The final 40 contestants for the Round 3 were chosen, and kraskevich advanced with the most confidence of all, as he was the only contestant to solve all problems. Congratulations!

This round contained a very cute easy problem. Two teams with n players each are playing a game. Each player has an integer strength. The game lasts n rounds. In the first round, the first player of the first team plays the first player of the second team, and the stronger player wins. In the second round, the first two players of the first team play the first two players of the second team, and the pair with the higher total strength wins. In the i-th round the strengths of the first i players in each team are added up to determine the winner. You know the strengths of each player, and the ordering of the players in the second team. You need to pick the ordering for the players in the first team in such a way that the first team wins exactly k rounds. Do you see the idea?

And finally, AtCoder Grand Contest 018 took place on Sunday (problems, results, top 5 on the left, analysis, my screencast with commentary). tourist has adapted his strategy to the occasion, submitting the first four problems before starting work on the last two, but in the end that wasn't even necessary as he also solved the hardest problem and won with a 15 minute margin. Well done!

As you can see in the screencast, I was also trying out this late submission strategy this time, and when I solved the first four and saw that Gennady has already submitted them, I was quite surprised. There was more than an hour left, so surely I'd be able to solve one more problem? And off I went, trying to crack the hardest problem because it gave more points and seemed much more interesting to solve than the previous one.

I have made quite a bit of progress, correctly simplifying the problem to the moment where the main idea mentioned in the analysis can be applied, but unfortunately could not come up with that idea. That left me increasingly anxious as the time ran out: should I still submit the four problems I have (which turned out to be all correct in upsolving), earning something like 40th place instead of 10th or so that I would've got had I submitted them right after solving? Or should I avoid submitting anything, thus not appearing in the scoreboard at all and not losing rating, but showing some possibly unsportsmanlike behavior?

I have to tell you, this is not a good choice to have. Now I admire people who can pull this strategy off without using the escape hatch even more :) To remind, the benefits of this strategy, as I see them (from comments in a previous post), are:
1) Not giving information to other competitors on the difficulty of the problems.
2) Not allowing other competitors to make easy risk/reward tradeoffs, as if they know the true scoreboard, they might submit their solutions with less testing when appropriate.

I ended up using the escape hatch, which left me feeling a bit guilty, but probably more uncertain than guilty. Do you think this is against the spirit of competition, as PavelKunyavskiy suggests? Please share your opinion in comments, and also let's have a vote:

Is it OK to bail out of a contest after a poor performance with late submit strategy at AtCoder?

Here's the hardest problem that led to all this confusion: You are given two rooted trees on the same set of 105 vertices. You need to assign integer weights to all vertices in such a way that for each subtree of each of the trees the sum of weights is either 1 or -1, or report that it's impossible. Can you do that?

In my previous summary, I have mentioned a hard VK Cup problem. You are given an undirected graph with 105 vertices and edges. You need to assign non-negative integer numbers not exceeding 106 to the vertices of the graph. The weight of each edge is then defined as the product of the numbers at its ends, while the weight of each vertex is equal to the square of its number. You need to find an assignment such that the total weight of all edges is greater than or equal to the total weight of all vertices, and at least one number is nonzero.

The first idea is: as soon as the graph has any cycle, we can assign number 1 to all vertices in the cycle, and 0 to all other vertices, and we'll get what we want. So now we can assume the graph is a forest, and even a tree.

Now, consider a leaf in that forest, and assume we know the number x assigned to the only vertex this leaf is connected to. If we now assign number y to this leaf, then the difference between the edge weight and the vertex weight will increase by x*y-y*y. We need to pick y to maximize this difference, which is finding the maximum of a quadratic function, which is achieved when y=x/2, and the difference increases by x2/4.

Now suppose we have a vertex that is connected to several leaves, and to one other non-leaf vertex for which we know the assigned number x. After we determine the number y for this vertex, all adjacent leaves must be assigned y/2, so we can again compute the difference as the function of y and find its maximum. It will be a quadratic function again, and the solution will look like x*const again.

Having rooted the tree in some way, this approach allows us to actually determine the optimal number for all vertices going from leaves to the root as multiples of the root number, and we can check if the overall delta is non-negative or not.

There's an additional complication caused by the fact that the resulting weights must be not very big integers. We can do the above computation in rational numbers to make all numbers integer, but they might become quite big. However, if we look at the weight differences that some small trees get, we can notice that almost all trees can get a non-negative difference, and come up with a case study that finds a subtree in a given tree which still has a non-negative difference but can be solved with small numbers. I will leave this last part as an exercise for the readers.

Wow, it feels good to have caught up with the times :) Thanks for reading, and check back next week!

A red-black week

Last week, Codeforces presented the problems from the VK Cup finals as two regular contests. First off, Codeforces Round 423 took place on Tuesday (problems, results, top 5 on the left, analysis). W4yneb0t has continued his string of excellent Codeforces performances with the best possible result - a victory, which has also catapulted him to the second place in the overall rating list. Well done!

In between the Codeforces rounds, TopCoder held its SRM 718 very early on Thursday (problems, results, top 5 on the left, anlaysis). The round seems to have been developing in a very exciting manner: three people submitted all three problems during the coding phase, then two of them lost points during the challenge phase, and the last remaining person with three problems failed the system test on the easiest problem! When the dust settled, snuke still remained in the first place thanks for his solution to the hard problem holding. Congratulations on the second SRM victory!

Codeforces Round 424 rounded up the week's contests (problems, results, top 5 on the left, analysis). TakanashiRikka was the fastest to solve the first four problems, and (probably not a sheer coincidence :)) the only contestant to have enough time to consider all cases in the hardest problem correctly while solving at least one other problem. Congratulations on the well-deserved first place!

Here's that tricky problem. You are given an undirected graph with 105 vertices and edges. You need to assign non-negative integer numbers not exceeding 106 to the vertices of the graph. The weight of each edge is then defined as the product of the numbers at its ends, while the weight of each vertex is equal to the square of its number. You need to find an assignment such that the total weight of all edges is greater than or equal to the total weight of all vertices, and at least one number is nonzero.

In my previous summary, I have mentioned a difficult IPSC problem: we start with a deck of 26 red and 26 black cards, and a number k (1<=k<=26). The first player takes any k cards from the deck, and arranges them in any order they choose. The second player takes any k cards from the remaining deck, and arranges them in any order they choose, but such that their sequence is different from the sequence of the first player. The remaining 52-2k cards are shuffled and dealt one by one. As soon as the last k dealt cards exactly match one of the player's sequences, that player wins. In case no match happens after the cards run out, we toss a coin, and each player wins with probability 50%. What is the probability of the first player winning, assuming both play optimally?

Solving this problem required almost all important programming contest skills: abstract mathematical reasoning, knowledge of standard algorithms, coming up with new ideas, good intuition about heuristics, and of course the programming skill itself.

We start off by noticing that the second player has a very simple way to achieve 50% winrate: he can just choose a sequence that is a complement of the first player's sequence (replace red cards by black and vice versa), and then everything is completely symmetric.

How can the second player achieve more? He has two resources: first, he can choose a string that is more likely to appear in the sequence of the remaining cards. Second, he can choose a string that, when it appears together with the string of the first player, tends to appear earlier.

The strings that are more likely to appear are those that leave an equal proportion of reds and blacks (after taking out the string of the first player once and the string of the second player twice), and have no borders (prefixes that are equal to suffixes). This is because we can count the number of ways a given string can appear by multiplying the number of positions it can appear in by the number of ways to place the remaining characters after the matching part is fixed. The number of ways to place the remaining characters is maximized then the remaining characters have equal numbers of blacks and reds. This slightly overcounts the number of ways because in some cases the string can appear more than once; the lack of borders minimizes the number of such occurrences.

The strings that tend to appear earlier when both appear are those which have a suffix which matches a prefix of the first player's string. At best, if the first player string is s+c, where s is a string of length k-1 and c is a character, the second player should pick his string from 'r'+s and 'b'+s. In this case as soon as there's a match of the first player's string not in the first position, we can have a >50% chance to have a match of our string one position before.

Now we can already make the first attempt at a solution: let's try likely candidates for the first player's best move - it should likely be among the strings that have the most appearances; the second player should then choose either another string with lots of appearances, or a string that counter-plays the first player's string in the manner described above. However, this is not enough to solve the problem - we will get a wrong answer.

As part of implementing the above solution, we had to also implement the function to count the sought probability for the given pair of strings. It's also not entirely trivial, and can be done by using dynamic programming where the state is the number of remaining red and black cards, and the state in the Aho-Corasick automaton of the two strings.

So, where do we go from there? Since we already have the function that computes the probability, we can now run it on all pairs of strings for small values of k and try to notice a pattern. We get something like this:

1 0.5 r b
2 0.5 rb br
3 0.3444170488792196 rbr rrb
4 0.35992624362382514 rrbb rrrb
5 0.3777939526283981 rrbrb brrbr
6 0.413011479190688 rbrbrr brbrbr
7 0.45319632265323256 rrbrrbb brrbrrb
8 0.4782049196004824 rrbbrrbb brrbbrrb

No obvious pattern seems to appear. However, we can notice that for large values of k, more precisely when 3k>52, the answer will be 0.5 simply because there is not enough remaining cards for either string to appear. So we only need to research the values of k between 9 and 17 now.

And here comes another key idea: we need to believe that by cutting enough branches early, our exhaustive search solution can run in a few minutes for all those values. At first, this seems improbable. For example, for k=16 we have 65536 candidates for each string, and four billion combinations in total, not to mention the Aho-Corasick on the inside. However, from our previous attempts at a solution we have some leads. More precisely, we know which strings of the second player are the most likely good answers for each string of the first player.

This allows us to get a good upper bound on the first player's score for each particular string reasonably quickly, which leads to the following optimization idea: let's run the search for all strings of the first player at the same time, and at each point we will take the "most promising" string - the one with the highest upper bound so far - and make one more step of the search for it, in other words try one more candidate for the second player's string, which may lower its upper bound. We continue this process until we arrive at the state where the most promising candidate does not have anything else to try, because we already ran through all possible second player strings for it - and this candidate then gives us the answer.

This search runs relatively fast because for most first player strings, our heuristics will give us an upper bound that is lower than the ultimate answer very quickly, and we will stop considering those strings further. It is still quite slow for larger values of k, so we need a second optimization on top: we can skip the Aho-Corasick part in the simple case where there's simply not enough cards of some color for the second player's string to appear. With those two optimizations, we can finally get all the answers in a few minutes.

Thanks for reading, and check back soon for this week's summary!

Sunday, July 16, 2017

A postcard week

The July 3 - July 9 week had a pretty busy weekend. On Saturday, IPSC 2017 has gathered a lot of current contestants together with veterans that come out of retirement just for this contest every year (problems, results, top 5 on the left, analysis). The (relatively) current contestants prevailed this time, with team Past Glory winning with a solid 2 point gap. Well done!

They were one of only two teams who has managed to solve the hardest problem L2. It went like this: we start with a deck of 26 red and 26 black cards, and a number k (1<=k<=26). The first player takes any k cards from the deck, and arranges them in any order they choose. The second player takes any k cards from the remaining deck, and arranges them in any order they choose, but such that their sequence is different from the sequence of the first player. The remaining 52-2k cards are shuffled and dealt one by one. As soon as the last k dealt cards exactly match one of the player's sequences, that player wins. In case no match happens after the cards run out, we toss a coin, and each player wins with probability 50%. What is the probability of the first player winning, assuming both play optimally?

Just an hour later, TopCoder Open 2017 Round 2B selected another 40 lucky advancers (problems, results, top 5 on the left, analysis, parallel round results, my screencast). dotory1158 had a solid point margin from his solutions, and did not throw it all away during the challenge phase, although the contest became much closer :) Congratulations on the win!

The final round of VK Cup 2017 took place in St Petersburg on Sunday (problems, results, top 5 on the left). Continuing the Snackdown trend from the last week, two-person teams were competing. The xray team emerged on top thanks to a very fast start - in fact, they already got enough points for the first place at 1 hour and 38 minutes into the contest, out of 3 hours. Very impressive!

And finally, AtCoder hosted its Grand Contest 017 on Sunday as well (problems, results, top 5 on the left, analysis). Once again the delayed submit strategy has worked very well for tourist, but this time the gap was so huge that the strategy choice didn't really matter. Way to go, Gennady!

Problem D in this round was about the well-known game of Hackenbush, more precisely green Hackenbush: you are given a rooted tree. Two players alternate turns, in each turn a player removes an edge together with the subtree hanging on this edge. When a player can not make a move (only the root remains), he loses. Who will win when both players play optimally?

If you haven't seen this game before, then I encourage you to try solving the problem before searching for the optimal strategies in the Internet (which has them). I think the solution is quite beautiful!

Thanks for reading, and check back for this week's summary.

A hammer week

TopCoder has hosted two SRMs during the June 26 - July 2 week. First off, SRM 716 took place on Tuesday (problems, results, top 5 on the left, analysis). ACRush came back to the SRMs after more than a year, presumably to practice for the upcoming TCO rounds. He scored more points than anybody else from problem solving - but it was only good enough for the third place, as dotory1158 and especially K.A.D.R shined in the challenge phase. Well done!

Later on the same day, Codeforces held its Round 421 (problems, results, top 5 on the left, analysis). There was a certain amount of unfortunate controversy with regard to problem A, so the results should be taken with a grain of salt - nevertheless, TakanashiRikka was the best on the remaining four problems and got the first place. Congratulations!

The second TopCoder round of the week, SRM 717, happened on Friday (problems, results, top 5 on the left, analysis, my screencast). I had my solution for the medium problem fail, but even without that setback I would finish behind Deretin - great job on the convincing win!

Here's what that problem was about. You are given two numbers n (up to 109) and m (up to 105). For each i between 1 and m, you need to find the number of permutations of n+i objects such that the first i objects are not left untouched by the permutation. As an example, when n=0 we're counting derangements of each size between 1 and m.

My solution for this problem involved Fast Fourier Transformation because, well, I had a hammer, and the problem was not dissimilar enough to a nail :) And it failed because I've reused FFT code from my library, which I've optimized heavily to squeeze under the time limit in a previous Open Cup round, and to which I've introduced a small bug during that optimization :(

And on the weekend, two-person teams competed for the fame and monetary prizes at the onsite finals of CodeChef Snackdown 2017 (problems, results, top 5 on the left, analysis). The last year's winner "Messages compress" were in the lead for long periods of time, but in the last hour they tried to get three problems accepted but got zero, which gave other teams a chance. Team Dandelion have seized that chance and won solving 9 problems out of 10. Way to go!

Thanks for reading, and check back for the next week's summary.

FBHC2017 Finals

There was one quite important competition that I forgot to mention two summaries back: Facebook Hacker Cup 2017 onsite finals took place on June 14 (problems, results, top 5 on the left, analysis, my screencast). In this round I have managed to make three different bugs in my solution to the second problem and the way those bugs combined led to my program passing the samples almost by accident, but of course it did not pass the final testing. On the bright side, not spending more time on this problem allowed me enough time to solve all other problems, so maybe the three bugs were actually a crucial component of the victory :)

Thanks for reading, and check back for the hopefully more complete next week's summary!

Saturday, July 15, 2017

A dynamic nimber week

The June 19 - June 25 week did not actually have any rounds that I'd like to mention, so let me turn back to the problem from the previous week's summary.

You are given an acyclic directed graph with n<=15 vertices and m arcs. Vertices 1 and 2 each contain a chip. Two players take alternating turns, each turn consisting of moving one of the chips along an arc of the graph. The player who can't make a valid move loses. We want to know which player wins if both play optimally. Now consider all 2m subsets of the graph's arcs; for how many of them does the first player win if we keep only the arcs from this subset?

Without the subset part, the problem is pretty standard. We need to compute the nimbers for all vertices of the graph, and the first player wins if and only if the nimbers for the first two vertices are different.

However, we do not have the time to even iterate over all subsets, and thus we can not apply this naive algorithm. Dynamic programming comes to the rescue, allowing to reuse some computations. The dynamic programming idea that is closest to the surface is: go in a topological ordering, and keep computing the nimbers. The nimber for a vertex depends on which arcs going from this vertex are included in the subset, and on the values of nimbers for the vertices reachable from this one, so our dynamic programming state should include only the values of nimbers for the already processed vertices, reducing the running time from 2m to something around the n-th Bell number. That is still not too little, but it turns out this approach could be squeezed to pass.

However, it turns out there is another beautiful dynamic programming idea that helps us move to the running time on the order of 3n. Instead of going vertex-by-vertex, we will now go nimber-by-nimber. For each consecutive nimber, we will try all possibilities for a subset a of vertices having this nimber. What are the requirements on such a subset? From the definition of nimbers we get:
  1. For each vertex in this subset, there must be an arc to at least one vertex with each smaller nimber.
  2. There must be no arcs within this subset.
Condition 2 is pretty easy to check, but condition 1 is not. However, we can notice that we can instead check:
  1. For each vertex that still has no assigned nimber (and thus will eventually have a higher nimber), there must be an arc to at least one vertex in this subset.
  2. There must be no arcs within this subset.
The new condition 1 guarantees that all higher nimbers will have arcs to all lower nimbers, so by the time we reach a certain nimber, we don't need to care about arcs to lower nimbers anymore. Now we can see that the state of our dynamic programming can simply be the last placed nimber and the set of vertices that do not yet have a nimber.

Finally, we can notice that the last placed nimber does not actually take part in the computations at all, so we can reduce the number of states n times more to just remembering the set of vertices that do not yet have a nimber, yielding overall complexity of O(n*3n).

Thanks for reading, and check back for the next week's summary!

An all-at-once week

Codeforces came back during the June 12 - June 18 week with its Round 419 (problems, results, top 5 on the left, analysis). Only Radewoosh and yutaka1999 could get all problems right, and Radewoosh has booked his (semi-) permanent place on the front page of Codeforces with this victory: he is now in top 10 by rating. Well done!

AtCoder Grand Contest 016 took place on the next day (problems, results, top 5 on the left, analysis, my screencast). It's not as if tourist needs an unusual strategy to win, but he successfully demonstrated that the AtCoder rules actually make it reasonable to withhold submissions until one has a solution for the last problem they intend to submit. As a theory, here's how this strategy might have helped here: maybe Gennady already had all solutions implemented by the 68-th minute, but he saw that I have an incorrect attempt for one of the problems, and he was not sure in his solution for problem D. So he submitted everything else, and started testing the solution for D more thoroughly, as he knew that he'd have five minutes after I solve my last problem to still get the first place. Gennady, is this theory at least remotely close to reality? :)

The hardest problem of the round presented a peculiar combination of dynamic programming and nimbers, one that I don't recall seeing before. You are given an acyclic directed graph with n<=15 vertices and m arcs. Vertices 1 and 2 each contain a chip. Two players take alternating turns, each turn consisting of moving one of the chips along an arc of the graph. The player who can't make a valid move loses. We want to know which player wins if both play optimally. The problem so far would be very simple, of course, so here comes the twist: consider all 2m subsets of the graph's arcs; for how many of them does the first player win if we keep only the arcs from this subset?

Thanks for reading, and check back soon for the next week's summary!

Sunday, July 9, 2017

A Dublin week

The June 5 - June 11 week was dominated by the main Google Code Jam elimination rounds.

First off, Code Jam Round 3 took place on Saturday (problems, results, top 5 on the left, analysis). The top 26 have qualified for the finals in Dublin, and kevinsogo was the only contestant to solve the very tricky last problem and still have time left for two more - congratulations on the first place!

I have contributed to the constructive problem trend with problem B: you are given a directed graph with at most n<=1000 vertices, and need to output any nowhere-zero flow in it with edge flows not exceeding n2 by absolute value. Seymour's theorem shows that we can actually make do with values between -6 and 6, but such frugality was not required :)

One day later, not just two, but 21 more tickets to Dublin were up for grabs in Distributed Code Jam Round 2 (problems, results, top 5 on the left, analysis). Solving everything was not required to qualify, but it was certainly required to get into the screenshot on the left. Congratulations to fagu on being the fastest!

Thanks for reading, and check back for the next week's summary!

Monday, July 3, 2017

A week**7

TopCoder SRM 715 was the first round of May 29 - June 4 week (problems, results, top 5 on the left, my screencast). It was nice to reduce the amount of 3am rounds thanks to my United States trip :)

The medium problem continued the "constructive" trend on TopCoder. You are given four numbers: k, n1, n2, n3. You need to construct a valid Towers of Hanoi configuration that requires exactly k moves to be solved, has n1 disks on the first rod, n2 on the second one, and n3 on the third one.

Yandex.Algorithm 2017 Round 3 wrapped up the week, and also completed the selection of the 25 finalists (problems, results, top 5 on the left, analysis, overall standings). Despite the addition of a marathon round which should theoretically be less correlated with the algorithm rounds, the finalist cutoff just increased more or less proportionally, from 32 points from 3 rounds last year to 40 points from 4 rounds this year. Congratulations to all finalists!

In my previous summary, I have mentioned a problem from Round 2 of the same competition: consider all sequences of balls of k<=15 colors, with exactly ai<=15 balls of i-th color, and no two adjacent balls of the same color. Let's arrange them in lexicographical order. What is the number of the given sequence s in this order?

Finding the number of s is equivalent to counting the number of sequences coming before s in lexicographical order. Coming before in lexicographical order, in turn, means that some prefix of the such sequence would be equal to the corresponding prefix of s, and the next number will be smaller than the corresponding number of s. That allows us to split our problem into 15 simpler subproblems, each looking like: how many sequences of balls of k colors exist, with exactly bi<=ai balls of i-th color, no two adjacent balls of the same color, and the first ball has color less than c, and not equal to d?

Here comes the main idea that I keep forgetting. Let's add balls into our sequence color-by-color. In order to not have adjacent balls of the same color in the end, it suffices to simply remember how many pair of adjacent balls of the same color we have. In other words, having placed some amount of colors, for a total of t balls, we have t+1 positions where the balls of the next color can be placed, and some of those positions are special: we must place at least one ball in that position eventually, to avoid having two adjacent balls of the same color in the final position. We need to remember just the number of special positions, and do not need to remember which ones exactly are special.

When placing a new color which has ai balls, we iterate over the number m of blocks of consecutive balls of this color we're going to have, and the number p of those blocks that will be inserted into special positions. Now we need to multiply several combination numbers (to choose p special positions, to choose m-p non-special positions, and to split ai balls into m non-empty blocks), and we also know the new number of special positions which changes by -p+(ai-m).

Finally, in order to deal with the requirements on the color of the first ball, we can start by processing the colors the first ball can be, and continue with the colors it can't be, and disallow placing balls into the first position on the second stage, which just reduces the number of available non-special positions by one.

Assuming we have k colors and at most a balls of each color, the running time of this approach is a product of:
  • k*a for iterating over the position of the first difference,
  • k for iterating over colors,
  • k*a for iterating over the number of special positions so far,
  • a for iterating over the number of blocks we form with the new color,
  • a for iterating over the amount of said blocks that go into special positions,
for a total of O(k3a4).

Thanks for reading, and check back for the next week's summary!

Tuesday, May 30, 2017

A 7-time week

ACM ICPC 2017 World Finals headlined the last week (problems, results, top 12 on the left, our stream, text analysis, video analysis). The ITMO team was leading for quite some time, but they did not manage to solve problem J in time, which gave a chance to the other teams. They did not take advantage of that chance, however, and ITMO became 7-time world champions. Congratulations!

Problem D, while being a bit on the "professional" side, was quite cute. You are given 500000 top-left corners and 500000 bottom-right corners on the plane, and need to pick one of each to obtain a valid rectangle with maximum possible area.

Here's its analysis video, in case you give up :)

AtCoder Grand Contest 015 took place on Saturday, when most World Finals contestants should have already got back home (problems, results, top 5 on the left, my screencast with commentary, analysis). When just two last problems remained, I went for the harder one, and almost got it (got accepted in upsolving after about 5 more minutes) - but not quite. sky58, on the other hand, chose the right strategy and won - congratulations!

Problem C allowed multiple different solutions, each with a non-trivial observation and thus quite exciting to get. You are given a set of blue cells on the 2000x2000 grid that forms a forest with regard to 4-connectivity, and 200000 queries. Each query asks: if we take a certain sub-rectangle of our grid, how many connected components of blue cells are there if we look just at that sub-rectangle?

A few hours later, Yandex.Algorithm 2017 Round 2 provided another chance to score place points towards qualification for the final round (problems, results, top 5 on the left, my screencast, analysis). Tourist threw all strategy considerations out of the window by solving all problems with 15 minutes remaining, while others have barely managed solve solve 5 out of 6. Amazing performance!

The solution to problem E relied on a standard idea which I forgot, so maybe explaining the solution in my blog will help me remember :) Here's what it was about: consider all sequences of balls of k<=15 colors, with exactly ai<=15 balls of i-th color, and no two adjacent balls of the same color. Let's arrange them in lexicographical order. What is the number of the given sequence in this order?

Finally, Codeforces held the online mirror of Helvetic Coding Contest 2017 which I have mentioned earlier (problems, onsite results, online results, online top 5 on the left, analysis). Congratulations to the sweet team on the victory (and their penalty time is better than ours from the onsite contest, too)!

In my previous summary, I have mentioned a TopCoder problem: you are given the distances from two vertices to all others in an unknown undirected graph with 50 vertices. You need to construct any graph with such distances from the first two vertices.

Consider an arbitrary pair of vertices. If their distances to vertex 1 differ by at least 2, then we can't have an edge between them. The same is true for their distances to vertex 2. This is relatively easy to spot, but here comes the hard part: if neither of the above is true, in other words if both pairs of distances differ by at most 1, then we can assume to always have this edge in our graph. Because if we don't have it, we can add it and distances to vertices 1 and 2 will not be affected for the vertices we just connected, and thus for all other vertices as well.

Now, since for each edge we can determine whether we have it in our graph or not, all that remains is to construct the graph, and check if the distances to vertices 1 and 2 come out as expected.

Thanks for reading, and check back next week!

Wednesday, May 24, 2017

ACM ICPC 2017 World Finals stream

ACM ICPC 2017 World Finals start tomorrow at 9:00 local time. There will be quite a few ways to follow the event, the most prominent being ICPC Live. Together with tourist and Endagorion, we have decided to provide another perspective on this competition: we'll try to solve the problems as soon as we get the statements (the rumor has it, we'll be able to submit on Kattis as well), and will stream our screen and our discussions on Youtube! Tune in tomorrow around the time World Finals starts, although we might start a bit later.

We're not sure how this will work out, and would appreciate any advice in comments! One thing that we're not yet sure about is whether we should talk in English or in Russian. The latter should be more productive and thus more realistic, but will naturally be harder to follow for most of the audience. On the other side, the sound quality might be so bad that it won't even matter :)

Monday, May 22, 2017

An almost Rapid week

Last week was relatively calm, with just two competitions that I want to mention, both on Saturday. First, TopCoder Open 2017 Round 2A has significantly raised the stakes compared to Round 1, with just 40 top finishers qualifying (problems, results, top 5 on the left, my screencast). I have enjoyed the medium problem of this round the most, as it is quite rewarding to come up with an easy-to-code beautiful solution after wasting some time coding a very tricky one that comes to one's mind first. Especially rewarding step is removing all code written for the tricky solution (screencast position) :)

Here's what the problem was about: you are given the distances from two vertices to all others in an unknown undirected graph with 50 vertices. You need to construct any graph with such distances from the first two vertices.

With just a few minutes break, Codeforces hosted its Round 415 (problems, results, top 5 on the left). With the only successful solution to problem E coming from a contestant with no other solved problems, it was the speed that decided the winner, and Radewoosh was almost half an hour faster than the rest. Congratulations!

In my previous summary, I have mentioned a Codeforces problem: you are given a connected undirected graph with at most 300000 edges. You suspect that this graph was constructed in the following manner: we started with a graph with no edges and assigned each vertex an integer label, then connected all pairs of vertices for which labels differed by at most one. Your goal is to return a set of labels that could have been used to construct the given graph, or report that there isn't any.

First of all, shifting all labels by a constant does not change the answer, so let's pick a vertex A and say that its label is 0. Now, the labels for all other vertices are almost uniquely determined: it's not hard to see that for all vertices labeled not 0, the absolute value of the label is equal to the shortest distance from A. So, we just need to determine which sign will each label have, and which vertices (out of those at distance 1) will have label 0.

Here we can see that vertices at distance 1 from A are the most tricky part, so let's concentrate on them. We can assign three labels to them: let's say those with label 1 are set X, with label 0 are set Y, and with label -1 are set Z. By the problem statement, all those sets must be cliques, and additionally we must have all edges between X and Y, and between Y and Z, but no edges between X and Z.

Let's assume we have a representative B from X, and a representative C from Z. Then the label of each vertex can be determined trivially: if it's connected only to B, it's 1, only to C, then -1, to both, then 0.

It doesn't matter which representatives we pick - in fact, it's not hard to see that we need to pick any two vertices B and C that are connected to A but not between themselves. If we remember that A can also be picked freely, our goal now is to find a chain of two edges such that its endpoints are not connected.

And this, in turn, can be done like this: first, let's find any missing edge. The graph is connected, so there's a path between its ends. If this path is of length 2, we're found what we need. If it's longer, consider its next-to-last vertex. If it's connected to its first vertex, we've found what we need. If not, then we can remove the last vertex and obtain a shorter path such that its ends are not connected. By repeating the process, we will eventually find the required path of length 2.

Now we have solved the problem for vertices with labels -1, 0 and 1, but how do we determine the sign of the label for the remaining vertices? Well, for vertices with label 2/-2, we can use connectivity to any vertex with label 1 as the differentiating factor, and so on.

Finally, having determined all labels, we need to check if our graph does in fact correspond to those labels. The simplest way to do that seems to be: let's check that for all edges in our graph the difference between the labels of the ends is at most one, and also check that the total number of edges in our graph matches the total number of pairs of vertices with labels differing by at most 1. After the first check, the only way we can have an incorrect graph would be having not all required edges, and the second check takes care of that.

Thanks for reading, and check back here and in my Twitter for news from ACM ICPC World Finals this week!

Another speaking week

Just like the previous week, the fun of May 8 - May 14 week started on Thursday with a Codeforces round, this time with Playrix Codescapes Cup (problems, results, top 5 on the left, analysis). Even an incorrect submission for E could not stop tourist, as he still won thanks to solving problem G and having much more points than everybody else on F. Well done!

The next round of the week was also a named Codeforces round - this time Tinkoff Challenge Final Round (problems, results, top 5 on the left, analysis, my screencast with commentary). This time explaining everything aloud did not lead to a disastrous performance for me (finally!). Maybe the quality of explanations suffered :) V--o_o--V was still significantly faster, so congratulations on the victory!

Problem D was a nice exercise in discovering a reliable way to detect a relatively simple pattern. You are given a connected undirected graph with at most 300000 edges. You suspect that this graph was constructed in the following manner: we started with a graph with no edges and assigned each vertex an integer label, then connected all pairs of vertices for which labels differed by at most one. Your goal is to return a set of labels that could have been used to construct the given graph, or report that there isn't any.

Later on Saturday, Google Code Jam Round 2 has narrowed the field to just 500 competitors (problems, results, top 5 on the left, analysis). Congratulations to jsannemo on the victory - quite impressive form for the KTH team before the upcoming World Finals, with this win and simonlindholm's win a week earlier.

Yandex.Algorithm 2017 Round 1 took place early on Sunday (problems, results, top 5 on the left, analysis, my screencast). Um_nik was flawless on the five easier problems and correctly noticed the fact that problem E was also, in fact, not very hard. Well done!

Just 80 minutes later, Russian Code Cup 2017 Elimination Round has revealed the 55 finalists (problems, results, top 5 on the left, analysis, my screencast). LHiC did not make any mistakes, and that turned out to be the key to getting the first place. Congratulations!

And finally, Distributed Google Code Jam Round 1 wrapped up the week (problems, results, top 5 on the left, analysis). mk.al13n was ten minutes faster than the rest of the pack in this still quite unusual format with parallel computations. Great job on the victory!

In my previous summary, I have mentioned an AtCoder problem: you are given two trees on the same set of vertices, one blue and one red. You want to convert the blue tree into the red one, step-by-step. At each step, you must take any path consisting of blue edges, add a red edge connecting its endpoints, but remove one of the edges of the path. After n-1 steps all blue edges will be removed, and n-1 red edges will be added, and you want those edges to form the required red tree.

The key idea in this problem is: let's look at the process from the end. Before the last step, we have just one blue edge connecting vertices, say, A and B, so our only option is to remove that edge and add a red edge connecting A and B. Now in the next-to-last step, we must either do the same, or make use of the last blue edge: for example, we can remove blue edge A-C, and add red edge B-C. After some staring at the paper, one can figure out what does this mean: first, we need to find an edge that is both blue and red for the last step, and then we need to contract it - unit its ends together into one vertex. Then, we need to find an edge that is both blue and red in the resulting graph (it might either be blue and red in the beginning, or be a result of a merge of different blue and red edges during the contraction), and contract it again, and so on until the graph is just one vertex.

Now it becomes clear that it doesn't really matter in which order we do the contractions, as they never make things worse. So we should just repeatedly perform any available contraction. There's some technical mastery involved in making the process run in O(n*polylog(n)) time, but that part is relatively standard and I don't want to focus on it too much. You can check the analysis for more details.

Thanks for reading, and check back soon for the last week's summary!