Tuesday, September 16, 2014

More on the Hamiltonian Plumber

This Sunday, I've launched a website with the goal of learning about good testcases that break a heuristic solution for a decisive TopCoder Open problem that can be reduced to Traveling Salesman: http://hamiltonianplumber.appspot.com/

So far, nobody except myself has submitted a testcase that makes the solution do at least two iterations — in other words, greedy improvement works in all submitted testcases. This is quite disappointing, so I'm wondering what's the reason for that. It could be:
  1. The challenge is not interesting enough, so people either don't bother at all or try a few manual tests.
  2. The website is confusing, so people don't understand what's really going on.
  3. The website is broken, so people submit good testcases but they are scored incorrectly.
  4. Something else? Please share what you don't like about the challenge!
Let me also share a simple strategy that allows to get a score higher than 1.0. It's actually quite straightforward - we can just try random big tests until we find one that scores more than 1.0. One might need to try several tests before that happens, and doing that using the website is slow and clumsy. However, the website actually has the code available for download (http://hamiltonianplumber.appspot.com/source.html), so one can quickly craft a local stress-test and find a case that requires many attempts. Since you don't know the hidden random seed, a testcase that requires two attempts on your machine might still be solvable in one on the server, but if you find a testcase that needs ten attempts, chances are your score on the server will be more than 1.0, too.

Sunday, September 14, 2014

This week in competitive programming

There were no regular competitions this week, so it's a good time to return to a past problem.

Let's talk about TopCoder Open 2014 Round 3A hard problem "PlumbersCoins" (previous post).

In short, the problem statement is: you are given at most 50 interesting points on a line that you need to visit in any order starting from point 0. In addition to moving along the line at one unit per second, you can use teleports between the given pairs of locations. The teleports are non-interleaving (if one teleport connects points a < b, and the other connects points c < d, then either b < c or d < a), and there are at most 25 teleports, each taking exactly one second to use. What's the shortest time required to visit all interesting points?

During the competition, I didn't notice that the teleports are non-interleaving, and thus the problem became much harder. The only thing I could come up with was reducing the problem to the Traveling Salesman problem, so I coded a heuristic solution for the Traveling Salesman problem that I remembered from IOI 2002 practice problem "red": start with a random path and repeatedly reverse its subpaths while the solution improves, then repeat everything until we run out of time. Surprisingly, this solution passed the system tests!

But then I started to think: is this really surprising? Does there exist a valid testcase that makes this solution fail?

I'm relying on you to find out :) Here's a website where you can try your testcases:

http://hamiltonianplumber.appspot.com/

Please remember to sign in at the bottom of the page to be included in the scoreboard (and so that others can't see your last testcase :))

Monday, September 8, 2014

This week in competitive programming

Last Monday, the Summer 2014 Petrozavodsk training camp came to its conclusion (day 1, day 2, day 3, day 4, day 5, day 6, day 7, day 8, day 9, overall stats, top 5 on the left). Some of the best Russian, Ukrainian, Belarusian, Romanian, Armenian and Kazakhstani teams started their preparation for the new ACM ICPC season that will end with the World Finals in Marrakech in May 2015. Unfortunately, for the first time in my memory, several very strong teams including the World Finals gold medalists Moscow State University team did not participate due to a schedule conflict: many contestants opted for summer internships in technology companies and could not come to Petrozavodsk. Nevertheless, the remaining teams still tried their best to solve problemsets that are often harder than the World Finals themselves. The team of Gennady Korotkevich (he was joined by different ITMO students at each contest) got a clear overall first place by winning most contests, but note that there's still no decision whether Gennady will take part in official ACM ICPC contests in this season (or this decision is a tightly kept secret). The Moscow Institute of Physics and Technology team with the team members from two different last year teams placed clear second, winning two contests and performing well in most others — they will definintely be a team worth watching.

TopCoder SRM 632 took place on Friday afternoon (problems, results, top 5 on the left, my screencast). The medium problem involved a nice graph theory observation: one needed to find the maximum flow in a graph which has different integer powers of 3 as edge lengths. Can you see how we can use the fact that the edge lengths are so exponentially different to create a simple solution?

Codeforces Round 265 rounded up the week on Sunday night (problems, results, top 5 on the left). Tourist has continued his 2014 unbeaten run in Codeforces — great job! — but this time the fight was really close. Both myself and Gennady followed essentially the same strategy during the round, solving 4 out of 5 problems in A, C, B, D order, and looking to find bugs in solutions of other contestants for problem A in between to score some challenge points. Gennady has managed to find 6 incorrect solutions for A while I stopped at 3 incorrect solutions for A and one incorrect solution for B, and those two extra challenges have earned him the first place. The system test showed that there were no incorrect solutions for problem A in my room left standing, so I didn't have a chance to catch Gennady just by challenging solutions for problem A.

However, that same room scoreboard does show my chance to get ahead of Gennady: 3 submissions have failed the system test for problem C. During the round, I have submitted a very simple solution for problem C and thus didn't expect others to submit incorrect ones — it was a big mistake in my thinking. The solutions for C were very short and easy to understand, and thus spotting the incorrect ones could have been easy. What's more, several contestants scored a lot of challenge points by challenging problem C, so I could have noticed that problem C is vulnerable by looking at their stats during the round — but it did not occur to me to do that. Lesson learned: next time I'll do my best to use all information that I have access to, including which problems are being actively challenged by others.

In other news, the round featured a very difficult technical problem E that was solved by just one contestant. Coincidentally, this problem also featured a graph with exponential edge lengths: you were given a graph with at most 100000 vertices, 100000 edges, and each edge length was an integer power of two between 2 and 2100000 (this time the edge lengths were not necessarily different). You were asked a very simple question: to find the shortest path from one vertex to another, suggesting that Dijkstra's algorithm is the way to go. But how does one handle such enormous edge lengths?

Thanks for reading, and check back next week!

Monday, September 1, 2014

This week in competitive programming

Last week had two international programming contests. The first one was Codeforces Round 263 on Tuesday (problems, results, top 5 on the left), won by WJMZBMR who was the only one able to solve all problems without mistakes. Great job!

Then, TopCoder SRM 631 took place on Saturday night (problems, results, top 5 on the left). The hardest problem was a data structure problem: you were given a rooted tree, and had to handle two types of requests: pick any subtree of the tree, detach it from the tree and attach it somewhere else in the tree; given two nodes in the tree such that one node is an ancestor of the other, find the maximum number of a vertex on the path between them.

Such problems usually have several working approaches:
  • implement all operations in a straightforward O(n) manner, so that the entire problem needs O(n2). This is too slow, but we can then optimize the constant hidden in the O-notation a lot and squeeze it under the time limit.
  • implement all operations using a complex data structure so that they take O(log(n)), so that the entire problem needs O(n*log(n)). This is usually the fastest approach, but it might require a very complex data structure - in this case, one could use a link/cut tree.
  • use sqrt-decomposition so that each operation takes O(sqrt(n)), for the total running time of O(n*sqrt(n)). In this particular problem sqrt-decomposition means splitting all queries into blocks of sqrt(n), and shrinking the tree to only contain interesting vertices for each block of queries.
I like sqrt-decomposition more for this problem since it's really easy to implement.

Thanks for reading, and check back next week!

Monday, August 25, 2014

This week in competitive programming

Last week was very quiet, with just the TopCoder SRM 630 on Friday (problems, results, top 5 on the left). Egor has beaten the SRM-at-5am odds and won the round by just 1.02 points - congratulations! I was not brave enough to take part in the middle of the night, so I don't have any insights to share about the problems.

Thanks for reading this short summary, check back next week for a more interesting post :)

Saturday, August 16, 2014

This week in competitive programming

On Tuesday, TopCoder SRM 629 has fulfilled two characteristic features of a good ICPC contest: nobody solved all problems, but each problem was solved by somebody (problems, results, top 5 on the left, my screencast). These properties are not as important for the TopCoder format, of course, since the score of a problem decreases with time, and thus even if several people solve all problems, they are still differentiated quite well.

Two things made this possible: the easy problem had a very subtle trick, while the hard problem was just hard.

The easy problem had a real-world statement: you have a rectangular hole in the ground, and want to cover it completely using several rectangular boards, placing each board in such a way that its sides are parallel to the sides of the hole. In order for the boards not to fall into the hole, you want to place the boards in such a way that all four corners of each board are not inside the hole and not on its boundary. Given the size of the hole and the sizes of the boards you have, what is the smallest number of boards you need to cover the entire hole?

When a problem has a statement that relies on common sense instead of formal maths, one needs to formalize the statement before solving the problem. In this case, we need to formalize the "all four corners are outside the hole" part and the "covers completely" part. And the main trick is that those two are formalized differently: for all corners to be outside the hole, we need the height of the board to be strictly greater than the height of the hole (or the same for width), while covering the hole completely requires that sum of the widths of the boards is greater than or equal to the width of the hole (or the same for height). It was very easy to miss the "or equal" in the second formalization, and many competitors including the match winner K.A.D.R made that mistake. And because the formalization step is usually quite simple and boring, we're simply not used to looking for mistakes in that step - so even re-checking the solution won't help since one would compare the program to the formalized version of the problem in one's head, and find no discrepancies.

The hard problem asked about a very simple thing: given N, K and MOD, calculate S(N,K) modulo MOD, where S(N,K) is the sum of all possible products of exactly K different integers, each between 1 and N. For example, S(3,2) = 1*2+1*3+2*3 = 11. It's not hard to come up with the following reccurrence relation: S(N,K) = S(N-1,K)+N*S(N-1,K-1) - it represents whether we take the number N into the product or not. However, N was up to 109, and K was up to 2000, so simply using that formula would be too slow.

K.A.D.R has come up with a brilliant insight which I want to share with you. He has noticed that when we fix the value of K, S(N,K) is a polynomial in N. For example, S(N,1)=N*(N+1)/2=1/2*N2+1/2*N. It's not hard to prove by induction that S(N,K) is a polynomial of degree 2K. One might now start constructing this polynomial in order to find its value for the given N - but that is quite hard, too. What Iaroslav did instead was to simply compute the value of this polynomial in points 1, 2, ..., 2K: S(1,K), S(2,K), ..., S(2K,K) can be computed in O(K2) time using the above recurrence relation very easily. And computing the value of a polynomial of degree 2K in point N given its values in 2K points can be done using a standard interpolation algorithm in O(K2), too!

Congratulations Iaroslav on this amazing solution and on winning the match!

On Friday, the Google Code Jam became yet another major contest won by Gennady in 2014 (problems, results, top 5 on the left). The Code Jam's scoreboard has loosely followed the pattern of the recent Yandex contest: at the end of the competition, Gennady was in 3rd place behind Evgeny 'eatmore' and Ivan 'mystic'. However, as the system test results for the large inputs were announced, it turned out that they were both incorrect in their attempts at solving F-large, while all of Gennady's submissions stood and he won the contest. Once again, making sure that one's solution are correct paid off - congratulations!

Second-placing Evgeny was really close to winning - the code he used for F-small looks good in general and should be able to solve F-large just as well - but his submission failed, indicating that he probably has some very small bug somewhere.

Among the problems of the final round, I've enjoyed problem D the most. You have up to 100 different candies, and for each pair of candies you prefer one of them to the other. Your preferences are not necessarily transitive - you might prefer A to B, B to C, and C to A. You're trying to pick your most favorite candy by considering all candies one by one and comparing each candy with the "current best" candidate. It's not hard to see that the candy chosen to be the most favorite depends on the order you consider the candies in. For example, in the above triangle, processing the candies in order ABC will lead to candy C being declared the most favorite, while ACB will make B look like the best candy. Given the matrix of preferences, and one specific candy, can you make that candy your favorite by considering all candies in the right order? If yes, you also need to find the lexicographically smallest such order.

The picture of Gennady on the right is from this article (which does have a few inaccuracies - but good job on publishing the article so quickly!)

Thanks for reading, and see you next week!

Tuesday, August 12, 2014

This week in competitive programming

Tourist has started the week in his usual fashion by winning Codeforces Round 260 on Friday (problems, results, top 5 on the left). He has participated in just 4 regular Codeforces rounds in 2014 - and he has won them all. Amazing consistency!

TopCoder Open Algorithm Round 3B, like most TCO rounds, took place on Saturday (problems, results, top 12 on the left, my screencast of the parallel round). Twelve more spots in one of the most prestigious tournaments in competitive programming were given out. The problemset was destined to make the contestants very nervous :) The 500 problem did not require any sophisticated algorithms or ideas, but required a small insight and a lot of attention to detail. Most contestants tried to solve it - but just 4 contestants managed to solve it correctly (3 out of those 4 are in the top 12 to the left), while many top rated favorites had a bug in their solutions and failed. This problem also played a very important part for some people who didn't solve it, as they were able to challenge a lot of incorrect solutions and gain a spot in top 12 that way.

The 1000 problem, on the other hand, had a really simple problem statement but required great combinatorial prowess. 4 out of 5 people who submitted it got it right, so it was a safe ticket to the TopCoder Open - provided you could actually get a solution that passes the example cases. Congratulations to wata on solving it very quickly and winning the round with a commanding margin!

The problem asked to count the number of trees on the given vertices (at most 50) that share at least the given number of edges with one specific given tree. Can you do it?

Finally, MemSQL Start[c]UP 2014 Final Round was held on Sunday (problems, onsite results, online results, top 5s on the left). The onsite round was for those willing to travel to the MemSQL office in San Francisco, and the online round was for the rest of us, using the same problemset. This time the problemsetters didn't worry that much about having easy problems in the set, so the first three problems all had a hundred thousand numbers in the input and required O(n) or at least O(nlogn) solutions. This has unexpectedly bitten the contestants using Java since two out of those three problems involved sorting hundred thousand integers - and it turned out that using Arrays.sort is not good enough, since it uses a deterministic quicksort algorithm and thus can be made to work in O(n2) by a specially crafted testcase. Sure enough, some contestants did come up with such testcase. Lessons learned: either shuffle the array before sorting using time-based randomness, or use Arrays.sort on arrays of objects, not primitives, since that version is not quicksort.

The hardest problem was not solved, is still not solved by anybody in the problem archive, and I still have no clue how to solve it. You are given a rooted tree of at most 250 vertices, and each node of the tree is either a leaf or has exactly two children. Each leaf has a number associated with it. Two players, maximizer and minimizer, are playing a game by making turns in order. At each turn, a player must pick a node with two children that are both leaves, erase those leaves, and pick one of their two numbers as the number for the node. The game ends when just the root is left in the tree, and the score of the game is the number picked for it. Maximizer wants to maximize the score, while minimizer moves in the other direction. What will be the score if both players do their best?

Thanks for reading, and see you next week!