Algorithms Weekly by Petr Mitrichev: September 2014

Monday, September 29, 2014

This week in competitive programming

Another quite typical week drifted past, with a TopCoder round and a Codeforces round. First, TopCoder SRM 634 took place very early on Friday morning (problems, results, top 5 on the left). The early start was too much for me, but I want to congratulate mikhailOK and piob on solving all three problems and thus leaving the rest of the competition far behind — great job!

Codeforces Round 270 occupied the Sunday evening (problems, results, top 5 on the left). The problemset had a very nice common background story: each problem explained how one can come up with problems of this type — consider reading them all if you look for inspiration for creating new problems.

The hardest problem brought an interesting aspect of the programming competitions into the limelight: sometimes the expected algorithms get so complicated that the constant hidden in the O-notation of their running time is so big that solutions that are very straightforward and efficiently implemented can be faster even despite worse asymptotic complexity. In this particular problem, the reference solution from the editorial has complexity O((nlogn)^1.5), while most, if not all, accepted solutions have complexity O(n²). With n as high as 200000, one might think that the extra square root of n will dominate, but since the n squared solutions are very straightforward and thus can easily be optimized, for example using bitwise operations, this is not the case.

ilyakor's accepted solution is the fastest, solving the worst testcase in just 2.5 out of 7 seconds by reusing a publicly available SSE3-based code for counting the number of elements in a bitset — a great use of all available resources to solve the problem! But as far as programming competitions in general are concerned, we've mostly given up on separating O(n²*polylog(n)) algorithms from O(n³) ones, and it's sometimes even difficult to separate O(n*polylog(n)) algorithms from O(n²) ones. It's a pity since we eliminate a large class of quite beautiful algorithms and data structures this way :( Any ideas on reversing the trend?

Thanks for reading, and check back next week!

Monday, September 22, 2014

This week in competitive programming

TopCoder SRM 633 took place on Wednesday (problems, results, top 5 on the left). The hard problem was nice, but the medium problem was even nicer — kudos to the problemsetter, cgy4ever! You were given two trees with the same set of vertices, each vertex had a (possibly negative) score assigned to it, and you needed to find a subset of vertices that is connected according to both trees and has the highest total score. The question is very simply stated, and yet the solution is quite challenging and creative — that's how great programming contest problems look like! Can you see it?

A flawless peformance by Gennady guaranteed him a clear first place. Great job!

Codeforces Round 268 happened on Saturday (problems, results, top 5 on the left). I've skipped this round but I can still see that Pavel achieved a commanding victory thanks to 10 challenges in a contest where many top scores couldn't find any — amazing!

We've also had some very nice developments in http://hamiltonianplumber.appspot.com/ during the week: +Boris Minaev found a creative way to break the 2-opt heuristic with random restarts. To quote him:

"my idea is very simple - creating a big test is hard, so we can create a small test (for example with 10 vertices) and then repeat it 5 times. By repeating I mean placing vertices of different groups far from each other. In such test case the correct order will be (visit all vertices of 1st group) -> (visit all vertices of 2nd group) -> ... (visit all vertices of 5th group). If we now look at number of iterations that 2-opt solution need to find a correct answer, it would be something like (number of iterations it needs to find a solution for one group)^5.

So we just need to create a small test where 2-opt solution works bad. This we can do with just a stress-tesing. It's easy to find a test which needs ~10 iterations of 2-opt, which gives ~10^5 iterations in total."

This testcase forces the heuristic to make 230932 attempts before finding the shortest path for the first time:

{690, 9932, 10000690, 10009932, 20000690, 20009932, 30000690, 30009932, 40000696, 40009883}

{1175, 1327, 1564, 2263, 2715, 7246, 7674, 7997, 8334, 8511, 10001175, 10001327, 10001564, 10002263, 10002715, 10007246, 10007674, 10007997, 10008334, 10008511, 20001175, 20001327, 20001564, 20002263, 20002715, 20007246, 20007674, 20007997, 20008334, 20008511, 30001175, 30001327, 30001564, 30002263, 30002715, 30007246, 30007674, 30007997, 30008334, 30008511, 40001663, 40003713, 40003852, 40007375, 40008436, 40009682, 40009842}

// please don't upload it to the server just for the sake of getting to the scoreboard — it actually requires quite a lot of resources to judge!

If you want to test your hamiltonian path implementation, you can construct the actual graph using the first part of the code in http://hamiltonianplumber.appspot.com/source.html.

And finally, it's time for some celebration as this is the 52nd post titled "This week in competitive programming" (here's the first one), meaning that this weekly programming contest review has been going for an entire year! I guess the expected question is: what do you think I should improve in this blog?

Hoping for sincere feedback, and see you next week in any case!

Tuesday, September 16, 2014

More on the Hamiltonian Plumber

This Sunday, I've launched a website with the goal of learning about good testcases that break a heuristic solution for a decisive TopCoder Open problem that can be reduced to Traveling Salesman: http://hamiltonianplumber.appspot.com/

So far, nobody except myself has submitted a testcase that makes the solution do at least two iterations — in other words, greedy improvement works in all submitted testcases. This is quite disappointing, so I'm wondering what's the reason for that. It could be:

The challenge is not interesting enough, so people either don't bother at all or try a few manual tests.
The website is confusing, so people don't understand what's really going on.
The website is broken, so people submit good testcases but they are scored incorrectly.
Something else? Please share what you don't like about the challenge!

Let me also share a simple strategy that allows to get a score higher than 1.0. It's actually quite straightforward - we can just try random big tests until we find one that scores more than 1.0. One might need to try several tests before that happens, and doing that using the website is slow and clumsy. However, the website actually has the code available for download (http://hamiltonianplumber.appspot.com/source.html), so one can quickly craft a local stress-test and find a case that requires many attempts. Since you don't know the hidden random seed, a testcase that requires two attempts on your machine might still be solvable in one on the server, but if you find a testcase that needs ten attempts, chances are your score on the server will be more than 1.0, too.

Sunday, September 14, 2014

This week in competitive programming

There were no regular competitions this week, so it's a good time to return to a past problem.

Let's talk about TopCoder Open 2014 Round 3A hard problem "PlumbersCoins" (previous post).

In short, the problem statement is: you are given at most 50 interesting points on a line that you need to visit in any order starting from point 0. In addition to moving along the line at one unit per second, you can use teleports between the given pairs of locations. The teleports are non-interleaving (if one teleport connects points a < b, and the other connects points c < d, then either b < c or d < a), and there are at most 25 teleports, each taking exactly one second to use. What's the shortest time required to visit all interesting points?

During the competition, I didn't notice that the teleports are non-interleaving, and thus the problem became much harder. The only thing I could come up with was reducing the problem to the Traveling Salesman problem, so I coded a heuristic solution for the Traveling Salesman problem that I remembered from IOI 2002 practice problem "red": start with a random path and repeatedly reverse its subpaths while the solution improves, then repeat everything until we run out of time. Surprisingly, this solution passed the system tests!

But then I started to think: is this really surprising? Does there exist a valid testcase that makes this solution fail?

I'm relying on you to find out :) Here's a website where you can try your testcases:

http://hamiltonianplumber.appspot.com/

Please remember to sign in at the bottom of the page to be included in the scoreboard (and so that others can't see your last testcase :))

Monday, September 8, 2014

This week in competitive programming

Last Monday, the Summer 2014 Petrozavodsk training camp came to its conclusion (day 1, day 2, day 3, day 4, day 5, day 6, day 7, day 8, day 9, overall stats, top 5 on the left). Some of the best Russian, Ukrainian, Belarusian, Romanian, Armenian and Kazakhstani teams started their preparation for the new ACM ICPC season that will end with the World Finals in Marrakech in May 2015. Unfortunately, for the first time in my memory, several very strong teams including the World Finals gold medalists Moscow State University team did not participate due to a schedule conflict: many contestants opted for summer internships in technology companies and could not come to Petrozavodsk. Nevertheless, the remaining teams still tried their best to solve problemsets that are often harder than the World Finals themselves. The team of Gennady Korotkevich (he was joined by different ITMO students at each contest) got a clear overall first place by winning most contests, but note that there's still no decision whether Gennady will take part in official ACM ICPC contests in this season (or this decision is a tightly kept secret). The Moscow Institute of Physics and Technology team with the team members from two different last year teams placed clear second, winning two contests and performing well in most others — they will definintely be a team worth watching.

TopCoder SRM 632 took place on Friday afternoon (problems, results, top 5 on the left, my screencast). The medium problem involved a nice graph theory observation: one needed to find the maximum flow in a graph which has different integer powers of 3 as edge lengths. Can you see how we can use the fact that the edge lengths are so exponentially different to create a simple solution?

Codeforces Round 265 rounded up the week on Sunday night (problems, results, top 5 on the left). Tourist has continued his 2014 unbeaten run in Codeforces — great job! — but this time the fight was really close. Both myself and Gennady followed essentially the same strategy during the round, solving 4 out of 5 problems in A, C, B, D order, and looking to find bugs in solutions of other contestants for problem A in between to score some challenge points. Gennady has managed to find 6 incorrect solutions for A while I stopped at 3 incorrect solutions for A and one incorrect solution for B, and those two extra challenges have earned him the first place. The system test showed that there were no incorrect solutions for problem A in my room left standing, so I didn't have a chance to catch Gennady just by challenging solutions for problem A.

However, that same room scoreboard does show my chance to get ahead of Gennady: 3 submissions have failed the system test for problem C. During the round, I have submitted a very simple solution for problem C and thus didn't expect others to submit incorrect ones — it was a big mistake in my thinking. The solutions for C were very short and easy to understand, and thus spotting the incorrect ones could have been easy. What's more, several contestants scored a lot of challenge points by challenging problem C, so I could have noticed that problem C is vulnerable by looking at their stats during the round — but it did not occur to me to do that. Lesson learned: next time I'll do my best to use all information that I have access to, including which problems are being actively challenged by others.

In other news, the round featured a very difficult technical problem E that was solved by just one contestant. Coincidentally, this problem also featured a graph with exponential edge lengths: you were given a graph with at most 100000 vertices, 100000 edges, and each edge length was an integer power of two between 2 and 2¹⁰⁰⁰⁰⁰ (this time the edge lengths were not necessarily different). You were asked a very simple question: to find the shortest path from one vertex to another, suggesting that Dijkstra's algorithm is the way to go. But how does one handle such enormous edge lengths?

Thanks for reading, and check back next week!

Monday, September 1, 2014

This week in competitive programming

Last week had two international programming contests. The first one was Codeforces Round 263 on Tuesday (problems, results, top 5 on the left), won by WJMZBMR who was the only one able to solve all problems without mistakes. Great job!

Then, TopCoder SRM 631 took place on Saturday night (problems, results, top 5 on the left). The hardest problem was a data structure problem: you were given a rooted tree, and had to handle two types of requests: pick any subtree of the tree, detach it from the tree and attach it somewhere else in the tree; given two nodes in the tree such that one node is an ancestor of the other, find the maximum number of a vertex on the path between them.

Such problems usually have several working approaches:

implement all operations in a straightforward O(n) manner, so that the entire problem needs O(n²). This is too slow, but we can then optimize the constant hidden in the O-notation a lot and squeeze it under the time limit.
implement all operations using a complex data structure so that they take O(log(n)), so that the entire problem needs O(n*log(n)). This is usually the fastest approach, but it might require a very complex data structure - in this case, one could use a link/cut tree.
use sqrt-decomposition so that each operation takes O(sqrt(n)), for the total running time of O(n*sqrt(n)). In this particular problem sqrt-decomposition means splitting all queries into blocks of sqrt(n), and shrinking the tree to only contain interesting vertices for each block of queries.

I like sqrt-decomposition more for this problem since it's really easy to implement.

Thanks for reading, and check back next week!