Sunday, March 6, 2016

A London week

TopCoder SRM 683 was the first round of this week (problems, results, top 5 on the left). Two successful solutions for the hard problem seems like an unusually high number these days :) jiry_2 has won with a big margin because of being the only one to solve all three - congratulations!

On Thursday, 25 finalists gathered in London for the Facebook Hacker Cup 2016 Onsite Final Round (problems with Facebook login, results, top 5 on the left). The round contained 6 solvable problems, but with quite a few tricks and extremely simple example cases, so the room for failure was huge. Well done to Makoto on avoiding the failures and winning convincingly!

Bruce, who was one of the contestants to solve all 6 problems, only to end up with 0 accepted solutions, explains what happened to his solutions here, and what are the correct approaches here. Let me cover my part of the failure story:

In the problem "Boomerang Crews", I've made a mistake that comes up again and again for me: tried to use a TreeSet as a multiset. In the problem "RNG", I've checked if a vertex has any outgoing edges before actually reading the edges from the input - so all vertices got marked as "no outgoing edges". Finally, in problem "Grundy Graph" my solution was a bit further from the correct one - I forgot to add the reverse edges, and to check if one Bob's vertex is reachable from another.

Failing like this in an important round naturally makes one think about better ways to avoid such mistakes. Bruce mentions that one of his failures was due to an integer overflow, and I've protected myself against that via CHelper plugin's ability to use Cojac library to detect integer overflows in Java at runtime. However, one wonders whether any other bugs mentioned above by myself or by Bruce are preventable by building better tools?

One relatively straightforward idea I had was to define a TreeSet class in my library which raises an exception when trying to add duplicates, and tune IDEA's auto-completion to offer it above the system one. In most cases where ignoring duplicates is a good thing, a HashSet is a more natural fit, and adding duplicates to a TreeSet seems to be a mistake more often than not.

Please share your suggestions!

Thursday, March 3, 2016

A strategic week

The last week started with the middle-of-the-night SRM 682 on Tuesday (problems, results, top 5 on the left). Continuing the recent trend, the hardest problem did not crack, and the winner was decided during the challenge phase, where subscriber has squeezed the victory - congratulations!

The week ended with two contests on Sunday: first, the Open Cup Grand Prix of Bashkortostan catered to the team contest lovers (results, top 5 on the left). Past Glory team was unstoppable once again, making just one incorrect attempt over 12 solved problems!

Problem F's statement is so short that I'll quote it verbatim: "For each vertice of given undirected weighted graph calculate the length of shortest simple cycle, which contains this vertice". The graph had 300 vertices. Is this a textbook problem?

And finally, Codeforces hosted the Final Round of 8VC Venture Cup 2016 for those who prefer to compete on their own (problems, results, top 5 on the left). The round had a $2500 prize for the first place, which tourist has won convincingly thanks to some super fast problem solving - great job!

My story in this round is all about strategy and tactics. Having solved problem A, I've read the statement for problem B first, which looked quite tedious, so I've decided to check other problems. Having read problem C, I've realized that it's a well-known problem which I don't remember the solution to, so solving it at this point would put me at a disadvantage. Finally, problem D also looked tedious but at least brought 2x more points than problem B.

I've glanced at the scoreboard at this point, and it turned out that tourist has already submitted B. It was pretty obvious at this point that following his strategy would not give me a win, since he was quite a bit ahead in time, so in order to win I needed a different strategy, so I've decided to implement D, then E, and then attempted F. My hope when starting F with almost 1 hour to go was to solve it, and then come back to either B or C, while tourist would not be able to finish F in time because he'd spend a lot of time on A-E.

My prediction about tourist's result turned out to be correct, but I couldn't execute on my strategy - problem F did not crack in 1 hour :( After the contest, I required 20-30 more minutes to get my solution to work.

Here's the tricky problem: you are given a tree with n nodes, one of the nodes being empty and other nodes containing all numbers between 1 and n-1 exactly once. One is allowed to add at most one arbitrary edge to the tree, and then move the numbers along the resulting graph, where at each move we can move a number to an adjacent empty node, and the goal is to reach the given ending state from the given starting state in the least number of moves. Implementation details aside, can you find the algorithmic solution in 1 hour?

I hope I won't need to rely on unusual strategies in today's Hacker Cup final round, which starts in less than 2 hours :) There should be at least a scoreboard, and maybe more live coverage at the competition page.

A solution week

The Feb 15 - Feb 21 week, somewhat surprisingly, didn't have any algorithmic contests among the ones that I usually cover. However, let's use this opportunity to dive into the solutions of the problems I mentioned before.

The first one was: you are given a set with 200000 numbers, and need to find the subset of that set where the difference between the mean and the median is the highest. The most tricky part of this problem is the median: unlike the mean, it's hard to update incrementally, so the natural idea is to start by iterating over which number will be the median (let's assume we'll pick an odd-sized subset for now). Having fixed the median, we need to add k numbers smaller than it and k numbers larger than it, maximizing the mean. Now it's easy to see that those numbers should be as large as possible: k larger numbers, plus k largest numbers right before out median in the sorted order. Now, if we iterate over all possible values of k, we obtain a O(n2) solution.

In order to speed up this solution, let's look more closely what happens when we increase k. We add two numbers to our set - the kth largest one, and the kth largest one among the ones before the median. Let's call those two numbers a and b. The mean before adding those numbers is c/d (where d is 2k+1 or something like that), and now it becomes (c+a+b)/(d+2). Is it smaller or larger than c/d? Well, the mediant inequality tells that this number lies between c/d and (a+b)/2. So if (a+b)/2 is larger than the current mean, the mean will increase, otherwise it will not. The final step is to notice that (a+b)/2 is always going down since we pick a smaller number for both a and b at each subsequent step. Because of this, we can see that the mean is a bitonic function of k: first increases, then decreases.

This allows us to use binary search on k, comparing c/d with (a+b)/2 inside to see if we should go to the left or to the right (of course, we could just compare two adjacent means, but that might overflow the 64-bit integer type), and obtaining a O(nlogn) solution for the problem.

The second one was: given a bipartite graph with 45 vertices, find how many subsets of its vertices are vertex-dominating. A subset of vertices is vertex-dominating if any vertex of the graph either lies on this subset, or has a neighbor in this subset, or both.

Given that we have 45 vertices, the logical first step is to pick the smaller of its two parts, which will contain at most 22 vertices. Now we need to find something that works in O(2n) or similar complexity.

We can now afford to iterate over which subset of this part will be included in the dominating set. Having fixed that, we still have to cover:

  1. The vertices in the first part that are not included in the subset.
  2. The vertices in the second part that are not adjacent to any vertex from the subset.
Part 2 is easy: we must just include all those vertices into our dominating set. Those vertices, in turn, will cover some vertices from part 1, but there might be some more left over. Now our state is: we have already included some subset A of the first part and some subset B of the second part in the dominating set, the entire second part is covered, and in the first part there's some uncovered subset C which we're still to cover using the some subset of the set D of the vertices of the second part, where D is the complement of B.

This could again be solved via dynamic programming in O(|D|*2|C|) time, yielding overall O(n*3n/2) complexity for the entire problem. However, for n=45 that is too slow.

Here comes the most magical moment of this solution: actually, the set D doesn't matter that much, so we can do this dynamic programming just once instead of separately for every A. One might say that of course D does matter, as we can otherwise pick a vertex from B when covering C, and overcount. However, from the construction above we can notice that vertices from B and C don't have any edges between them, and thus don't affect the cover - we can take an arbitrary subset of B in addition to some subset of D. Because of this, we'll actually count each solution exactly 2|B| times, and we can just divide the result of the inner dynamic programming by 2|B| to get the required count! Now we have a O(n*2n/2) solution.

Thanks for reading, and please cheer for me in the upcoming Hacker Cup onsite final round, which is scheduled to begin at 13:30 London time today!