Wednesday, January 1, 2020

A Houston week

The Nov 11 - Nov 17 week was the week of TopCoder Open finals in Houston.

The first semifinal took place on Thursday (problems, results on the left, broadcast, my Twitter thread, analysis). The easy problem was quite tricky and the sample cases did not really help check correctness, so the first five submissions in this photo were all incorrect. tourist, Egor and scott_wu later resubmitted, and this turned out crucial for Egor who was able to advance on the easy and medium. tourist, lyrically and uwi solved the hard as well, and therefore it did not actually matter if their easy was correct. Congratulations to the four finalists!

The second semifinal followed on Friday (problems, results on the left, broadcastanalysis). Five competitors solved everything during the coding phase, and I was in fourth place but with less than 25 point gap to the fifth. During the intermission, I have correctly concluded that it only makes sense for me to challenge one of the other three contestants, since if any solution of the first five fails, and it's not my solution, then I would advance anyway. However, right after the challenge phase started I've opened Um_nik's 1000, noticed that it does not handle a special case that I had to handle in my solution, and challenged with it. The challenge was unsuccessful, and I dropped to fifth place.

Luckily, it did not end that way and got even curioser after ecnerwal found two unsuccessful challenges of his own, and dropped below me — imagine how surprised and relieved I was at that point :) It turned out that he already knew that his medium was going to fail, so the challenge phase did not matter in the end as everybody who solved three problems advanced to the finals. Congratulations to xudyh, Um_nik and mnbvmar!

Here is the 1000-pointer that got me so excited that I failed to exercise good challenge phase judgement: you are given three integers n (1<=n<=1011), d (40<=d<=120) and b (5<=b<=62). Your goal is to return any base-b number that has exactly d digits when written without leading zeros, is divisible by n, and has at most four distinct digits when written in base-b without leading zeros.

Can you see a way to solve it without having to handle any special cases?

The final round completed the competition on Saturday (problems, results on the left, broadcast, my screencast, analysis). I got stuck in the easy problem, trying to be too clever and to avoid constructing the graph explicitly backfired, and I could not debug my solution during the round. I am notoriously bad at giving up on a problem, and it always felt that I was so close :) I have eventually switched to the hard problem and almost got it right — but only almost. The solution I submitted at the end did not even pass the samples, and it turned out that even though it had 5 different bugs, all were of "j instead of i" kind and probably preventable if I was coding it in less of a rush.

This is exactly what lyrically did, as she started with the hard problem and spent most of the contest solving it, and ended up being the only competitor to do so. In fact, the situation was a bit more complicated as her solution has initially failed my challenge case as well, but it turned out that my challenge case has exposed an issue with the problem itself: it seemed virtually impossible to return an output that would pass the checker because of floating-point precision issues. The admins were put in a difficult spot, and ultimately decided to let me keep the 50 challenge points, but also lowered the required precision and let lyrically's solution pass.

I did not realize myself that the problem was broken, and simply tried to put together a challenge case that could expose potential precision issues in other solutions.

In any case, all this did not affect the first place as tourist was the undisputed champion with good times on easy and medium. Huge congratulations!

Right after the final round, scott_wu and ecnerwal ran a fun double-elimination competition in a new 1:1 format called Lockout (more info, results on the left), trying to make competitive programming more exciting to watch. The competition was indeed very fast-paced and had a lot of plot twists depending on the strategy choices. The format before the last round has placed quite a lot of emphasis on the speed of solving easy problems, though, as all problems cost 1 point. The last round of the competition, the Grand Finals where tourist met Um_nik, took place much later, had different point values and a very nice broadcast — check it out! And of course congratulations to tourist on winning the whole thing :)

In my previous summary, I have mentioned an Open Cup problem: there are n drones (n<=500000), the i-th drone initially in the point (xi,yi,0). Some pairs of drones are connected with cables, those cables are straight lines and they do not intersect except at endpoints. We are then performing k moves (k<=500000), in each move we change the z-coordinate of one of the drones (x- and y-coordinates always remain constant). The cables connected to this drone have to change their length correspondingly, and each cable has a known maximum length — in case it needs to become longer than that, it breaks and stays broken for the remainder of the process. Finally, you are given q (q<=500000) queries, each query being a pair of drones. For each query, you need to find the move which caused the two given drones to become disconnected (using the cables), if any.

If we find out for each cable at which move does it break, the processing of q "when did these two vertices become disconnected" queries offline is a standard problem that is solved by going in reverse order of time and keeping a set of connected components and for each component keeping a list of queries with at least one end in it, and traversing this list for the smaller component when merging two of them.

Finding out the moment each cable breaks is the more interesting part of the problem. A naive approach would traverse all cables connected to the vertex being moved at each step, but that would of course be quadratic in case our graph is a star.

How can we improve the performance of our solution on the star graph? We can notice that we can process the outer vertices of the star quickly as the have only one incident edge, the slowdown only arises when we move the center vertex. In order to process the movements of the center vertex faster, we can notice that for each particular edge, the values of z-coordinate where the corresponding cable does not break form a segment, therefore there are two interesting values — the endpoints of this segment — which should trigger the breakage. We can put all interesting values for the center vertex into a data structure, then whenever we move the center vertex we can just query the data structure to retrieve all interesting values that we have passed when moving from the old value of z-coordinate to the new one, and break the corresponding edges.

Since every edge breaks at most once, this will have the total running time of O(m*logm) where m is the number of edges, and assuming the data structure can retrieve the next edge to break in O(logm). We can use a balanced tree or a pair of priority queues as the data structure.

Note that we should also update this data structure for the center vertex when processing the outer vertices, as the interesting values for the corresponding edge will change, but since each outer vertex has only one incident edge this is also O(logm).

How do we generalize this improvement from the star graph to an arbitrary graph? Let's choose an orientation for each edge. Whenever we move a vertex, we will process its outgoing edges directly. For the incoming edges, we will maintain the data structure with the interesting values of z-coordinate and query it for the new breakages. We also need to update this data structure for the neighboring vertices for each outgoing edge. If the maximum out-degree of a vertex is d, this solution runs in O((k*d+m)*logm).

Now the standard idea would be to orient each edge from the vertex of the smaller degree to the vertex of the larger degree. This would mean that the out-degree of each vertex is at most sqrt(2m), since there can't be more than sqrt(2m) vertices each of degree sqrt(2m) adjacent to a vertex, since the sum of all degrees is 2m. I have managed to squeeze this solution in, but it required some extra optimizations.

However, we can make use of the fact that the cables do not intersect, in other words that our graph is planar. Each planar graph has a vertex of degree at most 5. So we can do the following: repeatedly take the vertex with the smallest number of adjacent non-oriented edges, and orient those edges from this vertex. This will give us an orientation where all out-degrees do not exceed 5, so our complexity becomes just O((k+m)*logm) and no squeezing is required :)

Finally, now is a good time to remind you about the New Year Prime contest, where you can try your hand at solving the problems from 2019 that were not solved by anybody in contest time. Best of luck, and Happy New Year!


  1. Funnily enough, orienting each edge from the vertex with smaller number of changes (not degree!) to the vertex with the larger number of changes is always optimal, because the problem basically boils down to picking one of two integers for each edge, while minimizing the sum; obviously you should pick the minimum for each edge.

    The planarity argument can be used to prove the fact that a good orientation exists, hence the optimal orientation is even better (but it is not immediately obvious that the optimal orientation gives a good bound without considering planarity). In fact, the existence of orientation by degree is usually used to show that the method described in the first paragraph has O(sqrt(m)) complexity "per query" in an arbitrary graph.

    1. Right, that's a very good point! In this and many other problems I keep forgetting the fact that we have all requests and can do things offline rather than online.