Tuesday, May 31, 2011

Reflections on ACM ICPC 2011 World Finals Problems

I've taken some time to reflect on the World Finals problems - hopefully that will be interesting for you to read. If you were not following the World Finals closely, you should probably skip this post :)

The problems are at http://cm.baylor.edu/digital/icpc2011.pdf, the unofficial solutions from Per at http://www.csc.kth.se/~austrin/icpc/finals2011solutions.pdf.

I've really liked the problemset, so please don't take my words below as criticism - I've just tried to look at the problems from different angles and classify them in different ways, and maybe suggest some improvements for the future.

Problem A - To Add or to Multiply - ad-hoc (number theory?) (by ad-hoc I mean "need creative thinking to solve but no necessary prior algorithm/mathematics knowledge or experience solving similar problems"), most difficulty in corner cases and finding the lexicographically smallest answer.
Problem B - Affine Mess - exhaustive search (do what's described in the problem statement) + solving a system of 2 linear equations, most difficulty in corner cases.
Problem C - Ancient Messages - ad-hoc, out-of-format problem, as there's no precise definition of valid input.
Problem D - Chips Challenge - maximum flow (reduction to flow is the most difficult part).
Problem E - Coffee Central - exhaustive search (do what's described in the problem statement) + the набившая оскомину idea about using inclusion-exclusion for rectangle sums. Most difficulty in finding the lexicographically smallest answer.
Problem F - Machine Works - DP plus incremental convex hull within one quadrant. The key to producing NlogN solution is the somewhat well-known idea about incremental convex hull to find the maximum of a set of linear functions. Most difficulty is inventing this speedup (if not known in advance) and implementing incremental convex hull.
Problem G - Magic Sticks - ad-hoc, geometry. The solution has two main parts - inventing and then proving (or believing) the key mathematical fact about throwing away the longest edge, and solving a sub-problem where you have to maximize the area of the polygon with the given sides. The sub-problem is again quite well-known and was a significant advantage to the teams who saw it before. However, I believe the most difficult part was the longest edge fact. There was also a catch about one edge taking more than Pi angle which many teams forgot to account for.
Problem H - Mining Your Own Business - graph theory, ad-hoc. The most mathematically beautiful problem of the contest in my opinion. The main difficulty is constructing the "if and only if" condition for the set of escape shafts.
Problem I - Mummy Madness - interval trees (or priority queues), ad-hoc. The main difficulty is realizing (a.k.a proving or believing) the problem can be reduced to checking if a square is completely covered by a set of squares. The NlogN solution to that sub-problem is very well-known and should be very easy for top teams.
Problem J - Pyramids - quite straightforward DP (backtracking should also work). I don't see any significant difficulty at all here, but if I were to choose the main difficulty, it would be building the lexicographically largest answer in the DP solution.
Problem K - Trash Removal - a well-known problem. Solvable in NlogN using rotating calipers, but since the constraints were so low, that solution is an overkill. The only difficulty with such small constraints was to realize that the line should be parallel to the line connecting some pair of vertices. The main difficulty is the geometric formulas, but those are very easy as well.

Problem H was really standing out from the problemset, in my view, because it required graph theory thinking but at the same time involved no heavyweight algorithms or implementation difficulties. I would love to see more problems like this, but they are very hard to come up with. Problem C is unusual (so one might argue it gets close to the boundary of teams might reasonably expect) but still good. Problems E, F, I, J, K are "professional" problems - the most difficult part of each lies in a "standard set of tools" that most competitive programmers hone from contest to contest. This is not a negative thing - those ideas form the standard set of tools exactly because they're most useful "building blocks" of various algorithms. Problem K may be too well-known though. Problem G is borderline as it combines a beautiful ad-hoc part with quite difficult algorithm that might be well-known to some teams but not others. Problem D is borderline (but still very good in my opinion) because the "reduce to maximum flow" trickery can also be viewed as a part of "standard set of tools", and thus the problem is quite "professional" too. Problems A, E, J require relatively little thought, and have main difficulty in figuring out the corner cases and careful implementation, which is not exciting, and I'd say 3 such problems is too much (but maybe I'm too subjective after several teams I've supported got buried in those issues).

To spice up the long text, here's my picture with the World Champions:


Please share your thoughts :)

Monday, May 30, 2011

ACM ICPC 2011 World Finals

Here's my (hopefully) live commentary from the ACM ICPC 2011 World Finals. This post will be updated as the contest goes on.

8:31 - there's a big countdown screen that shows 28 minutes before the start of the contest. The rumor has it that the Peabody ducks will walk on the contest floor before the contestants :)

8:38 - 21 minutes before the contest start, the teams are coming in, right after 5 ducks.

8:43 - 16 minutes before, everyone is seated. I guess it's now up to Bill to entertain everybody.

8:50 - 9 minutes before, Bill is trying hard.

8:53 - to Adrian: I would expect after about 1 hour of the contest. We might try to publish the photos if we get them earlier :)

8:53 - 1 minute before! It looks like they decided to skip some minutes.

8:55 - the contest has started! It looks like there's 11 colored balloons.

8:56 - unlike yesterday, the scoreboard on the big screen is working. Teams are sorted alphabetically for now.

8:58 - no problem statements for spectators yet.

8:59 - no submits yet.

9:02 - live scoreboard from Kattis appears at http://scrool.se/icpc/wf2011/.

9:05 - we got the problem statements from Tatyana Churina!

9:13 - SPbSU and SPbSU ITMO solved K! No other submissions were visible before the scoreboard went down.

9:14 - according to zibada's chat, Perm solved K as well.

9:19 - photos of problem statements are being uploaded to Picasa. Sorry for the poor quality, hopefully it's still readable.

9:22 - 13 teams have solved K, no other submits. Time to look at the problem statements…

9:24 - problem K is classical: find the two parallel lines closest to each other that will fit the given polygon (maybe rotated) between them. It's not hard to see that the line must be parallel to the line connecting some pair of vertices, so you can just try them all. For even faster solution, we should just check adjacent vertices of the convex hull.

9:28 - problem C is quite straightforward as well. You are given 6 pictures of hieroglyphs, and need to write an OCR that will recognize them. The constraints say that the hieroglyphs in input may be distorted by the shape will be "topologically equivalent" to one of the 6 given pictures. The pictures are chosen such that one can just check the number of connected components of the white pixels (kudos to andrewzta for noticing that).

9:30 - meanwhile, there are lots of teams that solved 1 problem. All solved K except Tsinghua, who solved C, and Shanghai, who solved E.

9:33 - problem E is: given at most 500K points on a 1000x1000 grid, and at most 20 queries X, find the X by X square in (x+y, x-y) coordinates that has the most points inside.

9:34 - the constraints are so low that I believe the straightforward solution should pass: using inclusion-exclusion we can find the number of points in each square in O(1), then we can just check 20 million squares.

9:36 - it looks like official problem statements are up at http://cm.baylor.edu/digital/icpc2011.pdf. I hope you've enjoyed my photos in the meantime :)

9:39 - problem J, which several teams have attempted but failed, is: given a number C, represent it as a sum of numbers of form a_i=i*(i+1)*(2i+1)/6, b_i=4*i*(i+1)*(2i+1)/6, c_i=a_(2i+1)-b_i (sums of squares, sums of even squares, sums of odd squares). Each a_i, b_i, c_i may be used at most once. If there are several representations, choose the one with the smallest number of addends. If there are still several ones, pick the lexicographically largest one.

9:43 - meanwhile, three teams have 2 problems: SPbSU and SaratovSU have J and K, Tsinghua has C and K.

9:44 - 6 teams with 2 problems now: NNSU and Hangzhou have E+K, Taurida has J+K.

9:45 - I would expect that simple backtracking (trying all possible choices for the next number in decreasing order, and truncating the search when current best answer is less than remaining sum divided by the maximum possible value for the rest) works for J, since the numbers are quite dense and thus the be split will always be found very quickly.

9:48 - 3 problems solved for Tsinghua, C+J+K.

9:50 - problem A, which is the only remaining problem that has attempts on it, is the following: given two numbers a and m, and two segments [p, q], [r, s], one must create such sequence of operations +a and *m that will transform every number in [p, q] into some number in [r, s].

9:51 - moreover, one should output the program of minimum length, and smallest lexicographically among those with minimum length.

9:59 - here's one idea: even if we add before we multiply, we still add a multiple of a. So in order to check what is the minimal number of steps required to transform a given segment [q, w] to become a segment inside [r, s], we can just iterate over how many times k we multiply, check that there's a multiple k*a of a that, after being added, will move the result inside [r, s], and the number of steps will be the number of multiplications plus the sum of digits in k when written in m-ary system (roughly). So in order to solve this we must solve a sub=problem: given a segment [l, r], what is the number with the smallest sum of digits (in m-ary system) inside that segment? That is a standard problem that is solved by considering the number of digits of that number that coincides with corresponding digits of l (or r).

10:04 - The above solution yields the lexicographically smallest representation as well if implemented carefully. This looks quite tricky to implement, though. KAIST is the only team to solve the problem.

10:06 - The top of the scoreboard is: Saratov, Zheijang, NNSU, Taurida, Tsinghua - all with 3 problems.

10:10 - Problem G is: you are given N segments connected in one row via flexible joints, and you need to assemble one or more polygons in such a way that every segment is a side of at most one polygon and the segments are not allowed to intersect except at endpoints. Your goal is to maximize the total area of the polygons.

10:14 - Tsinghua got 4! A+C+J+K.

10:16 - There are at most 500 segments in problem G. I guess the first question is does it ever make sense to form more than one polygon?

10:19 - Yes, obviously it does. Suppose one middle segment is much, much longer than the others so no polygon can be formed using it. In this case, the segments to the left and to the right of this segment should be treated independently. We can then build similar examples in each part to have the best answer include even more polygons.

10:21 - What doesn't make sense is to have one polygon have non-consecutive set of segments as its sides - we obviously can increase the total area by joining this polygon with the adjacent one then. So it seems that the solution should be: for every [l, r] part of the given sequence of segments, find the best polygon (this is a standard problem). Then, run a standard DP to find the best total area.

10:24 - The only trick I guess is to solve the "standard problem" for so many cases. The standard solution, as Google tells us, is that the optimal polygon with the given sides is a polygon inscribed in a circle, and we can find the radius of the circle by binary search. But that gives a running time of O(n^3*log(precision)) which may be too big.

10:30 - I would expect that it is fast enough, since n^3 is divided by 6 (the number of segments is n^2/2, and the average length of one is n/3). With the super-fast computers, that should be fast enough.

10:32 - Three teams have 4 problems: Zhejiang, Taurida, Tsinghua.

10:48 - problem H is: given a graph, you need to mark the smallest number of its vertices in such a way that after removal of any vertex, there's a path from any vertex to some marked vertex. Unless I'm missing some corner case, this is reasonably straightforward (suggested by Burunduk1): first, we find the tree of biconnected components of the graph, then we choose one vertex in each of the leaves of this tree (in each biconnected component that is a leaf, we must choose a vertex that is not the articulation point that connects this component to the rest of the graph). In case the entire graph is biconnected, we must mark any two different vertices. This is obviously less than the optimal answer since all those marks are required (what happens if we disconnect a leave of the biconnected component tree?), and it's not hard to see they are sufficient.

10:48 - SPbSU, NNSU, Warsaw and Ural also have 4 problems solved.

10:56 - as discussed offline, I expect the winner to solve 8 or 9 problems.

11:02 - problem D: you are given a 40x40 grid of ".", "/" and "C" letters. Your goal is to replace as many "."s by "W"s as possible in such a way that: the total number of "C"s and "W"s in each row and column is at most the given fraction of total number of "C"s and "W"s, and the total number of "C"s and "W"s in the first row is equal to the total number of "C"s and "W"s in the first column, the same for second row and second column, and so on.

11:14 - NNSU has 5.

11:18 - problem F: you are given 100000 machines, each described by the day it appears on (D_i), the price to buy it (P_i), the money you get when you get rid of it (R_i), and the profit it brings on each day (G_i). You may own at most one machine at a time, you can get rid of an old machine and buy a new one in one day, a machine brings profit only on days between the one when you bought it and the one when you got rid of it (exclusive). You can only buy a machine exactly on the day it appears on, and you start with the given amount of money C. What is the maximum possible profit?

11:21 - this actually looks solvable: first, let's start with straightforward N^2 DP: we solve the problem "what is the most money we can earn from the previous machines if we buy k-th machine on the day it appears" by iterating over which machine will we buy before k-th. Then, we will optimize the solution to work in O(N*polylog(N)). More specifically, suppose the previous machine is m-th. Then the maximum profit to get before buying k-th machine is ans[m]-P_m+R_m+G_m*(D_k-D_m-1), and we can only do this if ans[m]+C>=P_m. The second condition doesn't depend on k at all, and the first only depends on D_k. So essentially when choosing the best m for the given k we're choosing the highest among several linear functions of D_k. We can find that quickly, for example, by keeping track of the convex hull of the existing functions.

11:34 - meanwhile, Zhejiang, Tsinghua and Saratov also got 5.

11:36 - and Tsinghua got 6.

11:49 - problem B is: you are given integer coordinates of 3 points before and after the following transformation: first, you rotate the plane such that Ox axis passes through an integer point on the border of [-10,10]x[-10,10] square. Then, you round the coordinates (up in absolute value if the fractional part is 0.5). Then, you multiply all coordinates in each axis by a non-zero integer (different for different axes). And then, you add an integer constant to all coordinates in each axis (different for different axes). You're asked whether there exists such transformation for the 3 given points, and if yes, whether those transformations transform each integer point in the same way. It looks like one can just iterate over all possible rotations, then find scaling and translation numbers based on linear equations, and then check the resulting transformation. The biggest difficulty I guess would be handling corner cases where all points have the same coordinates and suchlike.

11:50 - meanwhile, Zhejiang also got 6 and is in second place.

11:56 - SPbSU solves 6 and moves into second place.

11:58 - problem I is: you are standing in a cell of an infinite plane, and you have 100000 mummies going after you. Each mummy stands in some cell. You make moves in turn: first, you can move to any 8-adjacent cell. Then, each mummy moves to a 8-adjacent cell that is closest to you. You lose if at any point a mummy is in the same cell as you. What is the maximal number of moves you can survive?

12:02 - Moscow solves 5th problem but it's G! Now go for the easy ones! Saratov got 6.

12:32 - had a lunch break. Lots of teams with 6, teams with "standard" set of problems have better penalty time.

12:34 - and Tsinghua solves F. I didn't expect them fixing the problem so quickly.

12:47 - problem I solution together with FedorTsarev: first, let's assume that once the mummy catches us, it follows all our moves. It means that we need to find the largest time X such that we can be in a point distinct from all mummies at time X, and that can be done by binary search. To check a specific X, we believe the following proposition works: we take a 2X+1 times 2X+1 square with center in our position, and check if it's completely covered by 2X+1 times 2X+1 squares centered at mummies'. This relies on assumption that the way the mummies move in the problem statement is "optimal", but this is kind of obvious, since their move always provide strictly optimal distance to us both by x and by y. And checking if a square is covered by other squares can be done in O(NlogN) using the sweeping line method and keeping an interval tree that stores how many times each cell is covered.

12:47 - meanwhile, 1 team has 7 problems and 9 teams have 6 problems.

12:56 - the standings are frozen. The only problem for me to solve is D. I'm going to the contest room to watch the balloons brought to the teams.

13:13 - three teams in the 7/6 group have submitted problems, but none of them seems to have the corresponding balloon, so it looks those were wrong submissions.

13:20 - still no new top teams with non-standard balloon colors. Maybe they've stopped delivering them?..

13:30 - it looks like new balloons are indeed not delivered to top teams at all. Warsaw team were congratulating each other twice, so I assume they have 7.

13:32 - it looks like Moscow got A!

13:43 - no further updates. The only way to figure something out is to look at team's exctited/disappointed reaction.

13:45 - or maybe Warsaw actually has 6?

13:55 - the contest is over! Still no news.


13:59 - Saratov has 7.

14:08 - Taurida has 6.

14:15 - SPbSU, Saratov, Moscow, Nizhny Novgorod, FAU, Donetsk, Ural, Krakow all have 7.

14:17 - Zhejiang has 8! Warsaw and ITMO have 6.

14:20 - Waterloo has 7!

14:21 - Michigan has at least 7!

14:23 - most likely results: Zhejiang, Tsinghua, SpbSU, Nizhny, Saratov, FAU, Donetsk, Krakow, Moscow, Michigan, Ural, Waterloo.

14:29 - no, it's not! Zhejiang - 8, Michigan - 8, Tsinghua, SpbSU, Nizhny, Saratov, FAU, Donetsk, Krakow, Moscow, Ural, Waterloo - all 7.

14:40 - that's it. I'll probably not update this post anymore, unless some interesting comments surface. Thanks for following my blog today!

ACM ICPC 2011 World Finals - pre-contest

ACM ICPC 2011 World Finals are happening these days in Orlando. Tomorrow, as usual, I will be doing live commentary on the tasks and on the contest itself. Today, about 9 hours before the start of the contest, I write about what happened before the contest - using style borrowed from Mike's notes.

On Friday, we went to see the ocean and the Kennedy Space Center, where most American space launches happen. The space center was very disappointing - it looks like the only reason to go there would be to view an actual launch (and the next launch is in July), that should be very impressive and maybe even educational. What we found there was:

Real-size models of most American rockets, Russian ships and the Shuttle:


Space-themed and not-so-space themed attractions and fountains (really nice in 30C heat):


The ocean is nice and warm, but comes with some jellyfish bites :)

On Saturday, the organizers took us to SeaWorld resort. The resort welcomes you with a traffic jam on the parking entrance:


They started with an IBM history recap given on a stadium where the killer whale show (Shamu) takes place. Not everyone was excited:


Then we were given lots of time to explore SeaWorld, which was quite exciting. The most exciting part was the dolphin show, but I'm yet to download it from the camera, so here goes the still part of SeaWorld:


On Saturday, the Opening Ceremony and two Practice Sessions took place. Here's how the contest area feels:


And here's how the teams look:


Good luck to all teams, and especially to the team in the last picture - and stay tuned for the live commentary tomorrow right in this blog! The contest starts around 9am local time (click the link for other timezones).