Algorithms Weekly by Petr Mitrichev: A random week

The Feb 24 - Mar 1 week had two Codeforces rounds and an Open Cup. The first one was called Kotlin Heroes: Episode 3 and accepted solutions in Kotlin only (problems, results, top 5 on the left, analysis). Only Egor and tourist managed to solve all problems, with the former seemingly writing in Java and converting the code to Kotlin automatically right before the submission, and the latter using more idiomatic Kotlin style. The Java way was the better way this time, in large part thanks to tourist's 6 incorrect submissions. Congratulations to Egor!

The Open Cup round was called the Grand Prix of North America, and it was based on the problems from ICPC North American Championship (problems, results, top 5 on the left, onsite results, video analysis). Team Almost Retired really aced it this time, solving 12 problems from 12 attempts, congratulations! The performance of Gennady Korotkevich is also worth mentioning, as he solved all problems in order despite the fact that two most difficult problems were D and E, most likely because he did not know it would become an Open Cup round when he was solving the online mirror of NAC.

I found problem I educational: you are given a string s of length up to 10, and a positive integer d also up to 10. How many strings consisting of letters A-Z are at the Levenshtein distance of exactly d from s? Compute the answer modulo 998244353.

The second Codeforces round of the week was the Round 625 (problems, results, top 5 on the left, parallel round results with more problems, analysis). ksun48 was the only contestant to solve all problems, and 300iq achieved the same feat in the parallel round. If one excludes the problems A and C from the parallel round that did not appear in Round 625, ksun48 and 300iq would be neck and neck. Congratulations to both!

In my previous summary, I have mentioned two problems. The first one came from the Open Cup: find three points in three-dimensional space with integer coordinates up to 10⁶ that would present a hard testcase for floating-point geometry code. More specifically, let's project these three points to the unit sphere (x/sqrt(x²+y²+z²), y/sqrt(x²+y²+z²), z/sqrt(x²+y²+z²)). The plane passing through those projections must pass very close to the origin, but not exactly through it: the distance from the origin to this plane must not exceed 1.5*10^-19. Moreover, probably for the sake of the numeric stability of the checker, the three points have to be far apart on the unit sphere: at least 1.7 from each other (note that the diameter of the unit sphere is 2).

This problem was solved by Ilya Kornakov in our team, so I'm explaining his solution. First, let's find some way to generate random 3x3 matrices with determinant 1/-1 and values up to a 10⁶ by absolute value. We did it by starting with a random upper triangular matrix with ones or negative ones on the diagonal, and then trying 100 times to add or subtract one of the rows/columns to another row/column (which does not change the determinant), only doing the addition/subtraction if the values stay small enough.

Then, it turns out that such matrix has a reasonable probability of being our answer! To see why, note that the matrix having determinant 1 means that the corresponding pyramid has volume 1/6. Its height can be computed using the formula h=3*volume/base_area. Since volume is 1/6, we have h=0.5/base_area. base_area can be computed as one half of a determinant of a 2x2 matrix with values up to 2*10⁶, so it is up to 4*10¹², therefore its height could be as small as 0.125*10^-12.

Now, the problem asks for the height of a different pyramid: the one obtained by projecting the original three points onto the unit sphere. Let's split this projection into two parts:

Projecting all points into a sphere with radius equal to minimum distance l from origin to one of our three points (one of the points remains unchanged in this step).
Contracting that sphere l times.

The second part reduces the height of the pyramid l times, but what about the first part? We claim that it's very likely that the height of the pyramid decreases during this step! I don't have a strict proof here, but the picture on the left demonstrates two-dimensional analogy: the height AF of triangle ABD is bigger than the height AE of triangle ABC because it intersects the segment BC, and AE is the shortest path from A to BC. This argument breaks if F lands outside of the segment BD (more precisely, the line BD has to intersect the dotted circle, which is an even stronger requirement). Since the problem requires the three points to be at least 1.7 apart on the unit sphere, they form a pretty big triangle and therefore it's likely that the height still passes through it after the projection.

Since l can be up to sqrt(3)*10⁶, the final pyramid could have its height as small as 0.125*10^-12/(sqrt(3)*10⁶) which is approximately 7*10^-20. Our goal is 1.5*10^-19, so we have a ~2x gap, and after a sufficient number of attempts we will find a good example.

Note that the argument above also gives us a way to obtain a numerically stable upper estimate of the height: h=0.5/base_area/l. Computing the required height directly is hard to do in double or even long double, but this estimate is computed very precisely as there's no subtraction of close numbers at all. If this estimate is below 1.5*10^-19, the true height is even lower and therefore is also small enough.

Of course the fact that the first projection above does not increase the height is the most shaky part of this solution. To our defense, I can say that I've ran it with three different random seeds and it got accepted every time :)

What was your approach? Is there a way to avoid shaky arguments?

The second problem I mentioned came from Codeforces: you are given a complete directed graph on n vertices (n<=80), with each arc having its own non-negative cost. Your goal is to find a cyclic route of length exactly k (k<=10) from vertex 1 to itself with the lowest possible cost such that this route does not have sub-cycles of odd length (therefore k is always even).

The route not having sub-cycles of odd length is equivalent to saying that the vertices of the route can be colored with two colors such that the colors always alternate along the route. Since the length of the route is at most 10, it has at most 9 intermediate vertices. Therefore if we pick a random coloring of all vertices, with probability 1/2⁹=1/512 we will guess the right colors for the route from the optimal solution!

This leads us to the solution: let's repeatedly pick a random coloring of all vertices, and find the shortest cycle of length k where the colors alternate. We will do this until the time runs out, and print the shortest cycle that we found among all attempts. Finding the shortest cycle of length k can be done in O(n²*k) in a pretty standard fashion.

Since the probability of failure of each particular attempt is 511/512, if we manage to do t attempts our probability of failure will be (511/512)^t. For t=10000, this gives about 3*10^-9, which is good enough, and 10000 runs of a O(n²*k) algorithm should be fast enough for n<=80 and k<=10.

I find the way randomization helps us make 9 tough decisions pretty amusing :)

Thanks for reading, and check back for more!

Algorithms Weekly by Petr Mitrichev

Saturday, April 4, 2020

A random week

No comments:

Post a Comment