Algorithms Weekly by Petr Mitrichev: A week to make 1

The Oct 28 - Nov 3 week had two weekend rounds. TopCoder SRM 770 was the first (problems, results, top 5 on the left, my screencast with commentary). bqi343 has demonstrated excellent raw speed and was ahead by a bit less than 100 points before the challenge phase, but then tourist has added his excellent challenge form to the mix and overtook bqi343. Congratulations to both!

I was struggling quite a bit in this round, and ended up solving both the 450 and the 900 with the solutions that were more complex than the intended ones. For the 450, instead of a direct min-cost-max-flow formulation, I've used a combination of min-cost matching of the given size and greedy which might or might not be equivalent.

For the 900, I've taken the solution that was intended to fail because of precision issues and made it pass using quite some squeezing, including switching to C++ to use long double and having a magic constant of 1.02 that allowed to balance between time limit exceeded and floating point overflow (it turned out that it was possible to squeeze in this solution in Java using doubles, too — but only with the magic constant equal to 1.007, while 1.008 led to overflows and 1.005 to time limit exceeded).

AtCoder Grand Contest 040 followed on Sunday (problems, results, top 5 on the left, my screencast with commentary in Russian, analysis). The same two contestants occupied the top two spots as in the SRM, but the margin was much bigger this time, with tourist solving five problems a good half an hour before the others. Well done!

This was already the fifth AGC authored by maroonrk this year — amazing productivity at such high level! My GP30 scores for those are 29, 100, 0, 0, 2 :) While it started nicely, I seem to be unable to crack his problems in most recent rounds (I think I did skip one of the rounds altogether, though).

I was mostly stuck with problem D in this round, but let me highlight problem C which uses an apparently well-known but still beautiful trick: a string of even length of characters A, B and C is called valid if you can reduce it to an empty string by repeating the following operation: take any substring of length 2 that is neither AB nor BA, and erase it (pushing the two remaining parts back together). You are given an even number n up to 10⁷. How many strings of length n are valid, modulo 998244353?

In my previous summary, I have mentioned a Codeforces problem: you have n<=16 positive integers such that their sum does not exceed 2000 (let's denote that constraint as m), and another integer k, 2<=k<=2000. All n integers are not divisible by k. In one step, we can erase two numbers and instead write their sum divided by k repeatedly until the result is no longer divisible by k. For example, if k=2 and we have numbers 5 and 19, we can erase them and instead write 3 as 3=(5+19)/2/2/2 and 3 is not divisible by 2. Our goal is to end up with having just the number 1.

The solution with a 3ⁿ term in its complexity is somewhat straightforward: for each subset of numbers, we will compute which integers we can obtain from them. To process a subset, we will iterate over all ways to split it into two subsets which will be collapsed into the last two numbers we erase for this subset to obtain its final number, and over all possibilities for those two numbers. Since all numbers we get do not exceed m, and the number of (set, subset) pairs is 3ⁿ, this solution runs in O(3ⁿ*m²) which is too slow.

The key to obtaining a faster solution lies in an observation that looks almost trivial from the first glance: since each number is divided by k one or more times, either by itself or after being added to some other numbers, in the end we represent one as 1=sum(a_i*k^-b_i). It turns out that the inverse is also true: if we can find the values of b_i that satisfy the above equation, then there is a way to collapse everything into 1. To see that, we can find the highest value among b_i, notice that it will be repeated at least twice (as otherwise sum(a_i*k^-b_i) will not be an integer), and therefore we can take the two a_i's corresponding to the highest value among b_i and combine them according to the procedure to the problem statement to get a smaller set of values a'_i and b'_i where 1=sum(a'_i*k^-b'_i) still holds.

We have reformulated our problem in a more mathematical way, but did we actually change anything compared to the original formulation? It turns out we did, as now our search can disregard the tree-like order in which we combine the elements in the original formulation. Instead, let's accumulate the terms of sum(a_i*k^-b_i) in decreasing order of b_i. Note that this is not equivalent to the original tree-like order even though the first two elements being joined are the same (two with the highest value of b_i). The joins after those first two elements can be different, as it can happen that we divide their sum by k a lot of times.

Accumulating in decreasing order of b_i can be reformulated in the following way: we have our current sum s that starts with 0, and can do two things with it: either we take a number a_i that was not previously taken and add it to the sum (s+=a_i), or we divide the sum by k (s/=k). Note that we can only divide the sum by k when the current sum is divisible, but unlike the original process from the problem statement we are not forced to divide if it is.

This means that the dynamic programming which computes "what value of s can be obtained if we have taken the numbers from a given set" does not need to iterate over subsets, and instead can add items one by one, leading to O(n*2ⁿ*m) complexity.

Thanks for reading, and check back for more!

Algorithms Weekly by Petr Mitrichev

Friday, December 27, 2019

A week to make 1

No comments:

Post a Comment