Monday, January 22, 2018

A mean median week

This week Codeforces hosted its Round 458, also known as CodeCraft-18 (problems, results, top 5 on the left, analysis). Syloviaely has made a solid claim to the first place by submitting the 3000-point problem F just 13 minutes after the contest started (was it a previously known problem, or was it really fast solving?..), but SkyDec has gathered a whopping 800 points from challenges on two different problems, 450 on B and 350 on C, and thus finished first with quite a margin. Congratulations!

This round also had a great side story — the top-rated contestant, Um_nik, decided to go for the hardest problem H, and that bet did not play out well as he spent too much time on it and as I understood also discovered an incorrect output in one of the samples (which means an incorrect reference solution?..). The admins have decided to make the round unrated for him because of the incorrect sample issue, but he asked them to reverse that decision (and thus make him lose rating) because he deemed the issue did not impact him significantly. Kudos to Um_nik for the admirable honesty!

Last week, I have mentioned an AtCoder problem: you are given n<=2000 positive integers, each up to a<=2000. Consider all 2n-1 nonempty subsets, and compute the sum of the numbers in each subset. Now sort those sums, and find the median sum — the 2n-1-th in the sorted order. You need to output this median sum. Time limit is 2 seconds.

At first sight, the problem appears somewhat standard: we can run the knapsack dynamic programming that computes the number of ways to get each possible sum, and then go from lower sums to higher sums until we get 2n-1 in total.

This has two issues, though: first, the maximum sum is n*a, so the dynamic programming runs in O(n2*a). Second, the number ways to get each sum can be on the order of 2n, so we need big integers, and the running time after accounting for that is more like O(n3*a). Even O(n2*a) is too slow.

Here comes the insight: notice that all sums can be split in pairs. If the sum of all numbers is s, then for each subset with sum x we'll have the complementary subset with sum s-x (let's also include the empty subset). One of those sums is <=s/2, and the other is >=s/2, so the median is roughly equal to s/2. More precisely, the median will always be the smallest sum that is >=s/2.

This allows us to get rid of the big integers, as now we merely need to know which sums are possible, and don't care about the number of ways anymore. Moreover, we can use bitmasks to speed up this O(n2*a) solution ~32-64 times which makes it fast enough.

Thanks for reading, and check back next week!

No comments:

Post a Comment