Wednesday, April 18, 2018

A second extremal week

The last week of March featured more contests from the regular platforms. First off, TopCoder SRM 732 took place very early on Friday (problems, results, top 5 on the left). Both the medium and the hard problem involved asymmetric games requiring quite some insight to solve, and thus only two contestants were able to solve each (top 4 in the table to the left). Congratulations to all four, and especially to pashka on the win!

Then AtCoder held its Grand Contest 022 on Saturday (problems, results, top 5 on the left, analysis). The contest had three relatively easy problems, and three very hard ones. Only a few contestants were able to solve one of the three (I couldn't solve any, for that matter), so it is extremely impressive that Um_nik got all three — huge congratulations on very well deserved victory!

Problem E was the most approachable of the three. Consider a string of 0s, 1s and ?s of odd length up to 300000. First, we need to replace each ? with a 0 or a 1. Then, we repeatedly do the following operation: take any 3 consecutive characters, and replace them with the most frequent character among them. For example, 001 is replaced with 0. In the end we end up with a string of length 1, and our goal is to get the string 1. How many ways are there to replace the ?s so that it's possible to get the 1 after all reductions?

And finally, Open Cup 2017-18 Grand Prix of Moscow on Sunday wrapped up the week, but its results have not yet been published.

In my previous summary, I have mentioned an Open Cup problem: you are given n points p1p2, ..., pn on the plane and q queries (n, q <= 100000). Each query is defined by two numbers a, b, and you need to print the size of the smallest square with sides parallel to coordinate axes that contains all points from a-th to b-th (from the list papa+1, ..., pb) except maybe one.

Assuming we forget about "except maybe one" part, we need to find the bounding box of a segment of points, which can be done by finding the leftmost, rightmost, topmost and bottom-most points using interval trees in O(n*log(n)).

Now we can notice that when the skipped point is not one of the four extremal points, then the bounding box does not change, so we need to check at most four possibilities for the skipped point. In order to be able to find extremal points after skipping one point, our interval trees will need hold two extremal points instead of one, but otherwise the solution stays the same.

Thanks for reading, and check back for more!

ACM ICPC 2018 World Finals stream on Twitch

We have figured out a working setup for our stream, so tune in to https://www.twitch.tv/petrmitrichev around 10:00 Beijing time tomorrow!

Tuesday, April 17, 2018

A group embedding week

VK Cup 2018 Round 2 took place during the The Mar 19 - Mar 25 week (problems, results, top 5 on the left, parallel round results, analysis). Team Нижний Магазин SU: BZ would be first by far even without solving the last problem, but getting all problems accepted in the final minutes of the contest was of course the icing on the cake. Congratulations!

Open Cup 2017-18 Grand Prix of America wrapped up that week (results, top 5 on the left, parallel round results), with another two World Finals favorites in top 5: SPb ITMO 1 and Moscow SU Red Panda. The first place, however, went to team Past Glory — once again thanks to their incredible accuracy. Congratulations!

Problem K in this round was quite educational, if a bit professional. You are given n points p1p2, ..., pn on the plane and q queries (n, q <= 100000). Each query is defined by two numbers a, b, and you need to print the size of the smallest square with sides parallel to coordinate axes that contains all points from a-th to b-th (from the list papa+1, ..., pb) except maybe one. Can you dissect this problem into standard pieces?

In my previous summary, I have mentioned a Yandex.Algorithm problem: you are given a string s with 100000 characters, each a, b or c. You must swap exactly two distinct letters to obtain a new string t. How many ways are there to do that in such a way that the string t is good? In this problem we define a good string somewhat similarly to a valid parentheses sequence: empty string is good, if a string u is good then the strings auabubcuc are good as well, and if two strings u and v are good, then their concatenation uv is also good.

The key insight in this problem is to learn to check if a string is good easily. Let's pick a group and 3 elements of order 2 in it: aa=1, bb=1, cc=1. We will then map each string to the product of the corresponding elements of the group. We can see that any good string will map to 1. Moreover, if our group is general enough, in other words does not have any other non-derived products that are equal to 1 (or if such products are simply unlikely to be equal to 1), then the opposite is also true: when a string maps to 1, it is good. As an example group that works in this problem we can use the group of movements of 3D space that keep the origin fixed, where ab and c correspond to reflections through three random planes passing through origin. This way, we obtain a compressed representation for a string of any length as a single 3x3 matrix (modulo some prime number, to avoid dealing with floating point).

We can then use the divide-and-conquer approach: we start by splitting our string in two in the middle, and consider the case where the two positions being swapped are in different halves. If we also try all 6 possibilities for the letters being swapped, we have a sub-problem of the following kind: we have two strings, and need to replace a single a with b in the first string, and a single b with a in the second string, so that after doing those replacements and concatenating the strings we get a good one.

We can compute the compressed representation of each candidate for the first half using prefix and suffix products, then compute the compressed representation of each candidate for the second half in the same way, and then for each representation of first half find its inverse element in the second half, thus processing the case where the swapped elements are in different halves in O(n).

In order to handle the case where both swapped elements are in the same half, we follow the divide-and-conquer approach and recursively execute the same algorithm for each half, with the only difference that the target product in each half we want to get is not 1, but rather the inverse of the product of the other half. This allows to complete the solution of the entire problem in O(n*log(n)).

Thanks for reading, and check back soon!

A favorites week

The Mar 12 - Mar 18 week started with Yandex.Algorithm Round 2 on Tuesday (problems with Yandex login, results, top 5 on the left, my screencast). The hardest problem A was left unsolved despite 78 attempts, and yet the actual implementation is quite straightforward once one gets the idea. You are given a string s with 100000 characters, each a, b or c. You must swap exactly two distinct letters to obtain a new string t. How many ways are there to do that in such a way that the string t is good? In this problem we define a good string somewhat similarly to a valid parentheses sequence: empty string is good, if a string u is good then the strings auabubcuc are good as well, and if two strings u and v are good, then their concatenation uv is also good.

TopCoder SRM 731 took place on Saturday (problems, results, top 5 on the left). mjhun was the fastest during the coding phase, but Gassa was close enough to require just one successful challenge to climb into the first place. Congratulations to both of you!

Finally, the Open Cup 2017-18 Grand Prix of Belarus happened on Sunday (problems, results, top 5 on the left). Team Past Glory was once again the fastest — well done! This round also provided a World Finals prediction perspective, as there are two teams in the top 5 who will be competing in Beijing: Seoul NU and Peking U. This year's World Finals looks to be extremely competitive, with top teams like Seoul NU, Peking U, SPb ITMO, Moscow SU, Warsaw U all of comparable strength (did I miss a team that you think is one of the favorites to win? Sorry, and please share in comments!) I'm looking forward to the final showdown on Thursday!

In my previous summary, I have mentioned a cute Open Cup problem: you are given a n times n matrix A of integers modulo a prime number p. You need to find the smallest positive integer k such that Ak=0 (modulo p), or report that it does not exist, in O(n3).

The answer never exceeds n, which in very rough terms can be derived from looking at Jordan normal forms (is there a simpler argument?..), which enables the following O(n4) approach: just compute the first n powers of A. It can be sped up to O(n3*log(n)) by using binary search, since once a power of A is zero, all following powers are zero as well.

In order to speed it up to O(n3), we need to come up with a beautiful idea: instead of computing just Ak, we will compute Ak*v for some vector v of size n. In case Ak=0, then Ak*v=0 as well. It turns out the opposite is almost always true: Ak*v over all v defines a linear subspace, which has at least one dimension in case Ak is not zero, and thus we can just pick v randomly and the probability of Ak*v being zero will not exceed 1/p.

Since we can multiply a matrix by a vector in O(n2), we can compute v, A*vA2*v, ..., An*v in sequence in O(n3) overall.

Thanks for reading, and check back for more!

ACM ICPC 2018 World Finals stream

ACM ICPC 2018 World Finals take place this Thursday at 10:00 Beijing time (click for other timezones). Just like last year, we'll try to solve it in parallel with tourist and Endagorion, and stream the process, assuming the problems will be available for submission on Kattis.

I'm not yet sure if it's going to be on Youtube, which does not work well in China, or somewhere else. I will post an update here and in Twitter once we figure out the right setup. Stay tuned!

Saturday, April 14, 2018

A binary block week

The Mar 5 - Mar 11 week had two Codeforces rounds. Round 469 took place on Friday (problems, results, top 5 on the left, analysis). Since I'm writing this post on the plane to Beijing for ICPC 2018 World Finals, I can't help but look at the scoreboard from the ICPC angle: the winner dotorya will be participating in the Seoul NU team, and the third-placed Syloviaely is part of the host Peking U team. Congratulations on the great Codeforces result, and best of luck in the World Finals!

VK Cup 2018 Round 1 took place one day later (problems, results, top 5 on the left, parallel round results, analysis). Team VK Cup 24329020081766400008 from Saratov was already among the fastest in coding, but managed to stand out thanks to the challenges, even despite their solution for the hardest problem failing systests. Well done!

Finally, after a week's hiatus the Open Cup came back for another five-week marathon which started with the Grand Prix of Baltic (problems, results, top 5 on the left). Team Past Glory was not as fast as t.me/umnik_team, but they were more accurate, and thus won by a few penalty minutes. Congratulations to both teams on the great performance!

Problem E had a very simple statement and a very cute solution. You are given a n times n matrix A of integers modulo a prime number p. You need to find the smallest positive integer k such that Ak=0 (modulo p), or report that it does not exist. It is relatively straightforward to do in O(n3*log(n)), but your solution needs to run in O(n3).

In my previous summary, I have mentioned a difficult Codeforces problem: consider an unknown binary string of length k, k is up to a billion. You're given up to 100000 segments [ai,bi], and know that each of those segments contains at least one 0. In a similar vein, you're also given up to 100000 segments [ci,di], and know that each of those segments contains at least one 1. How many such binary strings exist, modulo 109+7?

First, how can we approach this problem if we forget that k is very big? Let's split our string into blocks of 0s and 1s, and compute pj — the number of ways to choose the first j characters ending with a block of 0s, and qj — the number of ways to choose the first j characters ending with a block of 1s. To compute pj, we iterate over the ending point t of previous block of 1s, and we see that pj is a sum of qt. t must be at most j-1, and at least max(ci) over all i such that di<=j, to make sure that no [ci,di] segment only contains 0s. We can compute qj in a symmetric way.

Since max(ci) only ever increases as we increase j, we can maintain a sliding window of qt's together with their sum, and thus obtain a O(k) amortized time solution.

Now let's look more closely at this process in case we don't encounter any new segment endpoints. Suppose we have just found pj and qj, and now are looking to find pj+1 and qj+1 while the left boundaries of our summation max(ai) and max(ci) stay the same. pj+1 is the same sum as pj, but with extra term qadded, so pj+1=pj+qj. Similarly, qj+1=pj+qj as well. So we got two equal numbers at (j+1)-th step, and on each further step these numbers will just keep multiplying by two until we encounter a new segment endpoint and summation left boundaries change.

This allows to obtain a O(nlogn) solution, where n is the number of segments: instead of computing all individual values of pj and qj, we will split our string into blocks between the segment endpoints, and in each block we will only store the first number, and all following numbers will be 2x the previous number. The formula for the sum of a geometric progression allows to handle such segments quickly.

Thanks for reading, and check back soon for more summaries and for some ICPC blogging!

Sunday, April 8, 2018

Google Code Jam qualification - last chance

There's about 4 hours left in the Google Code Jam 2018 qualification round. There were some stability issues with the new system earlier in the day, but they seem to have been resolved, so now's your chance to join the Code Jam if you haven't already. As usual I'm setting some problems this year, and I hope you'll find them interesting!

Monday, March 5, 2018

A power of two week

This week's contests were concentrated on the weekend. First, Yandex.Algorithm 2018 Round 1 has brought back the "open/blind" submission format on Saturday (problems with Yandex login, results, top 5 on the left, my screencast, analysis). It turned out that it doesn't really matter whether to choose open or blind if one gets all problems correct from the first try, and is the only one to solve everything :) Congratulations to tourist!

Problem E has a deceptively easy solution in the analysis, and yet it seems really hard to come up with. You are given a sequence of 100000 positive integers up to a billion. You need to find any integer k such that it's possible to transform our sequence into a strictly increasing sequence of positive integers by subtracting some of its elements from k (in other words, by replacing xi by k-xi for some set of i's). Can you see why this problem is actually quite simple?

On Sunday Codeforces hosted its Round 468 (problems, results, top 5 on the left, my screencast). It was V--o_o--V's turn to be head and shoulders above the competition, finishing all problems in less than an hour. Well done!

The hardest problem E involved finding an appropriate dynamic programming angle that allows "coordinate compression" to work. Consider an unknown binary string of length k, k is up to a billion. You're given up to 100000 segments [ai,bi], and know that each of those segments contains at least one 0. In a similar vein, you're also given up to 100000 segments [ci,di], and know that each of those segments contains at least one 1. How many such binary strings exist, modulo 109+7?

Last week, I have mentioned a TopCoder combinatorics problem: find the sum of 2x1 (two to the power of the minimum number) over all possible ways to choose k distinct positive integers up to n: 1<=x1<x2<...<xk<=n, where n is up to a billion, and k is up to a million.

The key idea is: instead of summing 2x1 for each combination, which seems hard to do, we will further subdivide each combination into 2x1-1 different objects, so that we just need to count the number of objects and multiply by 2. More specifically, if we have k distinct positive integers, and the smallest is x1, then there are exactly 2x1-1 ways to add some subset of numbers between 1 and x1-1 to it. As a result, we get a set of at least k distinct positive integers. And moreover, we can obtain all such sets: each such set can be obtained by subdividing the set of k its largest numbers. So we need to return 2 times the number of ways to choose at least k objects out of n, which is much easier to compute.

Thanks for reading, and check back next week!

Monday, February 26, 2018

An infinite ratio week

This week had a round from each of the platforms I usually cover. First off, TopCoder held SRM 730 on Tuesday (problems, results, top 5 on the left, analysis). The competition was ultimately decided during the challenge phase where I was basically very lucky: my room had 8 incorrect solutions for the 250, and I've managed to bring down 4 of those; ACRush had only 4 incorrect solutions in his room, one of those was challenged by somebody else, and he got the other 3. Nevertheless, amazing comeback after an almost half-year break for ACRush!

Quite unusually, the hard problem was solved by ksun48 in less than 5 minutes. The problem asked to find the sum of 2x1 (two to the power of the minimum number) over all possible ways to choose k distinct positive integers up to n: 1<=x1<x2<...<xk<=n, where n is up to a billion, and k is up to a million.

Next up was AtCoder Grand Contest 021 on Saturday (problems, results, top 5 on the left, analysismy screencast). tourist was nearly unstoppable this time, earning his first place with about 25 minutes left in the contest. Egor in 3rd place had the most realistic chance to overtake him thanks to a creative strategy, as he decided to avoid spending more time to earn the fairly technical final 300 points in the partial-scoring problem F, and instead focused on problem E which would give 1200 points if solved. Unfortunately 25 minutes were still not enough for that. Nevertheless, congratulations to both!

Problem B was quite cute. You are given 100 points on the plane. For each given point, consider the part of the plane where it is the closest of the given points (a Voronoi diagram). Some of the parts will be finite, and some will be infinite. What are the relative areas of the infinite parts (see the problem statement for the formal description)?

Sunday started with the Open Cup Grand Prix of Saratov (problemsresults, top 5 on the left). Six teams were able to solve 11 this time, but team Past Glory has managed to do that without a single incorrect attempt — amazing!

After a short break, Codeforces Round 467 wrapped up the week (problems, results, top 5 on the left, my screencast). mnbvmar has earned his first place by solving 4 problems in just over an hour, 15 minutes faster than anybody else — and then protected his lead from Syloviaely's surge by finding 5 successful challenges. Very well done!

Problem C provided a level playing field for experienced and inexperienced competitors, as it had nothing to do with standard algorithms or even standard ideas. You are given a string s of length 2000, and need to transform it into a string t of the same length using at most 6100 operations, or report that it's impossible. In one operation, you can split the current string into two, reverse the second part, and then put the first part after the reversed second part: for example, you can obtain the string fedcab from the string abcdef in one operation. Either part is allowed to be empty. Can you see a way to do the transformation of length n in at most 3n operations? Somewhat surprisingly, there are many working approaches in this problem.

Thanks for reading, and check back next week!

Sunday, February 18, 2018

A Fenwick bound week

Codeforces was quite active this week with two Division 1 rounds. The 462nd one took place on Wednesday (problems, results, top 5 on the left, analysis). Um_nik was the only one able to solve all problems, submitting one in the last minute — but he would've won without that one anyway. Congratulations on the great performance!

Round 463 took place on Thursday (problems, results, top 5 on the left, analysis). It was Radewoosh's turn to be the only competitor to solve all problems, with 7 minutes to spare this time. Well done!

He will be representing his university in the upcoming ICPC World Finals in Beijing, along with a few others that you can see in the screenshots on the left, so the competition there promises to be very interesting :)

Finally, the Open Cup Grand Prix of Gomel on Sunday provided another glimpse into full team performances (results, top 5 on the left). Moscow SU team Red Panda has found the winning ways again, this time thanks to being the only team to solve problem I — and doing it in the beginning of the third hour of the contest, which is quite unusual for a problem with only one solver. Congratulations!

Last week, I have mentioned a data structure problem from the Open Cup. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group ci already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most ai people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of ci and ai in the order the people come to lunch, how will the final queue look like?

One could immediately start thinking about standard approaches applicable in this problem. The fact that we need to be able to run a query against the last ai people in the queue suggests that we should probably use a balanced tree that supports quick splitting, like a treap, and maintain the size of the subtree plus some additional structures necessary to answer queries in each node. This approach can probably be made to work, but the nice thing about the problem is that we can obtain a much easier to code solution.

Since every person only ever joins the queue either in last position, or next to a person from their group, if we maintain the queue as a list of segments of people from the same group, in other words as a list of (group, counter) pairs, then the only operations we need to support is to increment a counter and to add a new group to the end of the list. Now in order to find a place for the next person we need to find the set of groups corresponding to the last ai+1 people, and then find the first group of the required type in that set.

In order to find the boundary on the group level, we need to be able to find the largest prefix for which the sum of counters does not exceed a given value. This can be done in O(logn) by maintaining the counters in a Fenwick tree, I will give more details below.

And in order to find the first group of the required kind after the given one, we can simply maintain a vector of indices of groups for each kind. Since we only ever create new groups at the end, these indices never change, and a simple binary search can find the first group of the given kind after a given index.

Finally, after we know what happens with groups we know at which position does each person enter the queue, but we still need to output the final order. This can be done using the same Fenwick tree with "largest prefix with sum not exceeding x" operation we already used: we go from the end and maintain the free spots in the Fenwick tree.

So instead of a balanced tree with extra information in nodes, we've managed to get by using two very easy to code data structures: a Fenwick tree, and a vector. The solution is O(nlogn), and the constant hidden in O() is also really small.

This solution used the following non-standard operation on a Fenwick tree: find the largest prefix with sum not exceeding x. A naive implementation using binary search and Fenwick prefix sums would give O(log2n) complexity, which would most likely still be fast enough to get the problem accepted. However, tgehr has pointed out to me that the standard Fenwick tree allows an extremely simple O(logn) way to answer this question.

Suppose our Fenwick tree has n elements, and k is the largest power of 2 not exceeding n. The (2k-1)-th element of the Fenwick tree array contains the sum s of elements on the segment [0;2k-1], so by comparing x with just this number we know if our answer is below or above 2k. Let's suppose it's above 2k. Then we notice that (2k+2k-1-1)-th element of the Fenwick tree array contains the sum of elements on the segment [2k;2k+2k-1-1], so by comparing x-s with just this number we know if our answer is below or above 2k+2k-1. We continue traversing the powers of two in the same manner, and it turns out that the standard Fenwick tree stores exactly the sums needed to execute this binary search! Here's the actual code:

private int upperBound(int[] f, int x) {
    int res = 0;
    int max = Integer.numberOfTrailingZeros(
        Integer.highestOneBit(f.length));
    for (int k = max; k >= 0; --k) {
        int p = res + (1 << k) - 1;
        if (p < f.length && f[p] <= x) {
            x -= f[p];
            res += 1 << k;
        }
    }
    return res;
}

Thanks for reading, and check back next week.

Monday, February 12, 2018

A leafy week

This week featured two weekend contests. First, TopCoder SRM 729 took place on Saturday (problems, results, top 5 on the left, my screencast with commentary). If you sum up the problem columns in the screenshot on the left, you can notice that the sum doesn't match the score column, and that's because the match presented ample opportunities for challenging. You can see on the screencast as I try to prepare an uber-challenge-case for the 450 during the intermission, and spent the beginning of the challenge phase getting it to work, while many solutions were already being challenged. uwi has made the best use of the challenge opportunities and thus claimed the first place. Well done!

Most of the challenge opportunities presented themselves in the medium problem, which looked very standard at first glance. You are given a 1000x1000 grid. In one jump, you can move from a cell to any other cell that's at least the given distance d away — you can't jump very close. What is the smallest number of jumps required to get from one given cell to another given cell?

We could use a standard breadth-first search to solve this problem, but we have 1 million cells and potentially 1 million jumps from each cell, so the total running time would be on the order of 1012 which is too slow. Can you see at least one way to speed this solution up? (there are many!)

The other weekend contest was the Open Cup 2017-18 Grand Prix of Khamovniki (results, top 5 on the left). Unlike the previous round, the active ICPC teams from Russia were not at the top of the standings, with only two ICPC teams from Asia and a veteran team Past Glory able to solve 10 problems — congratulations, and especially to Seoul National U 2 team for another Open Cup victory!

Problem D "Lunch Queue" was from the rare species of data structure problems that I enjoyed solving. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group ci already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most ai people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of ci and ai in the order the people come to lunch, how will the final queue look like?

In my previous summary, I have mentioned a hard AtCoder problem: you are given a rooted tree with n vertices, and each vertex contains an integer between 1 and n, each number appearing exactly once. Your goal is to rearrange the numbers in such a way that vertex 1 has number 1, vertex 2 has number 2, and so on. You're allowed to do the following transformation: take the path connecting the root of the tree with any vertex v, and cyclically rotate it, placing the number that was in root into v, the number that was in v into the parent of v, and so on — essentially a generalization of insertion sort from a single path to an arbitrary tree. You need to sort the tree in at most 25000 operations for n=2000.

I did not solve the problem myself, so I will describe the solution from the official analysis. First, we can notice that given any leaf, we can put the correct value into it in <=n operations (first pull it up to the root, then put it into the leaf in one rotation, and then never touch this leaf again, so we could sort everything in n2 operations, which is too much.

However, we can notice that in this approach we have quite a lot of freedom for choosing the moves that pull the given number up. We could try to use this freedom to try to pull up the numbers for all leaves, not just for one leaf. But even that would not be enough, as when our rooted tree is a chain it only has one leaf. However, we know that for a chain a simple solution is possible, called the normal insertion sort: we'll do exactly n operations, and after k operations we'd have k bottom numbers of the chain already sorted in correct order, in each turn inserting the new number to the appropriate place.

Now we need to combine the ideas of pulling the required number from anywhere in the tree with the idea of filling a chain in one O(n) operation stage in such a way that the number of stages would be O(log(n)) for any rooted tree. More precisely, we will find all leaf chains in the tree — chains that hang at the bottom with nothing else attached to them, and fill them all with correct values in one stage. This guarantees O(log(n)) stages since the number of leaves is divided by at least two after each stage.

Whenever the root of the tree has a useful value — one that must be in one of the leaf chains — we will send it there, inserting it into the correct place relative to other values that we've sent to the same leaf chain, just like the insertion sort approach above. And when the root has a useless value, we need to send it somewhere, so let's send it to any position which contains a useful number, but such that all its subtree contains either only useless numbers, or useful numbers that are already placed into the corresponding leaf chain (in other words, the numbers that we don't need to get to the root anymore). This will push this useful number towards the root, and create a subtree that doesn't have any numbers that we want to get to the root.

Why does this cycle finish eventually, and more importantly why does it run in O(n) operations? Each useful value passes through the root only once. A useless value, after passing through the root, ends up being in a dormant subtree, and would never pass through the root again, unless we need to touch this subtree because we're putting a useful value into it. In each such operation, at most one value moves from a dormant subtree back into circulation, and thus can reach the root again. So the total number of times a useless value that we've already seen once returns to the root does not exceed the total number of times we put a useful value into its place, meaning that the total number of operations for one stage is at most n plus total size of leaf paths, so O(n).

Thanks for reading, and check back next week!

Saturday, February 10, 2018

A Red Panda week

Last week started with Codeforces Round 459 on Monday (problems, results, top 5 on the left, analysis). Once again the hardest problem was too much to handle, and OO0OOO00O0OOO0O00OOO0OO was quite a lot faster on the first four. Congratulations on the win!

On Saturday, AtCoder hosted an unusually long 5-hour contest called AtCoder Petrozavodsk Contest 001 (problems, results, top 5 on the left, my screencast, analysis). There was a huge difficulty gap between the first six problems, all of which were solved by 52 contestants, and the last four. Only two contestants have managed to solve two of the more difficult kind. Congratulations to Marcin and vepifanov on this amazing feat!

The most approachable problem of the difficult kind, problem H, went like this: you are given a rooted tree with n vertices, and each vertex contains an integer between 1 and n, each number appearing exactly once. Your goal is to rearrange the numbers in such a way that vertex 1 has number 1, vertex 2 has number 2, and so on. You're allowed to do the following transformation: take the path connecting the root of the tree with any vertex v, and cyclically rotate it, placing the number that was in root into v, the number that was in v into the parent of v, and so on — essentially a generalization of insertion sort from a single path to an arbitrary tree. You need to sort the tree in at most 25000 operations for n=2000.

Finally, the Open Cup season came back from the winter break on Sunday with the Grand Prix of Korea (results, top 5 on the left). Moscow State University team Red Panda have won the contest by a whole problem ahead of other super strong university and veteran teams. Congratulations on the victory!

Since the problems for this contest came from Korea, I can't help but wonder if top Korean/Asian ICPC teams have also solved this problemset elsewhere, or will do that later. Does anybody have some scoreboards to share?

Last week I have mentioned a 250-point TopCoder problem: you are given 50 integers, each up to a billion. In one step, you can take any integer that is greater than 1 and divide it by 2. If it was odd, you can choose whether to round up or down. What is the smallest number of steps required to make all integers equal?

My thinking went like this: instead of deciding immediately whether we'd round up or down, let's keep a set of possible values for each integer, which will always be a segment [min;max]. After dividing some numbers by 2 a few times, we'll have such segment for each number. Whenever these segments all have a non-empty intersection, we're done.

For example, if our initial numbers are 3, 5, and 13, then we start with segments [3;3], [5;5] and [13;13]. After dividing the last number by 2 we have [3;3], [5;5] and [6;7]. After doing it again we have [3;3], [5;5] and [3;4]. After dividing the second number by 2 we have [3;3], [2;3] and [3;4]. Now all our segments intersect in point 3, so we can make all numbers equal to 3.

The only remaining thing is to decide which numbers to divide by 2, but it actually flows naturally from the above observation. What does it mean when all segments do not have an intersection? It means that there are two segments [a;b] and [c;d] that do not intersect (this is not true for arbitrary sets, but it is true for segments), without loss of generality we have b<c. But then we must divide the segment [c;d] by 2, otherwise it will never intersect with [a;b] or its descendants. So as long as we don't have an intersection, we can always find a segment to divide by 2 this way.

Thanks for reading, and check back in a couple of days for this week's summary!

Sunday, January 28, 2018

A TopCoder week

TopCoder SRM 728 was this week's contest (problems, results, top 5 on the left, analysis, my screencast). The hard problem was a relatively straightforward application of the Burnside's lemma, which I described in this blog more than 9 (!) years ago. The easy problem allowed a ton of completely different correct approaches, and thus in a sense was more interesting: you are given 50 integers, each up to a billion. In one step, you can take any integer that is greater than 1 and divide it by 2. If it was odd, you can choose whether to round up or down. What is the smallest number of steps required to make all integers equal?

I've also just realized that I forgot to cover the previous SRM, TopCoder SRM 727 which took place two weeks ago very early in the morning (problems, results, top 5 on the left, analysis). lych_cys, participating in Division 1 for the first time, has squeezed the victory during the challenge phase from wxh010910, who was doing only his third Division 1 round himself, while the rest of the pack was more than 100 points behind. Congratulations to the winning newcomers!

Thanks for reading, and check back next week!

Monday, January 22, 2018

A mean median week

This week Codeforces hosted its Round 458, also known as CodeCraft-18 (problems, results, top 5 on the left, analysis). Syloviaely has made a solid claim to the first place by submitting the 3000-point problem F just 13 minutes after the contest started (was it a previously known problem, or was it really fast solving?..), but SkyDec has gathered a whopping 800 points from challenges on two different problems, 450 on B and 350 on C, and thus finished first with quite a margin. Congratulations!

This round also had a great side story — the top-rated contestant, Um_nik, decided to go for the hardest problem H, and that bet did not play out well as he spent too much time on it and as I understood also discovered an incorrect output in one of the samples (which means an incorrect reference solution?..). The admins have decided to make the round unrated for him because of the incorrect sample issue, but he asked them to reverse that decision (and thus make him lose rating) because he deemed the issue did not impact him significantly. Kudos to Um_nik for the admirable honesty!

Last week, I have mentioned an AtCoder problem: you are given n<=2000 positive integers, each up to a<=2000. Consider all 2n-1 nonempty subsets, and compute the sum of the numbers in each subset. Now sort those sums, and find the median sum — the 2n-1-th in the sorted order. You need to output this median sum. Time limit is 2 seconds.

At first sight, the problem appears somewhat standard: we can run the knapsack dynamic programming that computes the number of ways to get each possible sum, and then go from lower sums to higher sums until we get 2n-1 in total.

This has two issues, though: first, the maximum sum is n*a, so the dynamic programming runs in O(n2*a). Second, the number ways to get each sum can be on the order of 2n, so we need big integers, and the running time after accounting for that is more like O(n3*a). Even O(n2*a) is too slow.

Here comes the insight: notice that all sums can be split in pairs. If the sum of all numbers is s, then for each subset with sum x we'll have the complementary subset with sum s-x (let's also include the empty subset). One of those sums is <=s/2, and the other is >=s/2, so the median is roughly equal to s/2. More precisely, the median will always be the smallest sum that is >=s/2.

This allows us to get rid of the big integers, as now we merely need to know which sums are possible, and don't care about the number of ways anymore. Moreover, we can use bitmasks to speed up this O(n2*a) solution ~32-64 times which makes it fast enough.

Thanks for reading, and check back next week!

Monday, January 15, 2018

An Um_nik week

This week's contests shared one of the problemsetters — and the winner. On Monday, Codeforces hosted its Hello 2018 round (problems, results, top 5 on the left, analysis). With the hardest problem having too much piecewise polynomial integration, the competition focused on the remaining seven. Only Um_nik could get them all, and even with 24 minutes to spare — congratulations!

On Sunday, AtCoder Grand Contest 020 started the race to Japan (problems, results, top 5 on the left, my screencast, analysis). Once again the hardest problem by tourist was a tiny bit too hard (and I even solved it after the contest by integrating polynomials, probably inspired by the analysis of the previous contest), and once again Um_nik was way faster than everybody else on the remaining problems — big congratulations again!

I found problem C quite cute. You are given n<=2000 positive integers, each up to 2000. Consider all 2n-1 nonempty subsets, and compute the sum of the numbers in each subset. Now sort those sums, and find the median sum — the 2n-1-th in the sorted order. You need to output this median sum. Time limit is 2 seconds.

In my last post I've discussed what was arguably the hardest contest problem of 2017: you are given an undirected graph with at most 200000 vertices and edges. Your goal is to split its vertices into four disjoint groups A, B, C, D in such a way that:
  • The subgraph formed by groups A+B is connected and has at least 2 vertices.
  • The subgraph formed by groups C+D is connected and has at least 2 vertices.
  • Each vertex in A has an edge to each vertex in C.
  • There are no other edges between subgraphs A+B and C+D.
If there are many possible solutions, you need to output any one of them.

Thanks to great ideas and pointers by Swistakk, mnbvmar and lewin, we were able to put together a O(n*logn) solution — you can check it out in this comment thread.

Thanks for reading, and check back next week!

Monday, January 8, 2018

An Otherland week

The first week of 2018 was quite calm, with the most devoted taking stabs at the still-ongoing Prime New Year contest by snarknews. Only one problem remains unsolved there, and it looks like it might take some team effort to crack it. I think it's within the spirit of the contest to discuss its problems publicly (after all, many even have analysis posted online), so I propose to do just that. Please skip the following paragraphs if you don't want spoilers!

The problem goes like this: you are given an undirected graph with at most 200000 vertices and edges. Your goal is to split its vertices into four disjoint groups A, B, C, D in such a way that:
  • The subgraph formed by groups A+B is connected and has at least 2 vertices.
  • The subgraph formed by groups C+D is connected and has at least 2 vertices.
  • Each vertex in A has an edge to each vertex in C.
  • There are no other edges between subgraphs A+B and C+D.
If there are many possible solutions, you need to output any one of them.

I have been thinking about the problem for some time, but only have the following not very helpful observations:
  • If the graph has an articulation point, then we can take this point as set A, any of the components it separates as set B, and the rest as C+D. So we can assume the graph is vertex-biconnected.
  • The answer is not always "Yes" - the smallest counterexample is the graph with 4 vertices and 5 edges. This example also shows that if a biconnected graph has a vertex of degree 2, then this vertex and both adjacent vertices must be in the same half of our splitting, since both ends of a splitting edge must have degree of at least 3 (two splitting edges plus one inside the half).
  • There are also counterexamples without vertices of degree 2.
  • There might be some hashing trick involved. I.e., for example suppose we assign a random number ai to each vertex, and compute for each vertex the value sum(ai-aj), where the summation goes over all adjacent vertices j. Then if we add up those values within one half of the graph (A+B in the above terminology), then almost everything will cancel out, leaving us |C|*aA-|A|*aC, where aA is the sum of ain A. But where to go from here?..
Please share your ideas!

In my last post, I have held a vote for the best problem of 2017. And the winner is this AtCoder problem by maroonrk: you are given two rooted trees on the same set of 105 vertices. You need to assign integer weights to all vertices in such a way that for each subtree of each of the trees the sum of weights is either 1 or -1, or report that it's impossible. The solution is explained in this post. Congratulations to the problemsetter and to AtCoder, and thanks to all problemsetters of 2017!

Thanks for reading, and check back next week.