Saturday, May 26, 2018

An unpublished week

The May 7 - May 13 week was the time for the first regional event of TopCoder Open 2018 in Warsaw. The problems and results seem to be unavailable on the TopCoder website, but you can read the recap here.

On Sunday of the same week the 2017-18 season of the Open Cup was wrapped up with the Grand Prix of Bashkortostan (results with Open Cup login, top 5 on the left). The overall season standings are not yet available, but there's no doubt about the first place of team Past Glory. In this round they have reiterated how huge the gap between them and the rest of the field really is, solving all problems from first attempt in 2 hours and 24 minutes, and having almost half the penalty time of the second-placed team. Huge congratulations!

Problem I in this round was relatively standard, and yet there was a lot of room for coming up with a solution that is easy to implement, as the constraints were relatively low. You are given the coordinates of the vertices of a convex polyhedron in three-dimensional space. You need to determine the horizontal plane that separates the polyhedron's volume 1:9. The number of vertices is at most 100. How would you implement this?

Thanks for reading, and check back for more!

Friday, May 25, 2018

A prefix xor week

The Apr 30 - May 6 week was about early qualification rounds for the major tournaments. TopCoder Open 2018 Round 1B took place on Thursday (problems, results, top 5 on the left, parallel round results, analysis). The medium problem provided ample challenge opportunities, and Michael_Levin has capitalized on those and earned the top spot with quite some margin — congratulations!

In the parallel round, kotamanegi was able to find even more challenges, and thus earned the top spot in the all-time high score list — congratulations as well :)

Google Code Jam 2018 Round 1C followed on Saturday (problems, results, top 5 on the left, analysis). The somewhat unusual interactive second problem did not delay Eryx much, as he finished the round in just over 35 minutes (8 seconds faster than austrin in 3rd place). Well done!

In my previous summary, I have mentioned a cute AtCoder problem: you are given a tree with 200000 vertices, each containing a number, either 0 or 1. We need to put all its vertices in some order in such a way that each vertex except the root appears to the right of its parent. We will then write out a sequence of 0s and 1s corresponding to numbers in the vertices in this order. What is the minimum possible number of inversions in this sequence? An inversion is a pair of positions such that the number in the left position is 1, and the number in the right position is 0.

The key idea that is widely applicable in this type of problems is called the exchange argument. Suppose we have some way to put the vertices in order, which results in some sequences of 1s and 0s. Let's consider two consecutive blocks of numbers in the resulting sequence, for example one block contains a 1s and b 0s, and the following block contains c 1s and d 0s. What happens to the number of inversions if we swap those blocks? It will change by cb-ad. Thus in an optimal sequence for every pair of consecutive blocks that are swappable (with respect to the parent-child constraint) we must have cb-ad>=0, which we can rewrite as c/d-a/b>=0.

Now suppose we have identified the optimal sequence for all children of some vertex, and want to find the optimal sequence for the vertex itself. The number for this vertex needs to go first, and then we need to interleave the sequences for the children somehow. In theory we could also reorder the optimal sequences for the children, but intuitively it is not necessary, and it will become clearer after we complete the solution.

Interleaving the sequences for the children can be viewed as splitting each child sequence into blocks, and then interleaving those blocks. From the exchange argument above, we need to take the block with the lowest fraction of 1s every time (lowest value of a/b).

There are many ways to split each child sequence into blocks, but we can notice that the first block must be at least the prefix p with the lowest fraction of 1s: in case we take a smaller prefix, the remaining part of p has an even lower fraction of 1s, and thus by the exchange argument above we'll be able to drag it to the beginning and improve the solution.

By applying the same argument repeatedly, we arrive at the following solution: each child sequence is split into blocks by taking the prefix with the lowest fraction of 1s as the first block, then the prefix with the lowest fraction of 1s of the remaining sequence as the second block, and so on. It's not hard to see that the fractions of 1s in those blocks for one child sequence will be nondecreasing, so in order to interleave those blocks we just sort them by the fraction, and blocks within one sequence will keep the same order.

It is however too slow to split each child sequence into blocks for every vertex, as potentially each sequence has a linear number of blocks, so the overall algorithm would be quadratic. However, since we interleave the blocks in nondecreasing order of fraction, we can notice that the sequence of blocks for each vertex is obtained by interleaving the sequences for the children — we don't need to re-split it. Then we prepend either a 0 or a 1 in the beginning. If it's a 0, we can simply make it a separate block and we'll still get a nondecreasing sequence. If it's a 1, it will "eat" a few following blocks until our sequence becomes nondecreasing.

This leads to the following approach: for each vertex, we'll build a sequence of blocks, ordered by the fraction of 1s, stored as a balanced search tree or a priority queue. In order to build the sequence for a vertex, we need to merge the sequences from its children, and add a new block in the beginning corresponding to the value in the vertex, and then repeatedly merge that block with the next block while its fraction is higher. When merging, we apply another relatively standard trick and insert blocks from the smaller sequence into the larger sequence, which guarantees that each block would be inserted at most log(n) times, and thus overall running time would be O(n*log2(n)).

I derive additional pleasure from the fact that while building this solution, we have essentially understood the mechanics of the problem in great detail: while initially we had a great deal of freedom in writing out the sequence, and it was not clear why we can even find the optimal sequence in polynomial time, now we realize that we just interleave blocks and merge them.

I have also mentioned a Codeforces problem: you are given 100000 numbers ai, each up to 260. You need to put them in such order that the xors of all prefixes of the resulting sequence form a strictly increasing sequence themselves.

The solution to this problem is quite the opposite to the previous one: instead of building a detailed understanding, we come up with an argument that works somewhat magically. It is not hard to prove that the solution works, but it keeps looking unexpected and magical.

Let's look at the highest bit set in at least one of ai's. Suppose it is set in some subset of numbers. Then the prefix xors will have 0 in this bit before the first such number, 1 in this bit after the first such number but before the second such number, 0 in this bit after the second such number, and so on. Since the resulting sequence must be non-decreasing, we must actually have exactly one such number. Let's start forming our resulting sequence by putting just this number there. Now we have two parts of the sequence, before this number and after it (including the number itself), which are somewhat independent: no matter which numbers go to the before part, its xor has 0 in the highest bit, and after xoring in our number the highest bit becomes 1, so we're guaranteed to have a strict increase on the boundary of the parts.

Now let's look at the next highest bit set in at least one of the remaining ai's, and look at the numbers where this bit is set. Just as before, the prefix xors will have 0 in this bit before the first such number, 1 in this bit after the first such number but before the second such number, 0 in this bit after the second such number, and so on. The only difference is that the set of numbers with this bit set might include the number that we already have in the sequence. If this is the case, then even though our bit will change from 1 to 0, we'd still have an increasing sequence, since a higher bit will change from 0 to 1 at the same point. So in case the already placed number has 1 in our bit, we can add two numbers with our bit as the highest bit: one in the beginning, and one after the already placed number. And in case the already placed number has 0 in our bit, we can add at most one number with our bit as the highest bit. Once again, after doing that, our sequence is guaranteed to be increasing at the boundaries of already placed numbers no matter what we insert between them.

A general step of our algorithm looks like this: we already have all numbers with highest bit higher than x placed, and want to place the numbers where the highest bit is x. We insert one such number in the beginning, and one after each already placed number where bit x is set, going from left to right, until we have no more numbers where the highest bit is x. If we run out of places before we run out of such numbers, there's no solution.

Note that the solution works just barely: for example, we can't put the new numbers after any subset of already placed numbers where bit x is set: we must go from left to right instead, to guarantee that the prefix xor does not have bit x set at each insertion point.  That's why I can't get rid of the feeling that this solution works almost by accident.

Thanks for reading, and check back for more!

Monday, May 14, 2018

A uniform combination week

Most teams have returned home from Beijing by the end of the Apr 23 - Apr 29 week, and the other contests returned in full swing. AtCoder Grand Contest 023 took place on Saturday (problems, results, top 5 on the left, analysis). The round was won by none other than the newly minted World Champion cospleermusora (also known as V--o_o--V and overtroll). yutaka1999 was also able to solve all problems, but required 30 more minutes to do so. Congratulations to both!

Problem F was very cute. You are given a tree with 200000 vertices, each containing a number, either 0 or 1. We need to put all its vertices in some order in such a way each vertex except the root appears to the right of its parent. We will then write out a sequence of 0s and 1s corresponding to numbers in the vertices in this order. What is the minimum possible number of inversions in this sequence? An inversion is a pair of positions such that the number in the left position is 1, and the number in the right position is 0.

Maybe I enjoyed this problem because I have set a problem in the past which involved the same strategy for producing a string from a tree.

VK Cup 2018 Round 3 happened on Sunday (problems, results, top 5 on the left, parallel round resultsanalysis). The ICPC champions, this time two of them, kept with their champion-y ways and solved two more problems than everybody else. Unbelievable!

I found problem C exceedingly beautiful. You are given 100000 numbers up to 260. You need to put them in such order that the xors of all prefixes of the resulting sequence form a strictly increasing sequence themselves.

Right after the Codeforces round ended, Google ran Code Jam 2018 Round 1B (problems, results, top 5 on the left, analysis). overtroll has continued his impressive form (see above) with another victory, this time with a healthy margin of 12 minutes. Well done!

In my previous summary, I have mentioned a World Finals problem: there are n people, each starting with 1 gem. The following operation is repeated d times: take one of the gems uniformly at random, and split it into two gems (so the person holding it will have one more gem). After doing all that we order all people by the number of gems they have in decreasing order, and add up the number of gems of the first r people in that order. What is the expected value of that sum? n and d are at most 500.

The first part of the solution is understanding the sequence generation process. At the first sight, it looks rather complicated, with the probability to get a new gem for each person changing along the way. However, let's look at the process from the following angle: let's put all gems for all people in one sequence, and every time a gem is divided we'll insert a new gem to the right of it. We'll also distinguish the original gems — the ones people start with — from the newly generated ones.

We start by having n original gems, and we can notice now that at each step we simply insert a new gem into a position in our sequence that is picked uniformly at random, except from the position before the first original gem which is never used. This in turn makes it clear that the resulting sequence starts with an original gem, and then has a sequence of n-1 original gems and d new gems picked uniformly at random from all C(n-1+d,d) such sequences. Each person gets all gems from their original gem to the next original gem in this sequence.

A uniformly chosen combination is a well-known object which is easy to work with, which allows us to proceed with solving the problem using either dynamic programming or more combinatorics, as outlined in the semi-official analysis.

Thanks for reading, and check back around the next weekend for more!

Sunday, May 13, 2018

An alma mater week

The Sächsilüüte week started with Codeforces Round 475 (problems, results, top 5 on the left, analysis). Three contestants managed to solve all problems, but some were faster than the others :) In particular, V--o_o--V has finished after just 81 minutes, and thus won with a 500-point margin. Well done!

One of the main competitive programming events of the year, ACM ICPC 2018 World Finals took place on Thursday (problems, results, top 13 on the left, our screencast with commentary, official broadcast, analysis). The deciding events of the competition happened in the last few minutes, when the Moscow State University team managed to squeeze in a solution to roblem E with an extra log factor by changing the number of iterations in binary search and getting something like TLE-TLE-TLE-OK-WA-WA-WA from seven attempts with different values of the magic constant.

That OK might well turn out to be TLE or WA as well, but fortune favors the bold, and what essentially happened was that the Moscow State University team did great in using all their resources and creativity up to the last minute and got a well-deserved victory. Really happy for the team and for my alma mater to finally get the cup that I and many others could not deliver in the past :)

Big congratulations to all other medalists on the great result, too!

Problem D brought the most excitement for me in this problemset. There are n people, each starting with 1 gem. The following operation is repeated d times: take one of the gems uniformly at random, and split it into two gems (so the person holding it will have one more gem). After doing all that we order all people by the number of gems they have in decreasing order, and add up the number of gems of the first r people in that order. What is the expected value of that sum? n and d are at most 500.

I'm also really interested in hearing what do you think about our stream and about the official broadcasts, if you had a chance to check them out, and I'm sure the ICPCLive team is very interested as well. Please share your observations or ideas!

Finally, TopCoder Open 2018 Round 1A took place on Saturday (problems, results, top 5 on the left, analysis). With the problems quite a bit on the easy side, the challenge phase was instrumental in determining the winner — congratulations to Dembel on finding the +125 and the victory!

Thanks for reading, and check back for more!

Changes to commenting system

I've changed the commenting system in this blog from HyperComments (thanks to its authors for making it!) to built-in Blogger comments, because HyperComments is discontinuing free usage. In the past years Blogger has added support for threaded replies which was the main motivation for me to switch to HyperComments in the past, and using an external commenting system brought its own pains, so switching back to the default seems to be the logical thing to do.

Please tell if you encounter some issues with commenting using the new (old?) system!

One side effect is that all comments made through HyperComments are not visible now. I have an xml export of all of them, but it's not clear at this point how to import those back into the Blogger system. Please share ideas if you have them :)

A +300 week

The week before the ICPC World Finals featured two competitions, both on Saturday. First off, Google Code Jam 2018 Round 1A took place very early (problems, results, top 5 on the left, analysis). This was the first round under complete new rules, as the penalty time did not matter in the qualification. The optimal strategy, of course, stayed more or less the same — just solve all problems quickly and without bugs :) vepifanov has executed it really well, congratulations on the victory!

TopCoder SRM 733 followed later that day (problems, results, top 5 on the left, analysis). Kriii has really dominated the proceedings, spending a bit over 18 minutes on all three problems in total Compare that to a bit over 41 minutes that kuniavski needed, and he came in second! Amazing performance by Kriii.

This was the third SRM which doesn't list cgy4ever in its authors, marking the transition to misof as the new TopCoder problem coordinator. Congratulations to misof on the new role, looking forward to the next SRMs!

Thanks for reading this [short] post, check back soon for more!

Friday, May 11, 2018

An elimination week

The Apr 2 - Apr 8 week had a couple of competitions on the weekend. Codeforces Round 474 took place on Saturday (problems, results, top 5 on the left, analysis). Solving eight problems correctly in just over two hours is an amazing feat — OO0OOO00O0OOO0O00OOO0OO was on top of his game this time. Well done!

Yandex.Algorithm 2018 Round 3 on Sunday (problems with Yandex login, results, top 6 on the left, analysis) has completed the Elimination stage. Unlike the first two rounds, this time blind submissions were actually necessary for the victory — congratulations to Merkurev for pulling four of those off !

Here is the final Elimination stage scoreboard (top 5 on the left), with top 25 advancing to the finals held both online and in St Petersburg.

Also on Sunday, Open Cup 2017-18 continued its streak with the Grand Prix of Warsaw (results, top 5 on the left). SPb ITMO U 1 team demonstrated impressive form, solving all 11 problems at the moment when all other teams had at most 9. Three other teams got to 10 problems in the remaining time, but nobody could approach the first place. Congratulations to the ITMO 1 team!

Thanks for reading, and check back for more!

Wednesday, April 18, 2018

A second extremal week

The last week of March featured more contests from the regular platforms. First off, TopCoder SRM 732 took place very early on Friday (problems, results, top 5 on the left). Both the medium and the hard problem involved asymmetric games requiring quite some insight to solve, and thus only two contestants were able to solve each (top 4 in the table to the left). Congratulations to all four, and especially to pashka on the win!

Then AtCoder held its Grand Contest 022 on Saturday (problems, results, top 5 on the left, analysis). The contest had three relatively easy problems, and three very hard ones. Only a few contestants were able to solve one of the three (I couldn't solve any, for that matter), so it is extremely impressive that Um_nik got all three — huge congratulations on very well deserved victory!

Problem E was the most approachable of the three. Consider a string of 0s, 1s and ?s of odd length up to 300000. First, we need to replace each ? with a 0 or a 1. Then, we repeatedly do the following operation: take any 3 consecutive characters, and replace them with the most frequent character among them. For example, 001 is replaced with 0. In the end we end up with a string of length 1, and our goal is to get the string 1. How many ways are there to replace the ?s so that it's possible to get the 1 after all reductions?

And finally, Open Cup 2017-18 Grand Prix of Moscow on Sunday wrapped up the week, but its results have not yet been published.

In my previous summary, I have mentioned an Open Cup problem: you are given n points p1p2, ..., pn on the plane and q queries (n, q <= 100000). Each query is defined by two numbers a, b, and you need to print the size of the smallest square with sides parallel to coordinate axes that contains all points from a-th to b-th (from the list papa+1, ..., pb) except maybe one.

Assuming we forget about "except maybe one" part, we need to find the bounding box of a segment of points, which can be done by finding the leftmost, rightmost, topmost and bottom-most points using interval trees in O(n*log(n)).

Now we can notice that when the skipped point is not one of the four extremal points, then the bounding box does not change, so we need to check at most four possibilities for the skipped point. In order to be able to find extremal points after skipping one point, our interval trees will need hold two extremal points instead of one, but otherwise the solution stays the same.

Thanks for reading, and check back for more!

ACM ICPC 2018 World Finals stream on Twitch

We have figured out a working setup for our stream, so tune in to https://www.twitch.tv/petrmitrichev around 10:00 Beijing time tomorrow!

Tuesday, April 17, 2018

A group embedding week

VK Cup 2018 Round 2 took place during the The Mar 19 - Mar 25 week (problems, results, top 5 on the left, parallel round results, analysis). Team Нижний Магазин SU: BZ would be first by far even without solving the last problem, but getting all problems accepted in the final minutes of the contest was of course the icing on the cake. Congratulations!

Open Cup 2017-18 Grand Prix of America wrapped up that week (results, top 5 on the left, parallel round results), with another two World Finals favorites in top 5: SPb ITMO 1 and Moscow SU Red Panda. The first place, however, went to team Past Glory — once again thanks to their incredible accuracy. Congratulations!

Problem K in this round was quite educational, if a bit professional. You are given n points p1p2, ..., pn on the plane and q queries (n, q <= 100000). Each query is defined by two numbers a, b, and you need to print the size of the smallest square with sides parallel to coordinate axes that contains all points from a-th to b-th (from the list papa+1, ..., pb) except maybe one. Can you dissect this problem into standard pieces?

In my previous summary, I have mentioned a Yandex.Algorithm problem: you are given a string s with 100000 characters, each a, b or c. You must swap exactly two distinct letters to obtain a new string t. How many ways are there to do that in such a way that the string t is good? In this problem we define a good string somewhat similarly to a valid parentheses sequence: empty string is good, if a string u is good then the strings auabubcuc are good as well, and if two strings u and v are good, then their concatenation uv is also good.

The key insight in this problem is to learn to check if a string is good easily. Let's pick a group and 3 elements of order 2 in it: aa=1, bb=1, cc=1. We will then map each string to the product of the corresponding elements of the group. We can see that any good string will map to 1. Moreover, if our group is general enough, in other words does not have any other non-derived products that are equal to 1 (or if such products are simply unlikely to be equal to 1), then the opposite is also true: when a string maps to 1, it is good. As an example group that works in this problem we can use the group of movements of 3D space that keep the origin fixed, where ab and c correspond to reflections through three random planes passing through origin. This way, we obtain a compressed representation for a string of any length as a single 3x3 matrix (modulo some prime number, to avoid dealing with floating point).

We can then use the divide-and-conquer approach: we start by splitting our string in two in the middle, and consider the case where the two positions being swapped are in different halves. If we also try all 6 possibilities for the letters being swapped, we have a sub-problem of the following kind: we have two strings, and need to replace a single a with b in the first string, and a single b with a in the second string, so that after doing those replacements and concatenating the strings we get a good one.

We can compute the compressed representation of each candidate for the first half using prefix and suffix products, then compute the compressed representation of each candidate for the second half in the same way, and then for each representation of first half find its inverse element in the second half, thus processing the case where the swapped elements are in different halves in O(n).

In order to handle the case where both swapped elements are in the same half, we follow the divide-and-conquer approach and recursively execute the same algorithm for each half, with the only difference that the target product in each half we want to get is not 1, but rather the inverse of the product of the other half. This allows to complete the solution of the entire problem in O(n*log(n)).

Thanks for reading, and check back soon!

A favorites week

The Mar 12 - Mar 18 week started with Yandex.Algorithm Round 2 on Tuesday (problems with Yandex login, results, top 5 on the left, my screencast). The hardest problem A was left unsolved despite 78 attempts, and yet the actual implementation is quite straightforward once one gets the idea. You are given a string s with 100000 characters, each a, b or c. You must swap exactly two distinct letters to obtain a new string t. How many ways are there to do that in such a way that the string t is good? In this problem we define a good string somewhat similarly to a valid parentheses sequence: empty string is good, if a string u is good then the strings auabubcuc are good as well, and if two strings u and v are good, then their concatenation uv is also good.

TopCoder SRM 731 took place on Saturday (problems, results, top 5 on the left). mjhun was the fastest during the coding phase, but Gassa was close enough to require just one successful challenge to climb into the first place. Congratulations to both of you!

Finally, the Open Cup 2017-18 Grand Prix of Belarus happened on Sunday (problems, results, top 5 on the left). Team Past Glory was once again the fastest — well done! This round also provided a World Finals prediction perspective, as there are two teams in the top 5 who will be competing in Beijing: Seoul NU and Peking U. This year's World Finals looks to be extremely competitive, with top teams like Seoul NU, Peking U, SPb ITMO, Moscow SU, Warsaw U all of comparable strength (did I miss a team that you think is one of the favorites to win? Sorry, and please share in comments!) I'm looking forward to the final showdown on Thursday!

In my previous summary, I have mentioned a cute Open Cup problem: you are given a n times n matrix A of integers modulo a prime number p. You need to find the smallest positive integer k such that Ak=0 (modulo p), or report that it does not exist, in O(n3).

The answer never exceeds n, which in very rough terms can be derived from looking at Jordan normal forms (is there a simpler argument?..), which enables the following O(n4) approach: just compute the first n powers of A. It can be sped up to O(n3*log(n)) by using binary search, since once a power of A is zero, all following powers are zero as well.

In order to speed it up to O(n3), we need to come up with a beautiful idea: instead of computing just Ak, we will compute Ak*v for some vector v of size n. In case Ak=0, then Ak*v=0 as well. It turns out the opposite is almost always true: Ak*v over all v defines a linear subspace, which has at least one dimension in case Ak is not zero, and thus we can just pick v randomly and the probability of Ak*v being zero will not exceed 1/p.

Since we can multiply a matrix by a vector in O(n2), we can compute v, A*vA2*v, ..., An*v in sequence in O(n3) overall.

Thanks for reading, and check back for more!

ACM ICPC 2018 World Finals stream

ACM ICPC 2018 World Finals take place this Thursday at 10:00 Beijing time (click for other timezones). Just like last year, we'll try to solve it in parallel with tourist and Endagorion, and stream the process, assuming the problems will be available for submission on Kattis.

I'm not yet sure if it's going to be on Youtube, which does not work well in China, or somewhere else. I will post an update here and in Twitter once we figure out the right setup. Stay tuned!

Saturday, April 14, 2018

A binary block week

The Mar 5 - Mar 11 week had two Codeforces rounds. Round 469 took place on Friday (problems, results, top 5 on the left, analysis). Since I'm writing this post on the plane to Beijing for ICPC 2018 World Finals, I can't help but look at the scoreboard from the ICPC angle: the winner dotorya will be participating in the Seoul NU team, and the third-placed Syloviaely is part of the host Peking U team. Congratulations on the great Codeforces result, and best of luck in the World Finals!

VK Cup 2018 Round 1 took place one day later (problems, results, top 5 on the left, parallel round results, analysis). Team VK Cup 24329020081766400008 from Saratov was already among the fastest in coding, but managed to stand out thanks to the challenges, even despite their solution for the hardest problem failing systests. Well done!

Finally, after a week's hiatus the Open Cup came back for another five-week marathon which started with the Grand Prix of Baltic (problems, results, top 5 on the left). Team Past Glory was not as fast as t.me/umnik_team, but they were more accurate, and thus won by a few penalty minutes. Congratulations to both teams on the great performance!

Problem E had a very simple statement and a very cute solution. You are given a n times n matrix A of integers modulo a prime number p. You need to find the smallest positive integer k such that Ak=0 (modulo p), or report that it does not exist. It is relatively straightforward to do in O(n3*log(n)), but your solution needs to run in O(n3).

In my previous summary, I have mentioned a difficult Codeforces problem: consider an unknown binary string of length k, k is up to a billion. You're given up to 100000 segments [ai,bi], and know that each of those segments contains at least one 0. In a similar vein, you're also given up to 100000 segments [ci,di], and know that each of those segments contains at least one 1. How many such binary strings exist, modulo 109+7?

First, how can we approach this problem if we forget that k is very big? Let's split our string into blocks of 0s and 1s, and compute pj — the number of ways to choose the first j characters ending with a block of 0s, and qj — the number of ways to choose the first j characters ending with a block of 1s. To compute pj, we iterate over the ending point t of previous block of 1s, and we see that pj is a sum of qt. t must be at most j-1, and at least max(ci) over all i such that di<=j, to make sure that no [ci,di] segment only contains 0s. We can compute qj in a symmetric way.

Since max(ci) only ever increases as we increase j, we can maintain a sliding window of qt's together with their sum, and thus obtain a O(k) amortized time solution.

Now let's look more closely at this process in case we don't encounter any new segment endpoints. Suppose we have just found pj and qj, and now are looking to find pj+1 and qj+1 while the left boundaries of our summation max(ai) and max(ci) stay the same. pj+1 is the same sum as pj, but with extra term qadded, so pj+1=pj+qj. Similarly, qj+1=pj+qj as well. So we got two equal numbers at (j+1)-th step, and on each further step these numbers will just keep multiplying by two until we encounter a new segment endpoint and summation left boundaries change.

This allows to obtain a O(nlogn) solution, where n is the number of segments: instead of computing all individual values of pj and qj, we will split our string into blocks between the segment endpoints, and in each block we will only store the first number, and all following numbers will be 2x the previous number. The formula for the sum of a geometric progression allows to handle such segments quickly.

Thanks for reading, and check back soon for more summaries and for some ICPC blogging!

Sunday, April 8, 2018

Google Code Jam qualification - last chance

There's about 4 hours left in the Google Code Jam 2018 qualification round. There were some stability issues with the new system earlier in the day, but they seem to have been resolved, so now's your chance to join the Code Jam if you haven't already. As usual I'm setting some problems this year, and I hope you'll find them interesting!

Monday, March 5, 2018

A power of two week

This week's contests were concentrated on the weekend. First, Yandex.Algorithm 2018 Round 1 has brought back the "open/blind" submission format on Saturday (problems with Yandex login, results, top 5 on the left, my screencast, analysis). It turned out that it doesn't really matter whether to choose open or blind if one gets all problems correct from the first try, and is the only one to solve everything :) Congratulations to tourist!

Problem E has a deceptively easy solution in the analysis, and yet it seems really hard to come up with. You are given a sequence of 100000 positive integers up to a billion. You need to find any integer k such that it's possible to transform our sequence into a strictly increasing sequence of positive integers by subtracting some of its elements from k (in other words, by replacing xi by k-xi for some set of i's). Can you see why this problem is actually quite simple?

On Sunday Codeforces hosted its Round 468 (problems, results, top 5 on the left, my screencast). It was V--o_o--V's turn to be head and shoulders above the competition, finishing all problems in less than an hour. Well done!

The hardest problem E involved finding an appropriate dynamic programming angle that allows "coordinate compression" to work. Consider an unknown binary string of length k, k is up to a billion. You're given up to 100000 segments [ai,bi], and know that each of those segments contains at least one 0. In a similar vein, you're also given up to 100000 segments [ci,di], and know that each of those segments contains at least one 1. How many such binary strings exist, modulo 109+7?

Last week, I have mentioned a TopCoder combinatorics problem: find the sum of 2x1 (two to the power of the minimum number) over all possible ways to choose k distinct positive integers up to n: 1<=x1<x2<...<xk<=n, where n is up to a billion, and k is up to a million.

The key idea is: instead of summing 2x1 for each combination, which seems hard to do, we will further subdivide each combination into 2x1-1 different objects, so that we just need to count the number of objects and multiply by 2. More specifically, if we have k distinct positive integers, and the smallest is x1, then there are exactly 2x1-1 ways to add some subset of numbers between 1 and x1-1 to it. As a result, we get a set of at least k distinct positive integers. And moreover, we can obtain all such sets: each such set can be obtained by subdividing the set of k its largest numbers. So we need to return 2 times the number of ways to choose at least k objects out of n, which is much easier to compute.

Thanks for reading, and check back next week!

Monday, February 26, 2018

An infinite ratio week

This week had a round from each of the platforms I usually cover. First off, TopCoder held SRM 730 on Tuesday (problems, results, top 5 on the left, analysis). The competition was ultimately decided during the challenge phase where I was basically very lucky: my room had 8 incorrect solutions for the 250, and I've managed to bring down 4 of those; ACRush had only 4 incorrect solutions in his room, one of those was challenged by somebody else, and he got the other 3. Nevertheless, amazing comeback after an almost half-year break for ACRush!

Quite unusually, the hard problem was solved by ksun48 in less than 5 minutes. The problem asked to find the sum of 2x1 (two to the power of the minimum number) over all possible ways to choose k distinct positive integers up to n: 1<=x1<x2<...<xk<=n, where n is up to a billion, and k is up to a million.

Next up was AtCoder Grand Contest 021 on Saturday (problems, results, top 5 on the left, analysismy screencast). tourist was nearly unstoppable this time, earning his first place with about 25 minutes left in the contest. Egor in 3rd place had the most realistic chance to overtake him thanks to a creative strategy, as he decided to avoid spending more time to earn the fairly technical final 300 points in the partial-scoring problem F, and instead focused on problem E which would give 1200 points if solved. Unfortunately 25 minutes were still not enough for that. Nevertheless, congratulations to both!

Problem B was quite cute. You are given 100 points on the plane. For each given point, consider the part of the plane where it is the closest of the given points (a Voronoi diagram). Some of the parts will be finite, and some will be infinite. What are the relative areas of the infinite parts (see the problem statement for the formal description)?

Sunday started with the Open Cup Grand Prix of Saratov (problemsresults, top 5 on the left). Six teams were able to solve 11 this time, but team Past Glory has managed to do that without a single incorrect attempt — amazing!

After a short break, Codeforces Round 467 wrapped up the week (problems, results, top 5 on the left, my screencast). mnbvmar has earned his first place by solving 4 problems in just over an hour, 15 minutes faster than anybody else — and then protected his lead from Syloviaely's surge by finding 5 successful challenges. Very well done!

Problem C provided a level playing field for experienced and inexperienced competitors, as it had nothing to do with standard algorithms or even standard ideas. You are given a string s of length 2000, and need to transform it into a string t of the same length using at most 6100 operations, or report that it's impossible. In one operation, you can split the current string into two, reverse the second part, and then put the first part after the reversed second part: for example, you can obtain the string fedcab from the string abcdef in one operation. Either part is allowed to be empty. Can you see a way to do the transformation of length n in at most 3n operations? Somewhat surprisingly, there are many working approaches in this problem.

Thanks for reading, and check back next week!

Sunday, February 18, 2018

A Fenwick bound week

Codeforces was quite active this week with two Division 1 rounds. The 462nd one took place on Wednesday (problems, results, top 5 on the left, analysis). Um_nik was the only one able to solve all problems, submitting one in the last minute — but he would've won without that one anyway. Congratulations on the great performance!

Round 463 took place on Thursday (problems, results, top 5 on the left, analysis). It was Radewoosh's turn to be the only competitor to solve all problems, with 7 minutes to spare this time. Well done!

He will be representing his university in the upcoming ICPC World Finals in Beijing, along with a few others that you can see in the screenshots on the left, so the competition there promises to be very interesting :)

Finally, the Open Cup Grand Prix of Gomel on Sunday provided another glimpse into full team performances (results, top 5 on the left). Moscow SU team Red Panda has found the winning ways again, this time thanks to being the only team to solve problem I — and doing it in the beginning of the third hour of the contest, which is quite unusual for a problem with only one solver. Congratulations!

Last week, I have mentioned a data structure problem from the Open Cup. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group ci already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most ai people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of ci and ai in the order the people come to lunch, how will the final queue look like?

One could immediately start thinking about standard approaches applicable in this problem. The fact that we need to be able to run a query against the last ai people in the queue suggests that we should probably use a balanced tree that supports quick splitting, like a treap, and maintain the size of the subtree plus some additional structures necessary to answer queries in each node. This approach can probably be made to work, but the nice thing about the problem is that we can obtain a much easier to code solution.

Since every person only ever joins the queue either in last position, or next to a person from their group, if we maintain the queue as a list of segments of people from the same group, in other words as a list of (group, counter) pairs, then the only operations we need to support is to increment a counter and to add a new group to the end of the list. Now in order to find a place for the next person we need to find the set of groups corresponding to the last ai+1 people, and then find the first group of the required type in that set.

In order to find the boundary on the group level, we need to be able to find the largest prefix for which the sum of counters does not exceed a given value. This can be done in O(logn) by maintaining the counters in a Fenwick tree, I will give more details below.

And in order to find the first group of the required kind after the given one, we can simply maintain a vector of indices of groups for each kind. Since we only ever create new groups at the end, these indices never change, and a simple binary search can find the first group of the given kind after a given index.

Finally, after we know what happens with groups we know at which position does each person enter the queue, but we still need to output the final order. This can be done using the same Fenwick tree with "largest prefix with sum not exceeding x" operation we already used: we go from the end and maintain the free spots in the Fenwick tree.

So instead of a balanced tree with extra information in nodes, we've managed to get by using two very easy to code data structures: a Fenwick tree, and a vector. The solution is O(nlogn), and the constant hidden in O() is also really small.

This solution used the following non-standard operation on a Fenwick tree: find the largest prefix with sum not exceeding x. A naive implementation using binary search and Fenwick prefix sums would give O(log2n) complexity, which would most likely still be fast enough to get the problem accepted. However, tgehr has pointed out to me that the standard Fenwick tree allows an extremely simple O(logn) way to answer this question.

Suppose our Fenwick tree has n elements, and k is the largest power of 2 not exceeding n. The (2k-1)-th element of the Fenwick tree array contains the sum s of elements on the segment [0;2k-1], so by comparing x with just this number we know if our answer is below or above 2k. Let's suppose it's above 2k. Then we notice that (2k+2k-1-1)-th element of the Fenwick tree array contains the sum of elements on the segment [2k;2k+2k-1-1], so by comparing x-s with just this number we know if our answer is below or above 2k+2k-1. We continue traversing the powers of two in the same manner, and it turns out that the standard Fenwick tree stores exactly the sums needed to execute this binary search! Here's the actual code:

private int upperBound(int[] f, int x) {
    int res = 0;
    int max = Integer.numberOfTrailingZeros(
        Integer.highestOneBit(f.length));
    for (int k = max; k >= 0; --k) {
        int p = res + (1 << k) - 1;
        if (p < f.length && f[p] <= x) {
            x -= f[p];
            res += 1 << k;
        }
    }
    return res;
}

Thanks for reading, and check back next week.

Monday, February 12, 2018

A leafy week

This week featured two weekend contests. First, TopCoder SRM 729 took place on Saturday (problems, results, top 5 on the left, my screencast with commentary). If you sum up the problem columns in the screenshot on the left, you can notice that the sum doesn't match the score column, and that's because the match presented ample opportunities for challenging. You can see on the screencast as I try to prepare an uber-challenge-case for the 450 during the intermission, and spent the beginning of the challenge phase getting it to work, while many solutions were already being challenged. uwi has made the best use of the challenge opportunities and thus claimed the first place. Well done!

Most of the challenge opportunities presented themselves in the medium problem, which looked very standard at first glance. You are given a 1000x1000 grid. In one jump, you can move from a cell to any other cell that's at least the given distance d away — you can't jump very close. What is the smallest number of jumps required to get from one given cell to another given cell?

We could use a standard breadth-first search to solve this problem, but we have 1 million cells and potentially 1 million jumps from each cell, so the total running time would be on the order of 1012 which is too slow. Can you see at least one way to speed this solution up? (there are many!)

The other weekend contest was the Open Cup 2017-18 Grand Prix of Khamovniki (results, top 5 on the left). Unlike the previous round, the active ICPC teams from Russia were not at the top of the standings, with only two ICPC teams from Asia and a veteran team Past Glory able to solve 10 problems — congratulations, and especially to Seoul National U 2 team for another Open Cup victory!

Problem D "Lunch Queue" was from the rare species of data structure problems that I enjoyed solving. n people are coming to lunch and forming a queue. The people are split into disjoint groups. When i-th person comes to the lunch, if there's nobody from their group ci already in the queue, they stand at the back of the queue. When there is somebody from their group already in the queue, they can also stand next to any such person (directly before or after), but only if they would be cutting in front of at most ai people by doing so. If they have multiple positions where they can join the queue according to the above restrictions, they always pick the front-most one. Initially the queue is empty. Given the values of ci and ai in the order the people come to lunch, how will the final queue look like?

In my previous summary, I have mentioned a hard AtCoder problem: you are given a rooted tree with n vertices, and each vertex contains an integer between 1 and n, each number appearing exactly once. Your goal is to rearrange the numbers in such a way that vertex 1 has number 1, vertex 2 has number 2, and so on. You're allowed to do the following transformation: take the path connecting the root of the tree with any vertex v, and cyclically rotate it, placing the number that was in root into v, the number that was in v into the parent of v, and so on — essentially a generalization of insertion sort from a single path to an arbitrary tree. You need to sort the tree in at most 25000 operations for n=2000.

I did not solve the problem myself, so I will describe the solution from the official analysis. First, we can notice that given any leaf, we can put the correct value into it in <=n operations (first pull it up to the root, then put it into the leaf in one rotation, and then never touch this leaf again, so we could sort everything in n2 operations, which is too much.

However, we can notice that in this approach we have quite a lot of freedom for choosing the moves that pull the given number up. We could try to use this freedom to try to pull up the numbers for all leaves, not just for one leaf. But even that would not be enough, as when our rooted tree is a chain it only has one leaf. However, we know that for a chain a simple solution is possible, called the normal insertion sort: we'll do exactly n operations, and after k operations we'd have k bottom numbers of the chain already sorted in correct order, in each turn inserting the new number to the appropriate place.

Now we need to combine the ideas of pulling the required number from anywhere in the tree with the idea of filling a chain in one O(n) operation stage in such a way that the number of stages would be O(log(n)) for any rooted tree. More precisely, we will find all leaf chains in the tree — chains that hang at the bottom with nothing else attached to them, and fill them all with correct values in one stage. This guarantees O(log(n)) stages since the number of leaves is divided by at least two after each stage.

Whenever the root of the tree has a useful value — one that must be in one of the leaf chains — we will send it there, inserting it into the correct place relative to other values that we've sent to the same leaf chain, just like the insertion sort approach above. And when the root has a useless value, we need to send it somewhere, so let's send it to any position which contains a useful number, but such that all its subtree contains either only useless numbers, or useful numbers that are already placed into the corresponding leaf chain (in other words, the numbers that we don't need to get to the root anymore). This will push this useful number towards the root, and create a subtree that doesn't have any numbers that we want to get to the root.

Why does this cycle finish eventually, and more importantly why does it run in O(n) operations? Each useful value passes through the root only once. A useless value, after passing through the root, ends up being in a dormant subtree, and would never pass through the root again, unless we need to touch this subtree because we're putting a useful value into it. In each such operation, at most one value moves from a dormant subtree back into circulation, and thus can reach the root again. So the total number of times a useless value that we've already seen once returns to the root does not exceed the total number of times we put a useful value into its place, meaning that the total number of operations for one stage is at most n plus total size of leaf paths, so O(n).

Thanks for reading, and check back next week!