Tuesday, December 31, 2019

An exponential week

Before we come to the Nov 4 - Nov 10 week, I've realized that I forgot to cover the Open Cup 2019-20 Grand Prix of Siberia that took place on Nov 3 (problems, results, top 5 on the left). Team USA1 was once again the fastest, but Team Past Glory got their third stage win thanks to solving 11 problems in 11 submissions. Amazing accuracy!

One other very notable thing from the Oct 28 - Nov 3 week that I forgot to mention was this post by ouuan which highlighted this post by min_25. It turns out that the "successive shortest paths" min-cost-max-flow algorithm that I was taught a long time ago and was implementing ever since is actually exponential, and can time out on graphs as small as 50 vertices! Before this post, I have assumed that it's actually polynomial, and even if the power of that polynomial is high, it is fast enough on graphs with thousands of edges. I guess some people in the community already knew this as the original post in Japanese is from 1.5 years ago, but it was news for me and maybe for you as well :)

Coming back to the main topic of this summary, the Nov 4 - Nov 10 week, Codeforces Round 599 took place on Wednesday (problems, results, top 5 on the left, analysis). Problems D and E were both quite hard to get, and Benq got E, which gives more points, reasonably fast. It means that he won the round and jumped to the first place in the rating list — huge congratulations!

The Open Cup returned with the Grand Prix of Poland on Sunday (problems, results, top 5 on the left, analysis in Polish). This time team Past Glory actually had more incorrect attempts than the only other team that solved all problems, but they had enough speed advantage to overcome that. Congratulations to them and to the team japan02!

Problem E used a nice trick that I did not know before. There are n drones (n<=500000), the i-th drone initially in the point (xi,yi,0). Some pairs of drones are connected with cables, those cables are straight lines and they do not intersect except at endpoints. We are then performing k moves (k<=500000), in each move we change the z-coordinate of one of the drones (x- and y-coordinates always remain constant). The cables connected to this drone have to change their length correspondingly, and each cable has a known maximum length — in case it needs to become longer than that, it breaks and stays broken for the remainder of the process. Finally, you are given q (q<=500000) queries, each query being a pair of drones. For each query, you need to find the move which caused the two given drones to become disconnected (using the cables), if any.

The time limit is 30 seconds, so some O(n*sqrt(n)) or even O(n*sqrt(n*log(n))) solutions could pass. Can you see a O(n*polylog(n)) approach, though?

In my previous summary, I have mentioned an AtCoder problem: a string of even length of characters A, B and C is called valid if you can reduce it to an empty string by repeating the following operation: take any substring of length 2 that is neither AB nor BA, and erase it (pushing the two remaining parts back together). You are given an even number n up to 107. How many strings of length n are valid, modulo 998244353?

The problem becomes much easier if we make the following substitution: let's take our string, and in every second position (for example, in all even positions) we replace every A with B and vice versa. Each substring that used to be AB or BA will then become AA or BB! Therefore we can now erase any substring of length 2 that is neither AA nor BB, in other words we can erase any substring of length 2 which has at most 1 occurrence of both A and B. It is relatively easy to notice that it is impossible only if more than half of all characters of our string are A, or more than half of all characters of our string are B. Counting such strings of the given length can be done using combination numbers.

Thanks for reading, and check back for more!

Friday, December 27, 2019

A week to make 1

The Oct 28 - Nov 3 week had two weekend rounds. TopCoder SRM 770 was the first (problems, results, top 5 on the left, my screencast with commentary). bqi343 has demonstrated excellent raw speed and was ahead by a bit less than 100 points before the challenge phase, but then tourist has added his excellent challenge form to the mix and overtook bqi343. Congratulations to both!

I was struggling quite a bit in this round, and ended up solving both the 450 and the 900 with the solutions that were more complex than the intended ones. For the 450, instead of a direct min-cost-max-flow formulation, I've used a combination of min-cost matching of the given size and greedy which might or might not be equivalent.

For the 900, I've taken the solution that was intended to fail because of precision issues and made it pass using quite some squeezing, including switching to C++ to use long double and having a magic constant of 1.02 that allowed to balance between time limit exceeded and floating point overflow (it turned out that it was possible to squeeze in this solution in Java using doubles, too — but only with the magic constant equal to 1.007, while 1.008 led to overflows and 1.005 to time limit exceeded).

AtCoder Grand Contest 040 followed on Sunday (problems, results, top 5 on the left, my screencast with commentary in Russian, analysis). The same two contestants occupied the top two spots as in the SRM, but the margin was much bigger this time, with tourist solving five problems a good half an hour before the others. Well done!

This was already the fifth AGC authored by maroonrk this year — amazing productivity at such high level! My GP30 scores for those are 29, 100, 0, 0, 2 :) While it started nicely, I seem to be unable to crack his problems in most recent rounds (I think I did skip one of the rounds altogether, though).

I was mostly stuck with problem D in this round, but let me highlight problem C which uses an apparently well-known but still beautiful trick: a string of even length of characters A, B and C is called valid if you can reduce it to an empty string by repeating the following operation: take any substring of length 2 that is neither AB nor BA, and erase it (pushing the two remaining parts back together). You are given an even number n up to 107. How many strings of length n are valid, modulo 998244353?

In my previous summary, I have mentioned a Codeforces problem: you have n<=16 positive integers such that their sum does not exceed 2000 (let's denote that constraint as m), and another integer k, 2<=k<=2000. All n integers are not divisible by k. In one step, we can erase two numbers and instead write their sum divided by k repeatedly until the result is no longer divisible by k. For example, if k=2 and we have numbers 5 and 19, we can erase them and instead write 3 as 3=(5+19)/2/2/2 and 3 is not divisible by 2. Our goal is to end up with having just the number 1.

The solution with a 3n term in its complexity is somewhat straightforward: for each subset of numbers, we will compute which integers we can obtain from them. To process a subset, we will iterate over all ways to split it into two subsets which will be collapsed into the last two numbers we erase for this subset to obtain its final number, and over all possibilities for those two numbers. Since all numbers we get do not exceed m, and the number of (set, subset) pairs is 3n, this solution runs in O(3n*m2) which is too slow.

The key to obtaining a faster solution lies in an observation that looks almost trivial from the first glance: since each number is divided by k one or more times, either by itself or after being added to some other numbers, in the end we represent one as 1=sum(ai*k-bi). It turns out that the inverse is also true: if we can find the values of bi that satisfy the above equation, then there is a way to collapse everything into 1. To see that, we can find the highest value among bi, notice that it will be repeated at least twice (as otherwise sum(ai*k-bi) will not be an integer), and therefore we can take the two ai's corresponding to the highest value among bi and combine them according to the procedure to the problem statement to get a smaller set of values a'i and b'i where 1=sum(a'i*k-b'i) still holds.

We have reformulated our problem in a more mathematical way, but did we actually change anything compared to the original formulation? It turns out we did, as now our search can disregard the tree-like order in which we combine the elements in the original formulation. Instead, let's accumulate the terms of sum(ai*k-bi) in decreasing order of bi. Note that this is not equivalent to the original tree-like order even though the first two elements being joined are the same (two with the highest value of bi). The joins after those first two elements can be different, as it can happen that we divide their sum by k a lot of times.

Accumulating in decreasing order of bi can be reformulated in the following way: we have our current sum s that starts with 0, and can do two things with it: either we take a number ai that was not previously taken and add it to the sum (s+=ai), or we divide the sum by k (s/=k). Note that we can only divide the sum by k when the current sum is divisible, but unlike the original process from the problem statement we are not forced to divide if it is.

This means that the dynamic programming which computes "what value of s can be obtained if we have taken the numbers from a given set" does not need to iterate over subsets, and instead can add items one by one, leading to O(n*2n*m) complexity.

Thanks for reading, and check back for more!

Thursday, December 26, 2019

A 3-person week

Facebook Hacker Cup 2019 Final in Dublin was the main event of the Oct 21 - Oct 27 week (problems, results, top 5 on the left, results announcement video, recap videoanalysis). Gennady was so much faster solving the first few problems that he could afford to test his solution for the last one extensively, giving myself a glimmer of hope as I was the first to finish all solutions. It turned out to be just that, a glimmer of hope :) Congratulations to Gennady!

Codeforces Round 596 happened a day later (problems, results, top 5 on the left, analysis). Benq ended up winning the round after he found 10 successful challenges and wxhtxdy's solution for F exceeded the time limit. There was a bit of luck involved as not all rooms even had so many incorrect submissions available for challenging, but still finding and executing those challenges is no small feat. Congratulations to Benq!

I have ruined my contest by trying to squeeze in a solution for E that was just too slow. I got into this very bad state where I thought repeatedly that after one more small optimization it will work, while having no proof that it will be the case. My solution had a 3n term in its complexity, while the correct one only has a 2n :(

Here is the problem: you have n<=16 positive integers such that their sum does not exceed 2000, and another integer k, 2<=k<=2000. All n integers are not divisible by k. In one step, we can erase two numbers and instead write their sum divided by k repeatedly until the result is no longer divisible by k. For example, if k=2 and we have numbers 5 and 19, we can erase them and instead write 3 as 3=(5+19)/2/2/2 and 3 is not divisible by 2. Our goal is to end up with having just the number 1. Can you find a way to do that without iterating over 3n of something?

In my previous summary, I have mentioned a bunch of problems. The first one came from Codeforces: you are given two strings of even length, a and b, consisting of characters 0 and 1. You start with the string a. In one step, you can choose any prefix of the current string that has even length and reverse it. Your goal is to obtain the string b in at most n+1 steps. You don't need to minimize the number of steps. The length of each string is even and at most 4000.

Since we only ever reverse prefixes of even length, it's natural to see what happens to blocks of two consecutive characters. Blocks of 00 and 11 stay the same, 01 turns into a 10 and vice versa. Therefore, in order for this transformation to be possible at all, we need the number of 00 blocks to be the same between the two strings, and the number of 11 blocks to be the same between the two strings (and therefore the total number of remaining blocks, which are 01 and 10, will also be the same).

It turns out that this is both necessary and sufficient for a transformation to exist. One way to do it is to keep converting the string a into the string b from right to left. Consider the rightmost block of b, for example it's 00. Then we can find the same block anywhere in a, reverse the prefix ending with this block so that it moves to the first position, and then reverse the entire string a so that it moves to the end, and then we can forget about the last two characters of a and repeat the process.

We have handled two characters in two reversals, so it looks like we will finish in at most n steps. However, there is one tricky case: when the block we need is a 01, and a does not have any 01s, just 10s (or vice versa). In this case we might need an extra reversal in the middle to turn one into the other when it's in the first position of a, so we will end up spending 3 reversals per 2 characters, potentially using up to 1.5n steps which is too much.

The official solution modifies this process a bit to make sure we need at most one such extra reversal — check it out. I have tried to come up with something in that vein but could not make it work, so I've used a randomized approach: I've noticed that in some cases we can only spend 1 reversal per 2 characters, namely when the required block is already in the beginning of a (in reversed form). If the number of times we spend 1 reversal is the same or at most one less than the number of times we spend 3 reversals, then we're good. And intuitively the "1 reversal" situation is much more likely to happen as we just need the right type of block in the beginning, while for the "3 reversals" situation to happen we need some type of block to not exist at all. Since we have quite some freedom in choosing which block to move to the last place at each step, making this choice randomly, and then starting from scratch in case we exceed n+1 steps turned out to be good enough.

The second problem was from TopCoder: you are given a number n between 1 and 1000, and need to construct any tree with n vertices that has at least 2.62n colorings of the following kind: each vertex is colored red, green or blue in such a way that the distance between any two red vertices is at least 3 (there are no constraints on green and blue vertices).

My solution, which ended up being the only one that passed the system tests, was very simple: we start with a tree that looks like a chain. Then we repeatedly make random changes to the tree, keeping the change if the number of colorings increases, and reverting the change if it decreases, until we reach the required number of colorings.

This still requires to solve the sub-problem of counting the number of colorings, but it is solved with a relatively standard dynamic programming algorithm.

Since there were only 1000 possible inputs, I've ran my solution on all of them and was sure that it works when submitting it. And then I ended up resubmitting it with just seconds left in the round :) When running it on the 1000 inputs for the first time, I've noticed that it takes some time to converge to the required number of colorings for inputs under ~700, but the solution becomes instant for inputs larger than that. Somehow I did not think this was bad at the time, but of course it was. The reason for this was that the number of colorings overflowed the double floating-point type, so it was immediately finding a tree with "positive infinity" colorings :)

Finally, I have also mentioned an Open Cup problem: there is a hidden array a with at most 250 elements, and you need to determine its contents after asking at most 30 queries. Initially, you just know that size of a and that each element of a is an integer between 1 and 109, inclusive. In one query, you can either ask for the value of a particular element of a, or choose a set of indices of elements of a, and get back the multiset of k*(k-1)/2 absolute values of pairwise differences between elements with those indices (where k is the number of indices you've chosen). Note that for a query of the second type you get back a multiset, in other words there is no order and you don't know which difference corresponds to which pair.

The first crucial trick that our team came up with is the following: if we send two queries of the second type for a set S and for a set S+{x}, and then find the difference between the returned multisets, then you get a set of absolute values of differences between x and all elements of S.

Now, if we somehow magically knew an element x which is the global minimum or maximum, we could solve the problem in the following way: let's repeat the above trick 8 times, on the i-th try asking for differences between x and a set of indices which have the i-th bit set to 1.

Since we have less than 28 elements, and all differences will be unique thanks to x being the global minimum or maximum, this will allow us to uniquely identify the element corresponding to each difference by looking at whether each bit in its index is 0 or 1. The difference corresponding to all bits being 0 will never appear in our output, but we can just start the indexes from 1 to avoid this issue.

Now we just need to find the value of x and whether it is the minimum or the maximum, which we can do with two queries of the first type, and all other elements can be obtained by adding (when x is the minimum) or subtracting (when x is the maximum) the corresponding difference. We have spent only 8*2+2=18 queries in this part.

How to find the global maximum or minimum using the remaining 12 queries? It turns out that we can craft a binary search using the presence of the overall maximum absolute difference as our test. First, we send a query of the second type for all indices, and find the overall maximum absolute difference — it will be the difference between the global maximum and the global minimum.

Then, suppose we have identified a subset S which contains the global maximum, the global minimum, or both (initially S will be equal to all indices), and let T be its complement (initially empty). Let's split S into two equal parts A and B. Let's ask send a query of the second type for the set A+T. In case this query returns the overall maximum absolute difference, then A contains at least one of the global maximum and minimum (because otherwise T would have to contain both, which is a contradiction). In case it does not, then B must contain at least one of them. So we can reduce our set of candidates in half using one query of the second type, and therefore find our global maximum or minimum in 1+8=9 queries, solving the entire problem in 27 queries.

This problem was one of the few cases where we have actually used the whole Open Cup 3-person team to discuss it and make incremental progress until all pieces of the puzzle fell together. I found the "uncertain" binary search quite cool in particular, as we manage to look for one of the two elements without knowing which one of the two it is, and without even knowing if the remaining subset contains only one of the elements or both :)

Thanks for reading, and check back for more!

Tuesday, December 24, 2019

A perfectly balanced week

Codeforces Global Round 5 started the Oct 14 - Oct 20 week (problems, results, top 5 on the left, analysis). Radewoosh solved 8 problems with almost an hour left in the round, and since nobody was able to get all 9, his first place was quite clear. Well done!

I was stuck trying to squeeze my solution for problem H into the constraints for quite some time, and as the round was coming to its end I've decided to make it do multiple passes with some randomization until it fits and submitted. It turns out that this approach worked, and I think it's likely impossible to create a counter-test.

Here's what the problem was about: you are given two strings of even length, a and b, consisting of characters 0 and 1. You start with the string a. In one step, you can choose any prefix of the current string that has even length and reverse it. Your goal is to obtain the string b in at most n+1 steps. You don't need to minimize the number of steps. The length of each string is even and at most 4000.

Can you see a deterministic, or at least a randomized solution?

TopCoder SRM 769 followed on Saturday (problems, results, top 5 on the left). Unlike the previous round, the hard problem in this round had only 1000 possible inputs, so this time I went for a randomized solution intentionally as I could verify that it works on all inputs before submitting. This turned out to be the right approach :)

Here's the problem: you are given a number n between 1 and 1000, and need to construct any tree with n vertices that has at least 2.62n colorings of the following kind: each vertex is colored red, green or blue in such a way that the distance between any two red vertices is at least 3 (there are no constraints on green and blue vertices).

Open Cup 2019-20 Grand Prix of Southeastern Europe took place on Sunday (problems, results, top 5 on the left, onsite resultsanalysis). Team USA1 needed just half of the contest to solve all problems,  amazing performance! The stage win counts for the 3 strongest veteran teams are now at 2-2-2.

Solving the interactive problem C was a lot of fun. There is a hidden array a with at most 250 elements, and you need to determine its contents after asking at most 30 queries. Initially, you just know that size of a and that each element of a is an integer between 1 and 109, inclusive. In one query, you can either ask for the value of a particular element of a, or choose a set of indices of elements of a, and get back the multiset of k*(k-1)/2 absolute values of pairwise differences between elements with those indices (where k is the number of indices you've chosen). Note that for a query of the second type you get back a multiset, in other words there is no order and you don't know which difference corresponds to which pair.

Finally, Codeforces Round 594 happened right in the middle of the Open Cup round (problems, results, top 5 on the left, analysis). tlwpdus was able to claim the first place even though they solved one problem less than rhrnald, thanks to the fast solutions for D and E and having no incorrect attempts compared to rhrnald's five. Congratulations to both!

Thanks for reading, and check back for more.


A trigonometric week

The Oct 7 - Oct 13 week had TopCoder SRM 768 (problems, results, top 5 on the left, analysis). My solution to the medium problem turned out to be a bit simpler than the author's approach. Here is the problem statement: n players are competing in a tournament. Each player has an integer strength which is 1, 2 or 3. In one round of the tournament, we pick two random remaining players, and the one with the lower strength is eliminated. In case their strengths are equal, we eliminate one of them at random (each with probability 0.5). In the end only one player remains, and the order of elimination is the ranking of the players. What is the expected rank of a player with strength 2? Your input is three numbers, the number of players with strength 1, 2 and 3 respectively, each up to 2000.

The fifth Open Cup stage, the Grand Prix of Korea, took place on Sunday (problems, results, top 5 on the left, analysis). Team Polish Mafia was way ahead of everybody else this time, solving 10 problems to the second place's 8, and bringing the number of stage victories between the three strongest veteran teams to 2-2-1 (as a sneak peek into the present, no other team managed to win a stage in 2019). Congratulations!

In my previous summary, I have mentioned an AtCoder problem: you are given 3000 points on the circumference of the unit circle. What is the expected position of the center of the circle inscribed in a triangle formed by a random triple of those points?

A cubic solution is straightforward: iterate over all triples, find the center of the incircle, find the average of those. However, it will not fit in the time limit.

The most typical way to speed up this type of computation is to somehow decompose the inner expression into parts which depend on only one or two out of three points, and then depending on what "decompose" means exactly we can use partial sums, or FFT, or some other trick to find the sum while only iterating over points or pairs of points, therefore obtaining a quadratic solution.

At this point I've looked at the Wikipedia article to find a formula that can be decomposed. The barycentric coordinates sin(A):sin(B):sin(C) caught my eye for the following reason: since in this problem all points lie on a circle, the angle A only depends on the locations of the points B and C and not on the location of point A (this is a middle/high school geometry theorem; more precisely, there are two possibilities for the angle A depending on which side of BC does the point A lie). This is exactly the "depends on only two out of three points" property we're looking for.

The barycentric coordinates formula looks like

(xA,yA)*sin(CAB)/(sin(CAB)+sin(ABC)+sin(BCA))+... (the rest is symmetric terms for B and C)

We need to sum this over all triples, so the total coefficient next to (xA,yA) is the sum of

sin(CAB)/(sin(CAB)+sin(ABC)+sin(BCA))

over all values of B and C. Each individual sin() depends only on two points, but they are still combined together in one term, so we need to decompose further.

Let's denote the angle ABC as β, and the angle BCA as γ, the angle CAB can be found as π-β-γ, and since sin(π-β-γ)=sin(β+γ) we have

sin(β+γ)/(sin(β)+sin(γ)+sin(β+γ))

The main difficulty here lies in the fact that we have a sum in denominator, but we want a product, since those are easier to decompose. We can use another fact from middle/high school (which I googled of course), this time from trigonometry, and replace sin(β)+sin(γ) with a product like this:

sin(β+γ)/(2*sin((β+γ)/2)*cos((β-γ)/2)+sin(β+γ))

Now we have trigonometric functions of (β+γ)/2 and of β+γ, to make things simpler we can replace sin(β+γ) using another middle/high school formula to get:

2*sin((β+γ)/2)*cos((β+γ)/2)/(2*sin((β+γ)/2)*cos((β-γ)/2)+2*sin((β+γ)/2)*cos((β+γ)/2))

Here we get a confirmation that we are on the right track: the multiplier 2*sin((β+γ)/2) appears both in the numerator and in the denominator and cancels out! So we have just

cos((β+γ)/2)/(cos((β-γ)/2)+cos((β+γ)/2))

Now we can use yet another trigonometric formula, for the cosine of sum, to get:

(cos(β/2)*cos(γ/2)-sin(β/2)*sin(γ/2))/(cos(β/2)*cos(γ/2)+sin(β/2)*sin(γ/2)+cos(β/2)*cos(γ/2)-sin(β/2)*sin(γ/2))

The term sin(β/2)*sin(γ/2) cancels out in the denominator, so we have just:

(cos(β/2)*cos(γ/2)-sin(β/2)*sin(γ/2))/(2*cos(β/2)*cos(γ/2))

Or even simpler

1/2*(1-tan(β/2)*tan(γ/2))

Finally, this is decomposed enough. Even though β and γ are not chosen completely independently (the points have to be in the right order on a circle), for each particular value of β (in other words, after fixing the points A and C) we need to find a prefix sum of tan(γ/2) over a range of possible points B, therefore we can precompute those prefix sums after fixing the value of A and obtain a quadratic solution (here is my code).

I agree that this does look a bit like bashing the formula until some parts magically cancel out, but for me it felt like that both choosing the formula to bash (the one with angles, since those depend only on two points), and the way of bashing (trying to get a product instead of a sum in the denominator) were actually guided by the algorithmic constraints of the problem, and therefore were not so dependent on luck.

Thanks for reading, and check back for more!

A fork week

AtCoder Grand Contest 039 was the highlight of the Sep 30 - Oct 6 week (problems, results, top 5 on the left, analysis). ksun48 was the fastest on five problems, but ecnerwal was also able to finish problem F in about an hour, something that other top finishers could not match. Congratulations on the win!

Problem D has gathered quite some heat (see this comment and the thread below), but I think the problem was actually very nice: you are given 3000 points on the circumference of the unit circle. What is the expected position of the center of the circle inscribed in a triangle formed by a random triple of those points?

In my previous summary, I have mentioned a Codeforces problem: consider a bipartite graph with 6 vertices in each half, and each edge existing with a certain probability (these 36 probabilities are given to you). What is the probability that this graph has a perfect matching? The probability needs to be computed modulo 109+7, and the time limit is 7 seconds.

My approach is already described in this comment, but let me repeat it here since it's short anyway. Let us run the normal dfs-based algorithm for finding maximum matching, but whenever the algorithm needs to check if an edge exists, we will fork the search, one branch continuing as if it does, and one branch continuing as if it does not. Why would something like this work fast: the intuition is, since dfs stops right after finding any augmenting path, in most branches we won't care about the state of many edges, therefore the total amount of branches won't be too big.

Here is my implementation, the number of branches it considers for a 6x6 graph is 16404365 (much less than 236, as expected), practically it means getting accepted in 5.3s out of 7s time limit.

This approach is quite tricky to get right, though. Since we have to branch a recursive algorithm (dfs), we need to rewrite it to maintain the stack in arrays we control so that we can copy those. This process is quite tedious and error-prone, and it took me quite a lot of time to find a bug in that part of the code (if I remember correctly, I was missing line 113 "stack[sp - 1] = cur;").

Here's my question: is there a programming language or an approach where implementing such branching search on a recursive algorithm would be easy, and that would still fit into the 7 second time limit for 16 million branches? fork() in C/C++ comes to mind as the first choice, but I think its overhead is way too high for such number of branches to be practical.

Thanks for reading, and check back for more!

Monday, December 23, 2019

A bridge week

Codeforces Round 588 started the Sep 23 - Sep 29 week (problems, results, top 5 on the left, onsite results with one more easy problemanalysis). Um_nik solved six problems almost half an hour faster than other top finishers, therefore it's even surprising that his advantage is only 233 points. Congratulations on the convincing victory!

I have enjoyed coming up and implementing a weird approach to problem E1: consider a bipartite graph with 6 vertices in each half, and each edge existing with a certain probability (these 36 probabilities are given to you). What is the probability that this graph has a perfect matching? The probability needs to be computed modulo 109+7, and the time limit is 7 seconds.

How would you approach this problem?

Almost six days later, Open Cup 2019-20 Grand Prix of Eurasia has wrapped up the week (results, top 5 on the left). The winners of the first three stages lined up on the podium this time, team Past Glory prevailing once again thanks to having much, much fewer incorrect attempts than its competitors. Congratulations!

In my previous summary, I have mentioned a supposedly medium-difficulty AtCoder problem: does there exist a simple connected undirected graph with exactly n vertices and exactly m edges that satisfies the q given constraints (nm, q<=105) of the following two types:
  • for a pair of vertices, there must be exactly one simple path between them
  • for a pair of vertices, there must be at least two simple paths between them
For most of the time during the round, I was trying to start from constraints of the second type, incorrectly assuming that such pairs of vertices must be in the same biconnected component. In fact, this is not necessary, since the two paths do not have to be completely disjoint, they just need to differ at some point.

The other approach direction turns out to be more fruitful: consider the (only) path between two vertices from a constraint of the first type. All its edges must necessarily be bridges in our graph, as otherwise we could remove one of them and find another path. Now if we group the vertices into connected components using the constraints of the first type, each such connected component is connected with bridges only (possibly via some intermediate vertices). Therefore for any pair of vertices in such connected component there is exactly one simple path between them, which in turn means that if any two vertices in a constraint of the second type are in one such component, then there is no solution.

Notwithstanding the requirement to have exactly m edges, the above case is the only one when there is no solution at all, because of the following construction: let's take the connected components using the constraints of the first type, build an arbitrary tree of bridges inside each such component, then pick one vertex in each component and connect all of them in one big cycle. In such a construction, for any two vertices inside one component there is exactly one path between them, and for any two vertices from different components there is at least two paths between them.

This construction has n edges (since it has exactly one cycle), which is the minimum possible number of edges if we have at least one constraint of the second type (since we must have at least one cycle then). Now, if instead of connecting the chosen vertices in one big cycle we connect them using a complete graph, we get another valid solution but with a lot more edges: if we have k components, the number of edges is n-k+k*(k-1)/2=n+k*(k-3)/2. It is also easy to see how to achieve any number of edges in between those two values, as we can keep adding edges to the cycle until it becomes the complete graph.

The only remaining step is to notice that we can't have more edges than that. The reason for that is that we can't have more than one edge between two components (otherwise one of the component's edges would not be a bridge, as we could bypass it), and adding more vertices to a component only makes things worse as we're replacing a lot of edges with a single new bridge.

Thanks for reading, and check back for more!

Sunday, December 22, 2019

A bfs week

Codeforces hosted its Round 586 during the Sep 16 - Sep 22 week (problems, results, top 5 on the left, analysis). The somewhat unusual problem difficulty distribution meant the differences between the top competitors were quite tiny, but still mnbvmar was the fastest of all solving six problems in 37 minutes. Well done!

AtCoder Grand Contest 038 took place on Saturday (problems, results, top 5 on the left, analysis). It was one of those days where tourist solved six problems in 79 minutes, and I ended up with three problems after the whole 110-minute round :) Congratulations to him but also to ksun48 and hbi1998 on solving all problems, too!

Problem D has got me stumbled: does there exist a simple connected undirected graph with exactly n vertices and exactly m edges that satisfies the q given constraints (nm, q<=105) of the following two types:
  • for a pair of vertices, there must be exactly one simple path between them
  • for a pair of vertices, there must be at least two simple paths between them
The general approach was somewhat clear, but I could not dig through all details. Can you?


For the third week in a row, Open Cup was the last round of the week, this time the Grand Prix of SPb (results, top 5 on the left). Even though team Past Glory was the fifth team to solve all problems, more than an hour later than team USA1, they have returned to the top of the standings thanks to having fewer incorrect attempts. Congratulations!

In my previous summary, I have mentioned a Codeforces problem: you are given a connected graph with m<=105 edges, numbered with consecutive integers from 1 to m. For any path, we obtain its cost by writing down the numbers of its edges in sequence without spaces to obtain one huge number. What is the smallest cost of a path from vertex 1 to each other vertex? You need to return the costs modulo 109+7 (but the smallest cost is chosen before the modulo).

The first idea is to notice that multiple-digit numbers are unwieldy, and that we can avoid them by replacing each undirected edge with two directed edges and then inserting auxiliary vertices in the middle of each directed edge splitting it into single-digit edges: instead of a-1204-b, we would have a->1->u->2->v->0->w->4->b and b->1->x->2->y->0->z->4->a. This does not change the costs of simple paths in the original graph, and since our cost function still satisfies the property that taking a subset of edges in a path results in smaller cost, we do not need to consider non-simple paths.

Now we have a directed graph with only single-digit edges, and need to find the shortest path from vertex 1 to each other vertex, breaking ties by taking the lexicographically smallest one. It turns out that we can modify breadth-first search to do it: in the breadth-first search queue, we will maintain which pairs of adjacent elements have exactly the same distance from vertex 1, and which have different distances. Now, when we need to take the next vertex out of the queue to process, we will instead take out the whole block of vertices with the same distance from 1, and then first process all 0 edges from all vertices from the block, then all 1 edges from all vertices from the block, and so on. This allows us to keep adding vertices to the queue in the correct order (by length, then lexicographically), and to maintain the "blocks by equality" in it.

Note that we never need to compare the actual path costs in this solution, therefore we can store them modulo 109+7.

Thanks for reading, and check back for more!

A Dasha week

Welcome to the Christmas marathon :)

TopCoder SRM 766 was the first round of the Sep 9 - Sep 15 week (problems, results, top 5 on the left, analysis). With his 44th SRM win, tourist has matched SnapDragon's second place in the record book (he has won two more since that week, but I will tell that slightly sad story later :)). Congratulations!

Codeforces Round 584 followed on Saturday (problems, results, top 5 on the left, my screencast, analysis). Problem F in this round reminded about a nice, if a bit standard, trick: you are given a connected graph with m<=105 edges, numbered with consecutive integers from 1 to m. For any path, we obtain its cost by writing down the numbers of its edges in sequence without spaces to obtain one huge number. What is the smallest cost of a path from vertex 1 to each other vertex? You need to return the costs modulo 109+7 (but the smallest cost is chosen before the modulo).

Open Cup 2019-20 Grand Prix of Warsaw was the last event of the week (problems, results, top 5 on the left, analysis), with another relatively new veteran team taking the top spot: team USA1 shares 2/3 with the last year's team MIT THREE, but this is the first season in the [ICPC] veteran status for them. Congratulations on the win and keep it going!

Thanks for reading, and check back for more.

Saturday, November 30, 2019

A mafia week

The Sep 2 - Sep 8 week had two Codeforces rounds, starting with Codeforces Round 583 (problems, results, top 5 on the left, analysis). wxhtxdy solved all problems with just 3 minutes remaining, under pressure from Radewoosh who kept submitting the last problem for the last 20 minutes of the contest but still couldn't make it work. Congratulations to wxhtxdy!

The second Codeforces round was called Kotlin Heroes: Episode 2 (problems, results, top 5 on the left, analysis). tourist, abacabadabacaba and Benq solved everything, but tourist was a whole 40 minutes faster. Well done!

Finally, the 2019-20 season of the Open Cup started with the Grand Prix of Kazan (problems, results, top 5 on the left). Team Polish Mafia showed how to come back as a veteran team in style, winning the contest by a whole problem. Congratulations, and looking forward to competing against you in the future rounds!

Thanks for reading, and check back for more.

A Petrozavodsk week

There were no major public contests during the Aug 26 - Sep 1 week, in part because many top ICPC teams were busy at the Petrozavodsk training camp (overall standings, top 5 on the left), which traditionally kicks off the new ICPC season. The results showed that while NNSU team looks pretty dominant in the renamed Northern Eurasia region, Seoul NU team could pretty much keep up with them, setting us up for an exciting season!

Incidentally, the Northern Eurasia Finals take place tomorrow, tune in to the broadcast, or try your luck at the same problemset in the Codeforces mirror :)

Thanks for reading, and check back for more.

Friday, November 15, 2019

A regular week

The Aug 19 - Aug 25 week had TopCoder SRM 765 (problems, results, top 5 on the left, my screencast). I've spent quite a lot of time on each of the first two problems, but managed to recoup those losses by challenging a lot of solutions to the easy problem, which required a lot of accuracy when coding and ended up having just 20% success rate.

Codeforces hosted a round called "Manthan, Codefest 19" (problems, results, top 5 on the left, analysis). Nobody except tourist could get the hardest problem accepted, and therefore the challenge fight was only for the second place. Congratulations to Gennady on the win!

In the previous summary, I have mentioned an AtCoder problem: you are given a n times m grid (n, m <= 100), containing each number between 1 and nm once. Your goal is to sort the grid in row-major order in three steps. In the first step, you can rearrange the numbers within each row arbitrarily. In the second step, you can rearrange the numbers within each column arbitrarily. In the third step, you can rearrange the numbers within each row again.

Before the third step, we need each row to contain the numbers it should have in the end, in any order. Let's color numbers in n colors, such that all numbers that end up in the same row have the same color. Then before the first step we must have all numbers of one color in the corresponding row.

This means that before the second step, each column must have exactly one number of each color. In the first step, we are rearranging the rows arbitrarily, so let's assume that we're filling columns one by one, so every time we need to take one from the remaining colors in each row in such a way that we get one of each color.

This subproblem is a matching one: we can make a bipartite graph where the left part contains rows, the right part contains colors, and there's an edge for each number in the initial grid. Since this graph is regular, it has a matching, and moreover it keeps being regular after we remove all edges of a matching, so we can just repeat this process m times to split the entire graph into n matchings, which gives us the necessary rearrangement of numbers for the first step.

Thanks for reading, and check back for more!

Thursday, November 14, 2019

A dragon week

All contests during the Aug 12 - Aug 18 week took place on the weekend. AtCoder Grand Contest 037 was the first (problems, results, top 5 on the left, my screencast, analysis). I was quite fast in solving the harder problems in this round, but got stuck for 25 minutes in the supposedly easy problem B, allowing apiad and Um_nik to take the first two spots. Well done!

Problem D in this round was very nice: you are given a n times m grid (n, m <= 100), containing each number between 1 and nm once. Your goal is to sort the grid in row-major order in three steps. In the first step, you can rearrange the numbers within each row arbitrarily. In the second step, you can rearrange the numbers within each column arbitrarily. In the third step, you can rearrange the numbers within each row again. Can you see why this is always enough, and how to actually find one way to do it?

Just a couple of hours later, TopCoder hosted the TCO19 Wildcard Round (problems, results, top 5 on the left, parallel round results). Despite failing the hard problem for a not very good reason, xudyh was still much faster than everybody else and qualified to the onsite finals together with uwi. Congratulations to both!

Codeforces Round 580 wrapped up the week (problems, results, top 5 on the left, analysis). TLE went for problem F while skipping some easier problems, and this turned out to be the right approach as he had much more points in the end. Well done!

In the previous summary, I have mentioned my Code Jam problem: there are several kings on an infinite chessboard. You don't know neither the number nor the locations of the kings, only that there are at most 10 and their coordinates are at most a million in absolute value. You can send at most 1000 requests, in each request you choose a cell on the chessboard and are told the sum of distances between this cell and the locations of the kings. The distance between two cells (a,b) and (c,d) for a king is max(abs(a-c),abs(b-d)). Your goal is to learn to answer such requests yourself: after you've done sending requests, the judging system will in turn send you 1000 such requests, and for each request you need to reply with the sum of distances from the locations of the kings to the request location.

The solution has two main parts. First, we need to notice that the distance metric used in this problem, the Chebyshev distance, is actually equivalent to the Manhattan distance if we make a substitution x'=x+y, y'=x-y, so it is enough to solve the problem for the Manhattan distance function: abs(a-c)+abs(b-d).

Now, suppose we send two requests: (a,b) and (a+1,b). The difference between the answers will tell us the difference between the number of points to with x>=a+1 and the number of points with x<=a. If we also know the total number of points, this allows us to tell how many points have x<=a, and we can find the total number of points by using this with a very large value of a.

We can then use the binary search on a using the "the number of points that have x<=a is at least i" predicate to find the i-th x-coordinate, and by repeating this for each i we can find all x-coordinates. We can find all y-coordinates in the same fashion.

However, we don't actually know how do the x-coordinates correspond to the y-coordinates, what are we going to do about that? It turns out that we don't need to know: the coordinates are summed independently in the abs(a-c)+abs(b-d) formula, so just knowing all coordinates is enough to be able to answer the requests! Moreover, the same argument shows that it's actually impossible to determine how the x-coordinates match the y-coordinates from such requests :)

There is one more technical complication arising from the fact that after rotating by 45 degrees we can no longer send arbitrary requests in the new coordinate system, only those where x and y have the same parity, so we can't actually send (a,b) and (a+1,b). However, we can fix that by using a very large b and sending (a,b) and (a+1,b+1) instead. The contribution from y to this difference will always be equal to the total number of points, and we can just subtract that.

I have also mentioned a TopCoder problem: given four squares that sum to an even number, find four squares that sum to half that number.

My approach to solving this problem was googling, as it seemed like it should be some well-known trick. And indeed after playing with keywords a bit I stumbled upon Euler's four-square identity. Substituting a1=1, a2=1, a3=0, a4=0 we get a very simple system of equations to solve.

Thanks for reading, and check back for more!

A six-in-a-row week

Google Code Jam 2019 Final Round was the main event of the Aug 5 - Aug 11 week (problems, results, top 5 on the left). There was a lot of movement at the top of the scoreboard during the round, since we did not know which submissions were solving just the visible input, and which tried to get both. After the dust had settled, things were much less suprising :) Gennady has won it for the 6th time in a row, incredible!

I have prepared the easiest problem in this round, Board Meeting: there are several kings on an infinite chessboard. You don't know neither the number nor the locations of the kings, only that there are at most 10 and their coordinates are at most a million in absolute value. You can send at most 1000 requests, in each request you choose a cell on the chessboard and are told the sum of distances between this cell and the locations of the kings. The distance between two cells (a,b) and (c,d) for a king is max(abs(a-c),abs(b-d)). Your goal is to learn to answer such requests yourself: after you've done sending requests, the judging system will in turn send you 1000 such requests, and for each request you need to reply with the sum of distances from the locations of the kings to the request location.

Can you see how to solve the problem, and why did we use such elaborate two-phase process instead of just asking to determine the locations of the kings?

TopCoder SRM 764 took place on the weekend (problems, results, top 5 on the left, my screencast, analysis). There were five problems instead of the usual three, and the divisions were combined. I can't say I enjoyed this format too much, but I did enjoy solving the cute mathematical puzzle that was the 500-point problem: given four squares that sum to an even number, find four squares that sum to half that number. Can you find a way to do it?

Thanks for reading, and check back for more!