## Wednesday, May 10, 2017

### A zero pigeon week

The April 15-16 Easter weekend was even more packed with contests. First off, Google Code Jam Round 1A took place very early on Saturday (problems, results, top 5 on the left, analysis). Eryx has solved all problems 40 minutes faster than anybody else, but he might've been taking a big gamble, as the last problem was exceptionally tricky (8 correct out of 124 attempts). In any case, congratulations on the victory!

I think problem A was a great example of a beautiful easy problem. You are given a grid where some cell contain letters, and some are empty, such that each letter appears at most once. Your goal is to write letters to all empty cells in such a way that the cells with each letter form a grid-aligned rectangle. You're only allowed to use letters that are already present in the grid. Any valid solution suffices.

For example,
G??
?C?
??J
can be solved as
GGJ
CCJ
CCJ

A few hours later AtCoder held its Grand Contest 013 (problems, results, top 5 on the left, analysis). Nobody has managed to get all problems right, and yosupo was the fastest to solve all but one. Well done!

I've had the most fun with problem E. We are given a long (109) segment. Some (at most 105) integer points on the segment are special. We consider all ways to take some set of non-special integer points on the segment. Each such set splits the segment into smaller segments. Let's find the product p of their lengths, and then compute the sum of the squares p2 of those products over all ways. What number are we going to get, modulo 109+7?

Open Cup 2016-17 Grand Prix of America took place on Sunday (results, top 5 on the left, other results on same problems). Huge congratulations to team Deep Dark Fantasy on solving the hardest problem I and winning the round!

We didn't solve the very nice problem K during the round. We are given a number k, a parentheses sequence of length 105, and also for each position you're given the cost of changing the parenthesis in this position to the opposite one. Our goal is to produce a parentheses sequence that is k-unbalanced: it requires changing at least k+1 parentheses to arrive at a balanced parentheses sequence. What is the smallest total cost to achieve that?

Right in the middle of the Open Cup round, Russian Code Cup 2017 held its second Qualification Round (problems, results, top 5 on the left, analysis). Congratulations to AndreiNet on beating the others to the first place by just two minutes of penalty time!

And to wrap up the Sunday, Codeforces hosted VK Cup 2017 Round 2 (problems, results, top 5 on the left, parallel round results, analysis). The goose team stood out by being the only one to solve all five problems - congratulations!

In my previous summary, I have mentioned an interactive Open Cup problem: two players are playing a game using a grid with n+1 rows and n columns. Each cell of the grid will contain a 0 or a 1, but the contents of the cells are initially undetermined. The first player, let's call him Dirichlet, can ask questions of form "what is the sum modulo 2 in this subset of cells", for any subset of cells. The second player, let's call him Pigeon, answers the questions - each answer is either 0 or 1. Pigeon can answer questions in any way - for example, he does not have to imagine a concrete grid and use it to compute the answers.

Dirichlet's goal is to get one of the three outcomes:
1) Find a contradiction in Pigeon's answers. More precisely, find a set of questions such that each cell of the grid appears an even number of times in this set, and thus the sum of all answers must be 0 modulo 2, but the sum of Pigeon's answers is 1 modulo 2. One simple case of a contradiction is when he asked about the same set twice and got different answers, but there are much more complicated situations possible.
2) Find two different cells in the same column that he can prove to contain 1. In order to prove that a certain cell contains 1, he can use a set of his questions such that the interesting cell appears an odd number of times in them, and all other cells appear an even number of times in them, and the sum of Pigeon's answers for those questions is 1 modulo 2. One simple way to prove is to ask about the set containing just the interesting cell and receive answer 1, but there are also other ways.
3) Find a row such that he can prove that all its cells contain 0. Proof for each cell must be done similarly to the previous case.

Dirichlet knows that sooner or later one of those three things will happen, since the grid has more rows than columns, and thus it either has a row of 0s or two 1s in the same column. One way to certainly arrive at one of the outcomes is to ask about each cell of the grid separately, using n*(n+1) questions. But Dirichlet wants to win faster: using at most 5*(n+1) questions. Pigeon's goal is to prevent Dirichlet from arriving at one of the three outcomes so fast. You need to implement a program that plays for Dirichlet. n is at most 70.

Let's ask about the sum modulo 2 in each column. First, consider the case where all answers are 0 (it's not a special case in the solution, but it helps understand the general case). Then we can do the following: let's pick any row, and ask about its cells one by one. In case all of those come back as 0 as well, we have found a row of zeroes so we're done. In case one of those comes back as 1, we know there's another 1 in its column, since the sum of each column is 0 modulo 2, so we can just ask about other cells in that column one by one to find the second 1, and we're done.

Now, let's solve the general case: when some columns have an odd number of ones. Let's split all rows into two almost equal halves A and B arbitrarily, and let's ask about sum mod 2 in rows from A in each column with total sum 1 (we also learn the sum in each such column in rows from B, by subtraction). Since the total sum in each "interesting" column is 1, exactly one of its sums in A and in B will be 0, and exactly one will be 1. Now we want to pick either A or B, and continue recursively with this set of rows and only with columns that have sum 1 in it, until at some point we have no columns with sum 1 anymore. Which one to pick between A and B? Well, in order for the recursive calls to converge, we need to maintain the invariant: the number of rows in our set is strictly greater than the number of interesting columns. Initially it's true (n+1>n), and since we split the rows into two parts and columns into two parts, it's not hard to see that it will still be true either for A, or for B — and that's what we should pick.

What do we have after the dust settles? We have used at most n (initial column queries) + n (column queries on first split) + n/2 (column queries on second split, since we split the rows in two and the number of interesting columns keeps being less than the number of remaining rows) +n/4+... <= 3n queries. And we have the following artifact: we have a row such that for each column there's a segment of cells in that column that has sum 0 modulo 2 and contains the given row (whenever a column becomes "not interesting", we find such a segment for this column).

Now we can just do the same thing we did for the case where all columns had sum 0: we iterate over our row, find any 1, and then find the second 1 in its column (it exists, since we have a segment with sum 0 there), spending at most 2n more queries, so overall we fit exactly in 5n as needed.

How to come up with this solution? Once again, I think it was about slowly building up enough intuition about the problem. I've kept asking myself questions, and there were two that combined to make a breakthrough. One was: when does any knowledge about a sum of a set of many (more than 1) cells lead to an improvement versus a naive approach of just asking about all cells in some order? And the other was: how can we defeat Pigeon's strategy that always answers 0 until the last moment when that would lead to Dirichlet finding a row of zeros?

At this point I realized that knowing that the sum of a column is 0 is actually very valuable, since then it suffices to find just one 1 in this column instead of two. That led to the solution for the case where the answer for all columns is 0, and then the "parallel binary search" for the general case was a somewhat standard follow-up approach.

Thanks for reading, and check back soon for the next week's summary!