Sunday, September 26, 2021

A 202 week

Facebook Hacker Cup 2021 Round 2 narrowed the field to 500 qualifiers on Saturday (problems, results, top 5 on the left, analysis). maroonrk was a lot faster than everybody else, and he also bravely skipped writing a dedicated solution for C1, submitting problem C2 first and then quickly adapting its solution for C1. Congratulations on the clear first place!

This contest ended up being quite a struggle for me. First, I did run into the stack overflow issues that are a recurring topic for the Hacker Cup. It seems that I have changed my setup compared to the last year, and now even about 15 minutes of googling during the round and another half an hour after the round did not help me find a way to increase the stack size in the new setup. In the end I just gave up and submitted by compiling with g++ in the command line (in hindsight, this decision should have come earlier).

However, I'm still curious how to increase the stack size in my setup, which is CLion that uses WSL as the toolchain, compiling with clang++. I've tried various compiler/linker flags such as -Wl,-stack_size and  -Wl,--stack, but none of them worked. I've tried editing /etc/security/limits.conf inside WSL, but that did not seem to have any effect. I've found answers suggesting that the stack size can only be changed in WSL 2, and I double-checked that I do in fact have WSL 2. I've tried calling setrlimit from within my code, but it did not help either. I've found mentions that calling sudo prlimit --stack=unlimited --pid $$; ulimit -s unlimited would help, but it's not clear how to do it when I'm executing my program from within CLion, as I could not find a way to specify that some commands need to be run before my program. Putting that command in .profile did not help.

So, I'm turning to the readers of my blog for help. Do you know how to increase the stack size in the CLion+WSL+clang setup?

EDIT: this has been resolved in comments, it turns out that I did not double-check enough and I had WSL1. With WSL2 /etc/security/limits.conf does the trick.

This was not the end of my fight with C++, however. After implementing the solution for C2, I've tried to check it before submitting by feeding it the input from C1 augmented with many random changes. I've noticed that it did agree with my solution for C1 in debug mode, but it produced a different answer in release mode! I went on to debug this for around half an hour, could not find any issues, and finally noticed that compiling from the command line with g++ with optimization on (the release mode mentioned above was compiling with clang++ from within CLion) produces the correct answers, so I've used that to make my submission which turned out to be correct.

After the contest I discovered what the problem was using binary search on clang++ flags: in release mode, CLion (which inherits this behavior from CMake as I understand) passes -DNDEBUG flag, which makes assert(f()) statements not execute f() at all, and I've used assertions to verify that the elements that I try to erase from sets are actually found: assert(set.erase(x)). I guess everyone has to learn this lesson once :)

Finally, I was quite excited to come up with a cool solution for problem D inspired by the algorithm described in this post from 9 years ago (see the second solution to problem F). However, I missed the fact that even though I only discarded at most 18 numbers in my solution, some of those discarded numbers are differences of, or differences of differences of original numbers (and so on), so we can discard much more than 18 original numbers. Therefore my solution has failed, but luckily the first four problems were enough to advance. Funnily enough, if I'm reading the feedback page correctly, this solution has almost passed, only failing one testcase where it discarded 31 numbers (it was only allowed to discard 25).

Open Cup 2021-22 Grand Prix of XiAn followed on Sunday (results, top 5 on the left, analysis). Team USA1 has returned to the winning ways, even though they were not the first team to reach 10 problems: Ildar Gainullin did this 20 minutes earlier, but his penalty time was ruined by the +24 in problem I. Congratulations to both teams!

In my previous summary, I have mentioned an Open Cup problem: you are given a positive integer k up to 106, and need to produce any set of at most 30 integers with absolute values not exceeding 1016 such that the number of its subsets with zero sum, including the empty subset, is exactly k.

Since we did not have anything else to code, I got to play with this problem using a computer during the round. First, I've tried a greedy approach where we start with a set containing just the number 1, and then we keep adding numbers that give us as many additional zero-sum subsets as possible (but not exceeding k). This approach found a solution of size ~100 for large values of k if I remember correctly. When looking at the solution, I've noticed that after starting with a few 1s and -1s, it starts using bigger and bigger numbers, which, while locally optimal, means that on later steps we have many different sums, and therefore not so many subsets with the same sum.

I've tried to improve the solution by prescribing it to use more 1s and -1s before falling back to the greedy approach, which brought the solution size to ~70, still quite far from the required 30.

Then I've tried to think how a more constructive solution to this problem look like. One idea that came to mind is to split the solution into two parts: in the first part, we use only positive numbers, and therefore do not get any zero-sum subsets. Then, we use only negative numbers, and we have some kind of a knapsack problem: for each negative number, we know the number of ways to build its negation as the sum of some positive numbers, and we need those numbers to add up to k. It's not exactly a knapsack though since there might also be zero-sum subsets with more than one negative number.

However, this line of thought allowed me to come back to my greedy approach with a new idea: instead of starting with 1s ands -1s, I've tried starting with 1s and 2s first (just iterating how many of each we have, up to a total of 30), and then greedily adding only negative numbers, each time maximizing the number of new zero-sum subsets, but not exceeding k. This has dramatically improved its performance, finding solutions with less than 30 numbers for k=106 and k=106-1. I'm not sure if this rationalization is correct, but I think the reason for such improvement might be the fact that now we do the "exponential blowup" only once instead of twice (using only positive numbers for it).

This solution did not pass, but it passed quite a few tests with 1000 cases in each one, so I've modified it to also try having some 3s, which was enough :)

Thanks for reading, and check back next week!

4 comments:

  1. It seems many people in the recent round had been struggling with stack size problem

    ReplyDelete
  2. I know that ulimit -s unlimited works for me when using Wsl2 and compiling from terminal.

    Maybe try something like system("ulimit -s unlimited") on first line of main?

    Another thing maybe is call ulimit from clion's terminal ui?

    ReplyDelete
    Replies
    1. Thanks for your suggestions! They did not seem to help, but then I decided to double-check and realized that I checked the version wrong, and I did in fact have WSL1. After upgrading to WSL2 and adding

      * soft stack unlimited
      * hard stack unlimited

      to /etc/security/limits.conf everything started working!

      Delete
  3. It seems stack size is an architecture-specific flag in LLD. I have some old version at hand and only macOS and Windows variants of LLD offer a stack size flag; it might be different in newer LLD but I doubt it.

    ReplyDelete