Wednesday, October 22, 2008

XXI St. Petersburg State University Championship - problem C

I've participated in a contest this weekend in St. Petersburg, and I'd like to explain (at least) one problem from that contest - I think it may be interesting even for people not very much into programming competitions.

The problem is: how many parallelograms are there with vertices having integer coordinates, x from 0 to n, y from 0 to m (both inclusive)? n and m are given as input, and both don't exceed 2000 (so an O(n^2) algorithm is almost surely OK, while O(n^3) or more is almost surely too much).

This is kind of classical problem if we replace parallelograms with triangles. But it turns out that parallelograms allow for some more interesting solutions. There's an approach that is the same for both parallelograms and triangles. What's the problem with counting those parallelograms directly? Well, there may be too much - on the order of n^6 (the parallelogram is basically defined by 3 of its vertices) - so we need to somehow count many at once. How can we do that? The most obvious way would be to count parallelograms that differ only by a shift together. But even that would only get us n^4 variants. To do better, we need to somehow group different parallelograms together. The most straightforward way to do this is to count all parralelograms that have the same bounding box together. Why? Because all those parallelograms have the same amount of possible shifts (if the bounding box is [x1,x2]*[y1,y2], then we have (n-x2+x1+1)*(m-y2+y1+1) possible shifts), so the answer for the problem can be calculated as: sum (w from 0 to n) of sum (h from 0 to m) of num(0,0,w,h)*(n-w+1)*(m-h+1), where num(x1,x2,y1,y2) is the number of different parallelograms with the given bounding box. If we can get that number in O(1) , then we're done - we have a O(n^2) solution!

For a parallelogram ABCD to have the given rectangle (w times h) as a bounding box, the parallelogram must have a vertex on each of the sides of this rectangle. There're two choices:
1) one of the vertices of the parallelogram (say, A) coincides with a vertex of the rectangle. Then it's easy to see that the opposite vertex C of the parallelogram coincides with the opposite vertex of the rectangle, and the rest of the parallelogram is defined by the position of the third vertex B of the parallelogram. The number of possibilities for B is just the number of points inside the rectangle, minus the 2 points already taken, and minus the points that lie on the segment AC. And we need to divide the resulting number by 2 to account for B and D being interchangeable, and multiply it by 2 to account for two possible choices for which vertices of the rectangle A and C go to. Thus, we get: ((w+1)*(h+1)-gcd(w,h)-1)/2*2=(wh+w+h-gcd(w,h)). (The reason why gcd appears here is left to the reader)
2) Each side of the rectangle has one vertex of parralleogram strictly inside it. Then, the parallelogram is defined by two of its vertices, A and B, as C and D will be located in symmetric positions, so there're (w-1)*(h-1) choices here.

So, we'd have a O(n^2) solution if we knew the values for gcd(w,h). But, we can calculate all such gcd's in O(n^2) using Dynamic Programming and the fact that gcd(w,h)=gcd(w-h,h) for w>h. So we're done!

There's another O(n^2) solution that groups parallelograms in a different manner: we can group them by the location of their center, which always has half-integer coordinates. Given the location of the center, the parallelogram is defined by two of its vertices, and those two vertices can be placed arbitrarily inside some rectangle, minus the cases where they lie on the same line with the center.

But explaining those solutions is not the main point of this post. I'm writing this because, when I was asked recently at a TC Spotlight Session what kind of mathematical background is needed to succeed in programming competitions, my answer was "One thing that is absolutely crucial is to be able to think 'mathematically'", and I had difficulty explaining what that means. I believe this problem, and this solution, is an excellent example, and explanation, of that phrase. It's crucial to be able to do several quite easy reduction steps until your problem splits into several quite easy problems, and to carefully do the reductions without omitting any important cases (like when the gcd appeared in the above solution). To be able to 'explore' some given direction deeply, and check if it yields a solution or not (instead of just trying to apply some known ideas one by one).

Does this make sense?

And also, can you solve the above problem faster than O(n^2)?


  1. yes, this kind of problems are educative.
    PS: In my opinion, another good example is counting quadrilaterals inside an rectangular grid (though the method is similar).

    Petr, would you mind sending an email to ? I got a favor to ask, regarding programming contests, but I can't find your contact information. Thank you in advance!

    - Rujia Liu

  2. Hi petr
    great post, I find it very educational
    thanks and keep educating dummies like me!

  3. Hey Petr, I set the parallelogram problem in a Romanian contest a while ago. I had the n^2 bounding box solution.

    Another one that I set was counting the number of rectangles on a nxn grid. My solution was O(n^3).

  4. I believe I've got it linear.
    Assuming n less than m,
    result = a[n] + b[m],
    where a[0]=1, a[i+1]=4*a[i]+(i+1)^2+i*(i+1)*2*2=4*a[i]+(i+1)*(5i+1),
    b[n]=a[n], b[i+1]=4*b[i]+(i+1)*(n+1).

    I.e., 'a' for

    x00 xx0 xxx
    000 -> xx0 -> xxx -> ... ,
    000 000 xxx

    and 'b' completes n x n square to n x m rectangle.

    Does it makes a sense?

  5. meant

    x000 xx00 xxx0
    0000 xx00 xxx0
    0000 0000 xxx0
    0000 0000 0000

  6. arrgghh,
    meant b[i+1]=2*b[i]+(i+1)*n

  7. crap, crap, crap...
    no more fixes...
    but it does look linear

  8. Nice blog :)

    @Cosmin : That can be done in O ( lg N ) ;-)

  9. I want to learn the new language Leonardo - hlc.
    Any one have exp in it?