Saturday, January 15, 2011

Facebook Hacker Cup team: please read for your post-mortem

Facebook have launched their annual Hacker Cup programming competition, basically a clone of Google's code jam. The idea is: using any programming language of your choice, implement algorithms that solve a set of given problems, then run them on a set of test cases and submit the results and your source code. For example, one of the problem sets of last week's qualification round was called "Double Squares". In this simple problem, a test case consisted of a single integer number X, and the requirement was:
For each value of X, you should output the number of ways to write X as the sum of two squares.
Sounds easy? It is: just check each integer number M between 0 and sqrt(X) whether X-M*M is a square of another integer, and make sure not to double-count number pairs.

The real challenge is to spend the right amount of up-front thinking about the solution before diving into the implementation. Since Facebook will grade your submission not only based on correctness, but also on time, you'll want to start hacking as soon as possible. But without a clear plan on what you're doing, you might easily overlook the easiest approach. Or, even implement a non-optimal algorithm that doesn't perform well enough on larger input sets. I made both mistakes with "Double Squares": instead of spending a few extra minutes on up-front thinking, I immediately jumped into coding and started by creating an array of all square numbers up to X (of course I will need them!), and then got stuck in some kind of brute-force algorithm which would probably would have run for days .

Now, you really wanna make sure your algorithm works and performs well enough even for large input sets: Facebook won't give you the "real" test cases with the problem statement, just a few sample ones. Only once you're confident enough in your solution is when you should click the "download input file" link - a timer will start, and you have precisely six minutes to run your code against the test cases and submit the results.

A couple of things went wrong during the qualification round: for one thing, while the instructions mentioned the 6 minute time window, they were easy to overlook. So, according to the angry comments of many participants, a lot of people downloaded all the test cases up-front, only to find their time limit exceeded at submission time. Also, there were reports of technical problems, in particular from people using Chrome who weren't able to save the input file to disk.

Facebook responded in a reasonable way: after a couple of hours, they lifted the six minute limit and allowed all participants to submit their solutions, although with a hefty time penalty. Fair enough, since solving one of the problems was sufficient to succeed to round 1. Also, they promised:

We're aware that things haven't gone very smoothly during the qualification round, and we're sorry that some of you experienced some frustration. I wanted to let you know what we're doing to improve things from here on. (...) First off, we've compiled a list of bugs in the interface and things that people have been confused by. We'll address all of these points before the next round. 
The next round, which was supposed to consist of three different sub-rounds at different time slots, was announced for this weekend. But things didn't quite go as expected... while the risk of accidentally starting the six minute timer was mitigated by a confirmation popup (good job on that one), here are some of the problems that I and probably hundreds of others experienced:

  1. "After the Dance Battle" was a maze problem which required finding the shortest path from start to exit, with some obstacles such as walls and the ability to "warp" from one field to another one with the same color. Fairly easy to solve if you've ever done a breadth-first search on a graph.

    Problems: the example input had 5 test cases, the example output had 6. So this was already confusing. But to make things worse, the actual input file was inconsistent with the example input, and your code was likely to fail with something like an "index out of bounds" exception. Good luck fixing it within six minutes, and make sure not to panic..

  2. "Power Overwhelming" was basically a constrained optimization problem. Given a budget M and the cost of a warrior (W) and a "shield generator" (G), find the optimal number of warriors and shields to buy to maximize damage to an opponent. Damage created was the number of generators times the number of warriors.

    Problems: The original problem statement was wrong and asked to output the number of warriors for the optimal solution - which was inconsistent with the sample output. After about an hour, Facebook silently updated the page, which now asked to output the number of generators of the optimal solution. Wow. Now suddenly, my algorithm worked with the test input set and returned the expected results for all test cases - except one, which also turned out to be wrong.

  3. Submitting the results and source code would only work occasionally, mostly clicking the submit button would not provide any feedback to the user - while the six minute timer was running down to zero. Again, don't panic!

  4. Facebook realized that there were technical problems with the submission, and I can't believe what their response was: about 30 minutes before the end the round, they announced:
Given the high rate of submission failures we are experiencing, we are shutting down the first subround a little early.

Wow. I can only imagine the angry and frustrated faces of all the participants who, despite all the challenges,  managed to come up with a working implementation ready for submission - only to find the competition close prematurely...

 Anyways, here's my unsolicited advice to the team at Facebook who manages the Hacker Cup:


  • Minimize the risk of faulty problem statements by applying the same sound engineering principles used to write software: seek multiple peer reviews, ask people to "test" the problems (including the input/output data) end to end. As soon as there is a strong draft of a new problem, use version control (no kidding) and review every diff carefully and by multiple people. Once a problem is finalized, "ship" it. No more last-minute-changes!

  • Do a dry run of the competition. Simulate real load on your back-ends to minimize the risk of technical problems during submission. Also, make sure the UI is always responsive ("Submitting...").


  • No real-time changes after a round has started. Even if a problem statement is faulty, stick to it - figure out how to grade the submissions later. Don't change any other basic conditions, such as the end time of the round: it makes people angry and frustrated. Only exception: extending a deadline to compensate for technical problems is acceptable. But, don't make these decisions ad-hoc: think through possible scenarios up-front and have contingency plans available.


That all being said, I think the actual problems were pretty cool and if it weren't for all these difficulties, it could have been lots of fun. Given that my qualification remains valid and Hacker Cup doesn't get completely cancelled, I'm looking forward to the next round.

3 comments:

  1. Nothing to add, just enjoyed reading the article :) good work.

    ReplyDelete
  2. fb shud give everyone a fair share of the prize money n be done with it... they suck

    ReplyDelete
  3. You could have conveyed your frustration ab it more :)

    ReplyDelete