Friday, February 11, 2011

Back to the Party

I recently posed the party problem while interviewing a candidate. There are many good solutions to the problem, and his approach was on target, if conventional. Typically, he pointed out, these problems are solved by forming the distributed processes, ahem, party guests, into a simple topology, like a ring. Most textbooks on distributed algorithms treat this case at some length because it lends itself nicely to inductive proofs. Of course, a fixed topology wasn't part of the original problem statement, so any algorithm that answers the original question, What word is on the card whose number is 50342?, would need to include a sub-algorithm to first organize the guests into a ring where each party-goer has exactly two neighbors.

Let's describe one such algorithm in more detail. I'd like to defer, until a later post, the part of the problem that allows guests to swap cards in any way they choose. We'll assume for now that each guest continues to hold the same deck of cards they are given when they enter the room. Our initial goal is to introduce each new guest to two others, whom we'll call the R-neighbor and the L-neighbor. A guest's R-neighbor is the one person in the room who has the smallest ID that is larger than his own. Similarly, the L-neighbor is the one who has the largest ID that is smaller than his. The only exception is where the tail of the ring joins the head, at which point the guest with the largest ID in the entire room has as his R-neighbor the guest with the smallest ID, and vice versa. Note, this is a logical ring, not a physical one. If we don't allow the guests to eat and drink, the algorithm will surely fail.

Organizing the guests into such a logical ring isn't too difficult. We could say each person, when she enters the room, chooses a four-digit ID number that is unlikely to be chosen by anyone else and writes it down. (Notice how randomness enters the story, literally, at the front door.) If the guest is the first to arrive, there's nothing more to do; she comprises a degenerate ring of one and is her own R-neighbor and L-neighbor. Subsequent guests all choose their own, probably unique, ID. Each one then chooses some other person in the room, it doesn't matter who, and hands them a message that says, My ID is XXXX. Please find my place in the ring. Upon receiving such a message, written on a slip of paper, each guest checks to see if the new guest's ID falls in between its own ID and the ID of its R-neighbor. If not, he passes the message on to his R-neighbor, who does exactly the same thing. If the new ID does fall between my own ID and that of my R-neighbor's, the new guest is inserted into the ring, which is done by sending messages to the new guest and to his original R-neighbor, introducing them to each other and informing them of the new neighborhood arrangement.

Once the logical ring is in place, the original question can be answered with each guest passing slips of paper to one of only two other guests, its neighbors in the ring. Here's one such algorithm. Upon receiving a message, each guest performs the following sequence of actions (remember, we're looking for a symmetric solution, so every guest must behave the same way) starting with whomever receives the original note from our host.

1. If there isn't already an ID at the top of the message, write my own ID there.
2. If there is already an ID, and it's mine, we're finished. Hand the answer back to the host and skip the remaining steps.
3. Search through my deck of cards looking for the number 50342.
4. If I find a card with that number on it, write the associated word on the slip of paper; this is the answer.
5. Pass the message on to my R-neighbor.

If we discount the possibility that two guests might choose the same ID at the front door, this algorithm is deterministic, both for the positive and negative cases. That is, if the card in question exists in any guest's deck, the algorithm will discover it and answer the question, and if there is no card, the algorithm will report that it doesn't exist. This particular version of the algorithm will provide both answers after sending exactly n+2 messages (counting the two in which the host is involved), where n is the total number of guests at the party. There are various ways to reduce the total number of messages required, but in all cases the number of messages needed will grow as the number of guests increases. Still, this is a pretty darn good algorithm that behaves just the way we like our computer software to behave — deterministically.

Just for grins, lets also describe a probabilistic algorithm that solves the same problem given the same constraints. This one doesn't presuppose any particular topology, so we can skip the organizational first phase. Instead, we proceed as follows. When the host hands the slip of paper with the question on it to a random guest, he shouts to everybody in the room,

"Attention! What word is on the card whose number is 50342?"

Upon receiving this message, each guest searches through his deck of cards and, if he finds the number 50342, writes the associated word on a slip of paper and hands it to the guest who shouted the question, who then passes the answer along to the host. If no one answers within some reasonable amount of time, we assume the card does not exist and send a message back to the host to that effect.

Which approach is "better?" This second algorithm is better in a number of regards. It's certainly a lot simpler. It's also more efficient, requiring only four messages in the positive case — the original question from the host, the shout, the reply from the question, and the reply to the host — and only three messages in the negative case. It also scales better, since neither of these performance metrics grows as the number of guests increases.

But the probabilistic algorithm has a major drawback that would prevent its use by most any red-blooded programmer: it is, well, probabilistic. There is some chance, however small, that the guest holding the requested card does not hear the shout and therefore fails to answer. The shouting guest, in that case, will wait a respectable period of time and then declare, the card doesn't exist. To be precise, he can only truthfully claim that he doesn't know whether it exists or not, since the negative case and the failure case are indistinguishable.

Another drawback of the latter algorithm, also related to its probabilistic nature, is in choosing the right "reasonable amount of time" to wait before declaring the card can't be found. If the chosen timeout is too short, some of the guests may not have enough time to search through their card decks to find the card. If it's too long, the nominal performance of the negative case suffers. Yes, we could have the guest repeat the shout after a short timeout in order to increase the chance that everybody hears it and to give everyone a longer period in which to search their individual card decks. However, there will always be some remaining possibility of a false negative.

Does that mean the probabilistic algorithm is fatally flawed and should be avoided in all cases? More on that later...