Description: In 'Secret Santa', people draw names from a ...



Probability of Derangements

Brian Parsonnet

Description: In "Secret Santa", people draw names from a hat to determine who they will buy a present for. If a person picks his own name, he picks another name and throws his own name back in. If the last person draws his own name, there's a problem. What is the probability of that problem as a function of the number of people participating?

The solution is most easily explained by way of example. Let’s say there are 5 people, named 1 through 5. For the last person to choose 5, the first four people must draw 1-4 as a derangement (where no one gets his own name), and there 9 ways of doing so (sequence A047920):

1

1 0

2 1 1

6 4 3 2

24 18 14 11 9

Reading the bottom two rows, of the 24 possible arrangements, 6 of 24 are not possible because person 1 puts his own name back in. Similarly, 4 of the 18 remaining are eliminated by person 2, and 3 of 14 remaining are eliminated by person 3, 2 of 11 by person 4, and 9 derangements succeed getting through the first 4 names, corresponding to 9 ways for the 5th person to get his own name. Also, 9 = !5 = round( 5! / e ).

But what is the probability of each of these 9 derangements, or sequences? It boils down to understanding how many choices exist at each successive draw. The first person (i=1) can draw from (N-i) possibilities (namely, 2,3,4,5). The second person nominally has 3 to choose from (N-i, i=2), unless the first person drew number 2, in which case person 2 may draw from 4 possibilities (1,3,4,5, or N-i+1). Similarly, the 3rd person has 2 or 3 to draw from, the 4th has 1 or 2, and the last person has 0 or 1 valid choices. (Zero choices means the last person is forced to choose his own name, which is “the problem.”)

Continuing with this example, the probability of sequence 2, 1, 4, 3, 5 can be determined as follows:

1st person: “2” is drawn from 4 possibilities (2,3,4,5 / 1 not yet taken)

2nd person: “1” is drawn from 4 possibilities (1,3,4,5 / 2 already taken)

3rd person: “4” is drawn from 2 possibilities (4,5 / 1-2 taken, 3 not yet taken)

4th person: “3” is drawn from 2 possibilities (3,5 / 1, 2, and 4 already taken)

5th person: “5” is forced (5 not yet taken)

…with probability 1/4 * 1/4 * 1/2 * 1/2 = 1/64. This will be better represented as 9/576, because the LCM of all sequences is 576, or (N-1)!^2.

There is another sequence with the same probability, namely 2, 4, 1, 3, 5. It’s the same because in both sequences, 2 is drawn before 2’s turn, 4 is drawn before 4’s turn, but 1, 3 and 5's numbers are not drawn before their turns, respectively.

So, if there are N people, at the i-th turn (i = 1..N), person i has either (N-i) or (N-i+1) choices, depending on whether his own name is chosen yet. A way to represent the underlying logic of the two cases above is “01010,” where a 0 indicates that the person’s number is not yet drawn, and a 1 indicates that it is already drawn.

For the cases of interest in which the Nth person is forced to choose his own name, the last digit of this pattern must be 0, by definition. Similarly, the 1st digit must be a 0, since no numbers are yet chosen. And similarly, the second to last digit must be a 1. So all the “problem” patterns start with 0 and end with 10.

Using the example of 5 people, the 2nd and 3rd digits can each be a 0 or 1, creating 4, or 2^(N-3), target patterns:

0-00-10

0-01-10

0-10-10

0-11-10

A simple way to calculate the probability of any sequence with such a patterns is:

01010 (pattern)

43210 (nominal number of choices per position)

44220 (add the above two together for total choices per position)

Multiplying the non-zero digits together (4x4x2x2=64) gives 1/64 or 9/576, as above. Using 576 as the denominator, the probability for any sequence of a given pattern is:

0-00-10 12 / 576

0-01-10 8 / 576

0-10-10 9 / 576

0-11-10 6 / 576

So for N=5, the 9 derangements must be distributed over these 4 patterns. By enumeration, that distribution can be shown to be:

0-00-10 1 sequence 41235

0-01-10 5 sequences 31425 34125 34215 43125 43215

0-10-10 2 sequences 21435 24135

0-11-11 1 sequence 23415

9 total sequences.

In the table above, let’s call {1, 5, 2, 1} the frequency vector. The length of this vector is always 2^(N-3), as a consequence of the process being followed. A direct calculation of these frequency vectors for the first few N is shown here:

|People: |3 |4 |5 |6 |7 |

|Frequencies: |1 |1 |1 |1 |1 |

| | |1 |5 |13 |29 |

| | | |2 |6 |14 |

| | | |1 |13 |73 |

| | | | |2 |6 |

| | | | |6 |42 |

| | | | |2 |18 |

| | | | |1 |29 |

| | | | | |2 |

| | | | | |18 |

| | | | | |8 |

| | | | | |14 |

| | | | | |2 |

| | | | | |6 |

| | | | | |2 |

| | | | | |1 |

Summarizing, for N=5, there is 1 case of 0-00-10 at probably 12/576, 5 of 0-01-10 at 8/576, 2 of 0-10-10 at 9/576, and 1 of 0-11-10 at 6/576. The total probability then that person 5 gets his own name is the sum of the frequencies of each pattern times the probability defined by each pattern, or {1,5,2,1} * {12,8,9,6} / 576, or 76 / 576.

Listing just the numerators of the probability ratios is trivial:

|People: |3 |4 |5 |6 |7 |

|Row 0 |1 |3 |12 |60 |360 |

|1 | |2 |8 |40 |240 |

|2 | | |9 |45 |270 |

|3 | | |6 |30 |180 |

|4 | | | |48 |288 |

|5 | | | |32 |192 |

|6 | | | |36 |216 |

|7 | | | |24 |144 |

|8 | | | | |300 |

|9 | | | | |200 |

|10 | | | | |225 |

|11 | | | | |150 |

|12 | | | | |240 |

|13 | | | | |160 |

|14 | | | | |180 |

|15 | | | | |120 |

Each value can be simply determined from its associated pattern, as already shown. As a second example, for N=7, the pattern 0110010, or 0-1100-10, is represented above in column 7, row 12 (binary 1100), indicating that each case of that pattern has probability 240 / 6!^2, or 240 / 518400.

But the frequency table which indicates how many sequences have this pattern is more tricky, and fascinating. But first, back to answer the original question…

Sequence: 1, 0, 1, 5, 76, 1624, 52116, 2298708, 133929216, 9961180416, 921248743680, 103715841415680, 13967643016085800…

Description: this series represents the numerators of a ratio, the denominator being (n-1)!^2 (A001044), in which the ratio gives the exact probability that the last person draws his own name from a set of names, given that each other person in the set has previously drawn randomly and in succession, but rejected his own name.

Formula: See attached Excel file for calculation of the above sequence. Here is a pass at translating it into symbolic notation, but I still need to verify it for accuracy in this format. Value = sum( i=0..2^(N-3)-1, F(i, N-2) * X(i, N-2) ), where N is the number of people, and F(r,c) = sum( j=0..c-L(r)-1, F(T(r),L(r)+j) * M(c-T(r)-1,j) ), F(1,) = 1, and X(r,c) = product(k = 0..c-2, (3 + k - b(r,k))), and M(x,y) = binomial distribution {x+2,y} when x+1 > y, and binomial{x+2,y}-1 when y+1 ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download