In math and economics, there is a long, proud history of placing imaginary prisoners into nasty, complicated scenarios. We have, of course, the classic Prisoner’s Dilemma, as well as 100 prisoners and a light bulb. Add to that list the focus of this post, 100 prisoners and 100 boxes.

In this game, the warden places 100 numbers in 100 boxes, at random with equal probability that any number will be in any box. Each convict is assigned a number. One by one they enter the room with the boxes, and try to find their corresponding number. They can open up to 50 different boxes. Once they either find their number or fail, they move on to a different room and all of the boxes are returned to exactly how they were before the prisoner entered the room.

The prisoners can communicate with each other before the game begins, but as soon as it starts they have no way to signal to each other. The warden is requiring that all 100 prisoners find their numbers, otherwise she will force them to listen to hundreds of hours of non-stop, loud rock musician interviews. Can they avoid this fate?

The first thing you might notice is that if every prisoner opens 50 boxes at random, they will have a [latex]0.5[/latex] probability of finding their number. The chances that all of them will find their number is [latex](\frac{1}2)^{100}[/latex], which is approximately as rare as finding a friendly alien with small eyes. Can they do better?

Of course they can. Otherwise I wouldn’t be asking the question, right? Before I explain how, and go into a Monte Carlo simulation in R, you might want to think about how they can do it. No Googling!

All set? Did you find a better way? The trick should be clear from the code below, but if not skip on to the explanation.

```
# How many times should we run this experiment?
iters = 1000
results = rep(0,iters)
for(i in 1:iters) {
# A random permutation:
boxes = sample(1:100,100)
# Labels for our prisoners
prisoners = 1:100
# Track how many "winners" we have
foundIt = 0
# Main loop over the prisoners
for(prisoner in prisoners) {
# Track the prisoners path
path = c(prisoner)
tries = 1
# Look first in the box that matches your own number
inBox = boxes[prisoner]
while(tries < 50) { path = c(path, inBox) if(inBox == prisoner) { #cat("Prisoner", prisoner, "found her number on try", tries, "\n") foundIt = foundIt + 1 break; } else { # Follow that number to the next box inBox = boxes[inBox] } tries = tries+1 } # cat("Prisoner", prisoner, "took path", paste(path, collapse=" -> "), "\n")
}
# How many prisoners found their numbers?
cat("A total of", foundIt, "prisoners found their numbers.\n")
flush.console()
results[i] = foundIt
}
hist(results, breaks=100, col="blue")
```

Here is what one of my plots looked like after running the code:

Out of the 1000 times I ran the experiment, on 307 occasions every single prisoner found his number. The theoretical success rate is about 31%. So, if it’s not clear from the code, what was the strategy employed by the prisoners and how does it work?

One way to look at the distribution of numbers in boxes is to see it as a permutation of the numbers from 1 to 100. Each permutation can be partitioned into what are called cycles. A cycle works like this: pick any number in your permutation. Let’s say it’s 23. Then you look at the number the 23^{rd} place (ie the number in the 23rd box, counting from the left). If that number is 16, you look at the number in the 16^{th} place. If that number is 87, go open box number 87 and follow that number. Eventually, the box you open up will have the number that brings you back to where you started, completing the cycle. Different permutations have different cycles.

The key for the prisoner is that by starting with the box that is the same place from the left as his number, and by following the numbers in the boxes, the prisoner guarantees that if he is in a cycle of length less than 50, he will eventually open the box with his number in it, which would complete the cycle he began. One way to envision cycles of different lengths is to think about the extreme cases. If a particular permutation shifted every single number over one to the left (and wrapped number 1 onto the end), you would have a single cycle of length 100. Box 1 would contain number 2, box 2 number 3 and so on. On the other hand, if a permutation flipped every pair of consecutive numbers, you would have 50 cycles, each of length 2: box 1 would have number 2, box 2 would have number 1. Of course if your permutation doesn’t change anything you have 100 cycles of length 1.

As you can see from the histogram, when using this strategy you can never have between 50 and 100 winning prisoners. Anytime you have a single cycle of length greater than 50, for example 55, then all 55 prisoners who start on that cycle will fail to find their number. If no cycles are longer than 50, everyone wins. Just how rare are different cycles of different lengths? For the math behind that check out this excellent explanation by Peter Taylor of Queen’s University.

Before moving on I wanted to visualize these cycles. Try running the code below:

```
# Unit circle
plot(0,0,xlim=c(-1,1),ylim=c(-1,1),col="white",ann=FALSE, xaxt="n", yaxt="n")
for(i in 1:100) {
points(cos(2*i/100*pi), sin(2*i/100*pi),pch=20,col="gray")
}
mySample = sample(1:100,100)
for(i in 1:100) {
found = FALSE
nextItem = i
# Pick a random color for this cycle
color = sample(c(0:9,"A","B","C","D","E","F"),12,replace=T)
lineColor = paste("#", paste(color[1:6],collapse=""),sep="")
while(!found) {
# Draw the cycle
segments(cos(nextItem/50*pi), sin(nextItem/50*pi), cos(mySample[nextItem]/50*pi), sin(mySample[nextItem]/50*pi),col=lineColor,lwd=2)
Sys.sleep(.4)
if(mySample[nextItem] == i) {
found = TRUE
} else {
nextItem = mySample[nextItem]
}
}
}
```

You can adjust the “Sys.sleep()” parameter to make the animation faster. I recommend running the code to see how the cycles “develop” over time, but here’s a snapshot of what I got:

Tags: 100 prisoners

It would be more helpfu if you put arrows on the graph. You can use gplot.arrow in library sna.

1) Very fun post – thank you đź™‚

2) I just wanted to add to the list the wonderful books of Raymond smullyan who has made fabulous math tales (a variety of which includes prisoners, see for example “the lady or the tiger”)

Um, I think you need to mention at the beginning of the problem that the boxes are numbered, or in an obvious linear sequence with an obvious first element.

Very cool trick.

It works, but on the unstated condition that the prisoners know of a surjective function from the 100 numbers the warden places in the 100 boxes to those 100 boxes, for example if the prisoners know that the warden will choose the first 100 counting numbers.

Neat trick. There are other tricks one can play too. I’ll give a worse one. I call this “2 prisoners, 2 boxes”, and is a smaller version of yours.

Random strategy: 25% odds of success

Optimal strategy: Prisoner 1 looks in one box. Prisoner 2 looks in the other box. 50% chance of success.

Worst strategy: Both prisoners look in the same box. 0% chance of success.

So my follow-up question: Is what you described optimal, or is there something better?

@JL:

Good suggestion, but I’m not sure it makes the graph look much better. For those who want to try it, add:

library(sna)

to the beginning of the code, then replace

segments(…)

with

gplot.arrow(…)

and remove the “lwd” parameter.

@Douglas McClean:

I suppose I should state that the warden has put each of the prisoners numbers in one of the boxes. Of course if any prisoner’s number is missing then the chance of them winning is 0% regardless of the strategy.

@peter:

Apparently the strategy I outlined above has been proven to be the very best the prisoners can do, given their situation. See â€śThe locker puzzleâ€ť, The Mathematical Intelligencer, Volume 28, Number 1 (March 2006)

Nice Posting. I thought the same as Jonathan mentioned ealier, you need a linear ordering, maybe wooden boxes with numbers at them, too.

I’ll have a look at that math article â€śThe locker puzzleâ€ť, thx.

Matt,

I meant that, if the warden elects to assign arbitrary “numbers”, as opposed to the integers from 1 to 100, to the prisoners, and the prisoners aren’t informed of the other prisoners’ numbers, then the strategy breaks down. Looking in the first box and discovering the number 37 + sqrt(11) which does not match his number 6, the first prisoner does not know what to do next.

To echo Jonathan, I assumed that the prisoners would not be able to assume that the boxes in the room were laid out in a line.

Maybe you could restate this contrived problem using doors down a corridor instead of boxes?

This seems to be a good way to get the maximum chance of everyone being freed. I would imagine the prisoners being a bit more selfish, what if they wanted to maximize the chance that each individual is freed?

The first strategy I thought of guarantees that exactly 50% of the prisoners would be freed, every time, and it might be possible to improve upon it, with a small change to the box selection:

The prisoners all agree to each open the 50 leftmost boxes. This way, 50 prisoners will find their number, since they all open the same box.

Nevermind, I missed the crucial bit about all of them being required to find their number.

Not only that, there is also a history of mathematics being developed in prison. Sheaf Theory was invented by a guy in the officer’s prison during WW2. And I believe “Trillian arithmetic” or … forget the exact name.

I wrote about this way of gerrymandering probability on my blog here: http://gcanyon.posterous.com/gerrymandering-probability

Since posterous shut down, here’s the equivalent page on wordpress: http://gcanyon.wordpress.com/2009/03/20/gerrymandering-probability/

You say that it isn’t possible for the prisoners to signal each other during the exercise, so this may not be valid, but you also say that the prisoners can discuss strategy beforehand so it could be possible.

Suppose that, instead of iterating over a single prisoner’s available operations, you iterate over the entire group?

foreach operation:

all prisoners open a closed box

prisoners who find their number leave the box open

prisoners who do not find their number leave the box closed

Other than this minor alteration, the rest of your code remains pretty much the same. As prisoners find their numbers, the search space is reduced because they only open *closed* boxes.

Scratch that – it actually works against this scheme, because the prisoners stop following a path through the boxes. It works out at about 17% efficient.

Thinking about it some more, why define a cycle at all? Suppose we define our boxes like this:

Box 1 – 2 – 3 – 4 – 5

Value 5 – 3 – 4 – 1 – 2

Prisoner 1’s cycle would look like this:

1 – 5 – 2 – 3 – 4

Prisoner 5’s cycle would look like this:

5 – 2 – 3 – 4

They follow the same cycle, but start at different points.

If we assume that all prisoners will basically follow the same path through the boxes, but starting at different locations, we can simplify the algorithm greatly:

Prisoner moves to the box with the index that matches his own number

While he has not found the box that contains his value:

Prisoner moves one box to the right

In the 5 box example, prisoner 1’s cycle looks like this:

1 – 2 – 3 – 4

Prisoner 5’s cycle looks like this (when he hits the last box, he wraps around to the first):

5 – 1

Testing this over 1000 iterations gives around a 36% success rate of all prisoners finding their box.

tl;dr – The complex cycle scheme is irrelevant. Any cycle through the boxes followed by all prisoners who each take a unique starting point, even if that cycle is just to iterate sequentially over the boxes, results in the same chance of finding the correct boxes.

Just practical thinking (or misunderstanding): If prisoners do their cycling sequentially theyÂ´d communicate boxes’ numbers without saying a word. If the first prisoners opens a box and then goes to the next one that got the number he found in the first box, everybody know what the number of the second box was. So in reality this would be probably some kind of memory….

Hi ran2,

The way the problem is stated usually implies that every time a new prisoner enters the room, all of the boxes are put back to exactly how they were at the beginning.

Cheers,

Matt

It’s nice to keep track of cycles so that one does not revisit them:

@Brian: It’s impossible for a cycle to happen since the prisoner started with his own number… if a cycle happens, then the prisoner has by definition found their own number and the loop ends anyway. No other cycle can exist because that would require opening a box with a number that has already been found… per the parameters of the game this is impossible (each number exists only once in the 100 boxes).

This is a variation of Pollard’s Rho, which is a good google fodder phrase for this topic.

This fact is interesting to @Bob’s assertion that the prisoners keep checking each adjacent box rather than the box with the number they find. I’m not certain if your results would be the same, but the fact that the neat chart that has a spike near 100% is easier to prove if you use the cycle method (the next box selected is the number of the box inside). This is because as the blog post indicated, we can only fail if the boxes (treated as a directional graph) contain a cycle of length > 50.

I’ve read the problem and solution on several web sites. Here is a counterexample where no loop-cycle combina-tricks gets the expected survivability to 31%.

2 prisoners, 2 boxes, each opens 1.

Odds they survive? 25%

25% < 31%

Oh, they agree in advance to look in a different box each, thereby making it 50% – both succeed or both fail.