Oh, well done, CodingHorror! Jeff Atwood posted a statistics puzzle and its solution, and his readers’ reaction was just about identical to what happened in the probability & statistics course I attended 15+ years ago, when a mathematically very similar problem was set on our weekly exercise sheet. Jeff’s two posts have been commented upon more than 1500 times in about 3 days, people are vociferously defending their reasoning and accusing “the other side” of idiocy, lack of maths skills, and boneheadedness, some careful and constructive chaps have written simulations… a nice mayhem.
Back in my statistics class, getting us students all riled up and passionate about solving an exercise was of course the intended effect.
Jeff’s puzzle was formulated this way: “Let’s say, hypothetically speaking, you met someone who told you they had two children, and one of them is a girl. What are the odds that person has a boy and a girl?”
(My old statistics exercise was given in a form that’s harder to grasp intuitively: You’re on a game show, three closed doors in front of you. You know that behind one of them is a car and behind the others are goats. If the door you’ll be opening has a car behind it, you get to keep the car; if it’s a goat, you lose. You pick a door, but don’t open it yet. Then, the game show host opens a different door, one you haven’t picked, and reveals that there is a goat behind it. You now have a choice between sticking to your original selection and switching to the other still closed door. What should you do? We didn’t have Wikipedia then, or Google.)
Intuition of probabilities is a tricky thing. There are studies that show, for example, that people’s intuitive judgments are much more often correct if they are asked to reason about frequencies (how often something will happen, out of a number of repeated tries) than in terms of probabilities (a number between 0 and 1) — even though both are mathematically equivalent.
It’s the same for Jeff’s problem. My reaction to reading it was “well 50% of course… one second… there’s something fishy about it… YOU’RE DOING IT WRONG… “. Turns out, how easy it is to see the correct answer depends on what sort of interpretation you put on the original formulation. So let’s take this apart and develop the puzzle into a longer narrative:
- “You’re a teacher, and overhear a father enrolling his two children, Sam and Robin. You didn’t catch all of the conversation, so you don’t know the sex of the kids. However, the father is picking up some leaflets for after-school activities and you hear him exclaim ‘You have a girl geek computer camp? Great, I’ve got someone at home who will love this!’ What are the odds that this father has a boy and a girl?” The solution is straightforward: One of the two, Sam or Robin, is a (computer-loving) girl, and for the other we don’t know — they may be male, or dislike computers, or the wrong age for the group, or not interested for any other reason. The possibilities are: Sam is a girl, and Robin is also a girl; Sam is a girl and Robin is a boy; Sam is a boy and Robin is a girl. As the probability of a random child being male or female is roughly 50%, these three combinations are equally probable. Two of the three combinations are mixed sex. Therefore, the odds of the father having a boy and a girl are 67%, approximately. This is the intended correct interpretation and solution of the puzzle.
- A lot of readers, however, form a different mental representation of the problem, which translates into a scenario that’s subtly different from the above: What you hear the father exclaim is “You have a girl geek computer camp? Great, my Sam will love this!” In this case, you know which of the children can be identified as known to be female, so the possible combinations are reduced to: Sam female – Robin male, and Sam female – Robin female. 50% odds of the family having a boy and a girl. However, nothing in the original formulation indicates that you know which kid is female, only that (at least) one of them is. This is why this solution is incorrect.
- There is yet another, “common sense, normal English” interpretation: If I meet someone and they tell me, outright, they have two children and one of them is a girl, I can pretty much assume that the other one isn’t. This, of course, would make this not a maths puzzle, but a trick question, with 100% odds that the family has two kids of different sex. Correct or incorrect? You decide. (I’m leaning towards “correct”.)
Our frequently wrong intuition about problems like this, even among people with mathematical training and jobs that require formal reasoning, is an interesting feature of cognitive psychology. The Wikipedia article about the “Monty Hall problem” (the one with the goats) contains and links to fascinating material. Jeff’s commenters also reveal the pitfall in their mapping of the formal problem to a mental picture:
So are you actually saying that once you’ve had one girl child the odds of having a boy increase????
This example is rubbish. The question is simply “What is the probability that my other child is a boy”. The answer is 50%. The sex of the first child has no bearing on the sex of the second.
If the person has two children, there are three possible combinations of gender: BB BG GG. If we rule out BB that leaves us with two options — BG and GG — which results in 50% chance of the other kid being a boy.
Solution is flat out wrong, it assumes a dependency that does not exist. If I flip a coin, its a 50% chance of being heads, and a 50% chance of being tails. This is independent of, and completely regardless of the outcome of a previous coin toss. This is exactly the same. Having one child that is a girl has absolutely no bearing on the sex of the other child, its still the same completely random 50% chance for each sex.
I’ve just written some code to test this out, and in the procces (and result) I’ve come to the 2/3 chance (25% no girls, 50% boy+girl, 25% boy boy) conclusion. […] I’m still quite confused by the implications. It feels like I’m being told that past events influence future events.
I’d say it’s 50%. If you have two children, you have BB, BG, GB, or GG. One of them is a girl then it’s not BB. When you mention that one of the children is a girl it’s 25% chance you are talking about a girl in BG, 25% a girl in GB and 50% (25%+25%) a girl in GG. (Four girls and you’re talking about one of them with equal probability.) Then chance to have a girl and a boy is 50% because it’s either BG or GB, not GG.
Fundamentally, analysing a statistical problem in terms of equally likely, atomic combinations, is not something that comes naturally to us.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Really nicely written, thanks. To me the problem is due also to the tendency in combinatorial settings to “forget” sources of information, which make the problem not as combinatorially simple as it seems.
Congratulations on spotting the problem with this problem: the English statement of the problem allows for multiple interpretations.
Many statements of the Monty Hall problem are in fact unsolveable: we need to know Monty’s universal behavior, but the problem is often stated only giving him behavior in one instance.
I suspect that what we have is some combination of sloppy language and a desire to not tip off the answer.
Is this a correct reasoning: if you know nothing other than the person has two children, boy-girl is 2 of 4 combinations, or .5 probability. When you are told that one of the children is a girl, you are eliminating one of the possibilities, leaving 2 boy-girl of 3 combinations, for the 2/3 chance.