Twenty questions and conditional probability
The previous post compared bits of information to answers in a game of Twenty Questions.
The optimal strategy for playing Twenty Questions is for each question to split the remaining possibilities in half. There are a couple ways to justify this strategy: mixmax and average.
The minmax approach is to minimize the worse thing that could happen. The worst thing that could happen after asking a question is for your subject to be in the larger half of possibilities. For example, asking whether someone is left-handed is not a good strategy: the worst case scenario is "no," in which case you've only narrowed your possibilities by 10%. The best mixmax approach is to split the population in half with each question.
The best average approach is also to split the possibilities in half each time. With the handedness example, you learn more if the answer is "yes," but there's only a 10% change that the answer is yes. There's a 10% chance of gaining 3.322 bits of information, but a 90% chance of only gaining 0.152 bits. So the expected number of bits, the entropy, is 0.469 bits. Entropy is maximized when all outcomes are equally likely. That means you can't learn more than 1 bit of information on average from a yes/no question, and you learn the most when both possibilities are equally likely.
Now suppose you want to ask about height and sex. As in the previous post, we assume men and women's heights are normally distributed with means 70 and 64 inches respectively, and both have standard deviation 3 inches.
If you ask whether a person is taller than 67 inches, you split a mixed population of men and women in half. You will learn 1 bit of information from this question, but you've put yourself in a suboptimal position for the next question. A height of at least 67 inches half of the adult population in general, but it selects a majority of men and a minority of women. And as we discussed above, uneven splits are suboptimal, in the worst case and on average.
If you're going to ask about height and sex, ask about sex first. If the person is female, ask next whether her height is above 64 inches. But if the person is male, ask whether his height is above 70 inches. That is, you want to split the population evenly at each step conditioning on your previous answer. A cutoff of 67 inches is optimal unconditionally, but suboptimal if you condition on sex.
The optimal strategy for Twenty Questions is to ask a question with probability of 1/2 being true, conditional on all previous data. You might get lucky with uneven conditional splits, but on average, and in the worst case, you won't do as well.