Class14: Cookie experiment II and the Monte Hall problem
Discussion
Choose a random order for the numbers from 1 to 10 by using shuffling and dealing a pack of cards numerered from one to 10. Then choose a cookie from each of the bags numbered from 1 to 10 and taste them in the order corresponding to your random order. Rate each on a scale from 1 to 5 with 5 being Excellent, 3 Average, and 1 Poor.
Princeton individual cookies data
Princeton average cookies data
Dartmouth individual cookies data
Dartmouth average cookies data
The Monte Hall Problem
Here is the infamous Monte Hall problem, as it appeared in the Parade Magazine of 9 September 1990:
Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say number 1, and the host, who knows what's behind the
doors, opens another door, say number 3, which has a goat. He then says to you, ``Do you want to pick door number 2?'' Is it to your advantage to switch your choice?
Discussion
- 1. Should you stick or switch?
- 2. Design and carry out experiments to check your conclusion.
- 3. What assumptions does your answer depend on?
- 4. Discuss other plausible assumptions, and how they would affect your answer.
Comments on the second set of journal questions
I (Laurie) enjoyed reading your journals this time. The second period journal assignments had the following components:
Class 5 Comment on the Nightline article (4 points),
Class 7: Aids calculations, estimation of HIV positive in U.S., and Simpson's Paradox (6 points),
Class 8 meaning of margin of error and your feelings about polls (6 points),
Comments on other issues of interest to you (up to 4 points)
Most of you thought it was reasonable to require boxers to be tested for the HIV virus. Many of you also thought those who are HIV positive should not be allowed to compete in boxing matches. One of you argued that just the knowledge that an opponent is HIV positive might change the way the opponent would fight to his disadvantage.
On the other hand, most of you also felt there were some pretty silly statements made on the broadcast. There was particular concern about how one would estimate the risk of becoming HIV positive as a result of contact in a sporting event. Some of you thought that these extremely small numbers that were given were just made up.
Most of you seem to understand the method of finding the probability of a person being HIV positive given that a single Elisa test is positive. Some are still not clear on the fact that the probability of testing positive given that a person is HIV positive is not the same as the probability a person is HIV positive given a positive test. This is a special case of the fact that P(A given B) is not, in general the same as P(B given A). For example, if you toss a coin twice, the probability that two heads turn up given heads on the first toss is 1/2 but the probability of heads on the first toss given two heads turn up is 1.
Most of you gave a too simplistic answer to the meaning of margin of error saying, for example, that a poll that estimated a percentage of 24% for a candidate with a margin of error of 4% on the basis of a sample of 700 voters means that the true outcome will be between 20% and 28%. Of course, it cannot be that simple. In a sample of 700 it would certainly be possible to get only 125 voters (about 18%) of the voters in favor of the candidate. We will talk about this more later and how you find the margin of error but here is an informal description that the New York Times includes for it's polls. Most of the leading newspapers include something similar.
How the Poll Was Conducted
The latest New York Times/CBS News Poll is based on telephone interviews
conducted Feb. 22 to 24 with 1,223 adults throughout the United States.
The sample of telephone exchanges called was randomly selected by a
computer from a complete list of active residential exchanges in the country.
The list of more than 36,000 residential exchanges is maintained by Marketing
Systems Group of Philadelphia.
Within each exchange, random digits were added to form a complete
telephone number, thus permitting access to both listed and unlisted numbers.
Within each household, one adult was designated by a random procedure to be
the respondent for the survey.
The results have been weighted to take account of household size and number
of telephone lines into the residence and to adjust for variations in the sample
relating to geographic region, race, sex, age, and education.
In theory, in 19 cases out of 20 the results based on such samples will differ
by no more than three percentage points in either direction from what would
have been obtained by seeking out all American adults.
For smaller subgroups the potential sampling error is larger. For example, it is
plus or minus five percentage points for those who say they are likely to vote in
a Republican primary or caucus this year.
In addition to sampling error, the practical difficulties of conducting any
survey of public opinion may introduce other sources of error into the poll.
Variations in question wording or the order of questions, for instance,
can lead to somewhat different results.
You were quite divided on the question of how seriously we should take the study that claimed that complex writing protected against getting Alzeimer's disease. In any case most of you felt that Linda should not worry too much. Some thought that her ability to do complex mathematics would substitute for complex sentences and one of you thought she was safe because she was not a nun.
On other issues there were a number of helpful suggestions about the class. The most obviously helpful comment was to get the clock fixed so it is not always five minutes fast so we would not feel we had to quit so early. We had requests that we discuss the infamous Marilyn vos Savant Monte Hall problem and we will do that Friday. A number of you mentioned the article in the Princetonian about drinking and thought that they did a good job of reporting the results of the poll. One person pointed out, for all the reasons that we have been talking about, that the correlation between drinking and grades should not be automatically thought of as causation and care should be taken in developing policies based on a causation relation.
A number of you commented on a new awareness in of probability and statistics in your daily lives including chances of getting a job, meaning of a basketball players percentage shooting etc.. A particularly interesting example was an advertisement for a product recommended to one of you that had a number of statistical claims. An interesting project would be to consider the role of statistics in advertising. A couple of you made interesting comments on the problem of rating colleges and suggested that we pursue this further in class. A tour guide said that the question often asked is "What are the chances of my daughter/son getting into Princeton?" and remarked that surely we should be able to answer this rather simple question.
Perhaps the most challenging suggestion for a chance course was, as the saying goes, put our money where our mouth is. Here is the suggestion:
We ought possibly make our probability assertions mean more. We can make
wagers on our beliefs using class points/no homework etc.... You decide.
But for example, we could have bet class points on our ability to taste
coke from pepsi, or we could have placed bets on other groups after this
statistical analysis was done. This might make us appreciate more the
value statistics play in our society.