John Finn Shunhui Zhu
1S Bradley 316 Bradley
Class 4 Average and Standard Deviaton
Class 5 Standard Deviaton; Stephen Jay Gould on batting .400
Class 6 Introduction to Probability
Class 7 The Binomial Distribution
Class 9 Polling, Standard Error, Normal Distribution
Class 10 Standard Error, Normal Approximation; Projects
Class 11 Surveys and Data Collection
Class 12 Correlation and Regression
Class 13 Correlation & Regression, and Immigration Statistics
Class 14 Correlation & Regression; Conditional Probability
Class 15 Economics, Streaks and Tversky
Thursday 10 October
Thursday 24 October
Thursday 7 November
Thursday 21 November
Tuesday 3 December
Homework
To supplement the discussion in class and assignments to be written about in your
journals, we will assign readings from your text FPPA, together with accompanying
homework. When you write the solutions to these homework problems, you should keep
them separate from your journals. Homework assignments will be assigned once a week and
should be handed in on Thursdays.
Final project
We will not have a final exam for the course, but in its place, you will undertake
a major project. This project may be a paper investigating more deeply some topic
we touch on lightly in class. Alternatively, you could design and carry out your
own study. Or you might choose to do a computer-based project. To give you some ideas,
a list of possible projects will be circulated. You can also look at some previous
projects on the Chance
Database. However, you are also encouraged to come up with your own ideas for projects.
Chance Fair
At the end of the course we will hold a Chance Fair, where you will have a chance
to present your project to the class as a whole, and to demonstrate your mastery
of applied probability by playing various games of chance. The Fair will be held
during the final examination time assigned by the registrar.
Resources
Materials related to the course will be kept on our web site
and on Kiewit PUBLIC server (PUBLIC: Courses & Support: Academic Departments & Courses:
Math: Chance). In addition supplementary readings will be kept on reserve in Baker Library.
On Table 1 from "Effects of Cigarette Smoking on Lung Function in Adolescent Boys and Girls" in The New England Journal of Medicine, 26 Spetember 1996.
Study the table.
What conclusions would you draw from the table?
Is there any relation between maternal smoking and child smoking?
Do you think there are significant differences between boys and girls with respect to smoking?
What confounding factors might explain these differences between boys and girls?
a. What kinds of bias are discussed in the article?
b. Are they measurement biases?
Homework:
Chapter 3 , all review exercises.
Chapter 4 , Review Problems: 1-3, 5, 7-10, 13, 14.
Read Chapter 6.
Class discussion: PSAT modifications.
Read "College Board Revises Test to Improve Chances for Girls", by Karen W. Arenson; The New York Times. Wednesday 2 October 1996
Discussion questions:
Introducing Average and Standard Deviation
More on using JMP
Journal assignment
Read "Incarceration Is a Bargain", by Steve Hanke, in The Wall Street Journal, Monday, 23 September 1996.
Discussion questions:
Standard Deviation
Discussion of Stephen Jay Goild's 'Why the Death of 0.400 Hitting Records Improvement of Plays' (from his Full House, Harmony Books, 1996).
Journal assignment
Don't forget your two article summaries, which are due Tuesday. When you summarize an article, be sure to include the publication it comes from and the date.
Homework: Read chapters 13 and 14 ("What Are the Chances" and "More about Chance" in FPPA. Do the even Review Exercises in chapter 13, and odd Review Exercises in chapter 14.
Coincidences in Airplane Crashes
Coin Classing Experiment
Journal assignment Think of coincidences in your own life. What is the likelihood of these being random chance occurrences? Are they really events that have occurred against all odds?
a. Break into groups of four.
b. Identify a member of your group who claims to be able to tell the difference between Pepsi and Coke. (Coke Classic, that is; accept no substitutes!)
c. Design an experiment to test whether this is true. Remember that one swallow doth not a summer make: Don't certify your taste-tester just on the basis of one taste. Write down exactly what data you will collect and what you will do with the data before you start collecting it.
d. What is being tested?
e. Carry out the experiment.
- When you design an experiment like this you should ask several questions. First, what do you want to test? Do you want to test if a person can tell given a single cup whether it contains Coke or Pepsi? Can a person decide which of two cups is Coke and which is Pepsi? Can a person who is given two cups simply decide if they have the same or different drinks? These are all testing slightly different abilities.
- What does your experiment test? Is that what you want it to test?
f. Record your results.
The Binomial Distribution
Don't forget your two article summaries, which are due Tuesday. When you summarize an article, be sure to include the publication it comes from and the date.
Homework assignment. In FPPA:
1. The CNN Tracking Poll for October 19-20 interviewed 732 likely voters. They reported that 55% favored Clinton, 34% favored Dole and 6% favored Perot with a sampling error of + or - 4% (sampling error is also called margin of error).
(1) Read the NYT article "Misreading the Gender Gap" by Carol Tavris (September 17,1996), What do you think of her explanation of the gender gap in the current election.(2) How would you explain "margin of error" to a friend who had not had a statistics course?
Speaker: Tami Buhr from Harvard University will speak on her experiences in polling.
Standard Error, Normal Distribution.
Homework Assignment: Read Chapters 19, 20, 21.
(We will be discussing a couple of the ideas from Chapter 18 in class. If you miss the class or need additional support, you may want to look through that chapter.)
Preliminary Project Proposal:
Please hand in a separate sheet with a brief description of your project proposal next Thursday. We will talk more about this is class Wednesday.
Journal assignment
Comments and reflections on speaker's talk.
Standard Error and Normal Approximation
by John Finn
And some review of the mathematics that has come up so far. We're going to try to make firm the mathematics behind chance quantities, particularly sums of draws from a box.
About Your Chance Project
Remember that you're to hand in Thursday a a brief description of your project proposal. We'll talk about ideas for projects and our policies on them.
Guest Speaker: Nancy Mathiowetz of the University of Maryland will speak on Surveys and Data Collection.
Confidence Intervals and Standard Deviation
Homework assignment:
(If you need a review of how to plot lines, find slopes, etc., read Chapter 7)
Journal assignment
Don't forget your two article summaries, which are due Tuesday. When you summarize an article, be sure to include the publication it comes from and the date.
We read about a poll taken to estimate what percentage of the population are voting for each of several candidates. The results say that "57% are for Millard Fillmore, with a 3% margin of error". What does this mean, and how do the pollsters come up with it?
Class discussion: You be the judge: did regression analysis reveal a voting fraud, and was the fraud decisive?
Read "Probabilty Experts May Decide Pennsylvania Vote" ( The Nets York Times, 11 April 1994).
Discussion questions:
Scatter Diagrams, and Correlation and Regression
Quantifying the degree of association between two variables:
- the scatter diagram;
- the correlation coefficient;
- the SD line;
Guest Speaker: Prof. Richard Wright: The Satistics of Immigration Prof. Wright is the Chair of Dartmouth's geography department.
Human Subjects
1. Elizabeth Bankert, Assistant Director of Grants & Contracts here at Dartmouth, will talk about guidelines for carrying out projects that involve human subjects (which include any sort of survey), and Dartmouth's regulations on these matters.
Correlation and Regression.
What are the SD line and the regression line of a scatter diagram? How do we determine them, and what do they tell us about the data?
Homework Assignment
Correlation and Regression.
What are the SD line and the regression line of a scatter diagram? How do we determine them, and what do they tell us about the data?
Class discussion: Conditional probability and false positives.
1. In one of Marilyn vos Savant's columns in Parade magazine a reader asked
Suppose we assume that 5So of the people are drug-users. A test is 95So accurate, which we'll say means that if a person is a user, the result is positive 95So of the time; and if she or he isn't, it's negative 95Wo of the time. A randomly chosen person tests positive. Is the individual highly likely to be a drug-user?
Marilyn's answer was:
Given your conditions, once the person has tested positive, you may as well flip a coin to determine whether she or he is a drug-user. The chances are only 50-50.
How can Marilyn's answer be correct?
2. An article in The New York Times some time ago reported that college students are beginning to routinely ask to be tested for the AIDS virus.
The standard test for the HIV virus is the Elisa test that tests for the presence of HIV antibodies. It is estimated that this test has a 99.8% sensitivity and a 99.8% specificity. 99.8Wo specificity means that, in a large scale screening test, for every 1000 people tested who do not have the virus we can expect 998 people to have a negative test and 2 to have a false positive test. 99.8So sensitivity means that for every 1000 people tested who have the virus we can expect 998 to test positive and 2 to have a false negative test.
The Times article remarks that it is estimated that about 2 in every 1000 college students have the HIV virus. Assume that a large group of randomly chosen college students, say 100,000, are tested by the Elisa test. If a student tests positive, what is the chance this student has the HIV virus? What would this probability be for a population at high risk where 5Wo of the population have the HIV virus?
If a person tests positive on an Elisa test, then another Elisa test is carried out. If it is positive then one more confirmatory test, called the Western blot test, is carried out. If this is positive the person is assumed to have the HIV virus. In calculating the probability that a person who tests positive on the set of three tests has the disease, is it reasonable to assume that these three tests are independent chance experiments?
Journal assignment
Read and comment on the Manchester, NH Union Leader story "Exit Poll Wrong Call in Senate Race Leaves Anger, Hurt, Red Faces". There are a couple of discussion questions at the end of the article.
Guest speaker: Professor Michael Knetter of Dartmouth's Economics Department will speak on the role of statistics in economics.
Activity: recognizing streaks
We will demonstrate a computer simulation of three coins:
The streaky coin is more likely to produce streaks of H's and of T's (like HHHHHHTTTT; a streak of 4 H's followed by a streak of 4 T's) than the ordinary coin, which is in turn more likely to produce streaks than the vacillating coin.
We'll tell you the probabilities for each of the coins. Your mission, should you decide to accept it, is to repeatedly look at a sequence of 20 tosses, and guess which coin is producing it. For instance, if we get HHHHHHHHHHTTTTTTTTHH, you'd probably guess the streaky coin.
Class Discussion Read the New York Times article "'Hot Hands' Phenomenon: a Myth?", on Stanford psychologist Amos Tversky's study of treaks in basketball.
Discussion questions: