Prepared by J. Laurie Snell, Bill Peterson and Charles Grinstead, with help from Fuxing Hou, and Joan Snell.
Please send comments and suggestions for articles to
Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:
Chance News is distributed under the GNU General Public License (so-called 'copyleft'). See the end of the newsletter for details.
Chance News is best read using Courier 12pt font.
All who drink his remedy recover in a short time, except those whom it does not help, who all die. Therefore, it is obvious that it fails only in incurable cases.
Galen (A.D. 13-200)
Contents of Chance News 8.07
Note: As you know, we put a series of Chance Lectures on the chance web site. These are lectures by experts in areas of probability and statistics that occur regularly in the news. We have also made a CD-ROM containing these lectures. Viewing these lectures with the CD-ROM still requires a browser but uses this locally so does not require an internet connection. If you would like one of these CD-ROM's send an e-mail message to firstname.lastname@example.org giving the address where it should be sent and we will send it to you when we get back from vacation September 5.
In Chance News 8.06 we mentioned a note in the New England Journal of Medicine reporting that studies that estimate false positive rates often get the definition of "false positive" wrong. Carol Hart and Thurman Wenzl noticed that the editors of NEJM and Chance News did not do much better. Carol Hart wrote: In your table for calculating false positive rates, I believe you mislabeled the left-hand column--should read "negative result" and "positive result" rather than "true negative" and "true positive." The NEJM made the correction in the next issue and said that they regretted this error and so do we.
Bob Hayden mentioned an interesting lesson series designed for K- 12 students but, in fact, of interest to anyone using newspaper articles in their teaching. These are math lesson plans based on the day's news, by authors Alison Zimbalist of the New York Times Learning Network, and Lorin Driggs, of The Bank Street College of Education in New York City. The lessons are based on articles in the New York Times and the text of the related articles is available. The lessons emphasize the use of graphics.
Here are examples of lessons that use statistical concepts.
Understanding the value of numbers in the newspaper Interpreting opinion polls. Analyzing baseball hall of fame statistics. Learning about company mergers through interpreting and creating graphs. All choked up by smoking statistics. Using maps, statistics, and written texts to recognize the H.I.V. explosion in Africa. Determining criteria of greatness. A lesson in comparative economics. Researching epidemics.
The other math lessons are also interesting--especially "Unmasking mathematical concepts in the art world" based on an article by our colleague Dan Rockmore. The lessons are at Browse Lesson Plans By TopicDan Rockmore also suggested the first two articles and provided a number of sleepless nights by asking in each case: how do they know that?
Tierney makes a plea for abolishing the penny. He asks: how can there be a shortage of something that nobody wants? Tierney remarks that prompt action is necessary since the U.S. Mint is working around the clock to re-supply banks that have run out. The mint is producing more than a billion pennies a month, most of which he claims are destined for oblivion. Philip N. Diehl, director of the mint stated that
The majority of pennies don't circulate. They make a one-way trip from us to penny jars, sock drawers, piggy banks and the spaces between couch cushions. Two-thirds of the pennies produced in the last 30 years have dropped out of circulation.
This is this point at which Dan Rockmore asked: How in the world do they know that?
We found what seemed to be the answer to Dan's question in a 1996 report of the United States General Accounting Office: "Future of the Penny, Options for Congressional Considerations" GAO/T-GGD-96-153. This was prepared for testimony before the House Subcommittee on Domestic and International Monetary Policy. The report gives the pros and cons of discontinuing the use of pennies in American commerce.
To understand the problem we are interested in, you have to realize that the Mint just makes the pennies. They then send them out to regional Federal Reserve Banks, who in turn send them to commercial banks as needed. Commercial banks also send pennies back to a Federal Reserve Bank when they have more than they need.
In the report we read:
Federal Reserve System data showed that the penny does not circulate as much as other coins. ...Our calculations showed that, in 1995, the circulation rates were 34 percent for pennies and 88 percent for quarters. These numbers tell us that for two-thirds of the billions of pennies produced, the trip from Mint to the Federal Reserve to the commercial banks and finally to the customers is a "one way" trip --- they are not seen again in circulation.A footnote states:
We calculated the circulation rate by dividing the number of coins received by the Federal Reserve Banks by the number of coins paid out by the Federal Reserve Banks to commercial banks.
We didn't really understand how this works and, after infinite arguments among ourselves, we decided to leave it to a discussion question to see if anyone could make sense out of this.
Since we could not understand this method we suggest in the discussion questions a sampling method to estimate the attrition rate for pennies. Peter Doyle tried this method out for us and got an attrition rate of 6.2 percent which is reasonably close to the average attrition rate of 5.5 percent that were given by the people at the U.S.Mint without any explanation how they got it.
Peter with the help of his daughter Helen used a data set with 156 pennies collected by Dan Rockmore. The confidence limit for such a small data set is large so now we have to start collecting more pennies. We suggest you try this also and let us know the results. It is important that your pennies have really been circulating and not from the bank or someone who collects old coins.
There were a number of letters to the editor about this article, expressing opinions about doing away with the penny. Some did not trust the merchants not to just round up. Another suggested that they should round up and give the difference to the government as a new form of tax. Another suggested that the math of combining the tax and then rounding to the nearest 5 cents was too difficult.
Interestingly, the GAO report states that a bill H.R. 3761, introduced in 1989 suggested a rounding system whereby cash purchases would be round down to the nearest 5-cent price when the total transaction amount, including sales taxes, ended in 1, 2, 6 or 7 cents and round up when the total price amount ended in 3, 4, 8, or 9 cents.
(1) Do you understand the argument given in the GAO report for the 67 percent attrition rate? If so, send it to us.
(2) Why do you think the bill in congress did not just say: round down if the total amount ended with 1, 2, 3, 4 and up if it ended with 6, 7, 8, 9?
(3) One of the activities in "Activity Based Statistics" is to have the students bring in random pennies and combine them to get a histogram of the dates of a random sample of pennies. The purpose of the activity is to have a very non-normal distribution to illustrate the Central Limit Theorem. Even samples of size 4 works pretty well. We can use similar data to try to estimate the attrition rate for pennies. Here is the data on Dan Rockmore's pennies as carefully recorded by Helen Doyle.
Year No. 99 17 98 16 97 16 96 8 95 8 94 7 93 6 92 9 91 2 90 4 89 7 88 5 87 7 86 6 85 8 84 2 83 5 82 9 81 6 80 3 79 5 78 4 77 2 76 3 75 3 74 2 72 1 71 1 69 1 68 3 67 1 65 1 64 1 62 1 61 2 60 2 59 2 56 1 52 1 46 1 45 1
Assume that there is a fixed probability p that a penny will disappear in any one year and that the same number of pennies are made every year. What should the distribution be for the date of a randomly chosen coin? What value of p gives the best fit for Helen's data?
(4) Well, of course the number of pennies made each year is not the same. We give below the number of pennies made each year since 1962. How would you change your model for the distribution of the date of a randomly chosen penny taking this data into account? Do this and get a better estimate for the attrition rate for pennies.
Year Number of pennies made by U.S. Mint 98 10257508500 97 9199355000 96 13123260000 95 13540000000 94 13632615000 93 12111355571 92 9097578300 91 9324382076 90 11774659553 89 12607002711 88 11346550443 87 9561856445 86 8934262191 85 10935889813 84 13720317906 83 14219554428 82 16725504368 81 12864985677 80 12554803660 79 10157872254 78 9838838400 77 8618992300 76 8895884881 75 9956751442 74 8879277751 73 7597759222 72 5978526504 71 5355665654 70 5480313904 69 5684117200 68 4852420571 67 3048667100 66 3679666100 65 3063318100
(5) In one of our many quests to understand all this, we got the following information from the U.S. Mint.
In a 30-year production period 316.9 billion coins were made. Daily commerce requires 114 billion coins for normal transactions. Estimated Average Annual Attrition rate is 5.5 percent. Median Time in Circulation stock is 10.0 years.
Assuming that 5.5 percent attrition rate means .055 probability that a coin disappears in a year, is that consistent with the result you got in the previous discussion questions?If you toss a coin with probability .055 for heads what is the expected time for heads to turn up for the first time? What is the median?
This article discusses the addition of new words to dictionaries.
The article gives ten new words for Webster's New World College Dictionary, Fourth Edition (published July 1999), ten for the Oxford American Dictionary and Language Guide (to be published this fall), ten for the Random House Webster's College Dictionary (to be published this fall) and ten words being tracked for future issues of The American Heritage Dictionary.
We checked the 40 new words listed for the 1999 or later editions. There was no overlap in the four sets of ten words. The Webster dictionary is the only one out and we checked that it indeed did not have most of the other thirty words. The article seemed to suggest that each dictionary had its own personality when it came to adding words. This all seemed too mysterious and led to Dan asking: how does an editor choose the words to be added to a dictionary?
As an example we are told that editor Jesse Scheidlower at Random House became convinced that "yada yada yada" had staying power not only by its history but because of its parallel expression "blah blah blah" so he put it in the 1996 Random House Webster's Collegiate Dictionary. However, Michael Agnes, editor of Webster's New World College Dictionary did not put it in their 1996 edition and said he is glad that he did not since he would have to take it out of this years edition.
Using Lexis-Nexis we find the number of times yada yada yada appeared in a newspaper article in this data base. The results for six months periods are:
With Without Seinfeld Seinfeld 1 Jan 1999 to 31 May 1999 168 36 132 1 Jun 1999 to 31 Dec 1999 406 100 306 1 Jan 1998 to 31 May 1998 710 435 275 1 Jun 1997 to 31 Dec 1998 258 107 151 1 Jan 1997 to 31 May 1997 44 25 19 1 Jun 1997 to 31 Dec 1997 15 1 Jan 1996 to 31 May 1996 9 1 Jun 1996 to 31 Dec 1996 11 1 Jan 1995 to 31 May 1995 9 1 Jun 1995 to 31 Dec 1995 11 1 Jan 1994 to 31 May 1994 9 1 Jun 1994 to 31 Dec 1994 7 1 Jan 1993 to 31 May 1993 6 1 Jun 1993 to 31 Dec 1993 6
We see from this that in 1996, when Sheidlower decided to introduce this word, yada yada yada rarely occurred in newspapers. When Seinfeld introduced it into his popular television program 24 April 1997, it took off. But as Agnes observed, it is falling off though still far ahead of its position in 1996. We considered citations with and without Seinfeld in the article and we see that, though this was while Seinfeld was surely responsible for its increased use, even outside the context of Seinfeld's program the usage is substantial. Seinfeld's program ended in March 1998 and reports of the final episode probably account for the large number of citations in that period.
There is still some disagreement about whether it should be yada yada yada or yadda yadda yadda or even yata yata yata. There were 960 citations that used yadda yadda yadda as compared to about 1600 that used yada yada yada. Yata yata yata occurred in the 40's in a song sung by Judy Garland and Bing Crosby. The lyrics went like this:
When I've got my arm around you
and we're going for a walk
must you ya-ta-ta, ya-ta-ta
talk talk talk?
The article did not really answer Dan's question so we asked Mr. Sheidlower how he did it. He referred us to a very nice article he had written exactly to answer this question. The article is: Principles for the Inclusion of New Words in College Dictionaries by Jesse T. Sheidlower, Dictionaries, 16 (1995) pp. 32ff.
Sheidlower remarks that editors nowadays have an embarrassment of resources at their disposal: dictionary of new words, journal and newspaper articles that discuss new words, such as, William Safire's "On Language" in the New York Times Magazine, and data bases to search such as Lexis-Nexis and Dialog. One that he does not mention is his own web site where Sheidlower answers questions about a word each day and encourages those that visit the site, to send him new words to consider.
There are four factors that must be considered paramount in any policy governing inclusion of a new word: the number of citations, the range of use of the citations, the time span in which the citations are found, and what might be called the words "cruciality", or the need to have the word in the language in the first place.
The article goes on to explain in more detail what these four factors mean and how Sheidlower uses them to select words. This process is illustrated by numerous examples of words that Sheidlower chose and words that he rejected.
This volume of "Dictionaries" was entirely devoted to Neolgoy (the practice of creating new words or giving new meaning to previous words). Perhaps the most interesting from a statistical point of view would be the article on "The Use of On-Line Databases in Neology". Unfortunately, at the time this was written we had access only to the Sheidlower article and a similar brief article by Michael Agnes. Agnes gave quite similar criteria to that given by Sheidlower so, as usual, "the devil lies in the details".
(1) Do you think that editors formally collects data on each new word they consider?
(2) Do you think Agnes would have taken out yada yada yada this year, if he had put it in 1996? Do you think yada yada yada will continue its decline to oblivion or level off?(3) How do you think an editor decides on the spelling of a word like yada yada yada if it originates in spoken form?
For some time it has been a concern of some that the S.A.T. exams are biased against black and Hispanic students who score below whites on average on the tests and women who score below men on average. In June, the U.S. Education Department Office of Civil Rights began circulating draft legal guidelines outlining what it considers bias. It is felt that colleges using the SAT may face legal action because of differences in the scores of different groups.
ETS reports that they have made every effort to avoid bias and make exams fair to all who take them. According to the article:
The company tests some questions each year on a section of the SAT that doesn't count toward the total score and evaluates how students of various races and sexes fare in answering them. If one group answers a question significantly better than other groups do, the question is banished from future tests.
The article provides the following examples:
Below are questions that were tried out on students who took the SAT exams in 1998. All of the questions were shown to be biased and will not appear on future SAT exams.Verbal
a. beach: ocean
b. drift: snow
c. wave: tide
d. rainbow: color
e. fault: earthquake
Correct Answer: B.
Results: 23 percent more whites than African-Americans and 26 percent more whites than Hispanics answered the question correctly.
Hypothesis: Regional variations. High proportions of African- Americans and Hispanics live in the south and southwest areas of the country, where there is less familiarity with terms associated with extreme winter weather.
Because Barbara McClintock's identification of 'jumping' genes represented a turning point in genetics, it is considered a ____ event.
Correct Answer: C.
Results: 9 percent more men than women answered this question correctly.
Hypothesis: Although the question is about a woman scientist, the science terms may have made women more uncomfortable than men.
The actor's bearing on stage seemed ____; her movements were natural and her technique ____.
a. unremitting. . .blase
b. fluid. . .tentative
c. unstudied. . .uncontrived
d. eclectic. . .uniform
e. grandiose. . .controlled
Correct Answer: C.
Results: 9 percent more women than men answered this question correctly. 8 percent more African-Americans than whites answered this question correctly.
Hypothesis: Women and African-Americans tend to do better on questions dealing with the humanities or the arts.Math
For all positive numbers a and b, [a+b] is defined by a+b = (a + b)(ab). If 2(2a + 2b) - k(a + b), what is the value of k?hhm
Correct Answer : 8
For all positive numbers a and b, [a+b] is defined by [a+b] = (a+b)(ab). If 2a+2b = k[a+b], what is the value of k? (See discussion question (2).)
Results: 4 percent more females answered this question correctly than men.
Hypothesis: Females tend to do relatively better than males on questions, like the one above, that are from the curriculum or textbooks.
If 80 percent of X is equal to 20 percent of Y, then Y is equal to what percent of X?
a. 16 percent
b. 25 percent
c. 40 percent
d. 250 percent
e. 400 percent
Correct Answer: E.
Results: 12 percent more men than women answered this question correctly.
Hypothesis: Females tend to do worse than men on questions involving percentages, particularly non-rounded number percentages or percentages over 100.
If the square root of 2x is an integer, which of the following must also be an integer?
a. The square root of x
d. x squared
e. 2x squared
Correct Answer: C.
Results: 7 percent more African-Americans than whites answered this question correctly.
Hypothesis: African-Americans tend to do better on problems that come from the standard math curriculum and don't involve applied math.
At North Industries, 1,200 employees are in the health plan and half of all company employees are in the savings plan. Of all the company savings plan members, 800 are in the health plan and 250 are not in the health plan. How many employees of North Industries are not in the health plan?
Correct Answer: 900.
Results: 16 percent more whites than Asian-Americans answered this question correctly.
Hypothesis: Asian-Americans tend to do worse on word problems applied to real life situations than whites of the same ability.
Source: Educational Testing Service.
Not surprisingly letters to the editor thought this process itself was not fair. A representative of Education Testing Service wrote:
Educational Testing Service
The purpose of the SAT is to provide test takers, college admission officers and other test users with information that allows them to make valid inferences. To this end, we use a number of procedures that eliminate irrelevant factors from the test. One of these procedures, Fairness Review, asks test developers to review every SAT question and each test to eliminate offensive content and ensure that the test includes positive references to both genders and various ethnic groups. Another process, Differential Item Functioning, produces statistics that identify large differences in group performance on individual test questions; it helps us determine whether these differences relate to what the SAT is designed to measure (verbal and mathematical reasoning) or whether they could be caused by unrelated factors. The DIF procedure first matches test takers from different groups, e.g., males/females, on the basis of their overall test performance, and then flags any test question on which these matched groups perform in substantially different ways.
Fairness Review and DIF maintain the integrity and high quality of the SAT. Groups perform differently on the SAT. Our job is to ensure that score differences reflect only relevant differences in the knowledge and skills that the test is designed to measure.Paul A. Ramsey
School and College Services
(1) What do you think about the Hypotheses?
(2) As you will see if you try to work it, the first math problem is obviously still wrong. Given that the answer 8 is correct, what do you think the original problem was?(3) How many of the questions do you think involve what is traditional thought of as bias?
Britain has built a huge Millennium Dome with many "zones" including such things as futuristic science centers, art galleries, theaters, a traffic-free cycle network and even an indoor tropical rain forest. An article about the Dome in the Guardian mentioned that it would have 675 loos. In a letter (July 27) reader Brian P. Moss wrote:
Does Dr. John Kilburn (Letters, July 26) assume sex discrimination in the Millennium Dome because the number of toilets, 675, is not divisible by two? An alternative explanation might be that the architects have at last realized that women have to sit down for both types of toilet call, so prolonging their average visit, and have provided a higher number of women's loos.In this note Robert Matthews writes:
The New Millennium Experience may take comfort in the hundreds of loos they have provided in the dome for both sexes (Letters, July 28), but queuing theory backs Mr. Moss's point (Letters July 27). If there are two types of people, one of whom is dealt with X times more slowly than the other, their queue will on average be at least a factor X-squared longer. Women take about 2.3 times longer to use the loo then men, and can thus expect queues at least 2.3-squared, ie five times longer than those experienced by men.<<<========<<
The effect of race and sex on physicians' recommendation for
The New England Journal of Medicine, 25 February, 1999, 618-26
K. A. Shulman et.al.
The article by Shulman et.al. reported a study in which they asked primary care physicians attending two national meetings to make recommendations for treatment, given video-recorded interviews presenting descriptions of patients with chest pain. The patients were in fact actors and, using a multimedia computer program, presented 144 descriptions for all possible combinations of six experimental factors including sex and gender.
Shulman and his colleagues claimed to show that "race and sex of a patient independently influence how physicians manage chest pain." Their article received a great deal of press coverage including Nightline and the major newspapers. All of these made statements equivalent to the following reported in the Wall Street Journal and New York Times based on an Associated Press report.
Doctors are only 60 percent as likely to order cardiac catherization for women and black as for men and whites.
Schwartz and her colleagues point out two serious statistical problems with this study. First, the 60 percent is a mistake. It should be 93 percent. Shulman and his colleagues found that 84.7 percent of blacks and 90.6 percent of whites were referred for catherization. They chose to report this in terms of an odds ratio of (.847/.153)/(.906/.094) = .6. The press being unaccustomed to working with odds ratio, interpreted this to mean that doctors are only 60 percent as likely to order referral for women and blacks as for men and whites, when in fact they are 93 percent as likely to do so.
The next problem was that the authors data showed that the rate of referral was the same for white men, black men, and white women (90.6 percent); only black women had a different rate of referral (78.8 percent). By considering only two categories, gender and race, this single difference made it appear that there was a difference in referral rate by race and by gender when in fact the difference was due entirely to the lower referral rate for black women. Media reports about the study focused on sexism and racism (as did the authors of the original paper). As Schwartz remarked in our local newspaper:
It's hard to understand, if it's racism, why only black women, and not black men had a lower referral rate, and, if its sexism, why only black women and not white women had a lower referral rate.The authors of the original paper in their reply to comments about their paper said:
Our study hypotheses, as stated in the original grant application, were that blacks would be less likely to be referred for cardiac catheterization than whites and that women would be less likely to be referred than men. Our reporting of the sizes of the main effects of race and sex is therefore consistent with fundamental statistical principles.
(1) What do you think about the authors explanation why they combined gender and race and did not report the more detailed results?
(2) Who deserves most of the blame for this mistake in reporting the results of the study, the authors or the media?(3) Could the whole explanation be that the black women actors chosen were just not good actors?
Indefatigable Robin Lock gave a record number of talks at this meeting and at the AMS meeting in Baltimore. Robin's talks on web resources are available at Robin Lock's Page. Robin does an incredible job of organizing links to interesting statistics web sites and keeping them up-to-date.
No meeting on statistical education is complete without the wisdom of George Cobb. For this meeting George gave us an after- dinner speech in which he asked the question:
Are we paying so much attention to "how" we teach -- computers, the web, activities, groups, projects -- that we are neglecting "what" we teach?
At the AMS meeting in Baltimore the "Demming Lecture" was given by Kenneth Prewitt director of the Census. The title of his talk was "The Census: Political Questions, Scientific Answers." Prewitt gave an interesting history of the Census and his understanding of the responsibilities of the Census Bureau in carrying out the Census. He stated that, in response to the Supreme Court ruling that sampling could not be used to determine the population for the purpose of apportioning congressional seats, the Census Bureau plans to make every effort to achieve as complete an enumeration as possible and report the result as required by Dec. 31, 1999.
The Census Bureau will then continue its efforts to get a progressively more accurate count using sampling with the final count reported April 1, 2000. Prewitt stated this is the job of the Bureau. How these numbers will be used in giving money to the states, redistricting etc., will be up to the various government officials who have the responsibility for making these decisions. Prewitt hoped and expressed some confidence that these officials would base their decisions on the population estimate that the Census Bureau indicates is the most accurate.
There were many other interesting new things presented at the AMS meeting. Three that we will review in the next Chance News are: the new statistical software package Fathom produced by Key Curriculum Press and two new Chance books: "Chance Rules" by Brian Everitt and "What is Random?" by Edward J. Beltrami--both new books published by Springer Verlag.
Finnally, we come to our our most interesting summer experience. We attended a workshop organized by Joan Garfield and Dani Ben- zvi: The First International Research Forum on Statistical Reasoning (SRTL), Thinking and Literacy held at the Kibbutz Be'eri, Israel, July 18 - 23, 1999. The papers presented at this workshop as well as background papers can be found at SRTL.
In the background papers you can find an article by Joan Garfield surveying the definitions of statistical thinking and literacy (especially the former) suggested by experts in statistical education. You will also find the article Statistical literacy: Conceptual and instructional issues by Iddo Gal University of Haifa, Israel. In this article, Gal discusses how a statistical literacy should differ from a standard introductory statistics course.
Iddo argues that such a course should be aimed at a consumer of statistics rather than a producer of statistics. The basic statistical concepts taught are mostly the same but the emphasis should be different. For example, a much broader discussion of types of experiments is essential to understanding reports in the news on medical experiments. Students need to understand the different interpretations of probabilities (subjective and objective) and risk (relative and absolute). They need to know not only that correlation does not imply causation but also how causation is demonstrated.
Iddo would like students to come away from a statistical literacy course with an ability to ready a news article and almost automatically ask a set of questions like:
Iddo's article is writing about a quantitative literacy course for adults, but his comments apply equally to a course like our Chance course.
(1) How would you define statistical literacy?
(2) Is there a difference between statistical thinking and statistical literacy? If so, what is the difference?(3) As Iddo remarks, there is nothing sacred about his list. Give a shorter list, say five or so items, that you would want anyone reading a news account of a new study to be thinking about.
This work is freely redistributable under the terms of the GNU General Public License as published by the Free Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.