CHANCE News 6.06

(11 April 1997 to 10 May 1997)


Prepared by J. Laurie Snell and Bill Peterson, with help from Fuxing Hou, Ma.Katrina Munoz Dy, Kathryn Greer, and Joan Snell, as part of the Chance Course Project supported by the National Science Foundation.

Please send comments and suggestions for articles to jlsnell@dartmouth.edu.

Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:

http://www.geom.umn.edu/locate/chance ============================================================

"They're falling in a Poisson distribution," says Pointsman in a small voice, as if it was open to challenge.
Thomas Pynchon, Gravity's Rainbow


Note: The Dartmouth Math Across The Curriculum NSF Project is given two workshops this summer from June 26 to June 28. You can find out more about these workshops from their anouncement at the end of this Chance News.

If you are having difficulty using Applets on the web--the network is slow or they crash your machine--you might try some of the software written in the free Xlisp-Stat language available for all platforms. These programs are run on your machine independent of the web. Of course, the same could be true for Applets if the developers would provide the sources but they seldom do. Here are two web sites where you can obtain interesting Xlisp-Stat software.

Teach Modules:

Here you will find modules covering several important statistical concepts including: the central limit theorem, confidence intervals, sample means, and regression. Each modules asks the student to explore these concepts with interactive Xlisp-Stat programs. The module describes the concept, how to explore it with the software, and provides exercises for the students. You can read about how the authors use this software in their article: Using Graphics and Simulations to Teach Statistical Concepts, Marasinghe, Meeker, Cook, and Shin, "The American Statistician" Vol. 54, No. 4, November 1996, p. 342.


ViSta is an ambitious project that provides a statistical package written with the Xlisp-Stat language and a text to go with it. ViSta is designed to allow the user to choose the appropriate level of expertise. If you are a student or other non-expert, you are offered a guided tour on how to use the package to explore data sets supplied by ViSta or the user. The text material is available in Acrobat pdf format.

Michael Olinick provided the following three accounts of a mathematical model for finding a mate.

Next time, Jim, try counting to 13 first...
Daily Mail, 5 April 1997, p. 7
Jenny Hope

According to a computer model developed by Dr. Peter Todd of the Max Planck Institute for Psychological Research in Munich, looking at a dozen possible mates gives a person enough information to make a rational choice. His conclusion shows that the strategy of assessing 12 potential partners, and then accepting the next person whose "attraction rating" meets the minimum criteria derived from your experience, gives you a 90% chance of settling for someone in the top 10% of people you will ever meet. Searching a lifetime for the perfect partner is not advisable; the law of diminishing returns implies that, after a point, your chances of improving your choice increase only slightly.


(1) The article states that, after looking at 12, "...the next best person encountered--lucky number 13--is statistically likely to be a good match;" this is also implied in the title. But does Todd really advise settling on the 13th?

(2) How do you think the 90% and 10% figures were calculated?

Mathematical model finds love the quick way.
The Herald (Glasgow), 5 April 1997, p.6

Here Todd's argument is summarized as follows: "Most people could base their choice on assessing just 12 people chosen at random and establishing a 'list of preferences.'"


Does choosing prospective mates "at random" sound like a reasonable model of actual dating patterns?

Maths, love and man's best friend.
The Independent, 5 April 1997, p. 7
Glenda Cooper

In what sounds like an application of the classical "secretary problem", the article reports that once an employer has seen 37% (sounds like 1/e) of job applicants, a coherent picture of the ideal employee is built up and the next person to fulfill these criteria is the one who gets the job.

The article summarizes Dr. Todd's recommendations as follows: first estimate how many people you are likely to meet in life, assess the first 37%, remember the best, and then take the next person who meets that standard. Unfortunately, the article adds, you would have to search through 75% of potential acquaintances to do so, a daunting task for "most of us who meet thousands of people."


(1) Where do you think the 75% comes from? In the "secretary problem" the goal is to maximize the chance of finding the single best person in the applicant pool. Is this the same as Todd's goal?

(2) Are the "applicants" for partner really the thousands of people we meet in a lifetime?

(3) From these stories, do you think you understand what Todd is recommending?

(4) In the secretary problem you can rank the applicants among those you have seen, but, if you don't hire the candidate before passing on to the next one, you lose this candidate. Your aim is to choose the best candidate. Show that if you have 100 candidates to interview you can have at least a 25% chance of getting number one by interviewing the first 50 and then choosing the best candidate you see after this. Can you adopt this argument to the problem discussed by Todd?

Paul Velleman suggested the following article and provided the discussion questions that he used in his class.

A census in cyberspace.
Busines Week Online, April 24
Amy Cortese

Business Week commissioned a Baruch College-Harris poll to see how many households use the Internet or the Web or both. The poll, based on a survey of 1003 adults, estimated that 21% of adults use are in the Net population. This would be about 40 million people or about half the number of people who use computers and twice as many as estimated a year ago.

The gender gap is decreasing in the Net population now estimated to be 41% female. The Net population is still dominated by whites. Blacks and Hispanics each account for 6% of Net users which is only slightly larger than a year ago. The Net population is also skewed towards the affluent.

The principle use of the Net is to search for information: 82% said they use it for looking for information, 72% use it for education, 68% for news, and 61% for entertainment. Only 9% use it for shopping and only 1% shop on the Net regularly. Older people tend to shop more than younger folks.

Young folks use the Net more for entertainment and socializing and older people for more serious matters such as checking on their stocks. Young people surf more than older people.

The survey and the results are available from this web site. At the end of the survey it is stated that: 1003 adults were surveyed, including 259 Internet/World Wide Web users, with a sampling error of +/-3%.


(1) The first question of the poll is: Do you use a computer or not? 43% said yes and 57% said no.

(a) Explain what Business Week means by sampling error of 3%. It might help to show the calculation the poll takers are making to obtain their estimate of 3% error.

(b) Properly carried out, the calculation you gave (or should have given) in (a) does come out to be almost exactly 3%. The next question addresses only the 431 respondents who have computers.

(2) Question 2 is: Do you use your computer to access the Internet or World Wide Web?

259 of the original 1003 respondents say that they use the Internet or WWW. The remainder of the poll is addressed only to these 259 respondents. For example, when asked what they do on the Internet 50% (of the 259) responded that they often did research.

Is the error in this (50%) estimated proportion larger, smaller, or about the same as the 3% error in the proportion estimated for the first question? Why? Is Business Week correct to claim a 3% error in its estimates?


Ken Steele sent the following remarks on the study mentioned in the last Chance News about the value of listening to Mozart before taking a test. References for the papers he mentions can be found at the end of this Chance News.

In the last Chance newsletter, reference was made to a study by Rauscher, Shaw, & Ky (1993) who reported in the journal "Nature" that students showed an increase in scores the equivalent of 8 to 9 IQ points after listening to a Mozart piano sonata for approximately 10 minutes. Understandibly, such a finding generated a great deal of interest in the public as well as other researchers. However I think it is important to point out to Chance readers that support for the existence of this effect has not fared well in replications.

Below I list the effect size scores for the orginial report and published attempts to replicate the effect. These scores are for the contrast between performance after Music vs. after a Silence control condition.

Study                                  d score    N of Subjects

Rauscher, Shaw & Ky (1993)               .62           36
Stough, Kerkin, Bates, & Mangan (1994)   .05           30
Kenealy & Monsef (1994)                  .04           24
Carstens, Huskins, & Hounshell (1995)    .08           51
Newman, Rosenbach, Burns, Latimer,
Matocha, & Vogt (1995)                  -.07           78
Steele, Ball, & Runk (1997)             -.04           36

Effect size is supposed to be a measure of what may be called the degree of departure from the null hypothesis in standard deviation units. Positive values are assigned to scores consistent with the experimental hypothesis, negative values are assigned to results inconsistent with the hypothesis.

Wolf describes one way of interpreting d scores. If you go to a normal curve table and look up the area under the curve associated with a d value then you will obtain the percentage of the experimental group which exceeds the upper half of the control group. For example, the Rauscher et al value of .62 corresponds to the area under the curve of .7676 and could be interpreted as meaning that the average person in the experimental condition would have a score higher than 76.76% of the scores in the control condition.


(1) What is the mean effect score for the replications? What does this value correspond to in terms of area under the normal curve? What is the interpretation associated with that value?

(2) Should one compute a mean effect size score? Or should effect sizes of studies be computed and evaluated individually? How should the size of the study be incorporated into considering effect size?

(3) How many negative results does it take to outweigh a positive result?

Norton Starr suggested the following article:

Experts see bias in drug data.
The New York Times, 29 April 1997, C1
Lawrence K. Altman

For several years, a pharmaceutical company prevented Dr. Betty J. Dong from reporting a study showing that one of its key drugs was no more effective than less expensive generic versions. (See Chance News 5.05). Dr. Dong had signed a contract in 1987 giving the sponsoring company the right to veto any publication. The study, completed in 1990, was finally permitted to be published in the April 16 issue of JAMA.

Altman, by interviewing experts, attempts to find out if the fact that drug companies sponsor research on drugs seriously biases the outcomes.

Dr. Pendergast, a Deputy Commissioner of Food and Drugs, said: "Now whether it is because the companies have contracted with the researchers to give the companies control, or whether because oftimes the stuff that is not published is negative data, or it is because journals were not interested in publishing it, ultimately, from our perspective it has the same result -- that there is a discordance between the full news about a new therapy and that which is published in the scientific literature in many cases."

Dr. Robert M. Califf, an authority on conducting indendent clinical trials remarked: "If you were a company doing your own studies and you did five studies that showed that your product was no good and one that it was good, you would only publish the favorable one."

Altman remarks that an example of subtle pressure can be found in the Framingham study of heart disease when it was run by the National Institutes of Health. In May, "The Journal of Clinical Epidemiology" will publish an article by Dr. Carl C. Selzer explaining how, in 1972, the Institutes would not permit him to submit for publication a paper showing that men who consumed moderate amounts of alcohol had a lower risk of heart disease than those who did not drink.

Dr. Mildred K. Cho, an ethicist at the University of Pennsylvania, reported, in the "Annals of Internal Medicine", March 1, 1996, a study designed to compare the quality of studies sponsored by drug companies with those not sponsered by a drug company. The study found that 98%, or 39 of 40, drug-company-sponsored articles, published in several journals, had outcomes favoring the drug of interest, whereas 79 percent, or 89 of 112, without acknowledged drug company support favored the drug of interest.

Altman remarks: "The findings suggested that suppression of negative data could be occurring, Dr. Cho said in an interview, but others contend that drug companies are more likely to support studies of drugs that are likely to be beneficial."


Norton Starr found this last paragraph puzzling and suggested looking at what Cho said in her article about her finding that studies supported by drug companies had a significant larger proportion of positive results. . Cho writes:

Our finding that articles acknowledging drug company support are significantly more likely to report positive results than articles with no drug company support confirms findings of others and could be due to several factors. First, drug companies may be less likely than other investigators to undertake or sponsor drug studies unless evidence already suggests that a drug is effective. Second, the study design may explain this. For example, because drug manufacturers are required to compare a drug with placebo to obtain FDA approval, industry sponsored studies might be more likely than non-industry-sponsored studies to use placebo controls. A drug might be more likely to appear efficacious compared with a placebo rather than an alternative treatment. She mentions that other articles have suggested that negative results are apt to be submited for publication and, if submitted, less likely to be accepted for publication.

(1) Which of her explanations seems most plausible to you? Are Chos's remarks quoted by Altman consistent with what she wrote in her paper?

(2) Cho found that a higher proportion of the studies done by Drug companies were randomized studies rather than observational studies. Why do you this was the case?

Jeff Eisenman suggested the following article and provided the discussion questions that he used in his class.

Prostate placebo is found to work.
The Chattanoga Times, 17 April, 1997
Janet McConnaughey, The Associated Press

It has long been known that those many of those who take placebo pills in a controlled medical study find that the pills have some positive effect. A recent study suggests that this effect may be long range.

Dr. J. Curtis Nickel reported on a study carried out in 28 medical centers in Canada to test the effect of the drug finasteride (Proscar) on men diagnosed to have an enlarged prostate. The drug is believed to shrink the prostate in men with enlarged prostate.

The study involved 613 patients for a period of two years. Nickel says: "One of the things we noted was that the patients were continuing to do very well on placebo. Some didn't want to stop taking the pills.

When they looked at the 303 men who had placebo pills they found that they really were doing better. The men's urine really flowed faster. However, while the finasteride-treated prostates shrank more than 21 percent, the placebo prostates grew an average of 8.4%.

Another article on this study (The Washington Post, 22 April) reported that 243 patients receiving placebo reported side effects that they ascribed to their medicine. The most common complaints were impotence and reduced sex drive, but some blamed the pills for a wide range of problems such as naussea, diarrhea, skin rashes and aching testicles.


(1) What information contained in the article at least partially qualifies Dr. Nickel's claim that patients "were continuing to do very well on placebo"?

(2) What information gathered by the researchers but not appearing in the article would help you understand which patients were experiencing which effects?

(3) What information not gathered by the researchers would help you understand what effects could be attributed to the placebo pills?

(4) If you were to conduct a follow-up study, what modifications in the study design would you make to gain a clearer understanding of (a) the results that can be attributed to the placebo, and (b) the characteristics of the patients most likely to experience such gains?

(5) Construct one or more figures or tables with labels for the row headings, column headings, vertical axis, horizontal axis, or whatever, with imaginary data to illustrate the kinds of findings that might emerge from conducting the study you just described.

Overdosing on health risks.
New York Times Magazine, 4 May 1997, pp. 44-45
Marcia Angell

From her position as executive editor of "The New England Journal of Medicine," Angell sees the impact of the journal's articles on the public. "No sooner do we publish a study on diet or life style than news of its conclusions, though virtually none of its qualifying details, hits the airwaves. Within 24 hours, millions of people consider eating fewer egg yolks or more oat bran to fend off disease."

Angell remarks that the public is not good at differentiating between significant and insignificant risks. She illustrates this in terms of the mammogram controversy. In January, the National Institutes of Health panel of experts concluded that regular mammograms for women in their 40's would at best save the life of 1 in every 1,000 women screened. Increased risk of cancer from radiation might occur for about 3 in every 10,000 screened. Because pre-menopausal women have denser breast tissue, mammograms are more difficult to interpret in this age group than in those over 50. Most women with suspicious-looking mammograms end up not having breast cancer. However, as a result of the mammogram test they might end up undergoing unnecessary surgery. Angell remarks: "In short, so small would be the payoff of regular mammograms at this age that the risk of driving the car to get them might well outweigh the benefits of the test. Yet the reaction of many to the conclusions of the panel was that they were callously sentencing large numbers of women to a death sentence.˛

Angell illustrates how a small risk can appear big by the study that found post-menopausal estrogen is associated with a 30 percent increase in the risk of breast cancer. She suggests that the same risk could have been expressed in a less threatening way. Since we know that 3 or 4 percent of post-menopausal women will get breast cancer in the next 10 years, we could say that this study shows that estrogen increases that risk to 5 percent. Put another way, if you are a post-menopausal women trying to decide whether to take estrogen, this study shows that your chances of remaining free of breast cancer for 10 years would decrease from over 96 percent to about 96 percent.

Angell gives examples of small risks which do have a significant impact on the entire population, while for an individual they appear to be a very small risks. For example, hearing the national recommendations to lower their cholesterol level, individuals automatically felt that they should do this, making large changes in their diets, for very little individual gain.

Angell concludes by encouraging increased skepticism in what you read in the news. She says that, with a few exceptions such as giving up smoking, many of the changes in lifestyle suggested will produce small effects and "there's more to life than fretting about health risks˛.


Should Laurie quit buying yogurt cones when he really prefers ice- cream cones?

Midwives deliver healthy babies with fewer interventions.
The New York Times, 18 April, 1997 p. 15
Tamar Lewin

Midwives are Better.
All Things Considered: National Public Radio, 20 April 1997
Daniel Zwerdling

Dr. Roger Rosenblatt of the University of Washington Medical Center has completed a study on the differences in costs and treatments of women who go to midwives instead of doctors. He has found that women who do go to midwives have "less expensive, less invasive procedures."

The 1,322 women in the study were healthy, low-risk patients. Women with serious problems in past pregnancies, a major medical condition, no obstetrical treatment in the first trimester, older than 34 or younger than 18 were excluded from the study.

The results were quite favorable for the midwives. The major differences occurred during labor-- the pre-natal care was nearly identical. The women using midwives tended to use fewer drugs to induce labor, while doctors tended to use drugs more liberally. Midwives also use less anesthesia, especially epidural anesthetic, providing fewer episiodomies.

A major discovery was a significant difference in the number of cesarean sections performed. Midwives had about forty percent fewer cesarean sections than the doctors. Midwives had a cesarean section rate of 8.8 percent vs. 13.6 percent for obstetricians and 15.1 percent for family doctors. Rosenblatt commented that: "It is striking that the patients of the midwives had a rate under 10 percent, and in some sense it gives us a target that we can consider obtainable."

On average, patients of midwives probably save about 12 percent of what doctors' patients pay for the services during pregnancy. If 3 million women deliver babies a year in the United States, that would add up to about 3 or 4 billion dollars in savings.

So should women run out and find a midwife immediately? Well, the situation is much more complicated than just the raw results of the study. In his interview on NPR, Rosenblatt tells us that most of the women who choose to go to midwives tend to be "tougher". They are less likely to ask the midwife for drugs during labor. They would generally be more likely to "tough it out" for a longer time. Also, the midwives tend to form a closer bond with the patients, better understanding how their bodies may have different reactions to various treatments. They might be better able to determine alternatives for women besides drugs and surgery. Perhaps, if the usage of midwives spreads in the United States, the cesarean section rates can be reduced along with costs of pregnancies. However, this would take time. Midwifery has come a long way. Possible patients must be informed that using a midwife can be as safe as using a doctor.


Can you conclude from this study that there is too much intervention in the delivery of babies by the obstetricians and family doctors?

CD-ROM, Addison-Wesley, July 1997.
Paul Velleman
ISBN 0-201-31071-6, $42.50

Concern is often expressed that most students do not get their first introduction to statistics from a statistician. This CD-ROM will go far toward improving this situation. Students who have ActivStats will learn what statistician Paul Velleman thinks statistics is all about.

They will not hear him an hour at a time but more like three or four minutes at a time. The segments are short because they are limited to a single concept. The goal is to motivate, explain, visualize and reinforce each individual concept before going on to the next -- something we can't afford to do in a lecture.

In a typical segment, Paul tells the students the importance of understanding the context of the data. He gives examples from advertisements and asks the students to answer the questions who? what? and why? relative to the collection of the data. Key points are presented on a virtual blackboard, and the students are encouraged to take notes.

In between Paul's segments, students are asked to carry out a variety of tasks related to the topic Paul has introduced. For example, students may be asked to analyze a data set using the built in statistical package Data Desk, read a paragraph from a standard text such as David Moore's "Basic Practice of Statistics," or test their understanding of the concept by trying to make a correct transformation between a set of words and a paragraph with missing words. Wrong words tumble back to the word source and final victory is rewarded by a pleasant cheer.

Students may also be invited to see a segment from the popular "Against all Odds" and "Decisions Through Data" video series. For example, in the unit on "understanding relationships", students view the segment relating to the Boston Beanstalk Tall Club. If they have a web browser on their computer and click on the WEB button, they will be taken to the Boston Tall Club's homepage.

When studying correlation, the WEB button takes them to articles in Chance News that use correlation. When studying confidence intervals, clicking on the TOOL button gives the student a chance to draw samples from a population and visualize the Central Limit Theorem.

A student making the mistake of clicking on the WORK button will be given a set of homework problems relating to the unit being studied. Clicking on the PROJ button provide them with a mini- project to carry out that relates to the topic being studied.

Authoring tools will allow instructors to add their own lessons to the end of a page of the lesson book. Those lessons could be as simple as some text to read, could be datasets that launch a statistics package (including one other than Data Desk, if they prefer), or a page with a URL -- for example to administer a quiz or provide more information from a course home site.

It is hard to think of a subject which lends itself better to multi-media presentation than statistics. Having Paul Velleman, DataDesk, "Against all Odds", numerous activities, and the Web at your fingertips will present real competition to the conventional way of learning statistics.

It will be interesting to see how this CD-ROM will be used. For a Chance course, it would permit the instructor to spend most of class time discussing news articles and carrying out activities, letting Paul teach the basic concepts. One can even imagine that some students at Dartmouth, who pay about $3000 to take an introductory statistics course, might choose to save $2958 by learning their statistics from Paul!

ActivStats was available and tested this Spring with a Mac version. A fully cross-platform Mac/Win95/WinNT release will be available in July. Teachers can obtain now a preliminary version of the July release by contacting Bill Danon at awi-info@aw.com or 617-944-3700x2563 (e-mail preferred).

Disclosure: Paul was one of our favorite Dartmouth undergraduates who acted as John Kemeny's statistical advisor in introducing co- education at Dartmouth.


Students at Dartmouth this year paid about $22,000 for tuition and $27,000 for tuition, room and board. They typically take three courses a term but they pay the same tuition no matter how many they take. They are required to take 35 courses to graduate. If ActivStats could replace one of these courses, how would you estimate the saving to the student?

Probability books have been slower to provide interactive material for their courses, but here is one that does.

Interactive Probability.
Kyle T. Siegrist
Wadsworth, Boston 1997
ISBN: 0-534-26568-5, $32.95

The aim of this book is to introduce probability by having the students experiment with a series of basic probability problems and models and to teach basic probability concepts in the context of these models. A software package comes with the book that runs on a PC. It includes interactive programs covering such experiments as: Buffon's needle experiment to estimate Pi, probabilities of getting the various poker hands, dice and urn experiments, the infamous Monte Hall problem, Bernoulli trials, Poisson processes, random walk, and interactive particle systems representing the spread of a fire, and a voter model.

The simulations follow a consistent format that allows the student to change the number of simulations, what is printed out, what is graphed etc. The student can export the output as data and analyze it using a standard statistical package. Siegrist hopes, by this, to show students that, just as understanding statistics requires an understanding of probability, understanding probability models requires an understanding of statistics.

In the book you find a discussion of each model, the probability ideas introduced by the model, and related exercises. An appendix contains a systematic account of the basic concepts of probability with exercises for the student.

Interactive Statistics could be used as the sole text for an introductory probability course or as a probability laboratory for a more standard text.

The software accompanying this book sets a high standard and hopefully will encourage other authors to include similar laboratory materials in their texts.

Just as we all have to express our indebtness to Feller when we write a probability book, so too we all, as does Siegrist, express our indebtedness to David Griffeath and Bob Frisch for their pioneering probability software GASP.

You can see an Applet version of the Siegrist's Buffon's needle experiment and the beginnings of a similar statistical project by going to his homepage: http://www.math.uah.edu/~siegrist/

Of course, to keep up with the statisticians, probabilists should also consider data from real life applications and discussing, in their texts, some of the real world practical problems in applying probability models. The current issue of "Chance Magazine" shows several examples that would enhance an introductory probability course. Here are two of these:

A statistician reads the sports page.
Chance Magazine, Vol. 10, No. 1, Winter 1997 p. 38
Hal S. Stern

Hal Stern describes a Markov Chain model for baseball. This Markov Chain has 25 states: the 24 states corresponding to the 8 possibilities for the three bases being occupied or not and the 3 possibilities for the number of outs, 0, 1, or 2 and a 25th state corresponding to the end of the inning. The transition matrix is estimated from 1989 data. This matrix is slightly different for the American and National League because of differences in the rules. Using this model, you can calculate the probability of scoring and the expected number of runs scored for each starting state. As a check on the model, Stern uses his model to find the expected number of runs in a game and shows that these are consistent with the average number of runs per game for the 1989 season.

Stern discusses strategic decisions for which the model can shed some light. For example, if there are runners on first and second and no outs, the model gives 56% chance of scoring and an expected number of runs scored of 1.36. A successful sacrifice would lead to the state with players on 2nd and 3rd and one out. From this state, the model predicts a 72% chance of scoring and an expected number of runs 1.42. This shows that a successful sacrifice would increase both the chance of scoring and the expected number of runs. Of course one needs to take into account the chance of a successful sacrifice. Stern observes that, if the chance of a successful sacrifice is significantly less than 75%, then a sacrifice is not a good idea. If the chance is 75%, then a successful sacrifice will lead to an increase in the probability of scoring but a decrease in the expected number of runs, so some additional judgment is required.

Stern points out limitations of the model such as: for example, it neglects known variation in performance of batters. But he argues that it is a good place to start in considering baseball strategies.


How could you calculate the expected number of runs scored when the Markov chain is in a particular state?

To wait or not to wait: that is the question.
Chance Magazine, Vol. 10, No. 1, Winter 1997 p. 38
Nico M. van Dijk

This article discusses, in an amazingly simple way, the basic concepts of queuing theory and provides some real life problems involving waiting times.

For example, the famous bus paradox is explained as simply the curse of variation. If buses arrive exactly every 20 minutes at your corner and you arrive at a random time you will have to wait an average of 10 minutes. However, the slightest variation in the arrival times, keeping the same average of 20 minutes between buses, increases your average waiting time. If the variation is large, your average waiting time can increase significantly. For example, if the times between arrivals are 3, 3, and 54 in an hour, the average time between arrivals is still 20 minutes. But if you come at a random time in this hour, your average waiting time will be 3/60 x 1.5 + 3/60 x 1.5 + 54/60 x 27 = 25.05 minutes. Dijk observes that this example can be used to explain why cancers, discovered by screening, are generally growing more slowly.

Dijk show why the desire to operate at near peak capacity explains why you often have to wait so long when you go to your friendly doctor. He suggests that clinics would do better to allow for an occasional free moment for the doctor. The doctor could probably find ways to productively fill this free time and his patients would be in a much better mood when he sees them.

Dijk shows that you can use variation to fight variation. He describes the success of the Dutch supermarket that electronically counted the number of people entering. At regular intervals, cashiers are sent into duty or taken off duty depending on whether the number entering in this period is over or under the average arrival rate. Whenever customers cannot find a checkout with fewer than three customers standing in line, they are given all their items free! When this policy was implemented, gross sales increased by 20%.


Why does it so often appear that you chose the wrong line in the supermarket?

Classroom Capsules: What is the margin of error of a poll?
The College Mathematics Journal, May 1997, p 201.
Bennett Eisenberg

Eisenberg notes that opinion polling stories often report a difference in percentages. For example, during the 1996 presidential campaign, Clinton's lead over Dole was often estimated. At one point, "USA Today" showed a 20-point difference, while on the same day CNN found a 10-point difference. Each claimed a margin of error of +/- 3.5%. Eisenberg's local newscaster observed that someone's margin must be wrong.

The +/- 3.5% applies to individual percentages for each candidate, not for the difference between candidates. Eisenberg presents a mathematical analysis for the true margin of the latter. For an individual proportion 'p', he derives the usual bound of 1/sqrt(n) on the margin of error for a 95% interval. For a difference 'p - q', however, he shows that the bound becomes 2/sqrt(n). But this is for differences within one poll. He goes on to show that, if 'p' and 'q' are close to 1/2, then the estimates for 'p - q' from two different polls can be expected to differ by 2*sqrt(2)/sqrt(n) or more about 5% of the time.

For the example cited earlier, Eisenberg estimates from the +/- 3.5% individual margins that 'n' is around 900. Then 2*sqrt(2)/sqrt(n) = .10, so the ten percentage point difference in the estimated lead does not immediately suggest that someone's margin is wrong!

For additional discussion of this problem see:

Poll Faulting.
Chance Magazine, Winter 1993
Stephen Ansolabehere and Thomas R. Belin

Letter to the Editor.
Chance Magazine, Winter 1994, p. 3
Peter Lacherbruch

Sampling errors in political polls.
Teaching Statistics, Autumn 1994, pp. 71-73
Zhigniew Kmietowicz


Does the presence of a third candidate in a race effect the margin of error for a candidate? What about the margin of error for the difference of the two leading candidates?

Big MacCurrencies:
The Economist, 12 April 1997

For more than 10 years now, the "Economist" has published an annual guide to whether international currencies are over- or under-valued. The theory of "purchasing power parity" (PPP) holds that currencies should move to equalize the prices of an identical basket of goods in each country. The Economist's basket consists of a single good: a McDonald's Big Mac sandwich! Tongue-in- cheek, the article notes that McDonald's recent discounting strategy threatens to undermine this key currency benchmark.

Data are provided comparing the US with 32 countries. Reproduced below is a portion of a table from the article.

                  Local         Implied      April 7 
Country           price         PPP          exchange rate
US                  $ 2.42      ---            ---
China            Yuan 9.70      4.01          8.33
Switzerland       SFr 5.90      2.44          1.47
Japan               Y 294       121           126

Thus, in Beijing a Big Mac costs 9.70 Yuan, compared with $2.42 in the US. This gives an implied parity of 9.70/2.42 = 4.01 Yuan to the dollar. Compared to the actual April 7 exchange rate of 8.33 Yuan to the dollar, this suggests that the Yuan is undervalued by 52%. This is the most extreme case of under valuation. On the other extreme, the Swiss franc appears to be overvalued by 66%. Over the last two years the dollar has moved closer to its PPP against most currencies. For example, two years ago the index showed the Japanese yen was 100% overvalued against the dollar. As seen above, the yen is now close to its PPP of 121 Yen.

Over the past 10 years, the Big Mac index has predicted the direction of exchange-rate movements for 8 of 12 currencies in large industrial economies. It was right in 6 of the 7 cases where the rate moved more than 10%. The article notes that this is better than the record of some highly-paid currency forecasters.


(1) How impressed should we be by the 8 out of 12 record over the last ten years? Do you think the 6 out of 7 subset represents "data-snooping", getting it right when it matters, or getting it right when it's easy?

(2) Do you find this discussion to be more or less serious than the ongoing "Darts vs. Pros" contest for picking stocks?

Study gives women guide on hormones.
The Boston Globe, 9 April 1997, pA1
Peter J. Howe

A study published in the "Journal of the American Medical Association" finds that more than 99 in 100 healthy women undergoing menopause would benefit from hormone therapy. Researchers have devised a checklist of complicating factors like blood pressure, family breast cancer history, and risk of hip fracture. In consultation with their doctors, women could then predict whether they would increase their life expectancy more than 6 months by taking hormone pills.

Hormone therapy has been controversial because, while it can reduce the risk of heart disease, osteoporosis, and perhaps Alzheimer's disease, it also increases the risk of breast (and sometimes uterine) cancer. The researchers acknowledge the widespread fear of breast cancer but point out that there are only 43,000 breast cancer deaths per year, compared to 65,000 deaths from hip fractures and 223,000 from heart disease.

Dr. Graham Colditz of the Harvard School of Public Health headed a 1995 study that found taking hormones at menopause increases a woman's lifetime risk of breast cancer by 30 to 40%, from an absolute risk of 1-in-400 to 1-in-270. While agreeing with the present study's findings, he observes that they are premised on the assumption that a woman would not care whether she has a heart attack or breast cancer. Or, to put it on a societal level, how many extra breast cancers per 1000 women should be tolerated so that the overall group lives longer?


(1) How exactly does one go about translating a reduction in absolute risk from 1-in-400 to 1-in-270 to a percentage reduction?

(2) Is it possible to strike a balance between overall increase in life expectancy with an increase in breast cancer rates? Should the medical profession issue a recommendation? Is it up to each person to decide?

(3) Though it does not say so explicitly, the article seems to set a 6-month gain in life expectancy as the threshold at which the therapy should be considered seriously. What do you think of this?

A taxpayer's guide to crime and punishment.
US News & World Report, 21 April 1997, p38.
Ted Gest

The lead sentence asserts: "The Federal government fights random violence with random spending." A report commissioned by the Justice Department finds that a wide range of programs, from prison construction to night basketball for youths, have been funded without any meaningful attempt to assess their effectiveness. Thus, while a recent survey has found a 12.4% drop in violence, it is not clear what is responsible for the improvement.

A general finding from the Justice Dept. report is that focused programs, which target specific types of crimes or convicts, tend to be more effective than broad-based programs. For example, community crime watches have not shown conclusive results, whereas renters have had success using anti-nuisance laws against landlords who permit drug sales on their property.

In conclusion, the researchers recommend that 10% of anti-crime funds be dedicated to scientific study of what measures actually do work.


(1) What does the term "random violence" mean? In particular, do you think it can be addressed by measures that target specific types of crimes?

(2) How do you think effectiveness of programs was compared here (take the neighborhood watch vs. the anti-nuisance law example)? Isn't there a danger of bias in favor of programs producing short- term improvement?

Surge in teen-age smoking left an industry vulnerable.
The New York Times, 20 April 1997, A1
Barnaby J. Feder

The most recent edition of the University of Michigan's Monitoring the Future Survey has found that teen-age smoking rates are on the rise. The survey found that, while smoking rates for the 1990's are still lower than in the 1970's, the percentage of 12th graders who smoked daily was up 20% from 1991 to 22%. The rate among 10th graders jumped 45% to 18.3% and the rate for 8th graders was up 44% to 10.4%.

Rising youth smoking rates have been cited by the FDA and President Clinton as evidence that the industry is marketing its products to youths and should be restricted by the FDA. The rates are also fueling demands in many states and nationally for higher taxes on tobacco, based on research showing that price increases typically discourage teenage smokers.

Just what has caused the teen-age smoking rate to rise so sharply remains unknown. The tobacco industry says the increase is due to a broad range of social forces and also notes teen-agers' naturally rebellious reaction to the increasing efforts to stop them from smoking. While critics acknowledge the rebelliousness argument, they also say that the industry itself is still the most important factor. The industry's spending on domestic advertising and promotions soared to $483 billion in 1994 (250% after adjusting for inflation, from $361 million in 1970, according to the Federal Trade Commission). While tobacco companies claim that much of the money goes into promotions to encourage retailers to run sales or to display particular brands and signs more prominently, critics like John Pierce, head of the Cancer Prevention Center at the University of California at San Diego, claim that spending rose most rapidly in the 1980's when the decline in youth smoking was halted. In addition, the surge in teen-age smoking in the 1990's coincided with a sharp expansion by both Reynolds and Philip Morris in giveaways of items, like T- shirts, in return for coupons accumulated by buying their cigarettes. Research showed that the companies had limited success in preventing distribution of their merchandise to children -- 30% of teen-age smokers have it -- and that the items are just as appealing to teen-agers as to adults.

The increased smoking rates since 1991 are expected to translate into tens of thousands of additional early deaths, because 1 out of 3 teen-age smokers is expected to develop a fatal tobacco- related illness. With these increases and the current smoking rates, five million people now younger than 18 will die of tobacco-related illnesses, according to the most recent projections from the Centers for Disease Control and Prevention in Atlanta.

References for the remarks by Ken Steele.

Cohen, J. (1977). Statistical power analysis for the behavioral sciences (rev. ed.). NY: Academic Press.

Carstens, C. B., Huskins, E. & Hounshell, G. W. (1995). Listening to Mozart may not enhance performance on the Revised Minnesota Paper Form Board Test. Psychological Reports, 77, 111-114.

Kenealy, P., & Monsef, A. (1994). Music and IQ tests. The Psychologist, 7, 346.

Newman, J., Rosenbach, J. H., Burns, K. L., Latimer, B. C., Matocha, H. R., & Vogt, E. E. (1995). An experimental test of "the Mozart effect": does listening to his music improve spatial ability? Perceptual & Motor Skills, 81, 1379-1387.

Rauscher, F. H., Shaw, G. L., & Ky, K. N. (1993). Music and spatial task performance. Nature, 365, 611.

Steele, K. M., Ball, T. N., & Runk, R. (1997). Listening to Mozart does not enhance backwards digit span performance. Perceptual & Motor Skills, 84, 1179-1184.

Stough, C., Kerkin, B., Bates, T., & Mangan, G. (1994). Music and spatial IQ. Personality & Individual Differences, 17, 695.

Wolfe, F. M. (1986). Meta-analysis: Quantitative methods for research synthesis. Beverly Hills, CA: Sage.

Please send comments and suggestions to jlsnell@dartmouth.edu.


CHANCE News 6.06

(11 April 1997 to 10 May 1997)


MATC Workshops

The Math Across The Curriculum (MATC) Project at Dartmouth College announces two concurrent workshops for June 26-28, 1997. Priority for these workshops will be given to teams of two or more individuals from different departments in the same institution. Room and Board will be covered by Dartmouth as will the travel expenses of the second member of each pair of participants coming from the same institution.

Mathematics and Art together in the Classroom:

Faculty of both mathematics and art departments are welcome to attend. There are two foci. Pattern: a course in symmetry and design, introduces students to some of the concepts in group theory and design. Geometry in Art and Architecture relates these two subjects historically. This workshop will be in a participative, hands-on format. Participants will devote a substantial portion of the time to reworking these ideas for use at their own institutions. Materials Fee: $100.

Math, Philosophy and Literature:

This workshop has two components. One explores the mathematics and philosophy of infinity as it has developed over time. The other explores Renaissance science fiction, astronomy and mathematics. Participants will receive a complete reader for each of these courses, as well as exercises in other activities that the students experienced. Discussion will center on both the issues central to these topics and on the ways in which this kind of material can be used in the classroom. Materials Fee: $100.

For more information: see website at:
or send e-mail to: