CHANCE News 6.03

(4 February 1997 to 20 February 1997)


Prepared by J. Laurie Snell, with help from Bill Peterson, Fuxing Hou, Ma.Katrina Munoz Dy, Kathryn Greer, and Joan Snell, as part of the Chance Course Project supported by the National Science Foundation.

Please send comments and suggestions for articles to jlsnell@dartmouth.edu.

Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:


NOTE: Part 2 of Chance News 6.03 will be sent out later this week.

The current issues of Skeptic (Vol. 4 No. 4 1996) and Skeptical Enquirer(Vol. 21, no. 2 March/April 1997) have tributes to Carl Sagan who died December 20th. Perhaps the most impressive tribute can be found in Skeptic's "In his own words". Here are some of these words from his last book: "The Demon-Haunted World" p. 28.


We will always be mired in error. The most each generation can hope for is to reduce the error bars a little, and to add to the body of data to which error bars apply. The error bar is a pervasive, visible self-assessment of the reliability of our knowledge. You can often see error bars in public opinion polls. Imagine a society in which every speech in the "Congressional Record", every tele- vision commercial, every sermon had an accompanying error bar or its equivalent.

Carl Sagan

James Randi's tribute to Sagan reminded us of our favorite Nova video "Secrets of the Psychics." In this video Randi explains how a seemingly paranormal phenomenon can be duplicated with trickery. For example, Uri Geller claimed to be able to bend a spoon by pure thought. Randi shows that not even much thought is necessary. The video ends with Randi strolling through the streets of Moscow after his meetings with Russian psychics. He leaves us with the philosophical conclusion: "Reality isn't so awful. Make the most of it." The tape is available from WGBH at 1-800-255-9424.

Contents of Part 1


There are a growing number of applets that provide programs illustrating basic concepts of probability and statistics. These applets can be run from any of the standard browsers. In the Chance Database under "teaching resources" you will find links to a number of applets we have found to work well and feel would be useful in an introductory course.

Paul Caprioli sent us the following article:

A professor divides his class in two to test value of on-line instruction.
The Chronicle of Higher Education, 21 February 1997, pA23
Kelly McCollum

Jerald Schutte, a sociology professor at California State University at Northridge, has run an experiment to assess the value of on-line instruction. He randomly divided his statistics class into two groups. One half took the course in a traditional classroom setting. The other half completed a web-based course, in which problems were assigned by e-mail, students collaborated in small groups and consulted with the professor only through on-line "chat rooms." These virtual students went to the classroom only for the midterm and final exams, on which they out-performed the traditional students by an average of 20%!

One of the virtual students commented that she appreciated not having to feel intimidated by other students in the classroom. Schutte concurs, noting that it is easier to ask questions in the relative anonymity of the chat rooms. Another student, however, commented that workload was daunting. Indeed, Schutte expressed surprise that none of the virtual students had dropped the course in the face of the increased load.


(1) The article doesn't say how many students were in the course. How large would the groups need to be for the 20% difference in averages to be convincing?

(2) The article mentions that Schutte now wants to find out whether the perceived improvement was due to the on-line nature of the course or to the fact that the virtual students spent more time collaborating with classmates than the traditional students did. How would you design a study to determine which it was?

(3) Suppose future research shows that you can learn just as well taking courses on the web as in a traditional college class. Do you think this would have any effect on college enrollments?

Paul Alper suggested the next article and and article in part 2 and provided ideas for discussion questions for these articles.

Student proves that S.A.T. can be:(D) wrong.
The New York Times, 7 Feb. 1997, A1
Mary B. Tabor

Colin Rizzio, a high school student from Peterborough N.H., wondered why his answer to one of the question on the math questions on his S.A.T.test was not correct. He consulted his teacher who said he was correct, so he sent an e-mail message to ETS. He did not hear anything for a long time, so he forgot about it. The next thing he knew, he was an instant celebrity, appearing on the ABC program "Good Morning America". On this program he was given a letter of acceptance from the president of Clarkson University where he had applied. The peopple at ETS had finally read their e-mail and agreed that, for the first time since 1982, they had made a mistake. The tests of the 43,000 students had to be re-scored. Here is the problem in question:

Directions: The following question relates to two quantities: one in Box A and one in Box B. You are to compare the two quantities and on the answer sheet fill in oval

A if the quantity in Box A is greater,

B if the quantity in Box B is greater,

C if the two quantities are equal,

D if the relationship cannot be determined from the information given.

Consider the sequence: 1, a, a^2, a^3,...,a^n

The first two terms of the sequence are 1 and a, and each succeeding term is the product of a and the preceding term.
            The median of the              
  Box A     sequence if n is           Box B   a^(n/2)
            a positive even 

The authors of the test meant to require that a be positive but did not say so. If a is positive the median of the set of numbers 1, a, a^2, a^3,...,a^n for n even is the middle number a^(n/2). Thus ETS considered C the correct answer. Rizzo realized that this is only correct if a is assumed to be positive and, since this was not assumed, the correct answer is D.

Brian O'Reilly, a spokesman for the board, said that, since the board felt the question was flawed, all students who took the test will get credit for the question. (This includes those who did not answer the question.) But he said that only 45,000 (about 13% of those who took the test) will see their scores go up because raw scores are rounded and top and bottom scores are affected far more by individual answers. Most of the increases will be 10 points, but a few at the very top and bottom will increase 20 or 30 points. Reilly stated that he expects the error will have "little or no effect on college admission decisions."

This error caused a lot of people to discuss what the median of a set of numbers is, and we suspect that a lot more people understand the complexities of the median now than before February 7.

In trying to understand why the change effected the scores of students so differently, we realized we would have to understand how SAT scores are determined from the raw scores. With the help of Charlie Lewis and Ida Lawrence at ETS we learned something about this but also that it is a complicated question. Here is a simplified description gleaned from their explanations.

The scoring of an exam determines a "raw score" using the following rules:

ETS gives several forms of the same test during the year. They try to make these as similar as possible, as far as difficulty is concerned, over the whole range of abilities. Obviously this is not completely possible, so they go through a process called "equating" to adjust the raw scores before determining the transformation from raw scores to SAT scores between 200 and 800.

Each test includes an "anchoring section" which is a miniature form of the total test. Students answer this part but it does not count towards their score. ETS then looks at two samples, one for the new test and one for a previous test which is linked to the new test by having the same anchoring section. They use the earlier test, already equated, to adjust the raw scores for the new test, to take into account that the new test might have been a bit easier or harder than the previous tests and the students might have been a bit better or worse. From the mapping of raw scores to SAT scores, already determined for the previous test, they determine a similar mapping from the adjusted raw scores of the new test to SAT scores. They repeat this for several earlier tests linked to the new test and average the results to get a final mapping.


(1) The Dartmouth admission director was also quoted in our local paper as saying that the 30 point difference was not significant in the admission process. Do you believe this?

(2) Does it still seem a little weird to you that giving all students credit for this single problem could have no effect on the S.A.T. score for one student and increase the score by 30 points for another?

(3) Another obvious solution for ETS would have been to re-score the test taking D as the correct answer. Why do you think ETS did not do this?

(4) Bob Norman pointed out that the problem asks for the median of a "sequence" of numbers. He remarked that, while a "set" of numbers has a median, he does not believe that a sequence of numbers does. Do you agree?

(5) John Kemeny (and probably others) suggested a definition of a central value K for any subset A of a set U on which a distance d(a,b) between points a and b has been defined.(D satisfies the usual axioms for a distance.) K is defined to be any x in U which minimizes the sum of distances d(x,y) over all y in A. (Of course, Kemeny didn't call it K)

(a) If A is a subset of the real numbers and d(a,b) = |a-b|, what is K? What if d(a,b) = (a-b)^2

(b) Let U be the set of all points in the United States and d(a,b) the ordinary distance. Where do you think K would be if we take A as set all the people in the United States. In other words, where should a national town meeting be held to minimize the total travel? Answer the same question if d(a,b) is the square of the ordinary distance from a to b. How might you interpret this. Which K would be nearest Chicago?


Coping with public perception.
Newsday, 4 February 1997,pB22
Sylvia Adcock

Air travel has been growing at nearly 6% a year, and some experts expect that the number of flights worldwide will double or triple over the next twenty years. The accident rate, measured in number of plane crashes per million flights, has been stable for the last ten years. Even if this rate stays unchanged--in other words, current safety levels are maintained--then in the future we may be facing one plane crash a week. Would the public tolerate such a figure? The article suggests that under this scenario it would be difficult to convince anyone that air travel is the safest mode of transportation.

The aircraft manufacturer Boeing predicts that there will be a "major hull loss" accident (a crash that effectively "totals" an airplane) every seven to ten days as soon as 2005. Boeing's vice president for safety worries that this could erode public confidence in air travel to the point that industry growth might stop.

Arnold Barnett of MIT points out that the right reason to work on cutting down the accident rate is because accidents are inherently horrible, not because of public perception. He adds that most of the increased number of crashes will be occurring outside the US, where they will receive little attention from the US media, and thus may not panic US public.


(1) The article notes that in 1995 there was a fatal commercial crash somewhere in the world an average of one every 2.4 weeks. Do you find this surprising? What is the relationship between "airline accidents", "fatal crashes" and "major hull losses"?

(2) A New Yorker article abstracted in CHANCE News (5.02) introduced the notion of "risk homeostasis." One example noted that airlines compensate for safety improvements in their planes and air traffic control systems by increasing the number of flights. Do you believe the public has a risk homeostasis level for air travel? Does it depend on accident rate or total number of accidents.

(3) The New York Times recently reported (Business Travel, 5 February, pD3) that beginning February 28, the FAA will post safety data on its web site. However, there will be no overall ranking of airlines--individuals will have to sort through the data on their own. What effect do you think this will have on public perception? Why do you think the FAA chose not to give overall ranking? Do you think, given this data, you could come up with an overall rating?

Prevention overshadowed by treatment in heart study.
The Boston Globe, 19 February 1997, pA1.
Richard A. Knox

Analysis presented in this week's "Journal of the American Medical Association" found US cardiac deaths dropped by 1/3 during the 1980s but concludes that most of the progress is due to improved treatment rather than prevention. The improvement translated into a saving of 127,000 lives in 1990, but only 32,000 of these were attributed to lifestyle changes such as exercise, healthy eating and quitting smoking. This amounted to only 25% of the progress over the decade. Treatment such as clot-busting drugs and bypass surgery, accounted for 43%; drugs to lower cholesterol and blood pressure accounted for 29%.

Cardiovascular deaths have been declining for three decades, and there has been a long debate over the relative contributions of healthier lifestyles and better medical management of the disease. Milton Weinstein of the Harvard School of Public Health, a co- author of the analysis, said: "It is fashionable in the public health community to bash treatment. I think our data show that treatment is making a difference. In this era of managed care...consumers have to be vigilant to be sure they are getting access to surgery."


(1) If a person at risk for cardiac disease goes on blood pressure medication and also starts eating healthier foods, how do you think researchers decide which saved his life?

(2) Accepting the breakdowns presented, should we conclude that in the long term, treatment is more effective than prevention for saving lives?

Car Talk.
Public Radio, 16 Feb. 1997
Tom and Ray Magliozzi

Tom and Ray, on their well-known public radio program, each week include a "puzzler" and the audience is invited to submit solutions. For their puzzler for the week of 2/14, after explaining why black pearls from the Seychelles are so expensive, Ray posed the problem:

I get 50 of these black pearls and I put them in a cigar box. And I get 50 faux white pearls. So, I've got 50 black pearls from the Seychelles and then I have 50 white pearls in another cigar box. And I tell my wife, "Look, I'm going to put these cigar boxes in front of you. You will be blindfolded and you will instruct me to open one or the other, either A or B. You will then pick a pearl out and, if it's a black one, you get the black pearls, and, if it's a white one, you get the cheap pearls that I had intended to buy you in the first place. So it's obvious, since there are 50 of each pearl, her chances are 50/50. But she can mix them up. She can put all the pearls in one box, she can put half the white ones in one box and half the black ones in the same box so she has 25 of each color in one box. Is there any way she can mix up these pearls to improve her chances beyond 50/50? That's the question.

This has always been one of our favorite problems because of its simple solution (put one black pearl in box A and the rest in box B) but also because the proof that this is the optimal solution is so simple and elegant (see discussion questions.)

We (Laurie) first heard this problem at Cornell in 1950. A psychology graduate student was writing a thesis on how mathematicians attack problems. She had a theory that they were too prone to look for symmetry arguments. She gave the problem to the two renown probabilists Doob and Feller. They both said they could not do any better than 1/2. When she gave the solution they said they assumed you had to put the same number in each box. She had made her point!


(1) If we put an equal number of pearls in each box, the probability of getting a black pearl is only 1/2. Since putting one black pearl in box A and the rest in box B does better than 1/2, the optimal solution must have more black pearls than white in one box and more white pearls than black in the other. Show that, the proposed solution maximizes the probability of getting a black pearl in both the box with more black pearls than white and the box with more white pearls than black.

(2) We taught a probability course once with psychologist George Wolford. One day, when discussing an urn problem, we drew our traditional urns with black and white balls on the blackboard. George asked why we always called the black balls white and the white balls black. Why did he ask that?

Nurses' health study has worldwide impact.
The Boston Globe, A1,
17 February 1997
Richard A. Knox

For the twenty years it has existed, the Nurses' Health Study has been a leader in women's health studies, and its ever-expanding scope has led to findings in a variety of health issues, such as heart attacks and breast cancer.

The study is a research-by-mail project that owes its success to 110,000 registered nurses across the nation who, for two decades, have filled out questionnaires mailed at two-year intervals. The questionnaires cover medical histories, diets, and life events. The study does not intervene in its subjects' diet or treatment.

The nurses' study has been cited by researchers worldwide. Out of its data 265 scientific papers have emerged. Many of these studies, while providing informative information, have also served to confuse and frighten many Americans. Confusing results sometimes occur due to the nature of "observational" studies, which can make links between a factor, such as a particular food habit, and subsequent development of disease but cannot prove cause and effect.

Findings on issues such as post menopausal estrogen and moderate drinking have been confusing. Such results, however, have been offset by positive evidence about genetic links to breast cancer and prevention of adult-onset diabetes.


What do you think it means to say the results on moderate drinking have been confusing?

First proof: driving while talking on phone is a hazard.
The New York Times, Feb. 13 1997, p. A-30
Gina Kolata

The good news is that cellular phones do not cause brain cancer. The bad news? They are unfortunately a cause of car accidents. In fact, according to research by Drs. Donald A. Redelmeier and Robert J. Tibshirani, driving with the use of a cell phone can be likened to driving with a blood alcohol level of about 0.10 per cent. The chances of getting into an car accident while using a cellular phone is nearly equal to the chances of having an accident while slightly drunk.

Another factor in the chance of having an accident is the topic and content of conversation: ³A person who is not talking on the phone cannot become distracted by a shouting match with a boss or a significant other." Thus, the risk gets even greater if the person is discussing something important or especially is in the middle of an argument.

The researchers, a professor of medicine at the University of Toronto and a statistician at the same school, ran a study of 699 drivers who had cellular phones and who had also been involved in an accident. They analyzed 26,798 phone calls on cellular phones and determined that the risk of having an accident quadrupled when drivers spoke on the phone. With heated discussions, this rate is even higher.

Dr. Malcom Maclure and Dr. Murray A. Mittleman of the Harvard Medical School have concluded that, with the growing usage of car phones, accidents caused by cell phones could account for between 0.6 and 1.2 per cent of all car accidents by the year 2000.

On the other hand, several people used their cellular phones to call the police after their accidents. The Cellular Telecommunications Industry Association made sure to note this point. Though they saw the statistical evidence of usage causing accidents, they also felt that the benefits of being able to get hold of the police immediately following an accident should also be considered.

Bradley Efron, professor of statistics at Stanford University was doubtful about the possibility of deciding whether or not cellular phones truly cause a threat to drivers. Problems included: how "to determine whether phones were being used at or near the time of car accidents and whether the confluence of phone use and car crashes were more than coincidence." However, after reading the researchers¹ paper, he was convinced that it was indeed possible.

Dr. Redelmeier decided to do the study after one of his patients was in an accident while talking to him on her cell phone. He came to the realization that people just do not realize the ³limitations of their attention.² With this new information out, hopefully more people will realize these limitations‹and leave the cell phone off unless in an emergency.

The article remarks that other researchers praised the elegant and novel method used. We decided to see, from the original article (New England Journal of Medicine February 17, 1997), how the study was carried out.

The researchers used a "case cross-over" method. This is a case control study where the controls are the same people used for the cases. This obviously avoids a lot of questions of possible differences between people in the controls and the cases. In this study, the cases were people identified as drivers who use cellular phone and have had an accident within a certain interval of time. The controls were the same people observed at the same interval at an earlier time on a previous day.

The idea is to see if those who were also driving at the same time a day before when they did not have an accident, used their telephones significantly less than they did during the time just before their accidents. If so, this would suggest that the use of a cellular phone is a risk factor for an accident.

The original paper is very clearly written and the authors discuss in detail how they tried to make the study as rigorous as possible and also some of the limitations of such a study.

Discussion Question:

(1) As usual even if a relation between use of a cellular telephone and accidents is established, this may not be the cause of the accident. What other possible causes might be considered?

(2) How might you decide if the proportion of people who used their cellular phone while driving on the day before than had an accident was significantly less than the proportion who used it on the day of their accidents?

Unconventional wisdom: The high price of selling out.
The Washington Post, Feb. 9 1997, C5
Richard Morin

How much does it cost to make someone sell out? Economist Robert Frank decided to find out. he asked a group of seniors at Cornell University how much more it would cost to have them work for less ³socially responsible² companies than more socially conscious ones. Frank found out that these less ³in² companies have to pay ³ a moral Œreservation premium¹‹call it conscience money‹to attract top workers.²

On the other hand, some think that the results of Frank¹s survey would differ at other universities. Also, they wonder if the price would go down once the students were out on their own, in the real world.

Below are some examples of how much extra certain companies would have to pay (on average) the Cornell seniors to employ them. The percentage who said they would prefer the latter job is in parentheses.

So, it obviously does take some extra money to make people work for less socially conscious companies. On the other hand, this indicates that some people put a value on their moral feelings.

Discussion Questions:

(1) Is this survey really a good indication of the price of selling out?

(2) Do you think that the results of this survey would be vastly different in another university? After these students have graduated?

Please send comments and suggestions to jlsnell@dartmouth.edu


CHANCE News 6.03

(4 February 1997 to 20 February 1997)