CHANCE News 6.02

(4 January 1997 to 3 February 1997)

Prepared by J. Laurie Snell, with help from Bill Peterson, Fuxing Hou, Ma.Katrina Munoz Dy, Katherine Greer, and Joan Snell, as part of the Chance Course Project supported by the National Science Foundation.

Please send comments and suggestions for articles to

Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:



Further, it cannot escape anyone that for judging in this way about any event at all, it is not enough to use one or two trials, but rather a great number of trials is required. And sometimes the stupidest man--by some instinct of nature per se and by no previous instruction (this is truly amazing)--knows for sure that the more observations of this sort that are taken, the less the danger will be of straying from the mark.

Jakob Bernoulli (Law of Large Numbers)

In the last chance news, we referred to an article suggesting that, when you buy a box of Animal Crackers, you would find that more prey animals are broken than predators. Reader Roxy Peck has supplied the reference for this article: it is entitled "Survival Strategies Among Animal Crackers" and appeared in the Journal of Irreproducible Results, Jan/Feb 1991, p. 15 (one of his favorite journals). It is an amusing one-page letter that we will include at the end of this newsletter. Our own modest experiments have suggested that this is, indeed, an irreproducible result.

Also in the last Chance News we reviewed Bernstein's book "Against the Odds". Chris Ferrel wrote that, at least from our review, it appeared that Bernstein did not understand the way economists in the past and present interpret risk. He remarked that economists themselves are to blame for this, having not been effective in conveying their subject to others. Chris offered to try to write something for us to clarify some of the economics issues of risk discussed in Bernstein's book.

Chris has a home page Econ 351B related to his econometrics course that readers might enjoy.

You will find here Chris's lecture notes for an econometrics course and much more.

Another homepage that might interest our readers is provided by Robert Matthews:
Welcome to the Home Page of Robert Matthews
Matthews divides his time between writing science articles for the Sunday Telegraph, New Scientist, and other science journals and doing research in computer science. A search on back issues of Chance News will show that he has provided us with a number of interesting articles (Murphy's law, Should you take an umbrella in London, the prosecutor's paradox etc.) He was recently awarded an IgNobel Prize for his work on Toast and Murphy's law (See Chance News 5.04).

You will find here more detailed discussions of some of the topics that Matthews covers in his science writings.

Jerry Grossman wrote about an ad that really "irks" him. He writes:

The province of Ontario, trying to attract American tourists, has half-page advertisements (e.g., in news- papers in Detroit) with the headlines: "Ontario 35% off" and "Your dollar goes 35% further in Ontario." Assuming that 1 US dollar = 1.35 Canadian dollars, the second statement makes some sort of sense (at least with poetic license), but the first is just wrong (it's really about "26% off" granting the same poetic license). I wrote to them, pointing out that using their logic, if the exchange rate were 1 US dollar = 2 Canadian dollars, then they'd have to say that Ontario was 100% off, so we could come enjoy their country for free; but they never replied. The ad has been running for several years.


(1) If 1 US dollar = 1.35 Canadian dollars, why is it more accurate to say "Ontario 26% off"?

(2) Why is poetic license required?

Anthony Rossini sent the following contribution:

In the fall of 1996, the Citadel started to encourage women to enroll at the college. In a newspaper article, on Sunday, January 26th, 1997, in "The State", a local Columbia, South Carolina, paper, a story was run, claiming that the Citadel was worried about increasing attrition rates of the knobs, which is the name given to the freshman class. The administration claims that they are worried about the annual retention rate of freshman. They provide the data that they used for the claim. How valid is this claim? (while reasonable, it doesn't look too strong, I personally think!).

School      Freshman       Dropouts        Dropouts 
 Year       Enrolled      by January        by May 

1987--88 646 85 111 1988--89 655 94 115 1989--90 535 76 101 1990--91 645 103 139 1991--92 622 97 126 1992--93 627 111 154 1993--94 609 87 121 1994--95 601 120 138 1995--96 591 96 102 1996--97 581 113 NA


(1) Do you think the data support the Citadel administration's concern about the attrition rate of their knobs?

Jim Baumgartner suggested our first cartoon for Chance News. The January 20 and 21st Dilbert cartoons featured the famous "bell curve." Of course you should look at the cartoons themselves but here is the text to give you a flavor of these cartoons:

Boss: Bad news on your performance review Wally. Everyone performed the same. But I'm required to rank the group on a bell curve. I had to make up some flaws to move you down the curve. Here's a pen. Sign it.

Wally reading: "Employee does not wash hands after using the rest room". I can't sign this performance review. It is full of alleged misdeeds that you invented to lower my rating.

Boss: Yes, but I think it reflects the sort of things you MIGHT do. I had to make all the reviews fit a bell curve.

Wally: I am NOT selling crack from my cubicle!!!


Eunice Goldberg suggested the next article and the discussion questions relating to it.

Education ratings find few stars as states get mostly C's in 4 vital areas.
New York Times, 17 Jan. 1997, A14
Peter Applebome

A study prepared for Education Week evaluated the states using 75 specific measurements from class size to teacher qualifications to total funds allocated for each student.

The report gave a grade of A to F in the following six areas:

Standards and assessment.
Overall grade B

Quality of teaching.
Overall grade C

School climate.
Overall grade C-

Are states allocating enough money to do the job?
Overall grade C+

Do states make sure that everyone gets a fair share?
Overall grade B-

Do states spend their money on the right things?
Overall grade C-

As you can see, the authors of this report have not heard of grade inflation. The article includes a geographic report card for four of these categories: standards and assessments, school climate, quality of teaching, and equity between rich and poor districts.

Of these four, only "standards and assessment" got any A's and, in fact 22 states got an A for this area. These A's are scattered all over the country and include: Oregon, Texas, Missouri, Alabama, New York and Maine. F's were given out to Wyoming and Iowa for not developing statewide standards.

States like Alabama and Mississippi got B's for "equity between rich and poor districts". The writer remarks that this is because spending is so low there is little room for disparity.

Looking at these pictures, you can see that there was a lot of variation even within states. For example, Texas got an A in "Standards and assessments, a B in "school climate" and "quality of teaching" and a D in "equity between rich and poor districts".

The report gives comparative scores on the National Assessment of Educational Progress, which the article states is the closest thing to an American national test. It is pointed out that the states that did the best on this test, such as Maine and Iowa, did so more because of a lack of urban areas than because of a forward-thinking policy.

West Virginia and Kentucky who were forced by court order to improve their education got some of the highest grades in this survey.

The report called "Quality Counts: a Report Card on the Condition of Public Education in the 50 States" can be found at the Education Week on the WEB


(1) How do you think the researchers measured "quality of education?"

(2) Are the methods for assessing schools similar or different than those for assessing students? Should students be graded on several criteria as the states were in this study?

(3) What interesting facts about differences in the schools between states and within geographic areas can you see from the geographic displays provided in the article? Is this a useful form of graphical display for information of this kind?

Michael Olinick sent us the following article:

The age of unreason: welcome to the factual free-for-all.
The New Yorker, 3 February 1997, P. 40
Kurt Andersen

Andersen's essay laments the decline in the standards of evidence in journalism and popular discourse. There is no longer consensus about facts, he says, nor is there faith that truth will emerge from them. He cites a number of examples from major news stories of the past year. We don't know who was responsible for the epidemic of arson attacks on black churches, nor is there agreement that there even was even an epidemic (see Chance News 5.08). We still don't know who was responsible for the pipe bomb at last summer's Olympics, nor what was responsible for the explosion of TWA Flight 800. In the absence of hard evidence, we are presented with disputes over how the stories were reported and who believes them. Andersen acknowledges that skepticism is sometimes justified, but his first example is curiously worded: he says that the last NYT/CBS poll before the presidential election "overestimated Clinton's margin of victory by 100%." He adds that we now hear that inflation has been miscalculated for years, and that TV networks now doubt the accuracy of the Nielsen ratings.

Andersen worries that if a story is scary enough, chances are it will be picked up by the media with its premises unchecked. His prime example is the panic over missing children. In the years following the 1979-81 murders of 23 children in Atlanta, the media regularly published stories about the numbers of children abducted each year by strangers. Estimates ran from 20,000 to 50,000 to more than 100,000. Andersen says that in 1984, as a reporter for "Time" magazine, he decided to some checking. Noting that the low end estimate would correspond to 12 abductions per week in New York City alone, he called a half dozen urban police departments around the country ("more or less at random") and asked how many abduction cases they had had in the last year. Zero, one or two were typical answers. From this, Andersen figures the national number was probably only in the hundreds. It now turns out, according to the current "Washington Monthly," that the 50,000 figure was invented by the father of Adam Walsh during interviews in the weeks following Adam's 1981 abduction and murder in Florida.

Another 1980s scare story concerned cocaine use. A frequently cited figure was that every day 5000 Americans were trying cocaine for the first time. The number was eventually cited in 1987 among the official data of the National Institute on Drug Abuse. But in 1988, when "Science" magazine tried to track down the source, they reported that the statistic had been calculated on the basis of 500 calls to a New Jersey cocaine hotline. Not quite, says Andersen, who claims he himself constructed the estimate for a 1983 "Time" cover story! Taking the difference between "credible" estimates of the 1982 and 1980 numbers of Americans who had used cocaine, and dividing by 730 (365 times 2), gave the 5000 new users a day figure.

In the second half of the article, Andersen explains how the World Wide Web, where everyone with a home-page becomes a publisher, has become a prime medium for the dissemination of pseudo-facts. He cites examples of postings, on sites he finds as professional-looking as those of prominent news-media, which propound theories that HIV is a byproduct of a US biowarfare program or that Flight 800 was downed by a rift in the space-time continuum. This is certainly food for thought for those of us actively using the Web in our courses. How should we teach our students to separate the serious reporting from the fantastic quantities of junk?


(1) What does Andersen mean when he says that Clinton's margin was overestimated by 100%?

(2) Check Andersen's arithmetic on the abduction story. Does 20,000 annual abductions for the US translate into 12 per week in New York City?

(3) Can you reconstruct the reasoning leading from Andersen's informal survey data on abductions to his estimate of a national total in the hundreds? How would you go about getting a more accurate answer?

(4) Do you think Andersen is defending his original 5000 new-cocaine-users-a-day estimate?
Do you find this calculation credible?

Horses, mollusks and the evolution of bigness.
The New York Times, 21 Jan. 1997, C1
John Noble Wilford

In our review of Steven Jay Gould's new book "Full House" (see Chance News 5.11) we mentioned that Gould discussed Cope's rule. Cope's rule says that evolution favors larger species (presumably because larger animals have a mating advantage, can resist changes in environment, can defend themselves better etc.)

In his book, Gould argues against this rule claiming that a random walk model with a left hand barrier at 0 (size of animals cannot be negative) for evolution of species would account for an apparent increase in size of species over time without assuming that nature favors large over small.

This article discusses a study by paleontologist David Jablonski at the University of Chicago which was reported in the January 16 issue of "Nature." The study shows that the variation in the size of different species of mollusks is better explained by a random walk model than by Cope's principle.

Jablonski studied the size of mollusk fossils over a period of 16 million years called the "Cretaceous" period. This period ended 68 million years ago with a mass extinction that is thought to have done in the dinosaurs.

Jablonski determined the size of 1,086 species of mollusks, as measured by the geometric mean of the height and the length. He then grouped the species by the next level of classification: the genus. He recorded the minimum and maximum size of the species in each genera, at the beginning of the Cretaceous time period and at the end of the time period.

If Cope's rule applies, we would expect that, for a majority of the genera, the maximum and minimum size for its species would increase. Under the assumption of a random walk model we would expect, by the symmetry in the changes in size, about an equal number of genera in the group where both the maximum and minimum size species increased and in the group where both the maximum and minimum size species decreased. Jablonsky found that each group contained about 27% of his 191 genera

Because of the increased variation over time in the random walk model, we would expect a larger number in the group where the maximum and minimum size both increased than in the group where the maximum and minimum size both decreased. Jablonsky found about 30% of the genera in the group where both the maximum and minimum increased and only about 10% in the group where they both decreased. Thus, Jablonsky's results are consistent with the random walk model but not with Cope's rule.

In an accompanying article in the same issue of Nature, Steven Jay Gould discusses the history of Cope's rule and remarks that it is another case of the natural bias we have for "bigger is better." He reminds us of Little Buttercup's admonition to Captain Cocoran in H.M.S. Pinafore, that "things are seldom what they seem".


We, as were Gould and Jablonsky, rather vague about what the random walk model for species would really look like. It should be some kind of random branching process. See if you can describe such a model with a left hand barrier at 0. Are the findings of Jablonsky consistent with your model?

Overcoming junk science.
Wall Street Journal, Jan. 9, 1997, A12
John Stossel

John Stossel is a consumer reporter for ABC and appears regularly on the program 20-20. He has written numerous stories on "new risks." For example, in 1982, he wrote about several Tylenol poisonings, causing a great stir. However, only 7 people died from these poisonings, while 100 people die each day from car accidents. Stossel remarks that, for some reason, these car accidents just are not exciting enough to be big news. He realized the silliness of some of these scares when a producer came into his office excited about BIC lighter explosions. A total of four people were killed by these explosions. Compare that to the 50 deaths per year caused by buckets(usually children falling in and drowning), which does not seem to be quite as shocking.

With the help of Bernard Cohen from the University of Pittsburgh, Stossel made a chart of the average amount of time off the end of one's life taken by various risk factors.

From this chart we see that a toxic waste site would take 0 to 4 days from your life. Pesticide residues take 0 to 4 as well. Flying in a plane takes an average of 4 days off your life, while driving takes 182 days off. House fires shave off 18 days, and being 25% overweight takes an average of 303 days away. Smoking takes 2,580 days off one's life while living in poverty takes off 3,600 days. So, some of the big "scares" we have had in the past few years, namely toxic waste sites causing disease, pesticide residues harming people, flying in planes, and fires in households are less likely to shorten one's life than driving, being overweight, smoking, or living in poverty--these problems we see often but do not think about nearly as much as we do the "scares."

Stossel remarks that hyping scares can actually lead to greater problems than leaving them unexposed. For example, when people begin to fear flying in planes, they may choose to drive in a car instead. This increases their chances of crashing or having some form of an accident. Though it is easy to be swayed by various studies, Stossel suggests some principles by which we can avoid getting caught in the web of junk science:

Stossel asks us to remember that science is politicized: for example, the phenomenon of crack babies, or children born of mothers who take cocaine, being handicapped for life was not completely true. However, liberals use this as justification for anti-drug campaigns, while conservatives use this as justification to demonize cocaine users.

Babies are born deformed purely by chance: about one in 5 pregnancies end in miscarriage. Between 2 and 3% of babies are born with inexplicable birth defects. These are no one's fault, but 80% of obstetricians have been sued in the US.

Though it is easy to be taken in by numbers in a survey, one cannot take everything in the newspaper as gospel. We must think about what numbers tell us and then decide for ourselves what we believe and what is nothing but "junk science."

On the same day that this article appeared Stossel had a "special" on ABC from 10:00 to 11:00 called "What you know may not be so." On this program he discussed in more detail, with site visits and interviews of experts and others, on a number of the issues mentioned in this article including government recommendations on salt, the breast implant controversy, dioxin, crack babies. You can get a video tape for this program by calling 1-800-913-3434. The cost is $29 plush shipping.


(1) Is it reasonable to assume that we all have the same time taken off our lives by driving, by traveling by airplane etc.?

(2) Which do you think commits more of the sins mentioned by Stossel, the newspapers or T.V.?

Survey reports rise in binge drinking.
The Dartmouth, 7 Jan. 1997, p1
Jess Jacob

The Core Institute at Southern Illinois University gives a questionnaire to college students nationally to determine their drinking habits. For five of the past six years Dartmouth has given this survey to Dartmouth students.

The article reports the results of the 1996 Dartmouth survey. This survey showed that the percentage of "binge" drinkers and the percentage of non-drinkers have both risen in the freshmen class. However, the percentage of binge drinkers in the college as a whole had dropped. Binge drinking was defined for men as having 5 or more drinks at one time more than once in the past two weeks. For women it is 4 or more drinks at one time. The article states that the percentage of freshman students who reported binge drinking rose from 33 percent in 1995 to 41 percent in 1996. This makes the percentage of freshman binge drinkers at Dartmouth near the national average of 43%. However, the number of freshmen who reported abstinence from drinking rose as well: from 12 percent in 1995 to 18 percent in 1996. The Dean of the College, Lee Pelton suggested that an increase of alcohol usage in high schools helps explain the increase in the percentage of student binge drinkers. He also suggested that the rise in alcohol-abstinence could be accounted for by an increase in the educational programs at the college and the many non alcohol-related events sponsored by the college.

Binge drinking has gone down in the school as a whole. The percentage has lowered from 35 percent for all students in 1994 to 29 percent in 1996. The percentage of fraternity and sorority members who do no drinking has risen from 19% to 29% from 1995 to 1996. And binge drinking among the leaders of fraternities and sororities has decreased from 68 percent in 1995 to 59 percent in 1996.

Sexual intercourse as a result of alcohol consumption has dropped from 21 percent in 1994 to 10 percent in 1995 and finally to 9 percent in 1996. Other sexual acts induced by drinking alcohol dropped from 47 percent to 31 percent to 26 percent in 1996. The number of students who abandoned safe sex practices because of too much drinking has been reduced from 13% in 1994 to 9% in 1996. Though these figures look promising, Dean Pelton says that it is too early, in the study of alcohol consumption at Dartmouth to determine if the survey results are truly a trend. DISCUSSION QUESTIONS:

(1) The article states that the survey was sent to 1,200 students and 45% or 544 students returned the survey. Assuming that 1/4 of the responses are from freshmen is the increase in percentage of binge drinkers from 33% in 1995 to 41% in 1997 a statistically significant increase in binge drinking.

(2) Do you think that leaders of fraternities and sororities drink significantly more than the typical Dartmouth student?

(3) Here are some results relating to the number of drinks per week from the five years that the Core survey given to Dartmouth students.

    Year              1991       1992       1994     1995     1996
    Responses          278        273        499      609      544
    Median              4          5         4         2        3    
    Mean                8          9         8         6        5       
    0 drinks           22%        21%        26%      29%      25%    
    1 drink             9%        10%        11%      13%      13%
    20 + drinks        15%        13%        12%      10%       9%

Why are the medians so much lower than the means?

Do you think the College can see any encouraging trends?

(4) The University of Michigan Institute for Social Research has been carrying out a survey of the drug use of high school students. They have given their survey to about 15,000 students since 1975. From their survey we can see what the situation is with binge drinking in the senior year of high school. Here are the percentages for the various years of the students who said that they had 5+ drinks in a row in the past two weeks.

   1975  1976  1977  1978  1979  1980  1981  1982  1983  1984  1985  
   36.8  37.1  39.4  40.3  41.2  41.2  41.4  40.5  40.8  38.7  36.7    

1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 36.8 37.5 34.7 33.0 33.2 29.8 27.9 27.5 28.2 29.8 30.2

(1) Do these results support Dean Pelton's statement that "national studies have show that alcohol abuse in the high school has increased?" Do you see any trends? How would you determine if these trends are genuine?

(2) Here are the results from these questionnaires about the percentage of seniors who smoked daily.

    1975  1976  1977  1978  1979  1980  1981  1982  1983  1984  1985  
    26.9  28.8  28.8  27.5  25.4  21.3  20.3  21.1  21.2  18.7  19.5

1986 1987 1988 1989 1990 1991 1992 1993 1994 1995 1996 18.7 18.7 18.1 18.9 19.1 18.5 17.2 19.0 19.4 21.6 22.2

Do you see any trends in cigarette smoking? Is the amount of smoking related to the level of drinking?

Shaky statistics are driving the air bag debate.
Wall Street Journal, 22 Jan. 1997, B1
Aska Q. Namoni and Jeffrey Taylor

The introduction of air bags has saved some lives and lost some. The National Highway Traffic Safety Administration (NHTSA) has made several attempts to estimate the number of lives air bags saved since their introduction. Using computer models, the NHTSA estimated that they have saved over 1700 lives.

A study that compared the number of drivers and front-seat passengers, who died in 2,880 cars equipped with air bags, with the number who died in crashes of 5,237 cars without air bags, concluded that there were about 10% less fatalities in cars equipped with air bags. From this they estimated that there would have been 1,136 more fatalities if air bags had not been introduced. However, the data base used for this study did not have good information on the use of seat belts and so the two groups may have differed significantly as regards the use of seat belts. This makes it hard to determine the contribution of air bags.

It has been felt that seat belts are the chief protector and safety experts have considered air bags as supplementary. But air bags were designed to inflate with the force to save a 169- pound man who is not wearing a-seat belt. When this rule was established, few people wore seat belts but now 70% do.

The NHTSA estimates that 145 people have been saved on the passenger side by air bags, all of whom are adults, while air- bags have killed 36 people, all but one of whom are children. The NHTSA is proposing that the force of the bags be reduced and "smart" bags be introduced that can adjust for small children.

The changes in air bag regulations will have the effect of changing the ratio of the number of people saved to killed. The fact that these involve different groups, adults and children, makes this a difficult regulatory decision. The lack of reliable estimates for the number saved by air bags and for the use of seat belts makes this decision even more difficult.


(1) The article states that the NHTSA are not required to formally consider the ratio of lives saved to killed--in part because regulations aren't supposed to result in any deaths. What about environmental regulations?

(2) Do you think it is necessary to consider the ratio of people saved to killed when making air bag regulations. If not what criteria would you use?

Women's firm question: Why?
The Boston Globe, p. A1, 05 January 1997
Richard Saltus

Consider the following grim statistics about breast cancer:

Despite these harsh numbers and the amount of research going on, no one knows what exactly causes breast cancer and, more importantly, how to prevent it. Family history and age may greatly determine who will get the disease, but other risk factors, such as environmental pollution and exposure to estrogen have not been unequivocally proven to contribute to causing the disease.

One cause of breast cancer, found in 1/10 women who get the disease, is a mutation of either the BRCA1 or BRCA2 gene. Nine out of 10 victims, however, are born with healthy genes which, somewhere along the line, are converted into cancerous ones. Researchers suspect chemicals with estrogen-like effects that are found in DDT, pesticides, plastics, and other compounds.

Others believe that certain risk factors increase a woman's chance of contracting breast cancer. For instance, an increase in a woman's lifetime exposure to her own reproductive hormones, such as estrogen, can increase her risk of cancer. Early menarche and delaying childbirth increase the body's exposure to estrogen. Estrogen stimulates the breast cells to divide and proliferate and with every additional cell division comes a chance for errors to creep into the genetic code.

Breast cancer rates have been steadily climbing. The rate of new cases has risen about 1% per year since national record-keeping began in 1973 and stands at 109 new cases a year per 100,000 population. Some of the cases were due to better detection procedures that caught the disease in its early stages. It doesn't explain, however, the increase in the number of younger women, those in their 30s and 40s, who have been diagnosed with the disease.

Television ratings system under attack as never before.
How the Nielsen rating system works.
The Associated Press, 30 Dec. 1996

According to this report the Nielsen Media Research, better known as the Nielsens, is plagued by research so faulty that critics say no one can really tell for sure who's out there watching what shows.

Nielsen selects 5,000 households at random and asks them to join its system. Televisions in these homes are fitted with a device that records what programs are watched and sends that information to the Nielsens' computer. Viewers are asked to punch in to a "people meter" to indicate which household members are watching television at any given time. In addition, four times a year participants are asked to keep a diary of what they watch.

Networks, who are the primary client of the Nielsens, believe the system has to change to fit the new system of network television and cable channels. Networks complain that, if the data is faulty, then the problem may not lie in the quality of the shows presented but on the sample being used. Nielsen provides television and advertising executives with statistics each week that determine how much a network can charge for advertising and whether certain shows should continue or not. A TV network can gain an extra $100 million in extra revenue if its rating increases by 1/10 of a point over a season.

Some complaints are the following:

(1) CBS and ABC claim that Nielsen measures a greater proportion of homes with cable television than there is in the population, which depresses the ratings of the general broadcast networks. By contrast, cable networks claim Nielsen doesn't have a handle on how many homes subscribe to satellite TV services, which would give them more viewers.

(2) Two measurement samplings taken for David Letterman's one-day road trip to Boston in November showed a whopping 33% difference in the estimated number of viewers.

(3) With such a tiny sample, relatively few people can have a huge impact on a show's ratings and on a network's overall performance. For instance, this October, the Nielsens had 958 teenagers providing ratings information each day. If just 10 teen-agers decided to do something other than watch TV, it could make a difference of a full ratings point.

Nielsen is currently exploring alternative ways to measure the population's TV viewing habits. Some of these include an "electronic eye" in sample households that could tell who is watching and assigning programs codes that can be easily measured.

What crime statistics don't tell you.
Wall Street Journal, Jan. 8, 1997 A22
John J. DiLulio Jr., Anne Morrison Piehl

The Federal Bureau of Investigation states that there was a 3% decrease in crime in the first six months of 1996. However, this may be misleading. What rate exactly is this down from?

The federal government has two main ways of measuring crime: the Uniform Crime Reports of the FBI and the National Victimization Survey of the Bureau of Justice Statistics. Both undercount crimes.

The UCR has seven "index crimes": murder, forcible rape, robbery, aggravated assault, burglary, motor vehicle theft, and larceny theft. Other crimes simply are not counted. The information for these crimes comes from 95% of the population. Though a large percentage of the population is counted in the survey, a hierarchical counting method causes exclusion of some crimes in the count. Only the "important" or "serious" crimes are counted. Also, if more than one crime occurs to a person at one time, only the most serious crime is counted. For example, if a woman is raped, beaten up and then her car is stolen, this would only be counted as a rape. The stolen car and beating would be excluded.

In Philadelphia in 1995, crime went up about 15%. This was not primarily due to a rise in crime itself but to a breakdown of the hierarchical counting method. Now the old counting methods have returned, so the crime rates should at least appear to fall significantly, though this will not be a real decrease.

The NCVS is a newer crime-counting method established in 1989, but its findings were not published until 1995. Only an estimated 35% of crimes are reported to the police according to the Bureau of Justice Statistics' National Crime Victimization Survey. According to old NCVS methods, about 34 million people were victims of crimes in 1992. However, according to the newer method, the total was more like 43 million. That is an increase of over 25%. Also in 1992, the old method counted 6.6 million victims of violent crime. The new method counted 10.3 million-a 55% increase. Thus, drops in crime rates seem less impressive than before. Problems still exist with the NCVS system. Crimes committed against children 12 and under are not included in the survey. However, an estimated one in six rapes of a child. That is a lot of crime to be ignored. No surveys are taken in jails, public hospitals, homeless shelters, or other institutional settings--places where many victims are located. The survey also does not count serial victimizations. For example, if a woman is beaten by her husband and does not remember how many times she was beaten, it is counted as one crime only. We still have great problems with crime. Until we can find a better and more exact means of counting crimes, we cannot let ourselves fall under a false sense of security in regard to crime.

This is an op-editorial. Mr. DiLulio is co-author of two recent books on crime and poverty and Ms. Piehl is on the faculty of Harvard's School of Government.

Discussion Questions:

(1) Why is it so difficult to estimate the crime rate. What would you suggest to improve matters?

(2) Why do you think the issues raised here are not mentioned in articles dealing with the current decrease in the crime rate in the major cities?

On campus; Fighting the rankings of a college guide
The New York Times, 5 January, 1997, p1
Peter Applebome

This article describes the backlash on campuses against popular publications which rank colleges, especially the top-selling US News and World Report Guide to Colleges. Alma (Michigan) College recently released a survey of 158 college administrators, who ranked US. News's last in quality among college guidebooks. Some 90 percent of respondents said the magazine's ranking was important, but only 7.6 percent surveyed said it was the one that most accurately described their school.

A letter from Stanford president Gerhard Casper's letter to US News is quoted in part: "I hope I have the standing to persuade you that much about these rankings -- particularly their specious formulas and spurious precision -- is utterly misleading." Critics charge US News with contributing to a horse-race mentality surrounding the rankings, noting that it doesn't make much sense for, say, Harvard to be #1 in one year and #3 the next. Johns Hopkins has experienced wild swings in the last three years, going from #22 up to #10 then down to #15. Furthermore, the rankings process appears to be biased in favor of private institutions: the University of Michigan and University of California at Berkeley, widely regarded for excellence, do not appear among the US News Top 20 National Universities.

Mel Elfin, editor of the US News guide, defends the ratings as follows: "When you buy a VCR for 200 bucks, you can buy Consumer Reports to find out what's out there. When you spend 100 grand on four years of college, you should have some independent method of comparing different colleges. That's what our readers want, and they've voted at the newsstand in favor of what we're doing."


(1) Do you think colleges can be compared the same way as automobiles or cleaning products? Do you think that the public is demanding that they be compared in this way?

(2) What does (Stanford President) Casper mean by "specious formulas and spurious precision"? Do you agree with him?

(3) In the article, Elfin attributes the perceived volatility in the rankings to the guide's willingness to listen to critics and adjust its methodology. He adds that the differences between the top schools are so small that focusing on the changes in numerical ranking misses the degree to which the rankings have been consistent and accurate. For instance, in the current rating of best national universities, Yale ranks first at 100.0, Princeton second at 99.8 and Harvard third at 99.6. "We've produced a list that puts Harvard, Yale and Princeton, in whatever order, at the top," Mr. Elfin said. "This is a nutty list? Something we pulled out of the sky?" Are consistency and accuracy the same thing?

(4) US News added a new category to the rankings last fall --"value added"--which is the difference between actual and predicted graduation rate. The magazine notes that the predicted rate is based on the median of average SAT or ACT scores of the school's students, and the educational expenditure per student. How do you think the predicted rate is calculated from these figures? Is a high value-added obviously a good thing?

Here is a series of three letters to editor concerning the EPA's clean air standards.

The EPA's clean air-ogance.
The Wall Street Journal, 7 January 1997, A16
Steven J. Milloy and Michael Gough

Steven Milloy's Junk Science home page (there is a pointer from the Chance web site) contains critiques of scientific studies used to support policy decisions. Michael Gough is director of science and risk studies at the Cato Institute, a conservative think-tank. Here these authors accuse the Environmental Protection Agency (EPA) of biasing the benefit-cost analysis for proposed new restrictions on air pollution from ozone and particulate matter.

The EPA estimates that the particulate matter standards will save 20,000 lives per year. Since, according Milloy and Gough, the EPA values a human life at $5 million, this translates into a $100 billion benefit to society. This far exceeds the $10 billion per year that compliance with the regulations will cost. However, the authors dispute the epidemiological evidence that was used to support the 20,000 figure. Based on data from 1982-1989 involving 550,000 adults in 151 metropolitan areas, the EPA study found that the most polluted communities had a 17% higher death rate than the least polluted. Milloy and Gough note that an observed association does not demonstrate that pollution is responsible for the difference in death rates. They charge that researchers failed to measure even one individual person for actual exposure to pollution but instead simply "guessed." Furthermore, the subjects certainly differed in behavioral, occupational and genetic factors that were not considered by the researchers. Adjusting for any of these factors could easily negate the 17% difference in risk. Finally, the authors note that "nobody has demonstrated how airborne particulates could cause higher death rates."

More generally, the authors conclude that the EPA needs to be held to stricter standards to prevent scientific rules of evidence from being bent to suit policy agendas. They claim that, among epidemiologists, increases in risks of less than 100% are considered "weak associations" which are inherently difficult to assess accurately.


(1) In assessing the EPA study, should we be concerned that individual people were not monitored for exposure to pollution? What about the failure to control for such things as occupational factors?

(2) Do you believe that no one has demonstrated how airborne particulates could be unhealthful? What problems are involved in translating this into numbers of lives saved?

(3) Accepting that 20,000 lives will be saved, the benefit-cost analysis then hinges on the $5 million benefit accrued for each life. How do you suppose this monetary value was derived? What do you think of making trade-offs on this basis?

How foul is the air we breathe?
The Wall Street Journal, 28 January 1997, A17
Barry S. Levy, M.D.

Dr. Levy is the president of the American Public Health Association in Washington. He counters Milloy and Gough's charges of "junk science," labeling these "junk opinion!" He says that particulate air pollution is associated with bronchitis, chronic cough, respiratory distress and premature death, while ozone reduces lung function and defenses to respiratory infection. These assertions, he says, are backed up by numerous well-designed studies around the world, which have been very consistent in their results. Furthermore, he notes that most epidemiological studies do not require a doubling of risk for the results to be considered significant. He adds that, while the 17% represents a small increase in relative risk, it represents a large absolute risk in terms of number of people affected.

DISCUSSION QUESTION: Comment on the issue of statistical significance vs. practical significance, in the context of Dr. Levy's comments.

Dr. Levy's letter is immediately followed by one from S. Fred Singer of the Science and Environmental Policy Project. Singer backs up Milloy and Gough, adding that the case against ozone is even weaker than that against particulate matter. He even notes that the EPA has neglected the benefits of protection from skin cancer, since ground level ozone screens out UV rays. In any case, no matter how low the ozone layers are set, there will always be some people who will be adversely affected. He recommends that vulnerable individuals should simply curtail strenuous activities during "the occasional days when meteorological conditions cause ozone alerts."

Noting that benefit-cost estimates have differed widely, Singer supplies his own as follows:

"Tighter ozone standards at ground level would remove about 5% of the amounts of ozone that are feared to be removed (in the stratosphere) by CFCs. I have used the EPA's rather improbable benefit figure of $32 trillion (!) for phasing out CFC production, cited by EPA assistant administrator Mary Nichols in congressional testimony on Aug. 1 and Sept. 20, 1995. Assuming that 40% of the US population, in urban areas, is affected by smog, the disbenefits of the tighter ozone standard would reach the astounding level of $640 billion, far exceeding any benefits cited by the EPA."


(1) Taking the letter at face value, can you follow Singer's calculations? Do you think he intends to be taken seriously?

(2) How do you think the $32 trillion was estimated?

Is the CPI accurate? Ask the federal sleuths who get the numbers.
The Wall Street Journal, 16 January 1997, A1
Christina Duff

In recent Chance News (5.13), we discussed stories that the CPI (consumer price index) has been systematically overstating the inflation rate. The implications for tax rates, Social Security and balancing the federal budget are enormous. The present article says that the crux of the technical-sounding economic debate comes down to the following: does the index properly account for the times when consumers substitute less expensive goods (e.g., chicken for beef) when prices rise? Or when they buy a computer that is similarly priced yet much more powerful than what was available a few years before?

About 300 Bureau of Labor Statistics (BLS) employees are responsible for gathering the data that go into the monthly CPI estimates. This article chronicles some of the challenges faced by several of these workers. One of them, Sabina Bloom, travels 900 miles each month to visit some 150 sites, where she collects data on price changes. The sites are selected through BLS surveys indicating popular stores and categories of purchases. One such category might be "women's tops." Mrs. Bloom's job is then to interview a store-keeper to identify an item, size and style (short- or long-sleeve; tank-top or turtle-neck; etc.) for the comparison.

Taking discounts into account can be problematic; it is not uncommon to find sales of the form "save 45%-60% when you take an additional 30% off permanently reduced merchandise--discounts taken at register." When an exact item cannot be found, price-takers must use their judgment to find a substitute. In many cases, the price-takers must rely on memory of local experts. For example, the price of a bacon, lettuce and tomato sandwich in a restaurant may not have changed, but the number of strips of bacon may have been reduced to compensate for a rise in the price of bacon.

An additional problem with the CPI is that the master list of categories to be priced is updated only once every ten years. For example, cellular phones are too new to be included. Even seemingly standard items like television sets are problematic. How much of an observed price increase is due to inflation, and how much is due to quality improvements, such as adding stereo sound, cable capability or power efficiencies?


(1) The article notes that, in the interviews with shop-keepers, items which generate the highest revenue have the best chance of being picked. What biases might be introduced by this practice?

(2) What are the pros and cons of basing monthly categories on surveys of consumer behavior?

Mammogram talks prove indefinite; mammogram panel refrains from recommending screening for women in their 40's.
The New York Times, 24 January 1997, A1
Gina Kolata

A panel of 13 impartial medical experts and consumer representatives convened by the National Institutes of Health concluded that it could not recommend that all women in their 40's have mammograms. It decided that there was no persuasive evidence that regular mammograms for healthy women under 50 would save lives. The recommendation was made despite the findings of a recent conference in Falun, Sweden that concluded that mammograms for women from 40 to 49 could reduce the breast cancer death rate by 16%.

Some experts claim that any benefit for women in their 40's is outweighed by the risk that false alarms in mammogram results will lead to unnecessary surgery and other treatment. For women over 50, however, there is unanimous agreement that mammograms are beneficial because the risk of the disease rises and cancerous tumors are easier to detect than in younger women.

The panel said mammograms for women between the ages of 40 and 50 might save no lives at all or, at best, might save the lives of 10 women of every 10,000 who underwent the screening every year. That benefit is offset by risks, such as a 30% chance over a decade of being falsely told that there might be a tumor present, and a risk of detecting a tiny lump that might or might not be cancerous but would require treatment as if it was a real cancer. About 40% of the lumps found by mammograms in women between 40 and 49 are an ambiguous sort of tiny tumor, called intraductal carcinoma in situ. No one knows which of these precancerous lumps will turn into cancer and which will not but, in the face of uncertainty, doctors treat them as if they are cancer. Some women treated for them have lumpectomies, either with or without radiation treatment, but 40% have the entire breast removed.

In addition, mammogram radiation itself might cause a small number of cancers. For every 10,000 women between the ages of 40 and 49 who have annual mammograms, about 3 might develop breast cancer from the X-rays used in mammography. Mammograms can also give false reassurances. They miss about 25% of all invasive breast cancers in women in their 40's, versus 10% in women 50 and older.


(1) After the announcement of this panel was made, there was a terrific outcry from women's groups, radiologists, and other experts claiming that the recommendations of the panel were irresponsible and incorrect. Panel members said they were amazed by this reaction. Do you think they should have been amazed?

(2) The panel recommended that women between the age of 40 and 50 should, in consultation with their doctor make their own decision whether to have a mammogram or not. Do you think this is a reasonable recommendation?

(3) Do you think the panel's recommendation will have an effect on the willingness of insurance companies to pay for mammograms for women between the ages of 40 and 50?

Ask Marilyn
Parade Magazine, 26 Jan. 1997, p. 13
Marilyn vos Savant

Marilyn is asked:

I play the lottery by buying one 1$ ticket each week, using the same six numbers each time. The range of numbers is 1 through 49, inclusive. How long will I have to live to be mathematically certain that my number will come up at least once?

---William J., Kansas City, MO.

Marilyn says that there is not such time and the reader will have to live for an infinite amount of time.


How would you have answered the question?

Well, if you have gotten this far we will reward you with the Animal Cracker Story.

Survival Strategies Among Animal Crackers.
Journal of Irreproducible Results Jan./Feb. 1991, p 15
Blackwell Scientific Publications.

Nabisco Brands, Inc.
Attn: Manager, Customer Service
Barnum Animals
East Hanover, NJ 07936

Dear Sir or Madam:

We bought a box of Barnum's animal crackers today and would like to make several comments. We hope you will accept these in the spirit of constructive criticism:

1. We could not help noticing that many of the crackers were in several pieces. While we did not make a detailed survey, it appeared that this was particularly true of the prey. Also, a few of the predators seemed larger than others of their kind. We think the problem here is that you have packaged the predators and prey in the same small box. A simple textbook on ecology will explain why this results in fragmented prey.

2. You indicate on the bottom of the box , "When writing to us, please enclose the top of the package. We are not doing this, however, because the top of the package has an attractive folding clown on it. We spent quite a while rummaging through the animal crackers boxes at the A&P, tying to decide on whether we wanted a clown, ringmaster, etc. The clown was clearly superior in our judgment, and we want to keep it. Furthermore, we feel that this clown has been through enough as it is, for he suffered greatly when we tied to open the package according to your instructions, "To open, pinch here."

As we said, we don't want you to think we are just being negative, so here are some positive comments and suggestions for improvements:

1. You are to be congratulated for having the clown on the outside of the box, and not on the inside, since there's no telling what the predators would do to the clown, given what they've done to the prey.

2. Several possible solutions to the predator-prey problem come to mind. You might have a larger box, perhaps with compartments. Also, do you feed the predators just before packing? Another possibility is to have several different types of products, such as "Barnum's Clowns," "Barnum's Predators," "Barnum's Don't Cares," etc.

Thank you for your interest and concern on these matters.

Mr. and Mrs. Robert L. Feldman and family (David and Elana)
Ithaca, New York


Please send comments and suggestions to jlsnell@dartmouth.edu.


CHANCE News 6.02

(4 January 1997 to 3 February 1997)