CHANCE News 9.01

(November 17, 1999 to January 2, 2000)


Prepared by J. Laurie Snell, Bill Peterson and Charles Grinstead, with help from Fuxing Hou, and Joan Snell.

Please send comments and suggestions for articles to

Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:

Chance News is distributed under the GNU General Public License (so-called 'copyleft'). See the end of the newsletter for details.

Chance News is best read using Courier 12pt font.


To understand God's thoughts we must study statistics,
for these are the measure of His purpose.

Florence Nightingale


Contents of Chance News 9.01


Note: If you requested a CD-ROM of the Chance Lectures and have not received it, please send another request to jlsnell@dartmouth.edu with the address where it should be sent. Others who would like this CD-ROM are also invited to send such a request. There is no charge.

Norton Starr has passed on the latest Forsooth items from the RSS news (Dec. 99, Vol. 27 No. 4:)

...it was revealed that seven people from the city (Sheffield) were fatally injured at work last year.

...the figures are for one year only and may not be typical, but they may reflect a disturbing trend.

Sheffield Star
2 March 1999

Professor Robert Winston is obviously a very clever man, but I disagree with the calculations of the probability of the Jim twins' amazing life stories (The Secret Life of the Twins, 21 July).

If the probability of owning the car they both owned is 1 in 7, the probability of both twins owning the same car is 1 in 7, not 1 in 49 as stated by the professor.
Reply from programme's producer:
The 1 in 49 odds quoted were correct for two men, by chance, driving a Chevrolet. The odds of two men both driving any make of car depends on the number of makes of cars and sales of each made available at the time. The script was checked with a statistician.

The Radio Times (letter)
14-20 August 1999

The number of automatic plant shutdowns(scrams) remained at a median of zero for the second year running, with 61% of plants experiencing no scrams.

Nuclear Europe Worldscam

Since it is not always completely clear why a particular item makes the Forsooth column, Norton Star has volunteered to give his explanation for any items you find mysterious. His e-mail address is NSTARR@amherst.edu.

The "Statistical Education Research Newsletter" is a new publication of the International Association for Statistical Education (IASE). This newsletter is an outgrowth of a similar newsletter started in 1996 by our Chance colleague Joan Garfield. The newsletter will include summaries of recent research papers, books, dissertations, bibliographies on specific topics, information about recent and future conferences, and interesting internet resources. It will also include short research papers. You can learn more about this newsletter and find the first issue at homepage for The IASE Statistical Education Research Group.

We would like to mention some other sources of chance news.

Warren Page edits a column "Media Highlights" in "The College Mathematics Journal" published by the Mathematical Association of America. In the November 1999 issue (Vol. 30, No.5) there are two articles of interest to our readers. Here they are.

Surgical Audit: Statistical Lessons from Nightingale and Codman, David J. Spiegelhalter. Statistics in Society (Journal of the Royal Statistical Society, Series A) 162:1 (1999) 45-58.

After giving the Florence Nightingale quote we used for this Chance News, reviewer Tom Short writes:

Ernest Codman (1869-1940) was a Boston surgeon who championed what Speigelhalter calls the "End Results Idea" for monitoring surgical outcomes. Codman's "clinical" approach was to follow every patient's case history after surgery and to identify individual surgeon's errors on specific patients. In contrast, Nightingale advocated an "epidemiological" approach to surgical audits, focusing on summary statistics of mortality rates and demographics. As modern health care systems face the challenge of surgical audits, Speigelhalter reminds us that the tension between examining individual cases and summary statistics dates back more than a century. His article guides us through the statistical careers of Nightingale and Codman. There is a fascinating case study that tracks one surgeon's success rate for a difficult surgery, and concludes with a recommendation for a synthesis between the epidemiological and clinical approaches to reporting surgical outcomes. A historical perspective may not relieve the statistical tension in the health care industry, but it should help to provide context and direction for the arguers.

The second article also reviewed by Tom Short is:

Developing a Census Data System in Chain, Jianfa Shen, David Chu, Qingpu Zhang, and Weimin Shang. International Statistical Review 67.2 (1999) 173-186.

In his review Tom writes:

It is refreshing to read an article about a census that does not argue the legal and political ramifications of urban undercount in the U.S., but it is surprising to learn just how different the issues of concern for the census in China.
Tom goes on to indicate what some of these differences are.

Next we remind our readers to watch John Paulos' monthly column for on-line ABC news. John's column appears on the first of each month. His Jan 1, 2000 column is based on the recent National Academy of Sciences well-publicized report stating that each year, between 40,000 and 100,000 deaths in this country are a consequence of medical error. As usual, Paulos has an interesting and original way to look at this problem. You can find his discussion at: Who's Counting?

Finally, Chance Magazine has a column called "Chance Musings From the Press." In the latest issue, Fall 1999, we find several interesting articles discussed. While most are serious, this one might be considered humorous:

"Play the Odds on the New York Subway"

Al O'Leary who handles public relations for the New York City Transit Authority, was surprised at all the phone calls about the dead man who rode the subway Monday morning. "It's not unusual," he said. "We move 4.2 million people every day. How could you not find a diseased person now and then?"
     The Washington Post, June 16, 1999

In this same issue of Chance Magazine we enjoyed the following article:

A graphical investigation of the scourge of Vietnam.
Chance Magazine, Fall 1999, Visual Revelations, p. 44
Howard Wainer

Howard begins his article by telling us that Fred Mostellar is supposed to have said that by leaving the Princeton math department to join the math department at Harvard he succeeded in raising the IQ in both places. Alas, we thought we had tracked the history of the great statistical ploy to the former San Francisco Chronicle writer Herb Caen (see Chance News 5.03). We must admit that Fred Mostellar is a more likely person to have thought it up.

Wainer says:

a good display has many purposes but it reaches its highest value when it forces you to see something you weren't expecting.

This happened to Wainer when he plotted a graph showing the number of SAT exams taken from 1960 to 1999. He noted that the number seemed to be picking up in recent years, after being more or less constant (about a million a year) for a number of years. However, he also saw a large bulge (a sixty percent increase followed by a similar decrease) occurring in the 60's during the period of the Vietnam War.

To look further, Wainer plotted graphs of the SAT scores and the percent of male recruits who scored above the median on the Armed Forces Qualifying Exam (AFQE). In each case the graphs showed a significant decrease during the Vietnam war period. The decrease in the AFQE is presumably caused because some of the cream of the military were siphoned off into the college population by the Vietnam war. The lowering of the SAT scores then might be attributed to the Mosteller effect: the Vietnam war lowered the average intelligence of both the college students and the military. Wainer notes that this appears to be peculiar to the Vietnam war since similar data suggests an opposite effect during World War II.

Bob Hayden sends us statistical cartoons: His latest is a Dilbert cartoon with the text:

Dilbert is responding to a paid interactive commercial on the tube.

Studies ave shown that monkeys can pick stocks better than most professionals.

That's why the Dogbert Mutual Fund employs only monkeys.

Yes, our fees are high, but I don't apologize for hiring the best.
Here's a Dilbert cartoon we noticed in this morning's paper (Jan 3,2000):
Dogbert consults.

You need to do data mining to uncover hidden treasures.

If you mine the data hard enough you can also find messages from God.

...sales to left handed squirrels are up... and God says your tie doesn't go with that vest.

In the last Chance News Harold Brooks sent us a note about a recent NBC Evening News program with Tom Brokaw where the incidence of breast cancer and death rate from breast cancer on Long Island were compared to National averages with the implications that they were significantly higher. Harold wondered how they determined that the differences are significant. Milton Eisner is a health statistician at the National Cancer Institute and sent us an answer to Harold's question. This was provided by the explanatory notes to the SEER Cancer Statistical Review. The review which includes these notes is available here.

Here is how these notes say that significance is tested:

The percent difference (PD) between the individual states and the rate for the total U.S. is based on the formula

PD = 100(State rate - Total U.S. rate/Total U.S. rate.

The standard error provided for each age-adjusted rate are calculated, based on the assumption that, for each age-adjusted rate, the number of deaths is a Poisson variable with the variances of the age-adjusted rates being a linear combination of the variances of the age specific rates. The difference between each age-adjusted state rate and the age-adjusted total U.S. rate is tested for statistical significance by calculating a Z statistic from the following formula:

Z = (State rate - Total U.S. rate/SE_d

The notes point out sources of error that can enter into these rate values. Errors in the numerator come from miss-classification of the disease, under-registration of deaths etc. Errors in the denominator can occur from the under-and over-enumeration in the census. This would be particularly serious when comparing rates by race.

The notes also discuss a number of issues in interpreting the results of the significance tests. They warn that these rate significance tests are not appropriate for assessing geographic clustering.

Finally, they note that for many cancers the District of Columbia is found to have the highest mortality rates. In some cases this can be explained simply by the fact that mortality rates tend to be higher in urban areas and DC is predominately urban.

Milt Eisner also pointed out that the death rates in the L.A. Times article "Trying to map elusive N.Y. cancer source" that we discussed in the last chance news were incorrect. He writes:

The female breast cancer death rate for the US for the period 1992-1996 was 25.4 deaths due to breast cancer per year per 100,000 females of all ages (not "25.4%" as stated.) The corresponding rates for Nassau County and Suffolk County were 30.5 and 31.1, respectively.

The reader can find further information on the Long Island Breast Cancer Study Project


As the District of Columbia example shows, care must be taken in interpreting a significant result. How do you think they go about trying to decide the cause of a significant increase in cancer rates at a specific location?

The next article was provided by Norton Starr.

Year 2000 computer problems may get an alibi.
The New York Times, 14 December, 1999, C1
Barnaby J. Feder

In today's world, we rely on electronics in many ways. It is well- known that complicated systems, such as telephone service and electric power distribution, are subject to occasional problems. This article reports on the breakdown rates for many activities. Such information will help determine whether problems that occur on January 1, 2000 are the result of Y2K problems.

Some examples of failure rates are: Ten percent of automated teller transactions fail on their first attempt, usually because of customer errors; in the last five years, tens of thousands of residents of Canada and the United States have lost power in late December or early January; emergency 911 service is disrupted somewhere in the U. S. on the average of once per week; pipelines carrying hazardous materials averaged 16 reportable disruptions for the period of December 31 to January 3 over the last 3 years.

These rates form a 'baseline' to which the actual rates of different types of problems occurring on January 1 will be compared. While such information will do little to convince the average person that a problem is not caused by Y2K, it will help the authorities identify serious Y2K problems more quickly.

It is less clear whether the authorities will know how to use these baseline rates. William Ulrich, a Y2K expert in Soquel, California, says that "most people will have experts who know what's normal in their command posts but 90 percent will be doing their assessments based on gut feeling."

The article concludes with a somewhat mysterious quote from another Y2K expert, Ann Coffou, in Cambridge, Massachusetts: "Too much information is just as bad as not enough."

Discussion Questions:

(1) What is the difficulty of too much information in such a context?

(2) What is a useful way to combine the insight of experts relying on their gut feeling, with modern statistical methods?

(3) Is it reasonable to think that there will be fewer problems over the period considered this year than there were on previous years?

Is complexity interlinked with disaster? Ask on Jan. 1.
The New York Times, 11 December, 1999, B11
Laurence Zuckerman

This article describes a theory, called "normal accident theory," that will be tested during the first few days and months of the year 2000. In a 1984 book, "Normal Accidents: Living With High-Risk Technologies," the author, Charles Perrow, argued that disasters such as the near meltdown of the Three Mile Island nuclear reactor and the explosion of the Space Shuttle Challenger should not be thought merely to be the result of "human error." This theory posits that the emergence of more and more intricate and interconnected systems causes such accidents, hence they should be thought of as "normal accidents".

Of course, the reason for this article at this time is that it is expected that many complicated systems will be tested by the date change on January 1, 2000. By the time you read this, much will have been reported about how well or how poorly such systems fared.

A competing view of systems is called the "high-reliability" view. People who ascribe to this view believe that by building many backup systems, safety can be enhanced. The normal accident theorists argue that the existence of complicated backup systems can actually increase the likelihood of an accident.

It is very interesting to consider the area of commercial aviation with respect to the above theories. Although it would seem that flying is an inherently risky system, in fact it is highly reliable. This area therefore seems to support the high-reliability view. The National Transportation Safety Board and the Federal Aviation Administration can be thought of as backup systems, and of course are quite intricate in their workings. The planes themselves are extremely complicated, and are filled with backup systems. The reader might consider how a normal accident theorist would respond to the apparent facts that aviation is complex and yet safe. In this connection you might like to read the article "Blowup" by Malcolm Gladwell, The New Yorker, Jan. 26, 1996, pp. 32-36. (see Chance News 5.02).


(1) What do you think Charles Perrow learned from the Y2K experience?

(2) Do you agree that airplane experience seems to support the "high-reliability" theory? What do you think Charles Perrow would say about this?

Dan Rockmore suggested the following article from our local newspaper:

Tough times.
Valley News, Dec 12, 1999, D1
Bruce Wood

This article is based on an interview with the Dartmouth football coach John Lyons. In his first six years as coach the Dartmouth football team did very well, winning two conference titles and being a serious contender in the other years. However, in the past two years the Dartmouth team has had a 3-11 record in the Ivy League conference. In this interview, Lyons discusses factors that he feels played a role in this dramatic change in the fortunes of the football team. Lyons discusses a number of factors but the most interesting from our point of view has to do with the effect of the recent increase in the average SAT scores at Dartmouth.

In the early eighties the Ivy League established a set of rules for recruiting athletes in the so called money sports: Football, men's basketball, and hockey. Athletic scholarships have never been allowed in the Ivy League but this agreement added an academic requirement. We had difficulty finding an exact formulation of these rules and so have relied on the description given in the book: "A is for Admission", by Michele A. Hernandez, Warner Books, 1997. Hernandez is a former Dartmouth admission officer.

The new requirement involved the introduction of an academic index defined as follows:

The academic index (AI) is the sum of three factors:

(1) the average of the student's highest SAT math and verbal scores each rounded to two decimal places,

(2) the average of the student's three highest SAT subject tests (also rounded to two decimal places) and

(3) the student's converted rank score (CRS).

The CRS is computed by a complicated algorithm and is based on the student's high school rank in class or whatever related information the school gives.

The AI is at most 240 and the Dartmouth average in 1997 was around 212. Under the Ivy League agreement the money teams must maintain an average academic index which is at most one standard deviation below the college's average academic index and individual recruits must have an academic index of at least 169. Hernandez remarks that football has more specific "bands" (or ranges of athletes that it can accept) related to the individual colleges average academic index.

The coaches submit their lists to the admission office and get feedback in the form of likely, unlikely, possible etc. On the basis of this they adjust their lists and submit new ones -- a process that Hernandez calls "long and tiresome" both for the coaches and the athletes.

The article points out that, until recently, the average SAT scores of Dartmouth students was between those of Harvard, Princeton and Yale, and those of Brown, Columbia, Cornell, and Pennsylvania. This gave them a nice niche for the "bands" from which they could recruit. But alas, along came a new president James Freedman who was determined to raise the intellectual climate of Dartmouth. He succeeded so well that now the average SAT scores are comparable to Harvard, Princeton and Yale, and the coach feels that this has makes it harder for them to recruit good football players.


(1) Hernandez writes:

Have you ever wondered why you'll see a really brilliant student sitting on the bench but not seeing much playing time?

What do you think she has in mind?

(2) According to the US News & World Report college issue the 25th-75th percentiles for SAT scores are Dartmouth (1,350- 1,520),Princeton and Yale (1,360-1,540) and Harvard (1,400-1,580) while below these we have Cornell (1,260-1450), Columbia (1,290- 1,490), Brown (1,290-1,500) and Penn (1,300-1,480) On the basis of this do you think there is much difference in the standard deviations for such scores between the schools.

(3) Obviously, the coach would like to see as large a standard deviation as possible. Does this mean that, in fact, he should help the admission office recruit really brilliant students whether or not they are interested in athletics?

Losing strategies can win by Parrondo's Paradox.
Nature, vol. 402, 23/30 December, 1999, pg. 864
Gregory P. Harmer, Derek Abbott

This article reports on a new paradox discovered recently by Juan Parrondo. We start with two games, each of which is biased against the player. Game A consists of flipping a coin that has probability 1/2 - c of coming up heads, where c is a small positive number (in the article, c is taken to be .005). The player wins $1 if the coin comes up heads; otherwise she loses $1. In Game B, there are two coins; the first has probability 1/10 - c of coming up heads, and the second coin has probability 3/4 - c of coming up heads. If the player's current holdings are a multiple of 3, then she next tosses the first coin; otherwise she next tosses the second coin. In either case, she wins or loses $1 depending upon whether the coin comes up heads or not.

Game A is clearly biased against the player. It is not obvious, but it is the case, that Game B is also biased against the player. We will give an intuitive argument, and then a more mathematically rigorous one. The key quantity to estimate is the long-term probability of winning any particular flip. If, in the long run, the player's holdings are 0, 1, or 2 (mod 3) with probabilities p_1, p_2, p_3, then the probability of winning a particular flip is just

(1/10 - c) p_1 + (3/4 -c) p_2 + (3/4 - c) p_3.

This is just the weighted average of the heads probabilities of the two coins, where the weights are the percentages of times that the holdings are 0, 1, or 2 (mod 3).

The intuitive argument runs as follows. If the holdings are 0 (mod 3), then it is very likely that after the next flip, the holdings will be 2 (mod 3). If the holdings are 1 (mod 3), then it is very likely that after the next flip, the holdings will be 2 (mod 3). Finally, if the holdings are 2 (mod 3), then it is very likely that after the next flip, the holdings will be 0 (mod 3). From this it is reasonable that the holdings are much more likely to be either 0 or 2 (mod 3) than 1 (mod 3). But from this it follows that the probability of winning any particular flip will perhaps be not much more than the average of (1/10 - c) and (3/4 - c). The average of these two numbers is 17/40 - c, which is less than 1/2, so perhaps the probability of winning any particular flip is also less than 1/2. In this case, the game is unfavorable.

Parrondo's paradox is that if the two games are played alternately, then the composite game is favorable to the player. Different sequences of plays of the two games lead to different biases in favor (or against) the player. For example, the game is more favorable if the sequence repeats the block A, A, A, B, B, than if it repeats the block A, B. In addition, if the games are played according to a random sequence (with probability p of playing Game A), then for many values of p (including p =1/2), the composite game is favorable.

We now give a more rigorous argument for the statement that Game B is unfavorable. This argument can be generalized to the case where the games are played according to a random sequence. We consider Game B to be a Markov chain, with states 0, 1, and 2, corresponding to the holdings (mod 3). The various transition probabilities, i.e. the probabilities of moving from one state to another, are given by the probabilities that the two coins come up heads or tails. For example, the probability that the chain moves from state 0 to state 2 is (9/10 + c).

Standard Markov chain theory tells us that in the long run, the chain will be in the various states certain fractions of the time; these fractions are given by the fixed vector of the chain, which in this case is approximately (.384, .154, .462). (Note that these fractions are in line with our intuitive argument above.) Using these fractions, one can compute that the probability of winning any particular toss, in the long run, is about .4957. Thus, the game is unfavorable.

If the two games are played randomly, then we again have a Markov chain, this time with 6 states; the states are labeled by the game that will next be played (A or B) and the holdings (mod 3) (either 0, 1, or 2). If, for example, the games are equally likely to be played, then one can compute that the probability of winning any particular toss, in the long run, is about .5079. This means that the game is favorable.

You can find out more about this by going to Parrondo's web site

Here you find the following explanation for why they are interested in these games:

Brownian ratchets can be used to harness the random thermal fluctuations of molecules or very small particles to get directed movement. If we introduce a couple of games originally devised by Parrondo, we can see that their mechanics work in a very similar fashion to that of the Brownian ratchet.

Ask Marilyn.
Parade Magazine, 26 December, 1999, p 18
Marilyn vos Savant

Marilyn received the following letter:

To say that women make less than men mainly because of time taken to rear children, as you implied in your column ignores mountain of research on why the wage gap persists. Will you please address this issue again?

Evelyne Konolle
Program Coordinator,
National Committee on Pay
Equity, Washington, D.C.

Marilyn remarked that in the previous column she also said:

Consider this: if [women's] work is equal, why aren't employers slashing their payroll costs by hiring women instead of men? In a free market, businesses are highly competitive and if they're paying men more than women -- there must be a reason. The most important question is "What is that reason?

She invites here readers, men and women to fill out a survey about the difference of the sexes on their jobs. Most of the questions have three possible answers

(a) a women

(b) a man and

(c) it makes no difference at all.

Here are four such questions:

Whom would you rather hire as a full-time baby-sitter while you work?

Whose voice do you trust more when you ask for computer support?

Whom would you prefer to pilot your plane when you travel?

Whom would you prefer to perform your heart surgery?


What do you think Marilyn expects to learn from such a survey? Will it add significantly to the mountain of research on this issue?

Who counts?
Margo J. Anderson and Stephen E. Fienberg
Russell Sage Foundation, 1999

Historian Margo Anderson and statistician Stephen Fienberg combine forces to give us a historical perspective of the Census from the first census in 1790 to the upcoming Census 2000.

The book begins with a history of the Census. The story starts with the founding fathers' establishing a representative form of government that required a census to make it work. We learn that the political problems we face today with the census have been with us from the very beginning: Who should be counted? How should they be counted?

We also learn that throughout the history of the census leading statisticians have played a major role in trying to answer these questions. It would be hard to think of another statistical problem that has received as much attention by the statistical community as that of carrying out the census. This is remarkable considering that they work under the constraint of continually changing political winds.

Following the historical introduction, the authors give a detailed account of the last three censuses. Of course the undercount problem and the associated legal battles play a major role in this account.

The reason that the undercount problem is a political issue is fairly obvious since adjustment tends to increase the count of minorities and minorities tend to vote Democratic. Also large sums of Federal money to the states are affected by the outcome of the census.

The statistical issues are more complicated but center around whether the assumptions required for the methods used are satisfied and whether the attempts to adjust the census cause more errors than they correct.

In the 1990 court cases we find-well known statisticians on each side of the issue. For example, in a New York lawsuit that led to not using the adjustment in 1990 we find Steven Fienberg, Ralph Rolf, and John Tukey supporting the use of the undercount adjustment and Paul Meier and David Freedman testifying against its use.

For the Census 2000 the Supreme Court has ruled that sampling methods cannot be used for apportionment. However, evidently it can be used for redistricting and determining the amount of money each state receives in Federal grants.

Thus the Census Bureau plans to do a traditional enumeration and present this information for apportionment to the President on December 31, 2000 as required by law. They will then carry out the undercount adjustment and give what they believe will be a more accurate count to the states for redistricting by March 31, 2001.

For any of us wishing to follow these developments it is advisable to have a clear idea of how the adjustment for the undercount is carried out. The authors provide this information in their book. However, as a test of whether we understand how it works, we will give our understanding of how the Census Bureau plans to carry out the adjustment.

The Census Bureau divides the population into blocks with a block having about the number of housing units you would find in a typical city block. For the purpose of estimating the undercount, the country is also divided into groups that are similar with respect to race, Hispanic origin, region of the country, gender and whether the family owns or rents the house. There are about 1300 such sub-regions called post strata. The role they play will be clear soon.

After the traditional enumeration is completed for the census, an independent second enumeration called the post enumeration survey will be carried out independently of the census for a stratified sample of about 60,000 blocks. Within a particular post strata the two enumerations are compared and the people who appeared in both are identified. Suppose, for example, that the census count for this strata was 10,000 and for the post enumeration survey it was 9,000. Assume that there were 7,000 who were counted in both the census and the survey. Then since 7/9 of those counted in the census were counted in the survey it is assumed that the census identified 7/9 of the people in this strata. Thus the adjusted estimate for the number in the strata would be 10,000*9/7 = 12,857.

This estimate is based on the classical capture-recapture method usually described in terms of estimating the number of fish in a lake. A sample of fish is caught and tagged and then later a second sample is caught and the proportion of fish in the lake is assumed to be the same as the proportion of the fish that are tagged in the second sample.

This model assumes that that within each sample all the fish have the same probability of being captured. Also the events of being captured in the two samples are independent. It is further assumed that all the counts are accurate and the tags do not fall off so the identification is correct etc.

For the census application of capture re-capture the census plays the role of the first sample (the tagged fish) and the post enumeration survey the role of the second sample (the re-captured fish). The assumption that within each sample each person has the same probability of being counted is called the assumption of heterogeneity. The independence assumption is related to a concept called correlation bias. It is hoped that, at least within a post strata, these assumptions are reasonably justified, though the pro and anti adjustment statisticians do not agree on this point.

Some of the other assumptions are clearly not satisfied so the Census Bureau has to adjust for this. The Census Bureau cannot use unique identification markers such as social security numbers since they feel that this would make some people reluctant to be counted. Thus they must compare facts about the people identified -- where they lived, their race, gender, etc. They can easily make a mistake. People can be counted twice in the census -- a college student might be counted both at his home and at his school. Also, between the time of the census and the post enumeration survey some people die, some babies are born and some move out of the area of the sample. The Census Bureau has to deal with these problems.

In an attempt to deal with many of these problems the Census Bureau re-examines all census enumerations in the sampled blocks to obtain a true count for this area for the census. Let's return to our example in which the census count was 10,000 and the post enumeration survey counted 9,000 people. After checking for duplications and other errors 9800 remain for the true census count. Then our estimate of 10,000 should have first been reduced by a factor of 98/100. This gives us an overall adjustment factor of (9/7)*(98/100) = 882/700 = 1.26 Thus our adjusted number for this post strata would be 10,000*1.26 = 12,600.

This procedure is carried out to obtain an adjustment factor for each post strata. Sampling causes excessive variation in these factors so a final statistical process of "smoothing" is carried out. These factors are then applied to adjust the estimate for the number of people in each block in the country. From this we can obtain adjusted census counts for cities, towns, or any particular area needed by simply adding the block counts.

It is generally admitted that these estimates may not be too accurate at the block level but when the results are added for larger groups many of these inaccuracies will cancel out. Critic Kenneth Darga provided a novel way to check this claim as it related to the 1990 census. He looked at the proportion of boys and girls under the age of 10 in various groups. He looked at nine demographic groups. He found that before adjustment the proportion of boys were all 51 percent as to be expected. However, for the adjusted counts these proportions varied from 48 percent to 56 percent. This led him to conclude that the adjusted counts were not accurate.

This book will be read by science writers, lawyers, judges and politicians as the inevitable court cases for the census 2000 approach. We hope that people who read this book will come away with the feeling that there has to be a better way than fighting it out in the courts to figure out how to carry out this important and challenging statistical task.

Key health habits linked to life expectancy
Boston Globe, December 1999, A3
John Tierney

A study in the current issue of JAMA (Journal of the American Medical Association) focused on three major risk factors for heart disease: smoking, high cholesterol and high blood pressure. The conclusion was that people at low risk for each of these factors had from 6 to 10 years increased life expectancy. Low risk was defined as non-smoking, cholesterol less than 200 milligrams per deciliter and blood pressure below 120-over-80. The researchers were encouraged because these appear to be realistic goals for much of the population.

Data were obtained from 360,330 men and 6229 women who had enrolled in two major prospective studies beginning between 1967 and 1972. The result was the first dataset large enough to include a substantial number people at low risk for all three categories, and with follow-up long enough to include sufficient numbers of deaths to estimate life expectancies. Men who were between 18 and 39 years old when they enrolled and met the low risk criteria were estimated to have between 6.3 and 9.5 years of additional life expectancy compared to other men their age. Men aged 40 to 59 had six additional years, and women aged 40 to 59 had 5.8 additional years. The benefits extended across socioeconomic and racial groups.


According to the article: "The studies did not record dietary or exercise habits, but Stamler and his colleagues suggested that cholesterol level and blood pressure may be not only risk-reducing factors in themselves, but also indicators of healthier lifestyles and more exercise." Does this mean that reducing cholesterol is beneficial in and of itself, without changes lifestyle or exercise? How would they know this?

Study: Arctic Sea ice is rapidly dwindling; global warming likely cause.
Washington Post, 3 December, 1999, A 1
Curt Suplee

As 1999 draws to a close, many people are looking at temperature records and speculating about global warming. Another potential indicator of warming is the disappearance of Arctic ice. An international study appearing in the journal Science has combined 46 years of data from 5 independent data sets, which include both ground-based measurements and satellite observations. The data show that Arctic sea ice is decreasing by 14,000 square miles per year on average.

The authors of the study infer that the melting is attributable to human activity rather than natural variability in the Arctic climate. They report only a 2% chance that the melting over twenty years represents normal climate variation, and a 0.1% chance that the whole 46 year record is attributable to normal variation. Furthermore, the observed data were found to be consistent with figures generated by computer climate models that simulate the effects of greenhouse gas emissions.

The article points out that 50 years is a short time period for assessing global climate changes, so it is hard to assess whether the observed melting is unusual. The lack of longer range data places additional importance on the comparison with computer models. The particular model used here was developed by the National Oceanic and Atmospheric Administration's Geophysical Fluid Dynamics Laboratory in Princeton, and is widely respected. Still, the article reports that this the first time the model has been specifically applied to Arctic ice. One critic of the study, Richard Moritz of the University of Washington, said "I am not convinced that the natural variability of the ice extent simulated by the model is realistic."


The melting rate is expressed in terms of area. Can you see any potential difficulties with this?

Police, medical workers say there's truth in old wives' tale.
Addison County [VT] Independent, 4 December 1999, 2
Associated Press

Does the full moon affect human behavior? This article quotes a number law enforcement officials expressing belief in the so-called "werewolf effect." An Illinois police lieutenant name d Michael Roberts says "People are a little bit weirder. Nurses, cops, fireman--we all believe it, whether it's been scientifically proven or not."

Medical workers in obstetrics also express belief that more babies are born during full moons. The article quotes Betty Fennema of Provena St. Joseph Medical Center. Fennema, who has spent most of her 34 year career in obstetrics, says "I don't have any statistical data, just practical knowledge. Those of us who have been around obstetrics a long time just expect a rise in the census."


(1) What do you think of the distinction being drawn between "scientific data" and "practical knowledge?" Propose some ways to collect "scientific data" to check these theories.

(2) According to the article, "Illinois State Police Sgt. Jeff Hanford says it's superstition, but police expect to be busy [during a full moon]. It's usually after the fact that we notice it. We'll have a busy night and then someone will notice it's a full moon." Comment.

Beta carotene pills flunk another test.
Boston Globe, 15 December, 1999, A11
Associated Press

Observational data indicate that diets high in fruits and vegetables containing beta carotene are associated with lower incidence of cancer and heart disease. But is beta carotene itself responsible? In Chance News 5.02, we described research that questioned the value of beta carotene dietary supplements.

There is similar news in today's issue of the Journal of the National Cancer Institute. It reports results from a four-year study involving 19,939 women who received beta carotene and 19,937 who took a placebo. During the study period, there were 378 cancers in the beta carotene group and 369 in the control group. Over the same period, there were 42 heart attacks in the beta carotene group and 50 in the placebo group. These differences were not found to be statistically significant.


Saying there were "378 cancers" in the treatment group does not make it clear whether 378 different people group got cancer, or if some people suffered multiple cancers. Does it matter?

We woz wrong.
The Economist, 18 December 1999, 47-48

Newspapers often print predictions, but follow them up much less often. Here the Economist confesses two conspicuous blunders: first a string of warnings the American stock market bubble was ready to burst, and second a prediction that oil prices were headed for further declines. In its cover story of 6 March 1999, the magazine reported that the world was "drowning in oil." At that time, the price of crude oil had dropped to $10 a barrel, and the story predicted that it might soon drop as low as $5. Alas, a mere four days later, OPEC agreed to cut production; within two weeks the price had risen 30%. By December, it had reached $25 a barrel!

How, the article asks, could the prediction be so far off? Three partial explanations are offered. First, the oil prediction was not made in isolation; it was based in part on the idea that slow growth in the world economy would keep demand for oil down. Thus, failing to anticipate the Asian economic recovery contributed to the error on oil prices. Second, the forecast included speculation that the Saudis might not go along with OPEC, but would instead increase their own production to enhance revenues. But there are pitfalls in trying to guess what the Saudi leaders are thinking, given that their decision-making process does not include public debate. Third, the forecast itself may have increased the resolve of the OPEC leaders!

Given the difficulties in making predictions, the article raises the rhetorical question of whether it might be better to simply give up. But it points out that predictions are an essential part of policy discussions. Two complicated examples are discussed at some length: NATO's decision not to commit ground troops to drive the Serbs out of Kosovo, and President Clinton's decision not to resign during the scandal that culminated in his impeachment trial. The article notes that every opinion expressed about a policy issue implicitly contains a forecast about the consequences of either following or not following some course of action. It concludes that publications like the Economist should set for themselves the goal of making their predictions explicit in order to hold them up for debate.


Take a current newspaper story about policy and identify the predictions implicitly or explicitly made there (for example, you might consider the Washington Post article on Arctic sea ice presented in this Chance News). How well are they supported?

Chance News
Copyright © 1998 Laurie Snell

This work is freely redistributable under the terms of the GNU General Public License as published by the Free Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.


CHANCE News 9.01

(November 17, 1999 to January 2, 2000)