CHANCE News 7.10

(11 October to 14 November 1998)


Prepared by J. Laurie Snell, Bill Peterson and Charles Grinstead, with help from Fuxing Hou, and Joan Snell.

Please send comments and suggestions for articles to

Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:

Chance News is distributed under the GNU General Public License (so-called 'copyleft'). See the end of the newsletter for details.

Chance News is best read using Courier 12pt font.


No good statistician doesn't believe in sampling; it's an oxymoron.

Kenneth Prewitt, Director of the US Census Bureau,
quoted in the Boston Globe, 11 November 1998.


Note: We had too much material for this Chance News so we will have another short Chance News soon.

Contents of Chance News 7.10



We are having our 2nd annual Chance Lecture series at Dartmouth December 11-12 1998. We have again invited speakers who are experts on chance topics appear in the news. We will make their talks available on the Chance web site.

If you would like to attend these lectures, you will find the program and details on how to register at the end of this Chance News. Last year's lectures were very successful and we expect they will be again this year. We continue to be interested in hearing from readers who have tried to use the Chance Lecture videos with or without success.


Our answer to the Mount Washington Weather Notebook question "What are the chances of someone breathing the same air as Copernicus" (see Chance News 7.09) was the winning answer and appeared on the Weather Notebook public radio program from Mount Washington on Thursday November 19th. This program, as well as the discussion of a number of other interesting weather questions, can be found at the weatherbook web site.


Peter Doyle called our attention to the following interesting new web site: Computer Animated Statistics

This site is for Professor A. M. Garsia's statistics course which is described as: Standard topics of statistics and probability are presented in a novel and visual way using computer animated images.

On this page you will find several elegant applets. In one of these, students are asked to estimate, by simulation, the number of boxes of cereal they need to buy to have a fifty percent chance of getting a complete set of prizes. The simulation is remarkably fast and the student can, after an initial estimate, quickly zero in on the solution. A similar fortune cookie experiment allows the student to compare sampling with and without replacement.

An applet, simulating betting on red at roulette, allows students to vividly experience the difference between using the bold strategy and the timid strategy when their objective is to obtain a fixed amount of money playing an unfair game.

Of course, you will also find the infamous Monty Hall problem but with a novel geometric way to see the solution. You can also find here the New York Times interview with Monty Hall.


Jerry Grossman wrote to us that the Kevin Drake Game in the Small- world network article of the last Chance News should have been called the Kevin Bacon Game. Kevin Bacon is a movie actor. Jerry also suggested that our readers would enjoy an earlier version of this kind of game played by mathematicians based on the "Erdos number". Paul Erdos was a wandering mathematician who enjoyed writing papers with other mathematicians. He wrote about 500 such papers. You have Erdos number 1 if you have written a paper with Erdos, Erdos number 2 if you have written a paper with someone who has written a paper with Erdos etc. Our (Laurie's) Erdos number is 3. You will find a discussion of the Erdos number and lots of information about mathematicians favorite mathematician Paul Erdos at The Erdös Number Project

The Fall 1998 edition of Chance magazine arrived and this is the final issue under the editorship of George Styan. George attracted some wonderful articles during his period as editor, and you can be reminded of this by viewing the Chance Lecture "Hidden Codes in Chance Magazine" by Stephen Samuels. The new editor, Hal Stern is well-known to Chance News readers for his many interesting articles on the statistics of sports. You can also see Hal in action in his Chance Lecture: "Statistics in Sports". Here is an item that Hal wrote in this Fall issue of Chance magazine.

How accurate are the posted odds?
Chance Magazine Fall 1998 17-21
Hal Stern

Psychologists are fond of pointing out situations where people are not very good at estimating probabilities. Hal considers how they do in estimating probabilities when they make bets on horse racing, football, and baseball. He finds that they do a pretty good job here.

Betting on horse racing uses the "pari-mutuel" method which makes the payoff odds on each horse inversely proportional to the amount of money wagered on that horse. The track operators remove about 17% of the total wagered on each race and redistribute the rest among the bettors selecting the winning horses. Thus the odds are based on a consensus of the bettor's subjective probabilities for the horses winning the race. Hal checked how good these subjective probabilities are by looking at data from 3,785 races in Hong Kong. He computes the expected number of wins for the horses based on the bettors' subjective probabilities and compares this with the observed number of winners. He finds that the fit is quite good. However, Hal finds a tendency for bettors to under-estimate the probability that a horse with a high win average will win and over- estimate the probability that horses with low win average will win. This is consistent with the well-known bias that makes people under-estimate the chance of a very likely event and overestimate the chance of a very unlikely event such as winning at the lottery

In football, the bookmaker introduces a point-spread in an attempt to make each bet a fair bet. A point-spread of 3.5 points in favor of team A means that if you bet on team A you win if A wins by more than 3 points and otherwise you lose. Bettors are given odds of 10:11 on their bets to provide a profit for the bookies. The point-spread changes as the betting proceeds because the bookmakers want to make the amount bet on each team about the same to avoid unexpected losses. As Hal remarks, loser's bets pay the winners and everyone pays the bookmaker. The point-spread ends up reflecting a consensus of the public's subjective probabilities for the outcomes of the games.

To see if these subjective probabilities are reasonable, it is sufficient to see if the point-spreads are consistent with data. Hal looked at this problem in an earlier publication: "On the probability of winning a football game," The American Statistician, 45, 179-183. He found that the difference between the point-spread and the outcome of the games to be approximately normally distribution with mean 0 and standard deviation 13.5 again showing that the player's subjective probabilities are consistent with the data.

This football example provides a nice way to introduce the concept of an "efficient" stock market. Any knowledge about a team, that is not insider's knowledge, that you could use to increase your chance of winning, would also be known to others and would be taken into account in the final point-spread.

An economic professor at Dartmouth used this idea in one of our chance courses. He asked all members of the class to make their choices on ten games and also answer three questions aimed at indicating their knowledge of football. He then reported back on the results showing that those with a knowledge of football did not do significantly better than the others, and the class as a whole did not do significantly better than if they had just tossed coins to make their decisions.

Finally, we mention Dartmouth's contribution to football betting strategy, the "Evil Twin strategy", used in the math department football pool (see http://math.ucsd.edu/~doyle/docs/twin for an explanation of why this strategy works). Unfortunately, this strategy is not working too well this year for us as we have not won yet, but this is because our colleague Mary Wood, whom we persuaded to use the evil twin strategy, has already won 3 times.


The Dartmouth Football pool, with three games and three participants, can be modeled as follows: three players, Laurie, Joan and Dan put $1 in the pot and submit the results of tossing a coin three times. The God of Football tosses a coin three times and players with the most matches with that of the God of Football wins the pot. In case of tie, the pot is split.

(1) What is Laurie's expected winning?

(2) Suppose Joan and Dan toss a coin but Laurie chooses the mirror image of Joan's choice, i.e. if Joan coins came up HTH Laurie would choose THT (the even twin strategy). Joan and Laurie agree to split the pot if either of them wins. Now what is Laurie's expected winning?


Dan Rockmore suggested the following related article.

Horse sense
New York Times Magazine, 4 Nov. 1998, 48-51
Linda Greenhouse

Bill Beyer writes a racing column for the Washington Post and has written four very successful books on how to pick winners at a horse race. Much of Beyer's success with his books and his own betting rests on his studies on measuring the speed of horses.

The article reports that, before Beyer's work, the speed factor of a horse had been under-used by bettors because, while the raw times were published, these did not accurately reflect how the horses would do on tracks of different lengths and different conditions. Beyer developed a way to compute speed estimates for a horse for every race, taking into account the distance and conditions of the track. For the past six years Beyer has been providing the Racing Form his speed figures for each horse in every race.

Of course, as in an efficient stock market, now that Beyer's information is common knowledge, it has been taken into account in the odds, and Beyer has lost the obvious advantage of his knowledge in his own betting. However, he and his colleagues still study the performance of the horses, to find horses that appear to be running faster than their Beyer speed would estimate and try to take advantage of this finding in their betting in the same way that stock experts look for stocks that might perform better than expected.


It is claimed that Beyer still makes money with his betting. How do you think his betting winnings compares with the money he makes from the Racing Form and from his books? Remember, he pays 17% to the bookmakers when he bets on the horses.


A poll-watcher's guide to stalking the elusive likely voter
The New York Times, 18 October 1998, Section 4, p. 5.
Michael R. Kagay

In the 1996 presidential election, only 49% of eligible voters actually went to the polls; for the 1994 midterm election, the figure was only 39%. As this year's elections were approaching, political pundits noted that the effects of voter turnout might be even more important than usual. The reason was the White House sex scandal. If President Clinton's opponents turned out in force, the Republicans would have the advantage. On the other hand, if the public was truly upset about impeachment proceedings, then Democrats would have the edge. In hindsight, it appears the latter view was correct. But this article is still of interest, because it focuses explicitly on how polling organizations try to assess who in their sample is "likely to vote."

Each polling organization has its own procedure. The Gallup poll rates voters on a point system with scores from zero to seven. Points are given for having voted in past elections, current intention to vote, and knowledge of where to go to vote. The article reports that "Gallup then picks those respondents with the most points, working down until they reach a percentage of the total group that is equal to the expected nationwide turnout (expected to be 39% this year)." David Moore of the Gallup Organization saw no reason to adjust the likely voter criteria this year. He expected that those people most affected by the debate over the Clinton scandal are strong Republicans or Democrats who would be expected to vote in any case.

The New York Times/CBS Poll defines "likely" voters as those who voted in either 1996 or 1994, report paying attention to this year's campaign, and say they will definitely vote this year. "More likely" voters are those who say they voted in both 1996 and 1994. It was found, based only on responses from "likely" and "more likely", voters came out more in favor of Republicans than when the whole sample was used. Interestingly, Gallup did not find so strong a pattern. According to the article, Galups finding implies "that the contest remains close at all levels of turnout."

The article discusses a number of other issues related to polling, including sampling techniques and phrasing of questions. It cites the following example of how the phrasing can affect responses. When checking on sentiment towards incumbents, Gallup asked "Please tell me whether or not you think each of the following political officeholders deserves to be reelected. First, members of the House of Representatives; second, your own representative." The NYT/CBS poll tied reelection to job performance, and also asked if it was time to give new candidates a chance. The Gallup results were more in favor of incumbents.


(1) Does something about Gallup's procedure sound circular? How do they know to expect that 39% will vote this year?

(2) Why do you think the NYT/CBS poll found a stronger likely voter effect than Gallup?

(3) If Gallup reversed the phrasing of its own question--so that voters first rated their own representative and then the House in general--how would you expect the results to change?


Science and technology: Trial and error
The Economist, 31 October 1998, pp. 87-88

October 30 was the 50th anniversary of the first clinical trial: the test of streptomycin to treat tuberculosis. The article salutes the tremendous medical advances attributable to clinical trials, but its primary goal is to remind readers of the shortcomings of this approach. Four major areas of concern are discussed.

First, while trials work well for drugs, they are not easily adapted to other kinds of treatments. For example, it is not clear how to use them to evaluate psychotherapy or minor surgery. With psychotherapy, there is no obvious placebo for comparison. For surgery there are obvious difficulties with having a placebo group. However, there actually is a trial now underway to test arthroscopic surgery for knee injuries. Some doctors feel the surgery is no better than simply allowing the knee time to heal. Patients in the control group will be given a memory-blocking drug, and will undergo a minor incision designed to look like a surgical scar.

Second, since private firms are driven by profits, there is reluctance to test herbal treatments, which cannot be patented. Similarly, there is less incentive to investigate new uses of a drug like aspirin, which is now out of patent but has been shown beneficial for new uses such as treating heart attack patients. Even more ominous was the 1996 story of a University of California researcher who was forced by Boots Pharmaceuticals to withdraw a paper because it showed that the firm's drug for thyroid disease was no more effective than less expensive existing treatments.

Third, trials may be conducted under conditions that will not be reproduced in the outside world, which limits their relevance. The article cites a trial called ENRICHD, which is investigating the use of psychotherapy to improve the health of heart attack survivors who are suffering from depression. The concern is that the quality of care in the trial is much higher than what can be expected in the real world. Such concerns are compounded when studies are conducted in poor countries, whose typical standard of care is far below anything likely to be experienced in a drug trial. For example, AZT treatment for HIV infection is seen as prohibitively expensive for many other countries. And, as the article points out, 50 years after the trial of streptomycin there are still 3 million annual deaths from tuberculosis.

Finally, while research initiatives tend to focus in physiological measures of success, there is comparatively little emphasis on quality of the patients' lives. Hilda Bastian of the Consumers' Health Forum of Australia cites data indicating that fewer than 5% of trials published from 1980 to 1997 measured emotional well-being or social function of the participants.


What do you think could be done to increase the value of clinical trials?


David Dorman suggested the following article.

Statistician says probability of other intelligent life in universe is rooted in numbers.
Seattle Times, 11 November 1998, p. A12
Dietra Henderson

This is a discussion of Amir Aczel's book "Probability 1: Why There Must be Intelligent Life in the Universe." (Harcourt Brace, 1998), which was reviewed in the last issue of Chance News. According to the article, the idea for the book was originally suggested to Harcourt Brace by Carl Sagan. But Sagan died before the project was realized. The idea of picking up the thread was suggested to Aczel by an executive editor who had rejected some of Aczel's other book proposals. The following passage from Sagan's book "Pale Blue Dot: A vision of the Human Future in Space" is quoted as presenting the initial ideas:
If I had to guess, I would guess that the universe is filled with beings far more intelligent, far more advanced than we are. But of course I might be wrong Such a conclusion is at best based on plausibility argument, derived from the numbers of planets, the of organic matter, the immense time scales available for evolution, and so on.

The article summarizes this passage as follows: "Or, as Aczel says via mathematical equation, probability, expressed by upper case P, is one."

Aczel's statistical development is given by analogy with a Galton board, where balls tumble down across a pegboard, each colliding with a peg deflecting balls to the right or left with equal probability. Aczel wrote:
Assume that every element and every condition--temperature, pressure, density, presence of a catalyst and so on--is a little ball. The unique balls fall randomly into an array of open boxes below. Each time a ball falls into a box below, a condition for life is realized...If there are enough elemental particles in the universe, and science tells us that there are...then at least some planets outside the Earth would be showered with enough of the litte balls supplying all of the ingredients necessary to allow life to evolve.

In fact, his calculations based on this model produce a probability that is "indistinguishable from 1.00."


(1) Do you think that the Sagan passage its equivalent to the statement "P = 1"?

(2) The article says that "in reaching his 100 percent probability, Aczel relied on the rule for the union of independent events, devised by mathematicians Blaise Pascal and Pierre de Fermat." What rule do you think is meant here?

(3) Describing the peg board experiment, the article says "as they fill the boxes below, the balls illustrate what's called the central limit theorem, or the law of averages." Which is it, how is it relevant to Aczel's argument?


Sampling and Census 2000
SIAM News, November 1988, p. 1 and 10
Morris L. Eaton, David A. Freedman, Stephen P. Klein, Kenneth W. Wachter, Richard A. Olshen, and Donald Ylvisaker.

This article is based on Technical Report No. 537, Department of Statistics, University of California at Berkeley, which was referenced in an earlier Chance News. It summarizes very clearly the authors' principal concerns about the proposed sampling plans for adjusting the Census.

The article begins by pointing out that there are two proposed uses of sampling for 2000. The first involves identified housing units from which responses are not obtained during the mail-in enumeration phase of the census. In the past, the bureau attempted follow-up visits to all of these. The new plan proposes using a sample of such residences to estimate the total number of non- respondents. The second use of sampling, called the Integrated Coverage Measure (ICM), is the analog of the Post-Enumeration Survey (PES) proposed for the 1990 Census. It would conduct a nationwide survey to check the accuracy of the enumeration phase and then statistically adjust the first phase for under-and over- counting within demographic groups.

The authors cite the following four general weaknesses with the proposals:

(1) Many somewhat arbitrary technical decisions will have to be made. Some of them may have a substantial influence on the results.

(2) Many of ICM's statistical assumptions are shaky.

(3) There is ample opportunity for error.

(4) The errors are hard to detect.

The article provides specific details on all of these. For example, on July 15, 1991 the PES estimated an undercount of 5 million. However, the authors assert that 50-80% of this figure reflects errors in the estimation process itself, not in the enumeration phase of the Census. Indeed, the Bureau later discovered an error in its PES analysis which had added 1 million people to the undercount and shifted a Congressional seat from Pennsylvania to Arizona! In any case, the net undercounts in 1980 and 1990 were in the 1-2% range. To improve the count, any new techniques would need to have errors of less than 1%. But the authors feel this level is not attainable with available survey techniques.

The proposed ICM would use a cluster sample of 60,000 Census blocks, representing 750,000 housing units and about 1.7 million people. Census officials would then attempt to match data for every residence in the sample blocks to data from the Census. An ICM record without a match may represent a "gross omission" in the Census, that is, a person who should have been counted but was not, whereas a Census record without a match may represent an "erroneous enumeration" in the Census. Finally, some people will be not be counted either time. Their number is estimated by adapting the capture recapture method, with Census records being the "captured and tagged" group, and ICM records being the recapture group. However, such estimates are flawed by "correlation bias" because the two groups are not really independent samples: people hard to find during the Census are also likely to be hard to find during the ICM. Moreover, even cases that are "resolved" through ICM field-work will still create errors if the respondents do not provide accurate information. People who have moved, for example, may not give accurate information about their place of residence or household size on the official Census day (April 1).

The capture-recapture methodology alluded to above actually needs to be modified because the undercount is known to differ according to demographic and geographic variables. Therefore, the population is divided into "post strata" and a different adjustment factor is estimated for each such stratum. For example, one post strata might be Hispanic male renters age 30-49 in California. The authors object to the "homogeneity assumption", under which factors computed from the ICM are applied to all blocks in the country.

The authors also worry about how the first use of sampling--namely sampling to follow up for non-response in the mail-in phase--will interact with the ICM procedure. One immediate concern is that the ICM sample will find people who did not mail in their Census forms but were not chosen for the follow-up sample. Such people will be accounted for twice: once in the ICM adjustment, and once in the non response adjustment. To avoid this, the Census Bureau proposes 100% follow-up for non-response in the blocks that will later comprise the ICM sample. The authors note that this makes two assumptions: "(1) census coverage will be the same whether follow- up is done on a sample basis or a 100% basis, and (2) residents of the ICM sample blocks do not change their behavior as a result of being interviewed more than once." Failure of these two assumptions results in "contamination error," and the authors state that the magnitude of this error is not known.


(1) Which of the concerns, "correlation bias" or "homogeneity," do you think would be the most serious.

(2) Obviously, there is going to be some error in the under-count method. What level of error would you think would make this process worse than not doing any adjustment?


Lack of census consensus held to hurt count
The Boston Globe, 11 November 1998, Section 4, p. 5
Michael R. Kagay

Kenneth Prewitt was confirmed by the Senate in October as the new director of the US Census Bureau. Prewitt is a Clinton appointee and shares the administration's position that sampling is needed to address the undercount problem. He says the Republicans' idea that the Census Bureau could manipulate the data to favor Democrats is "very, very far-fetched." But with Congress still far from agreement and the Supreme Court not expected to rule on sampling before March, Prewitt warns that the lost time is compromising his ability to lay the groundwork needed to ensure an accurate census. Decisions on everything from the census telephone system to the advertising campaign are on hold until the details of the methodology are worked out.

In describing the difficulties with traditional enumeration, the article asserts that "by the bureau's own estimation, this counting method in 1990 missed about 10 percent of the US population, with the most dramatic undercount among African-Americans and Hispanics." This figure is far larger than the 1.8% undercount figure that has been widely reported since 1990. It may actually refer to the figure for the total number of errors made in 1990 which, according to the New York Times ("You fill up my census, even if I can't find you" 30 August 1998), was estimated to be 26 million. This figure comprises people missed, people counted more than once and people counted in the wrong place. The undercount is a net figure, representing total people missed minus total erroneous enumerations, regardless of where they were counted.


Why do you think we hear more in the press about the net undercount figure than about the total number of errors? Which do you find more disturbing?


Investment Dartboard: A Brief History of our Contest
The Wall Street Journal, 7 October, 1998, C1
Georgette Jasen

For many years, the Wall Street Journal has sponsored a contest among three contestant groups. The goal of the contest is to pick stocks that will do well in the near future. The three groups are a set of four professional stock brokers, the Dow Jones index, and a set of dart throwers. Since July of 1990, the contest has been run as follows. The stock brokers each pick a single stock. The dart throwers do just that; they throw darts at a list (that has been put in random order) of all of the stocks that are traded on the three major exchanges.

At the end of six months, the average return for the professionals and the dart throwers are computed. In addition, the return of the Dow Jones Industrial Average is computed. Whichever of these three numbers is greatest wins the contest.

The contest came about as a result of the book "A Random Walk Down Wall Street," written by the Princeton University economics professor Burton Malkiel. In this book, Malkiel explains the "efficient market theory" which states that "all available information is quickly reflected in the price of a stock, so that all stocks present equal chances for a gain." Malkiel goes on to state that if this idea is taken to extremes, a "blindfolded monkey throwing darts at a newspaper's financial pages could select a portfolio that would do just as well as one carefully selected by experts."

The Wall Street Journal has run six-month contests continuously since July, 1990. The hundredth contest has just been completed. In these 100 contests, the average return on the stocks picked by the professionals is 10.9%, compared with 4.5% for the dart throwers and 6.8% for the Dow Jones average. The pros came in first in 44 of these 100 contests.

Another way to see how well the three groups did is to pretend that you started three accounts in July 1990, the pro account, the darts account, and the Dow account. You invested $1000 each month in each group's pick (so you are investing $3000 in new money each month). At the end of each contest, all of the money that you have gained in that contest is reinvested in the new contest's picks (keeping the pro money with the new pro pick, etc.). At the end of 100 contests, the three accounts would be worth $327,259, $189,897, and $205,059, respectively.

Does this mean that the pros can pick stocks that will outperform the Dow? Not necessarily. Malkiel points out that the pros tend to pick stocks with beta values higher than one. The beta value of a stock is a number which is based upon past performance of the stock and represents the ratio of the variance of that stock's price and the variance of some relevant index, such as the S&P 500, or the Dow. In a bull market, the stocks with beta values greater than one tend to go up more than the indices do (and hence more than the dart throwers' picks, as well). However, in a bear market, the opposite tends to be true. In the past 8 years, we have seen an almost uninterrupted bull market, which might explain why the pros have won more often than the dart throwers or the Dow. It would be very interesting to compare the pros' picks with random picks chosen from sets of stocks with similar beta values.

The article provides the average percent gain or loss for the experts, the darts, and the Dow for each of the 100 contests. You can obtain this data from the Chance web site by going to Teaching- aids and then Data.


(1) How would you test if the experts are doing significantly better than the darts or the Dow?

(3) Another objection Malkiel made to the contest was that the experts were so well known that their decisions alone made a short time gain in the stocks they chose. As we have mentioned he also objected that the experts chose especially volatile stocks since it was not really their money. What do you think about his objections?


Mathematics and the Media, October 8-10, 1998
Mathematical Sciences Research Institute

The Mathematical Sciences Research Institute (MSRI) sponsored a series of lectures related to mathematics and the media and made these lectures available, in video form, from their web site. We found the session in which science writers explain how they get and how they prepare their stories particularly interesting.

Curt Suplee, of the Washington Post, estimated that 75% of the science articles in their paper come from 5 sources: Science, Nature, New England Journal of Medicine, Journal of the American Medical Association, and the Hubble telescope.

The science writers stressed that they rely on the co-operation of their sources to provide (a) topics the sources consider new and newsworthy, (b) advanced copy of the publication of the research, and (c) names of experts they can consult to get help them understand the issues involved. They said that physics and chemistry do a good job of this but mathematics does very little. Sharon Begley, science writer for Newsweek, commented that one exception was that mathematicians were very helpful in her writing the story about the Bible Codes.

Science writers also stressed that the news media, by its very nature, is interested in stories that can be considered to be new developments. Ivers Peterson remarked that sometimes he gets around this by interpreting new as "new to his readers". One of our favorite science writers, Tom Siegfried of the Dallas Morning News, said that the new news requirement is less of a problem if the paper has a science section such as the Dallas Morning News Monday Science section. He attributed the lack of newspapers with science pages, not to the public's lack of interest in science, but rather to the advertisers' lack of enthusiasm about putting their ads on a science page.

Princeton mathematician Peter Sarnak was assigned the job of talking about a topic that is of current interest at a level that the science writers could understand. He gave a nice talk about random matrices and their connection with quantum theory and the Riemann Hypothesis. The discussion was opened by a science writer who said that, listening to the talk, she felt the same way she did when listening to a talk in German knowing only a few German words. This led to an interesting discussion of what you really have to do, if you want science writers to be able to understand a topic and prepare an article in a way that their readers can understand it.

As part of the symposium, Persi Diaconis gave a talk on coincidences. He discussed a number of examples from his classic paper with Fred Mosteller, "Methods for Studying Coincidence", Journal of the American Statistical Association Dec. 1989, Vol. 84, No. 408, 853-861. He started with an example from the work of the well-known psychiatrist C. G. Jung. Jung felt that coincidences occurred far to often to be attributed to chance. Diacoinis considered one of Jung's examples where Jung observed, in one day, six incidences having to do with fish. To show that this could occur by chance, Diaconis suggested that we model the occurrence of fish stories by a Poisson process over a period of six months with a rate of one fish story per day. Then we plot the times at which fish stories occur and move a 24-hour window over this period to see if the window ever includes 6 or more events, i.e., fish stories. Diaconis remarks that finding the probability of this happening is not an easy problem but can be done. The answer is that there is about a 22% chance that the window will cover 6 or more events.

Diacoinis discussed next the birthday problem and a number of variations of this famous problem. For example, he remarks that when you have 7 people at a dinner table there is a 50% chance that two have their birthdays within a week of each other. Finally he discusses some of his own experiences dealing with issues such as ESP. This talk was accessible to science writers and your students would enjoy watching it.

The last video is of Arthur Benjamin's Mathemagics lecture. Don't miss it!


Our colleague David Maslen obtained, from Professor Geoffrey Berresford, the data for the number of birthdays in the U.S. for each day in the year 1978 that Berresford used in his article: "The uniformity assumption in the birthday problem, Math. Mag. 53 1980, no. 5, 286-288. We have put this data on the chance web site (go to Teaching Aids and then Data) and we recommend that you make a time series plot of these numbers. It shows dramatically the weekly periodicity with fewer birthdays on Saturday, and even fewer on Sunday, more on Monday and Friday and then a dip on Wednesday. Do Doctors also play golf on Wednesday? Of course, over a period of years this periodicity will get washed out so the uniformity assumption is probably pretty good. It would be interesting to get data for several years to see how this works out.

Brendan McKay sent us the following contribution which appeared on the Critical Criminology web site

From: Tom Gill (based on published reports) Subject: bizarre death penalty facts (fwd)

In the course of preparing a death penalty case for trial in September 1997, attorney William Linka noticed patterns in the names of persons convicted of capital murder and eligible for the death penalty. He analyzed hundreds of cases from state legal records and published U.S. Supreme Court opinions. Linka noted:

8.3% of the persons were named Lee 5.4% were named Dwayne, Duane, or DeWayne, 19.4% had no middle name 13% were a "Jr." or "Sr.", 7% had two or more of the above factors.

An alarming 41% of people convicted of capital murder met one of the criteria above. Therefore, please do not name your child DeWayne Lee Jr.


Which of the observations noted do you think are significantly different from that which you would expected from a random sample of the U.S. population? Carry out a test to defend your answers.


We thought it might be useful to discuss one article in terms of the library resources that we use for Chance News. This might help our readers see what is becoming possible with the availability of electronic newspapers and journals.

The most useful data base for finding articles for Chance News is Lexis-Nexis. This data-base has full text of articles from national and regional newspapers going back about ten years. It does not have The Wall Street Journal but this is available from a similar data-base, the Dow Jones Interactive, which your school may have. It will certainly have it if you have a business school.

The Dartmouth library, like many other colleges and universities, provides the Academic-Universe version of Lexis-Nexis which can be accessed on the web by faculty and students.

The newspaper articles that we review are usually based on articles from one of the most popular science and medical journals. These include the science journals: Science, Nature, Science News, New Scientists and the medical journals: New England Journal of Medicine, Journal of American Medical Association (JAMA), Lancet, and British Journal of Medicine. All of these have some kind of web versions. For most of these, you or your library must have a regular subscription to be able to view and download full text of articles. We will provide at the end of this discussion more details on what is available for these electronic journals.

Of course, there are wonderful resources that you can use on the web that are not part of the library resources. One of the most useful for us has been National Public Radio. NPR covers most major science stories and keeps a well indexed archive of all its programs.

We will illustrate how we used these resources in preparing our final item for this issue of Chance News.

We heard, on NPR's program "Sounds Like Science", a story about monkeys being taught to understand when one number is bigger than another. You can hear this by going to Sounds Like Science 10/31/98

The program begins with the story of clever Hans, the horse that seemed to learn to count but, in fact, was just getting cues from its owner. This famous experiment has made people suspicious of studies that claimed animals could to math. Then you will hear a discussion with a co-author of a current study that showed that monkeys could do some math. The program reports that two monkeys named Rosenkrantz and McDuff were trained to be able to choose, in order, pictures with 1, 2, 3, and 4 objects, They were then, without any training, able to order numbers from 5 to 9.

This led us to search on Lexis Nexis for articles about this study. In this version of Lexis-Nexis you must pick a word or words for a topic which must occur in the title or in about the first 50 words. Then you can specify key words that can occur anywhere in the article. We chose "counting or numbers" as the topic and "monkeys" as a key word. We chose "major newspapers" and for the "last month". From this we found that the study was widely covered, though, curiously, not by the New York Times. We chose an article from the Washington Post by a staff writer and an article from the Financial Times written by a psychologist. These two articles nicely complemented each other.

Count monkeys among the numerate.
The Washington Post, 23 Oct. 1998, A1
Rick Weiss

This article explained where the monkey experiment fit into the long history of attempts to show that animals behaved as if they were using mathematical concepts such as addition. It provided quotations from other experts in the field praising the design of the study and testifying to the importance of its results. One said "This is still an eye-popper for most philosophers and mathematicians".

No monkeying about with numbers.
Financial Times (London), 7 Nov. 1998, p. 2
Andrew Derrington

This article discussed the relation between the achievement of the monkeys and those of children. The writer stated that while, infants can tell when two numbers are different, there is no evidence that they can tell whether one is bigger than the other until they have learned to talk.

None of these sources gave a very careful discussion of the experiments themselves so we turned to the original Science article to see the details.

Ordering of the numerosities 1 to 9 by monkeys
Science, 23 October 1998, pp 746-749
Elizabeth M. Brannon and Herbert S. Terrace

Here we learn that the experiments were more complicated than reported in the media. The monkeys were first trained to order the numbers from 1 to 4. However, after this they were not asked to order the numbers from 5 to 9 in sequence but rather to order all the 36 pairs of numbers from 1 to 9. The researchers divided these pairs into three groups: familiar-familiar, familiar-novel, and novel-novel where familiar meant a number from 1 to 4 and novel a number from 5 to 9. The monkeys performed better than chance on all three groups though the novel-novel group was their real achievement.

This difference does not seem important but shows that, to have the story exactly right sometimes you have to go to the source.

This Science article shows the possibilities for the future of electronic journals. In the articles mentioned in the references, links are provided that take you to PubMed, a version of Medline on the web. PubMed then provides abstracts for these articles in the references. In time these links will lead, instead of to abstracts, to the journals themselves where you can access the full text of the articles if the journal is one where you have these privileges. This is already the case for some of the articles appearing in references of articles in the New England Journal of Medicine.


(1) Why do you think the experimenters did not just see if the monkeys could order the sequence from 5 to 9 instead of working with pairs?

(2) Charles Grinstead's son, looking at the data in the Science article wondered which monkey was the smartest Go to the article and see if you can help him.

(3) Why do you think the authors used such an archaic word as numerosities instead of numbers? Did they mean something different than numbers?

If you are very adventuresome, you can even see Rosenkrantz and MacDuff and in action and meet the authors of the research at the CNN web site (You can search on monkesy or go to: Monkeys can count, new study finds. You will have to download Windows Media if you don't already have it and you may have to try it at a slack time on the web.


Details on electronic sources for important Newspapers and Journals

Major Newspapers

Online newspapers generally have photos and graphics in the current issue but not in the archived articles.

The New York Times

Today's paper is free. Archives include the past 365 days. You can search for an article but if you want to download it you have to pay $2.50. Of course, the Tuesday Science Times is a particularly valuable source for Chance News articles.

The Boston Globe

You get today's paper and you can search through the last fifteen years of Globe-staff written stories. You can obtain a full text of such stories for $2.50 before between 6:00 A.M. and 6:00 P.M. EST. and for $1.95 at other times.

Washington Post

You can search and get full text of articles from the past two weeks free. Articles more than two weeks old going back to Sept. 1996 are available for the same prices as the Boston Globe.

Los Angeles Times

Today's paper is free. Archives go back to 1990. You can search these but to download a story is $1.50.

Wall Street Journal:
The Wall Street Journal is the only major newspaper you cannot read free on the web. The web version costs $59 or $29 if you also subscribe to the paper.

Of course, for these newspapers, there is no point to paying for past articles if you or your library has access to Lexis-Nexis.

Science Magazines


You can register free and obtain contents, search, and staff- written summaries of research papers and news stories back to Oct. 1995. Full text of the articles is available only to subscribers of "Science" for an additional $12. Libraries can obtain site licensees for the whole school ($1500 to $5500 depending on size of school) or workstation licenses at $25 each. At Dartmouth you can access this through the librarians.


You can freely access table of contents of current issues and back issues of "Nature" back to June 1997. Subscribers of "Nature" can have access to search and full text. Nature does not yet have a library policy but they are experimenting with a number of schools including Dartmouth so you can access the eletronic version of Nature through a librarian.

Science News:

Table of contents of current issue and back issues of "Science News" back to May 1996. Some articles available in full text. Access is free.

New Scientist:

Table of contents of current issues and back issues of "New Scientist" back to 5 April 1997. Some articles (about 30%) available in full text. Access is free.

Medical Journals

New England Journal of Medicine

Abstracts of articles are available for all users. Subscribers to the journal can access full text of articles back to Jan. 1993. These also use GenMed to access many of the references that occur in articles.

Journal of American Medical Association (JAMA)

Here you will find contents and abstracts of all articles in JAMA back to July 5 1995. Articles that JAMA considers newsworthy are summarized in JAMA's Science News Update. Full text of articles are not available here, but access is free.

Lancet Interactive

You can register free to obtain table of context and selected full text of articles in Lancet back to 6 Dec. 1997. Subscribers of Lancet can obtain full text of articles back to this date. Libraries that subscribe to Lancet can activate online subscriptions on behalf of their users. They can register as many users as they wish, but only three users per subscription can have simultaneous access to The Lancet Interactive.

British Medical Journal

This site contains the full text of all articles published in the weekly BMJ from January 1997. Access is free at least until the end of 1998.

All of these web versions of journals enhance their web sites with information not available in the printed journal. Increasingly, they are trying to establish links with other journals so that you can access references to their articles. In general, this will only work if you or your institution has the right to access the other electronic journal.


Chance Lectures

Invitation to Chance Lectures

Hi -

Buoyed by last year's success, Laurie Snell and I would like to announce and invite you all to the *Second* Chance Lectures, December 12-13, 1998.

Once again, our goal is that these lectures will provide a basic reference for educators who wish to use current news items in teaching probability and statistics (as does the "Chance" course). The lectures will be videotaped and eventually, will be placed on the web. If you would like to see last year's program, and lectures, go to the Chance web site and click on "Chance Lectures".

We have reserved a block of 15 rooms at the Hanover Inn for out-of- town participants and have limited funding available on a first- come-first-serve basis to support up to 30 people (assuming two people to a room). We are also able to provide meals for all participants (dinner Friday through lunch on Saturday).

We need to have some idea of how many people to accommodate, and housing is limited. So, if you would like to attend, please register on the web, with login: chance and passwd: CoinToss

There is no registration fee.

We hope to see you there.

Best, Dan Rockmore and Laurie Snell

Second Chance Lectures
Friday December 11, 1998

Registration 5:00 - 6:00  (Hayward Lounge in Hanover Inn)

Dinner: 6:00 - 7:30  (Catered) Wheelock room in Hanover Inn

All talks will be in 1 Rockefeller Center. 
    7:45 - 8:45 PM  Clark Chapman (Southwest Research Inst.)
                    The Risk to Civilization from 
                    Extraterrestrial Objects

Saturday December 12

    8:30 - 9:00 AM Breakfast (Catered) Rockefeller Center

    9:00 - 9:55    John Baron (Dartmouth Medical School)
                   Topic to be announed.

   10:00 - 10:55   Tom Gilovich (Cornell)
                   Streaks in Sports

   11:00 - 11:15   Coffee

   11:15 - 12:10   David Freedman
                   Statistical Issues in Census 2000

   12:15 - 1:45    Lunch (Catered) Rockefeller Center

    1:45 - 2:40    Cindi McCary (American Viatical Testing Company)
                   Approaches and Social Aspects of Viaticals  

    2:45 - 3:40     John Paulos Temple University
                   Once Upon a Number - Mathematics and Narrative

    3:45 - 4:00    Coffee

    4:00 - 4:55    Jeff Norman (ZAIS Group)
                   Hedge Funds and Gambling in The Capital Markets

                        Conference Ends


Chance News
Copyright © 1998 Laurie Snell

This work is freely redistributable under the terms of the GNU General Public License as published by the Free Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.


CHANCE News 7.10

(11 October 1998 to 14 November 1998)