CHANCE News 10.05
April 12, 2001 to May 20, 2001
Prepared by J. Laurie Snell, Bill Peterson, Jeanne Albert, and Charles Grinstead, with help from Fuxing Hou and Joan Snell.
We are now using a listserv to send out Chance News. You can sign on or off or change your address at this Chance listserv.
This listserv is used only for mailing and not for comments on Chance News. We do appreciate comments and suggestions for new articles. Please send these to:
Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:
Chance News is distributed under the GNU General Public License (so-called 'copyleft'). See the end of the newsletter for details.
Chance News is best read with Courier 12pt font and 6.5" margin.
Are not the improbabilities of a race horse, impossibilities for the draft horse?
Lawyers in the Howland Will Case 1867
Contents of Chance News 10.05
2. A compendium of probability distributions on the web.
3. Why colleges shouldn't dump the SAT.
4. Wolves and moose on Isle Royale.
5. Population growth, technology and tricky graphs.
6. Oscar winners tend to live longer.
7. A nice application of graphics in the news.
8. Three sisters give birth on the same day.
9. The holiday birthday problem.
10. Victim's race affects decisions on killers' sentence.
11. New attention for the idea that abortion averts crime.
12. An early application of probability to the law.
13. Duelling idiots and other probability puzzlers.
14. Turning numbers into knowledge.
15. The 2001 World Series of Poker.
16. What has the hat problem got to do with codes?
17. Solution to the Holiday problem.
Note: If you would like to have a CD-ROM of the 1997 and 1998 Chance Lectures, also available on the Chance web site, send a request to
with your address. There is no charge.
You can also download the Chance Lectures from the Chance web site to view
them locally using your internet browser. See the Chance Lectures homepage for
A forsooth item:
Reader Sandy MacRae wrote us saying that the quote we used for Chance News 10.04:
That the ten digits do not occur with equal frequency
must be evident to any one making much use of logarithmic
tables and noticing how much faster the first pages wear
out than the last ones.
Simon Newcomb 1881
should surely have been a forsooth item. We agree so we give it that status in this Chance News.
Here are Forsooth items from the April 2001 RSS News:
Anne Robinson: How many novels had Barbara
Cartland written by the time of her death?
Over 300 or over 700?
Contestant: Over 300
Ann Robinson: Wrong: over 700
The Weakest Link (BBC TV)
1 March 2001
ATL [Association of Teachers and Lecturers]
challenged the DfEE's [Department of Education
and Employment] definition of "expected level".
The original meaning of this term was that national
curriculum levels would represent the "average
expectation". Hence the average expectation of an
eleven year old was level 4.
However, DfEE has now informed the ATL that the
expected level is the minimum standard at which
the vast majority of pupils should be operating.
ATL is pressing DfEE to explain when a decision
was made to redefine 'expected level' and by whom.
2 December 2000
Experts estimate that global temperatures could
rise as much as 5.8 Celsius (42.5 degrees Fahrenheit)
Times of India website
Michael McLaughlin called our attention to his Compendium of Probability Distributions. This compendium is similar to the popular books on distributions which have usually been taken out of our library. This compendium can remain on your computer. It includes the standard and some not so standard distributions including ways to simulate them when feasible. It is available separately but is also a part of a McIntosh software package for probability modeling which also looks interesting and is freely available from the same web site.
Jack Kaplan sent us a commentary on the following article:
Why colleges shouldn't dump the SAT.
Business Week, 9 April 2001
Robert J. Barro.
Barro is a professor of economics at Harvard and a senior fellow at the Hoover Institution. In this article he explains why he is opposed to dropping the SAT scores as a requirement for admission to college. He reports that Richard Atkinson, president of the University of California, has proposed eliminating SAT scores as a required part of the college application. (Actually Atkinson recommended dropping the aptitude test SAT 1 but he recommends requiring the SAT II tests which test mastery of a specific subject.)
The Education Department conducts the National Postsecondary Student Aid Study (NPSAS) every three years which includes students' grade point averages, admission test scores (including the SATS), and other family and school variables. Barro reports that he used the NPSAS studies for 1990, 1993, and 1996, which provided 33,000 observations, to study how well the admission tests predict college performance. He concludes from this study:
In this sample, admissions-test scores strongly
predict college grades, though much of the individual
variation in grades remains unexplained. Taking into
account many other factors (including college attended,
race and gender variables, and parental income and
education), the t-statistic -- a measure of how closely
two variables move together -- for the admissions test
is 60. In comparison, researchers customary regard a
result as significant if this statistic exceeds 2.
Therefore, admissions tests have strong predictive power
for college grades. They are as good for senior as for
Jack Kaplan writes:
Apparently Barro doesn't understand that the size of
the t-statistic measures the degree of certainty that
there is some relationship between SAT scores and
college grades, but it does not measure the strength
of that relationship. A weak relationship with a large
sample size can have the same t-statistic as a strong
relationship with a small sample size. This is something
that's taught -- or ought to be -- in any introductory
statistics class. Put another way, statistical significance
does not mean significant in the sense of being important.
Note that Barro also does not tell us what t-test he is carrying out.
There has been a large amount of research done on the predictive power of SAT I tests. If you do a simple regression at your college you will probably get a correlation between .3 and .4 which is not very impressive. However, the College Board has carried out studies that correct for the "restricted range" of the data and something called "criterion unreliability." Doing this they find correlations of about .65. You can find a summary of their predictive studies and further references here: and the results for a sample university here.
Barro discusses the results of his study for women and minorities His results here are consistent with other predictive studies of admission tests. One result of his study that does not seem to be consistent with other studies is that Barro reports that the mathematics part of the admission test is nearly twice as good as the verbal part as a predictor of college grades. If you look at the results of the sample university referred to above, you will see that the correlation for SAT Verbal was .62 and SAT math was .65.
(1) What t-test do you think Barro was referring to?
(2) What do you think "criterion unreliability" means?
For those who follow the saga of the wolf-moose herd on our beloved Isle Royale, we have put Rolf Peterson's 2000-2001 moose-wolf report on the Chance web site. In addition to beautiful pictures of moose and wolves, you will find graphs showing the results of a predator-prey relationship as observed for 42 years.
Roger Pinkham suggested the next article.
Population growth, technology and tricky graphs.
American Scientist, May-June 2001, pp 209-211
Peter Schulze and Jack Mealy
In writing about the human population in the September 1960 issue of Scientific American, noted ecologists Edward S. Deevey Jr., employed a graph which suggested that the world population was leveling off in the current industrial age. The authors of this article say this graph has been responsible for a serious misconception about population growth. The graph shows the population growth over a million-year period according to major periods of human development: "toolmaking" from one million to 10,000 years ago "agriculture from 10,000 to 300 years ago and "industrial" from 300 years ago to 1950.
When you look at a specific period you see a graph resembling a parabola suggesting a leveling off in each period. Schulze and Mealy give references to seven books and articles, most published after 1990, where the graph has been reproduced and in some cases used to argue that the population has become stabilized during the last few decades. They point out that the population has certainly not stabilized and, in fact, has doubled since the time of Deevey's graph. So they ask: what is wrong with this picture? Their answer is that Deevey used logarithmic scales for both population and the time axes. They write:
What sort of growth would be needed to show an obvious
increase on a graph with two logarithmic axes? To rise as a
straight line, a variable must make its next order- of-magnitude
increase in an order-of-magnitude less time than the previous
one. For example, consider a population that went from 100
to 1,000 in 100 years, an annual increase 2.33 percent. To
plot as a straight upward- sloping line, the population would
have to reach 10,000 in the next 10 years and then hit
100,000 in the following year. So regardless of the actual
situation, all plausible positive rates of growth will appear
to plateau on Deevey's graph.
Schulze and Mealy show another example by plotting the Dow-Jones industrial average from 1930 to the present. Using simple linear scales we see that the Dow increased slowly from about 100 to about 1000 in 1980 and then increased much more rapidly to about 11,000 at the present time. Plotting the graph using logarithmic scales for both axes gives a completely different picture. The graph again looks like a parabola suggesting a gradual increase and leveling off.
The authors then ask how should such graphs be drawn? They point out that it depends on the questions you are asking. The linear scales are useful when you want to read numbers directly from the graph. However, as the graph of the Dow shows, the compression of the vertical scale can obscure interesting changes that happen during periods of low values -- for example the 1929 stock-market crash. Referring again to population graphs the authors write:
Figures with a linear time axis and a logarithmic
population axis have a convenient feature: A
constant percentage rate of increase plots as a
constant slope. This attribute makes it easy to
detect changes in the rate of growth, which
appear as shifts in the slope of the line. Using
such a graph for the Dow would show clearly
that the rate of increase during the 80's and 90's
was greater than during the 60's and 70's.
Thus while there are places for logarithmic scales the authors remark that the object of a graph is to show patterns and so logarithmic scales should not be used when they obscure a pattern as they did in Deevey's graph of the world population. They conclude:
Arguments about when and how the population will stop
increasing are more important now than ever before.
People should not let these tricky graphs cloud the
debates they are intended to illuminate.
(1) The Dow Jones data is available from Yahoo (see Historical Quotes under Research and Education). The ticker symbol for the Dow is ^DJI. Get the weekly data from 1928 to the present and experiment with different combinations of scales, logarithmic and linear. Comment on the effects of your choice of scales.
(2) Deevey gives the source of his data in his articles. See if you can reconstruct his data from these sources. If successful try different scales, logarithmic and linear, and comment on the effects of your choice of scales.
Movie Buff Dan Rockmore suggested the next article.
Study: Oscar winners tend to live longer.
Nando Times, 14 may 2001
Available at Nando Times until 27 May 2001
Survival in Academy Award--winning actors and actresses.
Annals of Internal Medicine, Vol. 134, No 10, 15 May 2001
Donald A Redelmeier, Sheldon M. Singh
The abstract for this study as presented in the Annals of Internal Medicine described the study as follows:
Background: Social status is an important predictor
of poor health. Most studies of this issue have focused
on the lower echelons of society.
Objective: To determine whether the increase in status
from winning an academy award is associated with
long-term mortality among actors and actresses.
Design: Retrospective cohort analysis.
Setting: Academy of Motion Picture Arts and Sciences.
Participants: All actors and actresses ever nominated
for an academy award in a leading or a supporting role
were identified (n = 762). For each, another cast member
of the same sex who was in the same film and was born
in the same era was identified (n = 887). Measurements:
Life expectancy and all-cause mortality rates.
Results: All 1649 performers were analyzed; the median
duration of follow-up time from birth was 66 years, and
772 deaths occurred (primarily from ischemic heart disease
and malignant disease). Life expectancy was 3.9 years
longer for Academy Award winners than for other, less
recognized performers (79.7 vs. 75.8 years; P = 0.003).
This difference was equal to a 28% relative reduction in
death rates (95% CI, 10% to 42%). Adjustment for birth
year, sex, and ethnicity yielded similar results, as did
adjustments for birth country, possible name change,
age at release of first film, and total films in career.
Additional wins were associated with a 22% relative
reduction in death rates (CI, 5% to 35%), whereas
additional films and additional nominations were not
associated with a significant reduction in death rates.
Conclusion: The association of high status with increased
longevity that prevails in the public also extends to celebrities,
contributes to a large survival advantage, and is partially
explained by factors related to success.
Those who win more than one academy award are counted only once which explains the difference between the number nominated 762 and the number of controls 887.
As seen above, Oscar-winners have a life expectancy of 3.9 years (79.7 vs. 75.8 yrs) greater than the matched controls from the same movie who have not been nominated for Oscars. The researchers report that the winners have about the same advantage over those who were nominated but did not get an Oscar -- 79.7 vs. 76.
The researchers comment that this increase of about 4 years in longevity is equal to the estimated societal consequence of curing all cancers in all people for all time. They do not have any simple explanation for this increase in longevity. But they suggest the following possible explanations: Oscar winners are under greater scrutiny that may lead to a more controlled life to maintain their image; they may have managers with a vested interest in their reputation and enforce high standards of behavior; they may have more resources and so can avoid stress and have access to special privileges that others do not.
The usual explanations for longer life expectancy for the rich over the poor: better schooling, better health care etc. do not seem to apply here.
The Nando Times remarks:
Examples of long-lived Oscar-winners abound. John
Gielgud who won for "Arthur," was 96 when he died
last year. George Burns ("The Sunshine Boys") lived
to 100. Leading lady Greer Garson ("Mrs. Miniver")
reached 92. So did Helen Hayes ("The Sin of Madelon
Claudet," "Airport"). Katharine Hepburn - with a record
four Academy Awards - turned 94 on Saturday.
(1) What reasons can you think of for this increased longevity?
(2) In discussing the limitations of their study the authors say that they should have had more biographical information about those in the study. What would they looked for if they did?
(3) Would the examples in the Nando Times by themselves suggest that Oscar winners live longer than those who do not win?
John Towse wrote us:
The BBC news server has a web page about the foot and
mouth outbreak among livestock in the UK: As with other
material on their site, I don't know how long it will last.
However, some chance people might find it useful to have
some topical data to work with, in terms of how data are
presented (the comparison of cumulative vs. frequency
distributions) and how epidemiological models of the
progress of the virus correspond to reported incidents etc.
This is, indeed, a very nice illustration of the use of graphics in current news. It's still there today and we hope it will still be there when you read this. <<<========<<
The Spring 2001 issue of Chance Magazine arrived and it is a great issue. All the articles and columns are first-rate. Here is an interesting article related to chance in the news.
Three sisters give birth on the same day.
Chance Magazine, Spring 2001, pp 23-25
On March 14 the Stevens Point Journal asked: What is the probability that three sisters would give birth on the same day? This event happened when sisters Karralee Morgan, Marrianne Asay and Jennifer Hone all gave birth on March 11, 1998 in American Fork Utah. The Stevens Point Journal article included the comment:
If the three mothers hadn't all gotten pregnant about
the same time, then you could say that each baby
had a one-in-365 chance of having March 11 as
his or her birthday. The probability of the occurrence
of the three successive births on the same date would
be about one in 50 million. (or (1/365)^3)
We have often been asked questions about coincidences like this and have never felt very comfortable about our answers. Wetzel shows us why this might be by considering five different versions of this problem and indicating how we might estimate the odds for each of the five. Here are his five versions with his estimates:
1. Given three particular sisters who each will give
birth to a baby this year, what is the probability that
they will all give birth on March 11? Ans. (1/365)^3
or about 1 in 50 million
2. Given three particular sisters who each will give
birth to a baby this year, what is the probability they
will all give birth on the same day? Ans. 365/365^3
or 1 in 133,225
3. Before they had their babies, what was the probability
that the three sisters referenced in the Stevens Point
Journal newspaper article would give birth on March 11?
Ans. about 1 in 500,000
4. Before they had their babies, what was the probability
that the three sisters referenced in the Stevens Point
Journal newspaper article would all give birth on the
same day? Ans. about 1 in 6,000
5. What is the probability that somewhere in the
United States there will be three sisters who all
give birth on the same day sometime during the
next year? Ans. lower bound; about 1 in 407
Wetzel remarks: The last question is the closest to what we believe people mean when they ask the question "What is the probability that the three sisters would give birth on the same day?" It is also the hardest to estimate since you have to estimate the number n of families in which there are three sisters all of childbearing age and all becoming pregnant this year. He encourages readers who have a better lower bound for the answer to version 5 or a better way to estimate n to submit their solutions to email@example.com with "three sisters" in the subject line. Wetzel will review the solutions and prepare a follow-up article for Chance Magazine.
What assumptions did Wetzel have to make, and what information did he need, to answer each of his five questions?
Chance Magazine, Spring 2001, p 60
Editor Gabor Szekely
This column invites readers to provide solutions to interesting probability problems with the possibility of winning a one year extension to their Chance Magazine subscription. This issue provided the solution to an interesting problem posed in the Fall 2000 column. Charles Grinstead pointed out to us that it is also problem 34 in Fred Mosteller's classic "Fifty challenging problems in probability." Here is the way Mosteller states the problem:
Labor laws in Erewhon require factory owners to give
every worker a holiday whenever one of them has a
birthday and to hire without discrimination on grounds
of birthdays. Except for these holidays they work a
365-day year. The owners want to maximize the
expected total number of man-days worked per
year in a factory. How many workers do factories
have in Erewhon?
This problem has a pleasing solution: either 364 or 365. If you get stuck you can find Mosteller's solution at the end of this Chance News. His solution is a great example of the use of the additive property of expected values.
Victim's race affects decisions on killer's sentence, study finds.
The New York Times, April 20, 2001, page A10
This short article describes the findings of a study of homicide cases in North Carolina that occurred from 1993 to 1997. Among cases where the death penalty was possible, rates for receiving the death penalty are broken down by the race of the victim and defendant. (Actually, only the categories "white" and "nonwhite" are used, but the article states that nonwhite in North Carolina means mostly African-American and some Hispanic.) The study found that people convicted of killing a white person are more likely to receive the death penalty than those convicted of killing a person who isn't white.
The relevant information is presented in a table and graph, summarized below, where
W = white, NW = not white
Defendent Victim # of cases % receiving the death penalty NW W 284 11.6 W W 541 6.1 W NW 80 5.0 NW NW 616 4.7
One can immediately see that the # of cases column is also quite interesting, but this is not discussed in the article.
Using the above numbers, one finds that 7.99% of defendants who were convicted of killing a white person received the death penalty, versus 4.73% for defendants convicted of killing a person who isn't white. Other studies in recent years have also shown that the race of the victim affects sentencing (see for example Chance News 4.04). Interestingly, this study also looked at what led to these different rates and found that apparently prosecutors were more likely to let defendants plead guilty in exchange for a lesser sentence if the victim wasn't white.
(1) What percentage of defendants received the death penalty?
(2) Explain how to find the 7.99% and 4.73% given above.
(3) (a) Among defendants who received the death penalty, which of the four scenarios do you think is most likely? Least likely? (b) Find these percentages.
(4) What do you make of the distribution of the number of cases? Do the numbers surprise you?
New attention for the idea that abortion averts crime.
The New York Times, April 14, 2001, B7
The impact of legalized abortion on crime.
Quarterly Journal of Economics, May 2001
John J. Donohue III and Steven D. Levitt
John J. Donohue III of Stanford Law School and Steven D. Levitt, a professor of economics at the University of Chicago, have attempted to establish a link between the legalization of abortion in the early 1970's and the drop in crime in the U.S. that began in the early 1990s. As Levitt puts it, "The thesis is about as simple as you could have. A difficult home environment leads to an increased risk of criminal activity. Increased abortion reduced unwantedness and therefore lower[ed] criminal activity." It shouldn't be surprising that the study and its authors have received a fair amount of criticism.
Donohue and Levitt examined two main connections between abortion and reduced crime. The first connection arises from the simple fact that fewer births mean fewer potential criminals. In addition, children who are most at risk for criminal behavior as adults are also born to mothers whose rates of abortion have been highest (for example, teenagers and unmarried women.) The second link between abortion and reduced crime is that children who are born are more likely to be wanted and to grow up in an improved environment. Therefore, the argument goes, they will be less likely to commit crimes later in life. The authors also analyzed the five states -- Alaska, California, Hawaii, New York, and Washington-- that legalized abortion in 1970, three years prior to Roe v. Wade. These states saw drops in crime before the rest of the nation.
Some of the many questions raised in the article are: Do unwanted children really commit more crimes? What is the effect of the decrease in the illegal drug trade or the availability of handguns in some cities? What roles do increased incarceration rates and policing play? Other questions and starting points for discussion (see below) arise from Donohue and Levitt's paper, "The Impact of Legalized Abortion on Crime."
(1) The authors acknowledge that it is difficult to determine the abortion rate prior to 1973 (when it became legal). Why is knowing this rate important? What other information might you use (e.g., adoption rates) to try to determine the difference between the rates before and after 1973?
(2) To find the a link between abortion and reduced crime in, say, 1995, Donohue and Levitt can't use abortion rates in 1995 (why?!). Instead, they define an "Effective legal abortion rate" as the sum over all ages, a, of (the legal abortion rate in year 1995 - a)*(% of arrests in 1995 by people age a). Replacing 1995 by t gives the effective legal abortion rate in year t. (a) What does this sum mean? (b) Show that if the abortion rate is constant from year to year, then the effective abortion rate equals this constant rate. (c) Under what circumstances will the effective abortion rate be less than the actual abortion rate?
(3) The time-series graphs on crime that the authors provide show that crime rates have fluctuated rather dramatically since 1973. In particular, there was a sharp drop in the early eighties similar to the drop in the nineties, followed by a sharp rise in the later eighties. The authors rarely mention these fluctuations and occasionally group data in such a way that the differences over time are masked. Why might this be?
(4) How would you design a study to determine if abortion reduces crime? Ted Joyce, an economist critical of the study, who is quoted in the Times article suggests that "you would need to study 50,000 women, half of whom terminated their pregnancies, half of whom wanted to but did not, and follow them and their children over time." What do you think of this strategy?
Goran Djuknic suggested the next article.
She had to have it.
The New Yorker, April 23&30,2001, pp 62-70
Benjamin Peirce and the Lowland Will
JASA Vol. 75, NO. 371, 497-506 (Available from JSTOR)
Paul Meier and Sandy Zabell
The New Yorker article tells the story of the life of Hetty Green. Hetty has been ranked as the world's greatest miser in the Guiness Book of World Records with the following citation:
If meanness is measurable as a ratio between expendable
assets and expenditure then Henrietta Howland Green
(1835-1916), who kept a balance of over $31,400,000
in one bank alone, was the all-time world champion. Her
son had to have his leg amputated because of her delays
in finding a free medical clinic. She herself ate cold porridge
because she was too thrifty to heat it. Her estate proved to
be worth $320 million, equivalent to $3,816 million in 1996.
Hetty's Grandfather, Isaac Howland Jr. started a whaling business at a time when the principle source of oil was whale oil. Hetty inherited money from this business and used it to make investments, especially in railroads and real-estate. Her estate of 320 million put her right up there with J. Pierpont Morgan, who was worth 80 million when he died, and John D. Rockefeller, who was worth 900 million when he died.
Hetty was called the Witch of Wall Street, a title suggested perhaps by her shabby clothes and shrewd investments. She once said: "When I fight there is usually a funeral, and it isn't mine." However, she is, perhaps, best known for a lawsuit over her Aunt Sylvia Ann Howland's will. This lawsuit involved one of the first applications of probability theory in the courts in the United States.
Hetty's Aunt Sylvia was the last of the Howland partners in the whaling business. When she died in 1865, a will, written in 1863, stipulated that about half be given in bequests to various individuals and corporations and the remaining, $1,132,929, be placed in trust, with the income to go to Hetty during her lifetime and the principal to be distributed to lineal descendants of her grandfather after Hetty's death.
Hetty had expected to receive the principal directly and filed a bill of complaint in federal court claiming that the 1863 will did not represent her aunt's real wishes. She produced a will which left her directly essentially all of her aunt's estate written in 1862 and signed by her Aunt Sylvia and three witnesses. The will was accompanied by an "extra page" which said that Aunt Sylvia revoked "all wills made by me before or after this one." Hetty stated that she had written this page and her Aunt Sylvia signed it. She said that there were no witnesses to this page because it was part of a "mutual will" designed to assure that none of the Howland money go to Hetty's father Edward Robinson. Hetty and her aunt wanted to keep this a secret, using it only if necessary. There were two copies of this extra page, one kept by Hetty and the other by her Aunt Sylvia.
The two signatures on the extra page looked remarkably similar to the signature on the proper will. These two signatures also appeared at exactly the same place on the two pages. The executor of the estate, Thomas Mandell, claimed that Hetty had drawn up the second page herself, without her aunt knowing about it, and then traced her aunt's signature at the bottom of the pages.
This led to a trial that became known as the Howland Will Case, in which the principle issue was the authenticity of the signatures on the supplementary page to the 1862 will. Both sides acknowledged the authenticity of the will itself and Sylvia's signature on the will.
Hetty and Mandell spared no money in providing the court with expert witnesses to support their side of the argument. It took more than a year to hear the evidence and this resulted in a thousand pages of evidence for submission to the United State Circuit Court.
Hetty's lawyers presented a number of handwriting experts both from government agencies and from colleges. All stated that, in their informed opinion, the signatures on the "second page" had been written by Sylvia Ann Howland. One of the more impressive witnesses was John Quincy Adams, the grandson of President John Quincy Adams. Adams presented 110 returned checks from his grandfather's papers. He provided these to an experienced engraver, J. C. Crossman, who compared them one by one and labeled all 5,995 comparisons. The twelve that appeared most similar were photographed on transparent paper so that one could be superimposed for comparison. Crossman and other experts testified that they were remarkably similar and several were considered closer to each other than Sylvia's signature on the extra page was to the signature on her will.
Mandell's lawyers responded that President John Quincy Adams was famous for the uniformity of his writing and remarked "Are not the improbabilities of a race horse impossibilities for the draft horse?" Hetty's lawyers responded by giving examples from bankers and others showing that it was not unusual to have such similar signatures. In addition, Hetty's lawyers had distinguished Harvard scientists, including Oliver Wendell Holmes Sr. of the Harvard Medical School, testify that examinations with microscopes found no evidence of the signatures being traced from the original signature.
To match Adams' testimony Mendell's lawyers felt they needed a superstar to testify to the impossibility of the signatures in question being authentic. They found this in Harvard mathematician and astronomer Benjamin Peirce and his son Charles Sanders Peirce. Benjamin Peirce was one of the first American research mathematicians. One often sees his quotation: "Mathematics is the science that draws necessary conclusions". His son Charles worked in several different fields. He is best known for his contributions to philosophy where he founded the field of semiotics, a field of study devoted to the nature, varieties, and uses of signs. The Peirces were also among the first Americans to make significant contributions to statistics. These contributions are discussed in "Mathematical statistics in the early states," Stephen M. Stigler, The Annals of Statistics, 1978. Vol.6 No2. 239-265 (available from JSTOR).
Charles Peirce testified on the results of an experiment suggested by his father. His father had the idea of comparing two signatures in terms of the downward strokes made on each letter. For example S and Y have two such downward strokes, d has one and o none. He proposed measuring the similarity of two signatures by the number of corresponding downward strokes that could not be distinguished.
They obtained 42 signatures of Sylvia from documents she signed in her later life. They had these 42 signatures enlarged and printed on transparent paper to allow them to superimpose pairs and count the number of the downward slopes that could be considered to match.
Since there were choose(42,2) = 861 possible pairs of letters this resulted in 30*861 = 25,840 downward strokes to compare. They found that 5,325 of the 25,840 could be considered a match. They concluded from this that the probability of a match for a particular downward slope on two signatures is 1/5.
Comparing Sylvia's signature on the 1862 will with her signature on the extra page, they found a match on all 30 downward slopes. They concluded that there was 1 chance in 5^30 which they said was 1 chance in 2,666,000,000,000,000,000,000 (This number should have been 1 in 931,000,000,000,000,000,000).
After his son described these results the father addressed the court on his interpretation of the results and stated:
It is practically impossible. Such evanescent shadows of
probability cannot belong to actual life. They are unimaginably
less than those least things which the law cares for... Under
a solemn sense of the responsibility involved in the assertion,
I declare that the coincidence which has here occurred must
have had its origin in an intention to produce it... it is utterly
repugnant to sound reason to attribute this coincidence to any
cause but design.
But the Peirces did more than just this one calculation. They also provided a table and a graph to support the claim that the data could be described by a binomial distribution with p = .2. Here is the table:
Number of Agreements Observed Expected 3 97 67.6 4 131 114.1 5 147 148.3 6 143 154.5 7 99 132.4 8 88 95.2 9 55 58.2 10 34 30.5 11 17 13.9 12 15 5.5
The expected numbers were obtained by multiplying the binomial probabilities B(30,.2,k) by 851 for k = 0 to 39. Note that the Peirces gave only the middle values: k = 3 to 12. They referred to the other 35 as "undistributed values". Evidently there were 15 below 3 (all 2's) and 20 above 12. The expected number below 3 is 38 and above 12 is 2.7. Thus there were too few small values and too many large values.
Of course today we would wonder about the assumption of independence required for the Bernoulli trials computations. But in his article, Stigler points out that, even though probability and statistics were actively being developed in England and Europe, the US was slow to become interested in mathematics in general and probability and statistics in particular.
In their paper, Meier and Zabell remark that the other great mathematician-astronomer of the time, Simon Newcomb, provided the product rule for the Universal Cyclopedia (1900) without mentioning independence. Of course, these two examples themselves may not be independent since Newcomb was a student of Benjamin Peirce.
The fit in the corresponding graph probably looked reasonable to the court. Of course there was no formal theory of goodness of fit at this time. It is tempting to carry out a Chi-Square text but Meir and Zabell point out that the assumptions for this test are not met. Assuming that the number of agreements in a single pair of signatures has a binomial distribution, then for the chi- square test we should have 841 independent pairs of signatures. But, in fact, the pairs of signatures are not independent since they were all chosen from just 42. However, Meier and Zadell provide other tests that do apply and conclude that the data does not support Peirce's assumption of a binomial distribution.
In discussing the assumptions needed for a binomial distribution Meir and Zabell point out that knowing that there is agreement in, for example, position 1 and 3 would seem to make it much more likely that there is agreement in position 2. The effect of such dependence would increase the variance and be consistent with the larger number of high values observed than expected.
They also suggest that there may be correlation between signatures made close in time. Signatures made at the same sitting might be more similar than those days or months apart. Of course, it would be natural to consider the Peirce experiment on the 110 pairs of signatures of John Quincy Adams. However this would require an estimate for p, the probability of a match for a particular slope. We asked Sandy Zabell if he knew if these signatures were still available. He said that when writing their article they had visited a federal records depository near Cambridge, MA and he thought that these signatures were there. This would make a nice project for someone.
Having gone to the source of the court testimony and looked at the data, Meir and Zabell asked what the amateur sleuth could conclude: did Hetty forge the signatures? They remark that "no unequivocal conclusion seems possible."
Finally, what did the court do? The result of the trial is reported in The American Laws Review (1870, Vol. 4, pp. 656-663). At the time of the court proceedings, Massachusetts law treated a beneficiary under a will as disqualified (on account of self- interest) from testifying about the circumstances surrounding the will, unless called to testify by the opposite party or by the court. The court allowed Hetty to testify but reserved judgment on whether this testimony could be used in the final decision. The court ruled that it could not and without it the evidence was insufficient to support Hetty's claim that the 1862 will was in fulfillment of a contract. On this ground, the court found for the defendant, Thomas Mandell. Meir and Zadell comment:
It can only be conjectured whether it might have ruled
otherwise had it considered the contested signatures to
(1) Do you think the signature was forged?
(2) If you had been asked to be an expert witness for the court what suggestions would you have to determine the authenticity of the signatures? <<<========<<
Duelling Idiots and other Probability Puzzlers.
Princeton Press, 2000 Hardcover,256 p., Amazon $19.96
Paul J. Nahin
This book contains the statements and solutions of 21 probability problems. Many of these problems are not to be found in standard elementary probability texts. As is frequently the case in probability theory, some of the problems have surprising solutions. Such topics as Bernoulli trials, geometric probability, recurrence relations, and Markov chains are introduced in the solutions.
Solutions to all the problems are provided. Also MATLAB programs are provided which show how simulations can be used to test and verify many of the results.
This book should certainly be considered as a supplement to a standard text in a first probability course.
As one example of a problem that is considered in this book, suppose that two players play an N-game match in chess, where the probabilities that the two players win a given game are p and r, respectively, and the probability of a tie is q=1-p-r. Estimate the probability that the match ends in a tie. <<<========<<
Turning Numbers into Knowledge.
Analytics Press,2001 Hardcover, 244 p., Amazon $34.95
Jonathan G. Koomey
While students have opportunities to learn statistical techniques for analyzing data, they seldom have the opportunity to learn the complete process by which numbers are transformed into knowledge. Koomey remarks in his preface that this is an art that is seldom taught. His book is aimed at teaching this art to policy makers or more generally to anyone who wishes to improve their quantitative literacy.
The book is the result of Koomey's own experiences and so it is useful to look at the kind of problems that he works on. Koomey is a staff scientist at the Lawrence Berkeley National Laboratory. He leads the End-Use Forecasting group which analyzes products and technologies relating to their energy efficiency and impact on the environment. Their group develops recommendations for the U.S. Environmental Protection Agency and the US Department of Energy.
Here is a problem from the current news that is typical of the problems that Koomey's group works on. The Washington post, in a recent article ("Air conditioning standards chilled", 8 May, 2001, E01, Cindy Skrzychi) reports:
On April 13, the Department of Energy announced it
wanted manufactures of air-conditioning units to increase
their efficiency by 20 percent -- instead of the 30 percent
the Clinton rule ordered.
Manufacturers argue that the 30 percent requirement would be counter-productive because it would drive small manufacturers out of business. On the other hand, an energy expert states that the increase from 20 to 30 percent would be equivalent to taking 2.5 million cars off the road. How should we decide which percentage would be more effective? Most of us would say 30% if we are Democrats and 20% if we are Republicans. Statisticians would say that we have to look at the data. Mathematicians would say that we have to develop computer models. But Koomey would say that much more is involved in going from numbers to knowledge.
Koomey's book takes us through this process of critical thinking in 31 short chapters with such inviting titles as: Don't be Intimidated, Explore your Ideology, Put Facts at your Fingertips, All Numbers are not Created Equal, How Guesses Become Facts, Be a Detective, Reuse Old Envelopes and many more.
In his book Koomey practices what he preaches: decide what is important, know your audience, show your stuff, and keep it simple. We thoroughly enjoyed reading this book, but the best way to decide if this book is for you is to read the sample chapters available at book's web site. If you like these you'll like this book.
After you have read the book you might enjoy applying what you have learned to trying to figure out who is right in another policy decision in the news that Koomey's group has worked on. In the 21 Jan. 2001 Observer there is a discussion of the cause of the California electricity crisis. We read:
A recent study suggests the internet now accounts for
8 per cent of national electricity consumption - and
could come to consume between 30 and 50 per cent
of it in 20 years . This statistic has been seized on by
President George W. Bush as he seeks to justify an
energy policy that will allow oil drilling and coal mining
in the pristine wilderness of Alaska - but the figures are
spurious, say some analysts.
The estimate that internet usage was responsible for 8% of the national electrical consumption came from a study carried out by Mark Mills in 1999. This study was the subject of an influential article in Forbes Magazine (Dig more coal -- the PCs are coming, Peter W. Huber, 31 May 1999). Mills estimated that 8% of the national electricity consumption was due to the use of computers while connected to the internet and 13% due to all uses of computers.. In 2000, Koomey's group made a similar study. They estimated that internet usage was responsible for 1% of the national electrical consumption and the use of computers more generally was responsible for 3% of the national electricity. The factor of 8 reduction in the internet contribution is quite striking. We would expect there to be small differences between assumptions in calculations of this sort but not this much! Much of the difference in these comes from the fact that Mills estimated the active power of a personal computer plus monitor to be 1000 watts, while Koomey's group estimated it to be about 150-200 watts.
The Forbes article had a significant impact on policy makers and led to Mills testifying before Congress. You can find Mills' testimony together with a detailed rebuttal by Koomey here. You will also find here links to the Forbes article and up-to-date information on estimating the contribution of computers to national electricity consumption.
(1) Do you really think that using your computer is like having 10 100 watt bulbs going in your office? How could you test this?
(2) At end of Koomey's chapter "Use Forecasts with Care" we find the following exercise:
Find a forecast on a topic you care about, and learn
how it was created. Talk to the author and read any
supporting document. Do you still find it plausible?
Carry out this exercise for Mills' forecast that the electricity consumption due to the internet could come to consume between 30 and 50 percent in 20 years.
Pythagoras, Pi and Poker.
Los Angeles Times, 14 May 2001,
Mortensen claims poker championship.
Las Vegas Sun,19 May 2001
The World Series of Poker was played this year from Monday May 14 to Friday May 18. There were a record 613 participants. For an account of how the tournament is run and a detailed description of the game "Texas Hold 'em'" that is played, see Chance News 8.05.
Here is how the Sun describes the final day.
The championship game in the World Series is "Texas
Hold 'em," in which each player is dealt two cards face
down. Five cards are placed face up in the middle,
and players make their poker hands from these cards
and the cards in their hand.
The game is "no-limit," meaning that players can bet and
raise up to the total amount of chips they have. The buy-in
for the final tournament is $10,000.
The showdown between Mortensen and Tomko was
set after Tomko ousted Stan Schrier, an auto dealership
owner from Omaha, Neb., with a pair of kings. Tomko
had made his way up from sixth place through skillful
play, but faced a huge challenge in Mortensen, who
had amassed a mountain of chips totaling more than
$4 million, giving him a 2-to-1 chip advantage.
The two players jabbed back and forth for about
15 minutes, neither one willing to take the other on.
Usually an aggressive bet or raise was enough to
convince the other to fold, often before a single
community card had been dealt.
But 20 minutes into their heads-up match, the two
took each other on. After the three of clubs, 10
of clubs and jack of diamonds were dealt in the
center, Mortensen bet $100,000. Tomko raised
$400,000, and after some thought, Mortensen
went "all-in." Tomko called, setting up a showdown
for the title.
Tomko flipped up a pair of aces -- the strongest
starting hand in the game of hold 'em. But Mortensen
showed a powerful hand as well -- the queen and
king of clubs. So while Mortensen trailed Tomko,
he was just one card from making either a straight
or a club flush, and capturing the tournament.
The fourth card was a three of diamonds, giving
Tomko two pairs -- aces and threes. But the
final card was a nine of diamonds, which
completed Mortensen's straight and won
him the championship.
The Los Angles Times article discusses the final game of the world series of poker last year between T. J. Cloutier and Chris Ferguson (see Chance News 10.01). The author remarks: Cloutier is an old-school Texas road gambler who learned his game in the days when guns were sometimes brought to the table to settle a game's outcome. Ferguson has a doctorate in computer science/artificial intelligence from UCLA and calculates all his poker moves mathematically.
Ferguson uses the game theory of von Neuman and Morgenstern to determine an optimal strategy for bluffing and uses Bayesian statistics to estimate what cards each player might have after each new card is dealt and after each round of betting. He also practices poker, computing the probabilities for card and betting combinations he might encounter.
According to this article an increasing number of those who enter the World Series of Poker are young poker players who have learned to play reading books written by the experts, practicing with computer poker and participating in the discussion of poker on the Internet newsgroup rec.gambling.poker. The article stresses that luck still plays a big role in the World Series of Poker. This is verified by the fact that neither Cloutier nor Ferguson made it past the first day of play in this years World Series of Poker!
One report stated that only 2 players in the third day of play in the 2001 world series who were also in the third day of play in the 2000 world series of poker. The commentator suggested that this showed how much luck was involved. Do you agree with this comment? What additional information would you need to make this conclusion? See if you can get this information to test this conclusion.
InChance News 10.4 we discussed the "red hat problem" presented in New York Times (Science Times D5, 10 April 2001). Recall that in this problem n people were randomly given a red or a blue hat. As a group they were to look at the other peoples' hats and simultaneously either guess the color of their hat or pass. The object of the game is for the group to devise a strategy that gives the highest probability that at least one person guesses correctly and no one guesses incorrectly. The article stated that an optimal strategy was known for n of the form n= 2^k-1. We presented this strategy in Chance News 10.4.
The Times article commented that this solution was suggested by analogies with Hamming codes. We think we now understand this analogy and so we make some remarks on the relation of the hat problem to codes. We consider words as sequences of 0's and 1's. In coding theory one studies ways to send words across a channel where errors might occur causing one or more digits to change from a 0 to a 1 or from a 1 to a 0. A code assigns a codeword to each word to be sent. This code will be known to the receiver. We want to assign codewords so that if at most one digit get changed the receiver can identify the codeword that was sent.
Suppose we want to send a word consisting of a single digit 0 or 1. We choose as codewords 000 and 111 and code 0 as 000 and 1 as 111. Then if a single digit is changed we will be able to recognize that this is an error and we will know what codeword the sender intended to send. For example, if we receive 010 and assume there was only a one error we can conclude that the sender meant to send 000. If we receive 000 and assume that at most one error occurred in transmission then we know that this was the codeword intended. Thus this code satisfies our requirements. This example corresponds to the hat problem with 3 players. Recall that the optimal strategy for this case was: If you see two hats of the same color guess that your hat is the other color, otherwise pass. The codewords 000 and 111 correspond to cases where the group does not win. In fact they all guess and are wrong.
Consider now n = 2^3-1 = 7. Let's recall how the optimal strategy was defined in this case.
The group assigns each player one of the 7 sequences 001,010,011,100,101,110,111 which, as in the hat problem, we call "nimbers". Then when called on to give their answers each player adds the nimbers for those that he sees with red hats. If the sum is 000 he guesses that he has a red hat. If the sum is his nimber he guesses that he has a blue hat. Otherwise he passes.
For the corresponding code we want to send messages that are words of length 4 and we do it by sending codewords of length 7. We determine the codewords as follows:
We assign the nimbers 001,010,011,100,101,110,111 to the seven possible positions for the digits of a 7-letter word. We then define a codeword to be a 7-letter word in which the sum of the nimbers for the digits that are 1 is 000. For example for the word 1001100, the nimbers assigned to the positions with 1's are 001,100,101. Since 001+100+101 = 000, this word is a codeword. With this definition 1/8 of the 128 possible 7-letter words are codewords, giving us 16 codewords. (For a proof of this see Discussion question 1 in the red hat article in Chance News 10.04) For our code, we assign one of the 16 possible codewords to each 4-digit word we might want to send. This method of encoding a message has two properties shared by Hamming codes: (1) for any 7-letter word received there is a codeword which differs from it by at most one digit (such a code is called a perfect code), (2) If only one digit is changed in transmission the receiver will be able to detect where the error occurs and change it to determine the true message that was sent.
Let's see why this is the case. Assume that we have a word that is not a codeword. Then the sum of the nimbers for the 1's will not add to 000. Suppose that it adds to y. Then some position was assigned the nimber y. If there is a 0 at this position and we change it to a one, the sum of all the nimbers of the 1's is y+y = 000 and we have a codeword. If this digit is a 1 then the sum of the nimbers of the other 1's must be 000 so if this digit is changed to a 0 we will again have a codeword. Thus any word that is not a codeword can be changed to a codeword by changing one digit. The position that we switch to get a codeword corresponds in the hat problem to the person who guesses and guesses correctly.
Finally, to show that the code works we need to show that there is no other digit we could change to get a codeword. Suppose we had tried to change a digit whose nimber x is not equal to the sum y of the nimbers for all 1's. If the digit is a 0 and we change it to a 1 the sum of the nimbers for all 1's is x + y. But this cannot be 000 unless x = y which we have assumed is not the case. If the digit is a 1 and we change it to a zero then the sum of the resulting sequence cannot be 000 since that would again mean that x = y. Thus if there was only one error in transmission the receiver will be able to detect that there was an error and tell exactly where it occurred.
DISCUSSION QUESTION: Assume that our channel transmits each digit correctly with probability .9 and that a 4-letter word was generated by tossing a coin 4 times, making a digit 1 if heads turns up and a 0 if tails comes up. What is the probability that there will be at most one error?
Mosteller's solution to the Holiday Birthday problem.
Assume, more generally, that there are N days in a year and let n be the number of workers. Then the probability that any one day is a holiday is (1-1/N)^n. Thus the expected number of holidays is N(1-1/N)^n and the expected number of work days is E(n) = nN(1-1/N)^n.
A value of n which maximizes E(n) will have E(n-1) <= E(n) and E(n+1) <= E(n). Writing this out and simplifying we find that the first inequality will be satisfied if n <= N and the second if N <= n+1. These are satisfied only for n = N or n = N-1. For N = 365 this means that the factory should have either 364 or 365 workers. The expected number of work-days will then be 365^2(1-1/365)^365 = 48,943.5. If it did not have the holidays it would have 365^2 workdays. Taken the ratio of these two numbers we see that the proportion of the workdays the plant has with holidays to what it would have without holidays is (1-1/365)^365 which is approximately 1/e = .368.
Copyright (c) 2001 Laurie Snell This work is freely redistributable under the terms of the GNU General Public License published by the Free Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.
CHANCE News 10.05 April 12, 2001 to May 20, 2001