!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

CHANCE News 9.03

(February 4, 2000 to March 6, 2000)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Prepared by J. Laurie Snell, Bill Peterson and Charles Grinstead, with help from Fuxing Hou, and Joan Snell.

Please send comments and suggestions for articles to
jlsnell@dartmouth.edu.

Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:

Chance News is distributed under the GNU General Public License (so-called 'copyleft'). See the end of the newsletter for details.

Our quote comes from the June 1998 issue of FORSOOTH!

===========================================================

Thou shalt not answer questionnaires
Or quizzes upon World-Affairs,
   Nor with compliance
Take any test. Thou shalt not sit
With statisticians nor commit
    A social science.
W.H. Auden (Under Which Lyre)

===========================================================

Contents of Chance News

<<<========<<




>>>>>==============>
Note: If you would like to have a CD-ROM of the Chance Lectures available on the Chance web site send a request to

jlsnell@dartmouth.edu

with the address where it should be sent. There is no charge. If you have requested this CD-ROM and it has not come please write us again.
<<<========<<




>>>>>==============>
In the last Chance News we had the following Forsooth item from RSS News, V. 27, #5, Jan. 2000, page 12:

a p-value (the probability over repeated sampling
of observing data as extreme as, or more than,
what would be observed if an impossible hypothesis
were true)...
                       The Psychologist May 1999

Reader Sandy MacRae wrote us:

As a Fellow both of the British Psychological Society,
publishers of 'The Psychologist' and of the Royal
Statistical Society, whose newsletter contains the
sometimes hilarious 'Forsooth' column, I was
sufficiently stirred by Item 1 in Chance News 9.02
to look up the context of the quotation.

It occurs on page 256 in a book review by Dan Wright, a
sophisticated mathematical psychologist and author of an
excellent statistics textbook. The phrase is, in my view,
part of an accurate expression of what it aims to be: an
encapsulation of the arguments against null hypothesis
significance testing. The whole paragraph is:

The arguments against null hypothesis statistical
testing are that a p value (the probability over
repeated sampling of observing data as extreme as,
or more than, what would be observed if an impossible
hypothesis were true) contains so little information
that it is a hindrance to scientific progress rather
than an aid.
A point null hypothesis is 'impossible' in the sense
that it is an infinitesimal part of a continuous range
of possible states of nature.

The book being reviewed is "What if there were no significance tests?" L.L. Barlow, S.A. Mulaik, J.H. Teiger (Eds), Lawrence Erlbaum Assoc. This book provides a series of articles giving the arguments for and against the use of null hypothesis testing.

In his review Wright comments:

high up in psychological organizations
people are talking serious about a ban of null
hypothesis statistical testing in journals.
The impact of a test ban would be immense on
both the teaching of psychology and research".

DISCUSSION QUESTION:

In his review Wright also writes:

The majority of the authors feel that null hypothesis
statistical testing has very limited use (and most think
that it should always be presented as a confidence
interval rather than a p value), but they consider that
the real culprit is the grotesque misuses of the approach
that is common.
Do you agree with the majority? What do you think "grotesque misuses" might be?
<<<========<<




>>>>>==============>
If you are looking for a web site to go to for a summary of the polls relating to the campaign 2000 go to Polling Report. And don't forget that you can place your bets at the Iowa Electronic Markets. Today (March 5) you can buy a share on Bradley for about 2 cents and McCain for about 8 cents. You are paid off a dollar per share if your stock is a winner.

DISCUSSION QUESTION:

Articles about the Iowa Electronic Markets usually point out that the market price is a good predictor for the final result. For example, in a recent article in the Star Tribune we read:

In 1996, the going price for President Clinton was only
one percentage point away from the percentage of people
who actually voted for him, a prediction that many
pollsters would envy.
Peter Doyle comments that if the polls say that Gore has a 55% chance of winning with a margin of error of 3%, and we believe in statistics we should be willing to pay significantly more than 55 cents for a Gore share. The latest Gallup Poll (Feb. 25-27) has 65% for Gore and 28% for Bradley and at the Iowa Market a Gore share costs 96 cents while you can buy a Bradley share for only 2 cents. So Peter's theory seems correct now. Why should it not also be correct near the election?
<<<========<<




>>>>>==============>
Milton Eisner provided the following item together with his commentaries on the articles. Needless to say we very much appreciate such contributions and urge others to send us their suggestions with or without commentaries.

Study: Hormone therapy's cancer risk understated.
Washington Post, Jan. 26, 2000, A1
Susan Okie

Study backs hormone link to cancer for women.
New York Times, Jan. 27, 2000, A17
Denise Grady

Pros and cons: Hormone therapy can do wonders for women, but a new cancer study
underscores the risks.
Time Magazine, 2000 Feb. 7, 68-69
J. Madeleine Nash

Menopausal estrogen and estrogen-progestin replacement therapy and breast cancer risk.
Journal of the American Medical Association, Jan 26, 2000
Catherine Schairer et al.

I first saw this story in the Washington Post version. The study compared three populations, women taking no hormones, women taking estrogen alone, and women taking both estrogen and progestin ("the combination"). Reporter Susan Okie writes:
... for each year women take the combination, their risk
of breast cancer rose by 8 percent compared with that of
nonusers. After 10 years of treatment, the risk would
thus be 80 percent higher than that of nonusers, or
almost double.
Weighing in one day later, New York Times reporter Denise Grady writes:
The researchers ... found that women who took the
hormone combination for five years had a 40 percent
increase in the risk of breast cancer, compared with
those who did not take the treatment. Women who took
estrogen alone had a 20 percent increase. Dr. Schairer
said one way to illustrate the increased risk was to
consider a group of 100,000 normal-weight women, ages
60 to 64, none of whom take hormone replacement. During
a five-year period, 350 cases of breast cancer would be
expected. But if all the women took combined hormone
replacement for 5 years, about 560 cases would be
expected.
In Time, which featured the story on its cover, J. Madeleine Nash reports the findings as follows:
The combination ... may increase one's breast-cancer risk
by 8% a year, vs. 1% for women taking estrogen alone. ...
[T]he rise in risk only became striking after four or more
years of continuous hormone use.
I went to JAMA and found the original article. The researchers studied 46,355 women for a total of 473,687 woman-years.
Results: During follow-up, 2082 cases of breast cancer
were identified. Increases in risk with estrogen only
and estrogen-progestin only were restricted to use
within the previous 4 years (relative risk [RR],1.2
[95% confidence interval {CI}, 1.0-1.4] and 1.4
[95% CI, 1.1-1.8], respectively); the relative risk
increased by 0.01 (95% CI, 0.002-0.03) with each year
of estrogen-only use and by 0.08 (95% CI, 0.02-0.16)
with each year of estrogen-progestin-only use among
recent users, after adjustment for mammographic
screening, age at menopause, body mass index (BMI),
education, and age. The P value associated with the
test of homogeneity of these estimates was .02.
Among women with a BMI of 24.4 kg/m^2 or less, increases
in RR with each year of estrogen-only use and estrogen-
progestin-only use among recent users were 0.03 (95% CI,
0.01-0.06) and 0.12 (95% CI, 0.02-0.25), respectively.
These associations were evident for the majority of
invasive tumors with ductal histology and regardless
of extent of invasive disease. Risk in heavier women
> did not increase with use of estrogen only or estrogen-
progestin only.

Reading the story, I first thought the Post had made a naive mistake in multiplying 8% by 10, but after several looks at the original, I now believe the Post's interpretation is correct. The story is worthwhile because it illustrates the difference between relative risk and absolute risk. The absolute risk of getting breast cancer in a year is quite small (the study found 2,082 breast cancers in 473,687 woman-years) so the relative risks cited correspond to small increases in absolute risk.

DISCUSSION QUESTIONS:

(1) Give your critique of each of the articles.

(2) Why might the science writers have trouble reading the summary of results in the original paper?

(3) If you were writing for your local newspaper, how would you have expressed the results of the study?
<<<========<<




>>>>>==============>
The Census 2000 is underway and we remind readers that we have a number of videos on the census in our Chance Videos on the Chance web site. We also added to our audio programs a 1998 NPR program on the census. This is an hour-long program providing a wide-ranging discussion of the census with emphasis on the undercount problem. Participants are Harvey Choldin, author of "Looking for the Last Percent: The Controversy Over Census Undercounts", Stephen Fienberg, Carnegie Mellon, co-author with Margo Anderson of the recent book on the census, "Who Counts? The Politics of Census-Taking in Contemporary America", and Steven Holmes, correspondent for the New York Times.
<<<========<<




>>>>>==============>
Predicting Future Duration from Present Age: A Critical Assessment.
To appear in Contemporary Physics
Available at abstract astro-ph/0001414
Carlton M. Caves

Point, Counterpoint and the Duration of Everything.
The New York Times, Feb 8, 2000, F5
James Glanz

The second article gives a brief description of a theory, due to J. Richard Gott (See Chance News 8.06), that claims to be useful in predicting the duration of an observed activity. The first of these two articles raises serious doubts about the correctness and usefulness of this theory. This reviewer must confess that he has always felt uneasy about whether Gott's theory is probabilistically sound, so it is very interesting to find someone who shares this unease and has tried to be mathematically precise in thinking about this theory.

Suppose that we observe an event that is in the process of occurring, and we know how long ago the event started. The question is whether we can come up with an interval [a, b] such that, with high probability, the event will end sometime after a years from now but before b years from now. Gott uses a version of the Copernican theory of ordinariness: When you observe an event, your observation does not occur at a special time. From this assumption, Gott claims that, at the 95% confidence level, your observation occurs sometime between 2 1/2 % and 97 1/2 % of the total duration of the event. Thus, if the event started s years ago, then at, the 95% confidence level, we can assert that the event will end between s/39 and 39s years from now.

Caves goes through a detailed and rigorous analysis of this argument and claims to have uncovered two series flaws, either of which is enough to do serious damage to this theory. After giving some examples of how this theory has been applied by Gott, we will attempt to explain the flaws found by Caves.

The first example is the one that is claimed by Gott to be the one where the theory was conceived. He was standing at the Berlin Wall in 1969, 8 years after it was erected. He used a 50% confidence interval to assert that the Wall would last between 2 2/3 years and 24 years into the future. As is well known, the Wall came down in 1989, 20 years after Gott's calculation.

The next example involves the longevity of the human species. Using this theory, together with the knowledge that homo sapiens has been around for about 200,000 years, we obtain the statement that, at the 95% confidence level, our species will last between 5100 years and 7.8 million years into the future. This statement is quite hard to refute, of course, but it is also hard to take very seriously.

Caves gives some examples that show that this theory is far from being universally applicable. His first example involves radioactive decay. Suppose that an atom is excited to a metastable energy level at some unknown time and decays to the ground state with a decay constant of (20 min.)^(-1). Now suppose that you observe the atom 15 minutes after it was excited. Gott's theory gives a 95% confidence interval for the time until decay of [23.1 seconds, 9.75 hours]. However, it also predicts that with probability 1/5, the decay time will be at least 60 minutes in the future.

Actually, the law of exponential decay gives a 95% confidence interval of [0, 60 minutes], which is quite a bit different than the one given by Gott's theory. Caves remarks that this numerical discrepancy is "only a symptom of the real problem: Gott's rule, by including present age in the prediction of future duration, is inconsistent with the very notion of an exponential decay."

Another worrisome example is the following one. Suppose you walk by a house in which a party is in progress, and one of the guests on the porch informs you that the party is in honor of someone who is celebrating her 50th birthday. Gott's theory predicts that, with 95% certainty, the celebrant will live between 1.28 and 1950 additional years. More disturbing is the fact that this theory also predicts that the woman will, with probability 1/3, live beyond 150.

The main reason that Caves is suspicious of this theory is that it does not take into account any prior information about the event under observation. For example, if we are observing a human institution, such as a certain country, it must be relevant, when attempting to predict the future survival of that country, to use knowledge about the duration of other countries. If we are attempting to predict how long someone will live, it is certainly relevant to ask how old the person is, and of course Gott's theory does ask this question. What the theory does not take into account is the fact that no one lives beyond about 120 years (and in fact, most people die before the age of 90).

Caves shows by Bayesian analysis that the theory is wrong because it assumes that your observation does not occur at a special time, and yet it does occur at a special time, namely while the event is in progress. If your observation really were to be at a non-special time, it would quite likely be before the event had started or after it had ended. He then shows that Gott's theory only holds if the amount of time that has passed since the event started is unknown. This implies that the theory has no predictive power, because the age of the event is unknown.

Near the end of his article, Caves cannot resist formulating a bet with Gott (which Gott has turned down). Caves compiled a list of 24 pet dogs owned by faculty, staff, and graduate students in his department. He then chose the 6 dogs from this list that were over 10 years old. Gott's theory predicts that, with probability 1/2, a given dog will live to more than twice its present age. Caves offers to give Gott 2 to 1 odds, on each dog, that the dog will not live to twice its age. Caves claims that if Gott's theory is correct, then Gott's expected winnings are $6000, and that the probability that he will be a net loser is only about .109.

Caves' article is extremely well-written and is quite technical. Nevertheless, it is certainly worth trying to wade through.

DISCUSSION QUESTIONS:

(1) When the radioactive example was pointed out to Gott he answered that, under his assumptions, one would not know the rate of decay. Would this get him off the hook?

(2) Gott might say there is something special about the 6 dogs from Cave's list over 10 years old and that his theory would not apply to this. Do you think Gott would say that his estimate would apply to a randomly chosen dog?

(3) How do you think Gott would explain the the fact that his theory would say that a 50-year-old women would have a 1/3 chance of living beyond 150?
<<<========<<




>>>>>==============>
After we prepared the previous review we learned of Gott's extensive reply posted by PhysicsWeb.

Since we learned of this reply after finishing this Chance News we will leave its assessment as a discussion question for our readers. If you send us your comments on this controversy we will summarize these in the next Chance News.
<<<========<<




>>>>>==============>
As we have previously mentioned John Paulos writes a monthly column "Who's Counting" for ABCNews.com

In his current column, "Who wants to be a science-savvy president?" John proposes a "science literacy" quiz for the presidential candidates. His quiz has 15 questions including statistical questions such as:

Is a carefully conducted poll of 1,500 randomly
selected American adults sufficient to determine
the percentage (plus or minus 3 percentage points)
favoring a certain policy? Is such a poll more
or less accurate than one surveying only the residents
of a small town of 5,000 from which 100 people have
been randomly selected?
and
What is a double-blind study? A placebo? Would you
be interested in a photo opportunity with the latter
at the San Diego Zoo?

Paulos also includes questions involving understanding percentages and large numbers, as well as more general questions on the nature of science. John guesses that Gore and McCain might get 11 or 12 questions right, Bradley nine or 10, and Bush seven or eight.

You might also enjoy John's February 1 column "Of ants, butterflies and economic whimsy". This column is based on a recent book: Butterfly Economics by Paul Ormerod (Amazon.com $16.80). Ormerod uses the following simple example to illustrate his explanations for a wide range of human behavior.

Two identical piles of food are set up an equal distance from a large nest of ants. Each pile is automatically replenished so the ants have no reason to prefer one pile to the other. While we might expect some kind of equilibrium behavior, instead the number of ants going to a particular pile fluctuates wildly and does not ever settle down -- just like the current Dow Jones.

Ormerod shows how such examples help explain changes in family structure, fashion, as well as choices that that lead to academy awards and choices for technology such as VHS vs. Betamax etc.

DISCUSSION QUESTIONS:

(1) See how you would do on Paulos' quiz. After this see what you think about Paulos' estimates for the number of correct answers that the presidential candidates would get?

(2) Could Paulos have given confidence intervals for his estimates for how the politicians would have done on his questions? If so how would he do this?

(3) Can you give other examples of human behavior that might be explained in part by the ant example?
<<<========<<




>>>>>==============>
A trump card for unusual risks.
The New York Times
February 13, 2000, Section 3, page 2
Jane Wolfe

There are a huge number of contests in the United States every year. Many of these contests have large prizes, but in many cases these prizes may never be paid out, because winning the contest involves a substantial amount of luck or skill. As examples, contestants can win by making a half-court shot at a basketball game or making a hole-in-one on a golf course. It is here that the idea of insurance enters the picture. From the perspective of those offering the contest, they would feel much more comfortable if the cost of the contest were a known quantity before the contest even begins. It makes sense for them to pay someone to insure them against the large loss that would be incurred if someone were to win the contest.

Bob Hamman (who happens to be the world's best bridge player at present) is someone who offers such insurance. Hamman is in charge of SCA Promotions, and he charges a percentage of the value of the prize in a contest as a fee. Of course, the percentage depends upon the probability that the prize will be won, and Hamman is responsible for estimating these probabilities. For example, he claims the probability that a random fan will make a half court shot is 1/25 and the odds of a hole-in-one on a 175-yard golf hole is 1/6500.

One of the best stories in this article involves Nolan Ryan, who holds many records in Major League Baseball, including most strikeouts and most no-hit games. In 1989, SCA underwrote a contest that would have paid Little League Baseball $250,000 if Ryan were to pitch a no-hitter that year. Ryan was then 42 years, and no one at that age had ever pitched a no-hitter. Hammon recalls that several times that season, Ryan reached the eighth or ninth inning with a no-hitter going, but he never finished one. It is interesting to note that he did pitch a no-hitter in each of the next two seasons, but no contests were riding on those.

This article reminds this reviewer of a story involving Paul Erdos, one of the most prolific mathematicians of the 20th century. He was in the habit of offering monetary rewards for solutions to many of his problems. Someone asked him once what he would do if all of his problems were to be solved; he couldn't pay all of the rewards in this case. Erdos countered by pointing out that, if all of the gamblers in Monte Carlo were to win their bets, the casinos would go broke. He then added that he thought that this eventuality was much more likely than the possibility that all of his problems would suddenly be solved.
<<<========<<




>>>>>==============>
Boston inspects but doesn't grade restaurants.
Boston Globe, 8 February 2000, E3
Bruce Mohl

As reported in Chance News 7.06, the article "Dining out in L.A. Comes to Crunching Numbers" discussed a numerical rating system being used by inspectors to give letter grades to Los Angeles restaurants (A for 90-100, B for 80-89, etc.). By law, the grades must be displayed in restaurant windows; they can also be viewed on a county web site. The present article reports that, when the system was adopted in 1988, only half the restaurants got A grades. Today, 75% make A's.

Boston's restaurant inspections also result in a point score, which could in principle be converted to a letter grade. However, patrons get only pass/fail information. A restaurant failing inspection is closed down until the problems are corrected, usually only a matter of days. A pass grade certifies that it is safe to eat there, and the mayor's office argues that this is all the public needs to know. But Dr. Jonathan Fielding, the former Massachusetts commissioner of public health who now holds that title in L.A. county, disagrees. He says that B or C grades can discourage business, which gives restaurants an incentive to improve to A's rather than being content merely to keep their doors open.

Steven Grover of the National Restaurant Association notes that grades are attractive as a "quick fix" to public health concerns but warns that many components of inspections are subjective. Furthermore, just reporting a final score does not tell consumers what problems actually exist. Orange County, California has found a middle ground position. Restaurants display notices indicating that they passed their last inspection, and copies of the full report must be available to patrons on request.

DISCUSSION QUESTIONS:

(1) The Globe visited 20 well-known Boston restaurants and found that the average score from their most recent inspection was 79.7. According to the article "that would have been on the border between a B and C by Los Angeles standards." Are you convinced that this group is worse than most L.A. restaurants?

(2) When visiting a restaurant, do you think you would be inclined to request a copy of its inspection report? What kind of information would lead you to leave without eating?
<<<========<<




>>>>>==============>
In defense of the Harvard Nurses' Health Study.
Washington Post, 8 February 2000, Z4
Letter from Walter Willett, M.D. Harvard School of Public Health

Food surveys have wide margins of error; Researchers know that questionnaire results don't always reflect actual eating habits.
The Washington Post, 1 February 2000, Z9
Lawrence Lindner

The news media frequently feature stories on the latest link found between diet and disease. The controversy reported here concerns Harvard University's Nurses' Heath Study. Dr. Willett was concerned by the tone of the Post article which criticized the methodology of food surveys in general and the Harvard study in particular.

The Post cited a number of reasons to doubt the quality of data produced by food surveys. One was the complexity of some of the questions asked. The article cites the following example from the Harvard study:

How often, on average, did you eat a quarter of a cantaloupe during the past year? One to three times a month? Once a week? Two to four times a week? Once a day? Please try to average your seasonal use over the entire year. For example, if cantaloupe is eaten four times a week during the approximately three months it is in season, then the average use would be once a week.

Even when they understand the questions, people may not accurately report their eating habits. Writing in the American Journal of Clinical Nutrition, psychologist John Bludell cited a recent study finding that obese men under-reported the calorie intake by 36%. He expects that, if people are told fat intake is bad, they will similarly begin to under-report that.

For their part, the news media do not do a good job distinguishing between correlation and causation. For example, a recent Washington Post headline announced "Study Links Hot Dogs, Cancer: Ingestion by children Boosts Leukemia Risk, Report Says." The article began: "Children who eat more than 12 hot dogs per month have nine times the normal risk of developing childhood leukemia..." Furthermore, since data for the study came from parents recollections of their children's eating habits, they are subject to the same concerns cited above.

Nevertheless, the article does not conclude that food surveys are worthless. According to nutrition research James Fleet of the University of North Carolina, such research "generates new hypotheses to test in more controlled settings."

In his letter to the editor, Dr. Willett noted that the cantaloupe question was presented out of context in the article. He points out that researchers designing questionnaires give careful consideration to the ordering and wording of questions.

Furthermore, he points out that researchers do not blindly accept survey data. Efforts are made to validate reported results with already established dietary risk factors. For example the ratio of polyunsaturated to saturated fat intake is known to be related to cardiovascular disease. The fact that this relationship has been confirmed in the Nurses' Heath study gives additional confidence in the reporting. Finally, the fact that the data have been recorded over a 20-year period strengthens conclusions about long-term effects of diet.

DISCUSSION QUESTIONS:

(1) Do you understand how the average seasonal use of once a week was computed for the cantaloupe example? Estimate your own average consumption of cantaloupe.

(2) Do you agree that the hot dog headline is too sensationalistic? How would you rewrite it to be more accurate? What about the lead sentence?
<<<========<<




>>>>>==============>
Ask Marilyn.
Parade Magazine, 5 March, 2000, p.8
Marilyn vos Savant

A reader writes:
I believe every supporter of capital
punishment ought to answer this question:
"What error rate do you consider acceptable?"
If you were a supporter, how would you reply?
Hal O. Carroll
Pickney, Mich.
Marilyn writes:
I would say that only a 0% error rate is acceptable.
However, that doesn't lend support to your argument
that the death penalty should be abolished, as you
imply. I'm sure you feel the same way about surgery
and airline travel, but I'm equally sure you wouldn't
abolish either of them.

Also, when a man has been found to be "wrongly convicted",
it doesn't mean he has been found to be innocent: Courts
don't make that judgment. (A verdict of "not guilty"
means that guilt has not been proved to the satisfaction
of the law. Instead, the cause can be anything from the
discover of a technicality to the finding that key evidence
was flawed. Moreover, I've read articles in which the
reporters wrote that "DNA testing proved the man was
innocent," when, in fact, the DNA test did not prove
the man was guilty. That's different.
Marilyn concludes her discussion with:
One last note: You imply that as long as a possibility
of error exists, you are against the death penalty. This
implies that, if the possibility of error does not exist,
you would approve of capital punishment.

DISCUSSION QUESTIONS:

(1) What do you think about Marilyn's anwer?

(2) It is not uncommon to hear a DNA expert say: DNA cannot prove a suspect guilty of a crime but it can prove that the suspect is not guilty. Who do you think is right about this?

(3) Comment on Marilyn's last note.

(4) In the March 2 Republican debate, Jeff Greenfield speaking about the death penalty, asked Governor Bush:

But just yesterday, a prisoner in Texas on death row,
a man named Calvin Burdine, was released from prison
after a federal judge found that his lawyer had slept
through much of the trial. Now, in light of this,
are you still confident that the 458 prisoners on
death row have had their legal rights protected
in these life and death cases?
Bush answered:
... the question isn't about the ones that are coming
up, the question is about the ones that have been put
to death. And I'm absolutely confident that everybody
that has been put to death is two things: One, they're
guilty of the crime charged and secondly, they had full
access to our courts, both state and federal.
Do you think that statistics would suggest that Bush cannot be "absolutely confident" of this?
<<<========<<




>>>>>==============>
Assessing time trends in sex differences in swimming & running.
Chance Magazine, Winter 2000, Vol.13, No.1, p. 10
Howard Wainer, Catherine Njue, Samuel Palmer

Comment: Studying trends in sport participation by modeling results of
elite-level athletic performance.
Chance Magazine, Winter 2000, Vol.13, No.1, p. 16
David E. Martin

Comment: Can we trust a method just because we like its prediction?
Chance Magazine, Winter 2000, Vol.13, No.1, p. 18
Phillip N. Price

Sex and sports: a rejoinder.
Chance Magazine, Winter 2000, Vol.13, No.1, p. 21
Wainer, Njue, Palmer

For any article with Howard Wainer as co-author you know that you have to see the pictures to appreciate what is going on. However, we will try to say enough about these articles to make you run to your Chance Magazine or your library to read it.

Wainer and his colleagues want to study the time behavior of men and women athletes in sporting competitions. They start with the Boston Marathon and ask the following three questions:

1. How far behind men's marathon performances are women?

2. How long before women catch up?

3. Why are women lagging?

They start by looking at trends and linear fits for the winning times and there ratios for men and women in Olympic running and swimming contests. These graphs suggest that women are improving faster than men and that there performance appears to lag behind that of the men. The graphs of improvements were similar except that the women's curves were offset horizontally by varying number of years and vertically by a percentage that varied from 2% to 9%. The horizontal offset is suggested by a lag because of smaller participation rates in earlier years and the vertical offset by physiological strength differences. When they made a best fit with these two parameters they found that the vertical offset did not vary too much. It was least, 2%, in the marathon and greatest, 9%, in the 100-m freestyle swimming. However, the lag time was very different. It was about 75 years in the marathon. Similar values applied for Olympic track events. However, for swimming the vertical offset was larger, around 8%, but the lag times was significantly shorter, about a decade.

The comment papers made interesting observation about these results. Martin is a physiologist. He remarks that we should be careful in using such terms as "when women will catch up" and "why women are lagging behind" because they might be misinterpreted to mean that women should catch up. He then discusses the physiological differences that suggest that they should not. He comments that the percentage differences found are quite consistent with expert opinion on what the differences would be expected to be on physiological grounds. He ends his comments with "vive la difference."

Price is a physicist and discusses various ways in which the two parameter model proposed by Wainer and his colleagues could be tested further.

In their rejoinder the authors thank the commentators for their support and end with the remark:

We are not unaware of the plausible application
of this methodology to the study of sex differences
on performance in various academic skills.
perhaps suggested by the famous remark of Crick and Watson at the end of their one page article in Nature describing their discovery of DNA:
It has not escaped our notice that the specific pairing
we have postulated immediately suggests a possible copying
mechanism for the genetic material.

DISCUSSION QUESTIONS:

(1) What explanations can you give for the lag time between men and women's records?

(2) Why do you think the lag time is shorter for swimming than running?

(3) What questions about academic skills do you think the authors have in mind in their concluding remark.
<<<========<<




>>>>>==============>
Graduated driver's licenses could put the brakes on teen cruising.
The Seattle Times, 30 January 2000, L5
Shelby Gilje

Graduated licenses save teen lives.
Better Homes and Gardens, March 2000, 106

Washington may soon join more than 30 states that have instituted graduated driver's licenses. Such measures apply to 16- to 18-year-old drivers and place restrictions on the hours they may drive and whether they can carry other teenage passengers. The laws are motivated by higher accident rates among young drivers, especially when there are other teens in the car. A bill before the Washington legislature would require 50 hours of driving experience, including 10 nighttime hours, before beginning drivers could carry other teenage passengers. (According to the Better Homes article, 63% of teenage passenger deaths occur when another teen is driving.) Furthermore, in the first six months teens would be prohibited from driving between midnight and 5:00 a.m. unless accompanied by an older driver.

The neighboring state of Oregon has had graduated licensing since 1989. Since that time, the National Highway Traffic Safety Administration (NHSTA) reports that there have been 16% fewer crashes among male drivers aged 16-17. There has been no significant change among female drivers.

The article recommends the NHSTA website as a source of futher information about graduated licensing.

A search on "graduated licence" leads to a discussion of summary statistics like those in the article. You can also find link to more detailed data on automobile accidents from the agency's National Center for Statistics and Analysis.

Available there in pdf format are extensive reports on safety data from 1998.

DISCUSSION QUESTIONS:

(1) What else do you need to know to assess the Oregon statistics?

(2) According to Better Homes and Gardens, police accident data show that 36% of 16-year-olds in fatal crashes were speeding. That figure drops to 21% among drivers aged 25-49. The article cites this as evidence that younger drivers need more experience. How does this follow?

(3) The Seattle Times cites the executive director of the Washington Insurance Council as saying the industry supports graduated licensing because it has been proven to reduce teen deaths. But insurers won't commit to reducing rates for teens until data is available for the new system. Does this seem contradictory?
<<<========<<




>>>>>==============>
Trillion dollar bet.
WGBH TV: Nova, 8 February 2000
Video available for $19.95 (1-800-948-8670)

Introduction to Mathematical Finance: Options and other topics.
Sheldon M. Ross
Cambridge University Press, 1999
Amazon.com $24.47

These are two resources for discussing the Black-Scholes formula for determining the price of a stock option. This formula is a beautiful example of mathematics that is really applied. It is used in the stock market thousands of times every trading day. Black and Scholes received the Nobel Prize in 1997 for this formula. For a discussion of this event see The 1997 Nobel Prize in Economics

The NOVA video features a discussion of the development of this formula by Nobel prize winners Paul Samuleson, Merton Miller, Myron Scholes, Robert Merton, and others. The Trillion Dollar Bet refers to the bet made by the Long-Term Capital Management fund organized by John Meriwether to use mathematical analysis of stock behavior in investments involving billions of dollars. Merton and Scholes were members of this company. When the Asian financial crisis came the math did not work and the fund lost their bet and their company in the process.

Nova is produced by WGBH in Boston and, at The Trillion Dollar web site you will find a discussion of the Black-Scholes formula and have a chance to try your hand at buying options with a program that simulates the behavior of four typical stocks.

The book by Ross gives an elementary derivation of the Black-Scholes formula--one that could be given in an introductory probability course. We give an outline of the derivation.

The derivation of the Black-Scholes formula is based on an old gambling idea called a "Dutch Book". A Dutch book is a set of bets made on the outcome of a chance event that guarantees that you will win (lose) money no matter what the outcome is. We could not find why the Dutch get blamed. One suggestion is that, like Dutch Treat, it was based on the frugal nature of Dutch people who might only bet on a sure thing. In his book "Emergence of Probability", Ian Hacking remarks that the first annuities in Holland were so badly designed that they caused towns in Holland to lose significant amounts of money so perhaps this is the source of the phrase.

The possibility of Dutch Books has been used to argue that subjective probabilities should obey the properties of a probability measures. Let's see how this was done.

Bets are usually expressed in terms of odds, but for our purposes it is more convenient to describe a bet as a division of a "stake" s. Let E be an event related to a chance experiment. If you make a bet with stake s that E will occur, you win s(1-p) if it does and lose sp if it does not occur. The stake s can be positive or negative. A negative stake means you are betting against the occurrence of E. If you are indifferent between a bet for or against E, then p is called your "degree of belief for E".

Ramsey and de Finetti showed that, when degrees of belief do not have properties of a probability measure, it is possible to arrange a Dutch Book using these bets.

To illustrate their argument consider an experiment with three outcomes a,b,c with degrees of belief p,q,r. Assume that that the sum or these degrees of belief are less than 1. (A situation that those who bet on the horses dream of finding). Let u = 1- p - q - r. Make bets that each of these three outcomes (bet on all the horses). Then if the a occurs our net winning is 1-p-q-r = u > 0. The same is true for any outcome so we have a Dutch Book. A similar proof shows that if the sum of the degrees of belief are greater, than 1 a Dutch Book will be possible by betting against each of the outcomes. Thus the sum of the degrees of belief for the possible outcomes of the experiment must add to 1.

To prove that degrees of belief must be a probability measure if there cannot be a Dutch Book, we need only show, in addition, that the degrees of belief have the additive property for disjoint events.

Let E and F be disjoint events with degrees of belief s and t. Then since E and F cannot both occur we have the following three possibilities for the outcome of the experiment:

Outcome            You win

E and not F        1-s - t
F and not E        1-t - s
not E and not F     -s -t

Thus you either win 1-(s+t) or lose -(s+t).

Now let u be your degree of belief that E or F will occur and assume that u > p + q. Make an additional unit bet against E or F occurring. If E or F occurs your net winning for the three bets is

1-(s+t) - (1-u) = u - (s+t) > 0.

But your net winning is the same if neither E nor F occur. Thus we can make a Dutch book if u > p+q. A similar argument shows that we can do the same if u < p + q by betting against all three events. Finally, we need to show that if no Dutch Book is possible. then conditional bets correspond to conditional probabilities in the obvious way. We leave that to the reader or you can consult John G. Kemeny's paper "Fair Bets and Inductive Probabilities , Journal of Symbolic Logic, Vol. 20, No. 3. (Sep., 1955), pp. 263-273 available from JSTOR where you will find a nice discussion of this whole topic.

Now for the stock problem we need an extension of the notion of a bet and a Dutch Book. Assume that we have a chance experiment with a finite number of possible outcomes W = (1,2, 3,...,n). Then a "bet" is determined by a stake s and a function X defined on W called the "payoff function". The payoff for the bet X with stake s is sX(w) for outcome w. For example, let W be the 38 places where the ball can stop when playing roulette. If you bet on a corner common to the numbers 13,14,16,17, then X is a function with value on 8 for these four numbers and 0 otherwise. If you make such a bet with stake s you are paid off 8X(w).

If we have a probability measure defined on W, a function X defining a bet becomes a random variable, and we can speak of the expected value of the bet. A bet is "fair" if it has expected value 0.

When the degrees of believe p of and event E is determined by a bet, it would be fair if it were the probability of E for some probability measure since then the expected outcome of the bet would be p(1-p) - (1-p)p = 0.

The key result used by Ross in computation of the price of a stock option is the "arbitrage theorem" which states:

For any set of bets relating to a chance experiment with outcomes W either (a) there is a probability measure on W that makes all the bets fair or (b) a Dutch Book is possible with these bets.

Applied to degrees of belief of events, the arbitrage theorem says that, given degrees of belief for any set of events relating to a chance experiment, either a Dutch Book is possible or there is a probability measure defined on the set of outcomes of the experiment which makes these degrees of belief probabilities.

Now on to options. Consider a stock with current price $100. Then a stock option with "strike price" $150 and "exercise time t" allows you to buy a share of this stock for $150 at time t.

If at time t the price of the stock is greater than 150, you would exercise your option and sell it at the current price for a profit. If the price is less than or equal to 150, you would not exercise your option. Since you can only win by this, it is reasonable that you should have to pay for such an option. The Black-Scholes formula provides a rational price for such an option. Ross assumes an interest rate r and uses discounted values for holding a stock at a future time. This does not change the way in which the option price is determined, so we will simplify matters by assuming r = 0.

Black and Scholes developed their formula by assuming that the logarithm of the price P(t) of a stock can be modeled by Brownian motion. In particular this assumes that log(P(s+t)/P(s)) has a normal distribution. If you look at data for stocks over a period of time you will see that this is reasonable for most stocks.

Ross starts with a discrete time approximation to the continuous time model and then passes to the limit to obtain the Black-Scholes formula. This discrete model, a multiplicative random walk, assumes that in each unit time period the price of a stock goes up by a factor u with probability p or down by a factor of d with probability 1-p.

We consider first the case of a single time period. We assume that our stock has initial price 100 and then either goes up to 200 or down to 50. Thus u = 2 and d = 1/2. Assume that the price of an option is c. Consider the following two bets: Buy a share of the stock and buy an option. Then W = (0,1) with 0 indicating that the stock went down and 1 that it went up. The random variables for these bets are the price of a share of stock and the amount made or lost by buying an option. When the stake is positive you are buying stocks or options. When it is negative you are selling short stock or selling options. "Selling short" a stock means that at the beginning of the time interval you borrow a share of stock and sell it and then buy it at the end of the time interval to pay back the stock you borrowed. Now we want our price c to be such that no Dutch book is possible. Otherwise it would be possible for traders to make unlimited amounts of money. By the arbitrage theorem the probabilities p and q must make both of these bets fair.

The bet on the stock will be fair if

100 = (200)p - 50(1-p)

or p = 1/3

The bet on the option will then be fair if

0 = p(50-c) - (1-p)c

which gives c = 50/3. Thus if we make these bets fair it will not be possible to have a Dutch Book by buying or selling the stock or the option.

Another way to find this cost it to try to arrange a Dutch book. If we buy options we win if the stock goes up. If we sell shares short we win if the stock goes down. Thus we can "hedge our bets" by doing both of these. Suppose we sell x shares short and buy y options. If the stock goes up to 200 we lose 100x on the stocks we sold short since we have to buy these for 200. On the other hand we win (50-c)y on our options. If the stock goes down to 50 we win 50x on the stocks we sold short but we lose cy on our options since we do not exercise them. Thus if the stock goes up our net gain is 50y - cy - 100x and if it goes down it is 50x-cy. We can make these equal by choosing y = 3x. If we do this our net gain with either outcome will be x(50-3c). Thus if c < 50/3 we can make a positive Dutch Book and if it is less we can make a negative Dutch Book. To avoid this we need to have the cost of an option 50/3 that we obtained earlier by making sure the bets we make are fair.

Note that this method shows that the probability that the stock goes up is not needed to determine the price of the option. Two investors having different values for this probability should accept the same price for the option. On the other hand if they disagree on how much the stock will go up or down they would not. This suggests that the variance of the stock price is important. We shall see, in fact, that the final formula does involve the volatility of the stock. This solves the problem for a single time interval

Then let the outcomes W of our chance experiment be sequences of length n of 0's and 1's where 1 means the stock went up and 0 that it went down. Let S(0) be the initial price of the stock, K the strike price and c the cost of the option. Then the price S(i) of the stock on day k is S(0)(u^v)(d^(n-v)) where v is the number of days the stock went up in the first k days and n-v the number of times it went down.

For our bets, each day we buy a share of stock and sell it the next day. In addition we buy an option at time 0 with exercise time n days. Then for a stock bet on day k to be fair we must have S(k) = puS(k) +(1-p)S(k) where p is the probability the stock goes up on that day. This means that 1 = pu +(1-p)v. Solving for p gives p = (1-d)/(u-d).

The value of the option after n days is max(S(n)-K,0) and so its expected is E(max(S(0)(u^Y)(d^(n-Y)),0) where Y has a Binomial distribution with probability p = (1-d)/(u-d) for success. For our bet on buying a stock option to be a fair bet the price of the option must equal this expected value when exercised. This determines the price of the stock.

Thus we have shown how to determine the price of an option for our discrete process with n time intervals. Then Ross obtains the Black-Scholes formula for the option price as a limit of the values obtained for the discrete process. The final formula is:

C = S(0)N(w) - Ke^(-rt)N(w) - s^2(sqr(t))

where

w = (rt + (s^2)t/2 - log(K/S(0))/s(sqr(t)).

Here N(w) is the standard normal distribution function, S(0) the price at time 0, r the interest rate, K the strike price, t the time we can exercise the option and s is the variance of the stock price.

As previously mentioned the final value for the cost of an option depends on the volatility of the stock as measured by its variance. This can be estimated from previous history of the stock.
<<<========<<




>>>>>==============>
In the last Chance News we discussed a New York Times article that discussed new applications of ratchets. This article mentioned that physicist Sergei Maslov had applied ratches to the study of stocks. Dr. Maslov sent us an e-mail message and two papers describing his work.

Optimal investment strategy for risky assets.
International Journal of Theoretical and Applied Finance. Vol. 1, No. 3 (1998) 377-387
Sergei Maslov, Yi-Cheng, Zhang

Dynamical optimization theory of a diversified portfolio.
Physica A 253 (1988) 403-418
Marsili, Maslov, Zhang

Here is Dr. Maslov's e-mail note:

Dear Prof. Snell, Thank you for your e-mail. Contrary to what can be inferred from the NY Times, I am not concerned with applying Dr. Parrondo's paradox to portfolio management. However in my papers (see PDF files attached my collaborators and I describe yet another example of what may be generally and loosely referred to as ratchet. We show how an active portfolio management can sometimes turn two losing stocks into a winning portfolio. Also, the statement from the NYT that "so far it is too early to apply his model to the real stock market because of its complexity" is not based on my words and is far from reality. Our model uses a simplified view of the stock price behavior as a multiplicative random walk. As always, simplified models usually clarify mechanisms behind the phenomenon in question, but are not immediately suited for practical applications.

Probably the clearest example of how a ratchet-like investment strategy can sometimes get a positive capital growth by investing in a losing stock is as follows: imagine a (hypothetical) asset whose price after some time period (say a month) is either multiplied by 2 (which happens with probability 1/2) or divided by 3 (also with p=1/2). If you simply "buy and hold" such a stock then in the long run your capital almost certainly goes down together with the stock (indeed the logarithm of price follows a random walk with a negative drift). However in a short run this stock has a positive return of (1/2)$2+(1/2)($1/3)=$7/6 for each dollar invested in it. That means that after T months your AVERAGE capital would grow (7/6)^T times! The problem is that this amazing growth rate comes from exponentially unlikely events when you gain a lot. Any sensible investor should not rely on it. It turns out that the optimal strategy which would make your TYPICAL capital (i.e. a median of the probability distribution) to grow would be to always keep 1/4 of your capital in this stock and 3/4 in cash (or bank). This ratio has to be actively maintained, i.e. some stock has to be sold immediately after its price went up and vice versa. Following this strategy for T time steps an investor should typically expect to have his/her capital to be multiplied by 1 << sqrt{25/24}^T << (7/6)^T. Hope this example would turn useful in your newsletter. I enjoyed reading its last volume!

Sincerely,

-- Dr. Sergei Maslov,

Dept. of Physics, Brookhaven National Laboratory, Upton, NY, 11973

Editors note: This example is based on the Kelly gambling system (see Chance News 7.09 Our answer to Peter's question). For more details about this and related examples see Maslov's two papers mentioned above.
<<<========<<




>>>>>==============>
Chance News
Copyright © 2000 Laurie Snell

This work is freely redistributable under the terms of the GNU General Public License as published by the Free Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

CHANCE News 9.03

(February 4, 2000 to March 6, 2000)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!