!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
CHANCE News 4.16
(17 November to 10 December 1995)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Prepared by J. Laurie Snell, with help from William Peterson,
Fuxing Hou, Ma.Katrina Munoz Dy and Joan Snell, as part of the CHANCE Course Project
supported by the National Science Foundation.
Please send comments and suggestions for articles
to jlsnell@dartmouth.edu.
Back issues of Chance News and other materials for teaching a
CHANCE course are available from the Chance Web Data Base.
http://www.geom.umn.edu/locate/chance
=====================================================
In Lake Wobegon, all the women are strong, the
men are good looking, and all the children are
above average.
Garrison Keillor
=====================================================
Contents
<<<========<<
>>>>>==============>
Note: The current Chance Magazine reports that John Rolph is finishing his term as
editor. This issue is a good example of the high standards John has brought to Chance.
We wish his successor George Styan good luck during his term. You can get more
information about Chance Magazine including how to subscribe.
We will start with three of our favorites from the current issue of Chance Magazine.
<<<========<<
>>>>>==============>
Bordeaux wine vintage quality and the weather.
Chance Magazine, Fall 1995, pp.7-14
Orley Ashenfelter, David Ashmore, and Robert Lalonde.
If you want to drink good wine you can buy new wine and let it mature in your cellar
or you can buy older wine that has matured in some dealer's cellar. To decide which
is the better strategy it is helpful to know the answer to questions like: does the
price of mature wine reflect the quality of the wine? Presumably the answer to this
is yes since by this time the quality of the wine is known and the prices reflect
this knowledge. If so, the natural next questions might be: is the price of new
wine a good predictor of the price after it has matured? If not, what is a good predictor?
This article tries to answer such questions.
The authors begin by providing the 1990-1991 London auction prices of red wines from
7 of the best known Chateaux (Vinyards) that were produced in the years from 1960
to 1969. These years were picked because by 1990 they should be fully mature and
their quality known. For a given Chateaux, there is wide variation in these prices through
the years and, for a given year, there is wide variation in the prices between Chateaux.
Using regression techniques the authors show that the prices of the wines at the time
they were grown are not good predictors of their prices when they matured. On the
other hand weather conditions are very good predictors. Great vintages for Bordeaux
wines correspond to years in which August and September are dry, the growing season
is warm, and the previous winter has been wet. Ashenfelter uses this fact to estimate
the value of new wines and provides these estimates in a newsletter he distributes
called "Liquid Assets: The International Guide to Fine Wines".
Professor Ashenfelter is a Princeton Economist who is widely quoted in the newspapers
on weightier matters, but his newsletter also makes the news occasionally. This
article provides some of the more humorous remarks made by well known wine critics
on the use of statistics to assist in judging wines.
DISCUSSION QUESTION:
Wine critics are offended by the idea of using statistics to help judge wines. Why
do you think this is?
The authors state that their results show that the market for new wines is not "effcicient"
but the market for mature wines is. What do you think they mean by this?
<<<========<<
>>>>>==============>
Picturing an L.A. Bus Schedule.
Chance Magazine, Winter 1995, p. 44
Howard Wainer
Howard Wainer edits a column in Chance Magazine called "Visual Revelations". His
columns provide wonderful examples for classroom discussions of the use of graphics.
This month he considers a question from the first National Adult Literacy Survey
conducted in 1992. In this question you are given the appropriate L.A. bus schedule and
asked how long you would have to wait for the next bus on a Saturday afternoon if
you miss the 2:35 bus leaving Hancock and Buena Ventura going to Flintridge and Academy.
The schedule is typical of those we have all struggled with: columns of outbound times
and inbound times, remarks about buses that run only Monday through Friday only
etc.
Wainer suggests that we should think about how to make a general purpose plot of the
bus data and then see how it serves to answer a variety of questions, including the
one on the quiz. His choice is to plot the time of day on the horizontal axis and
the various bus stops on the vertical axis. A change of scale suggests itself, and after
this change is made we have a plot that makes it easy to see regularities in the
way the buses run. The cyclic nature of the graph suggests that there is a single
bus going back and forth on the route considered, making a round trip in just under two
hours. The graph provides easy answers to a variety of reasonable questions including
the survey question.
<<<========<<
>>>>>==============>
Fuzzy logic: great hope or grating hype.
Chance Magazine, Winter 1995, pp. 15-19
Michael Laviolette
The author of this article feels that many problems currently being solved by fuzzy
set theory could be equally well solved using probability theory. To illustrate this
he considers the following simple application of fuzzy set theory to control theory.
You want to design an air conditioner-controller to make a motor go at speed y when
the temperature is x. You only have a vague feeling about when the room is cool,
just right, or warm. Fuzzy logic suggests associating these labels with appropriate
intervals of temperatures. These sets are called "fuzzy sets" and are allowed to overlap.
For example, suppose you assign the interval from 50 to 70 as the "cool" set, 60
to 80 as "just right", and 70 to 90 as "warm." Then the temperature 65 is in both
the "cool" set and the "just right" set. For each fuzzy set you define a membership function
that assigns a value between 0 and 1 to each member of the set. For example, for
the cool interval from 50 to 70, you might make the membership function increase
linearly from 0 to 1 as the temperature goes from 50 to 60 and decrease linearly from 1 to 0
as the temperature goes from 60 to 70. Then 60 is a really cool temperature, but
65 is only .5 cool.
Similarly, you can determine fuzzy sets and measurement functions corresponding to
intervals of speed that you consider slow, medium, and fast.
We associate "cool" with the motor being on "slow", "just right" with it being on
"medium," and "warm" with it being on "fast." This restricts how we make the correspondence
between temperatures and speeds, but at the same time it creates some conflicts.
For example, the temperature 65 is in both the "cool" and the "just right" temperature
sets, so it should correspond to a point in either the "slow" or "medium" set or
possibly both. The temperature and speed membership functions determine, by fuzzy
logic, a new speed fuzzy set and membership function for the temperature 65. When the temperature
is 65, the controller sets the speed equal to the average speed calculated using
this membership function.
For a detailed description of how this is done, consult the author's longer article
(Technometrics (1995), 37(3), 249-261). The explanation in the Chance article is
rather brief, and a key figure (the last part of Figure 2) is incorrect.
In the probabilistic approach to the problem, the membership functions are replaced
by conditional probabilities. We determine subjectively or otherwise probabilities
of the form "the probability that the room is perceived as cool given that the temperature is 65" and probabilities of the form "the probability that the machine is running
at speed x given that it is running at medium speed." These probabilities, combined
with the rules associating temperature sets with speed sets, allow you to compute
the expected speed given a specific temperature, say 65. Then, for a given temperature x,
the controller sets the speed equal to y, where y is the expected speed with respect
to these conditional probabilities.
Laviolette's article in Technometrics includes long discussions by workers in fuzzy
set theory describing their feelings about the relationship between probability and
fuzzy sets.
DISCUSSION QUESTIONS
(1) How could you estimate the probability that a room that is perceived to be warm
is at a specific temperature x?
(2) How could you estimate the probability that a motor perceived to be running at
a moderate speed is actually running at a specific speed x?
(3) Do you think that "fuzzy logic" is a good name for this subject?
<<<========<<
>>>>>==============>
Mortality associated with moderate intakes of wine, beer, or spirits.
British Medical Journal, 310, pp. 1164-1169
Morten Gronbaek et al.
A number of studies have shown a U shaped for the relative risk of mortality in relation
to the alcoholic intake both for men and women. This article reports the results
of a large study carried out in Denmark to assess the effects of different types
of alcoholic drinks on the risk of death from all causes and from heart attacks, taking
into account sex, age, socioeconomic conditions, smoking habits, and body mass index.
The study followed 13,285 subjects (6051 men, 7234 women) aged 30-79 from 1976 to
1988. The authors found that beer intake had little effect on the relative risk of
mortality. Spirits intake also had little effect until you reach 3-5 drinks daily;
and this caused a significant increase in the relative risk of mortality. On the other hand
the relative risk as a function of wine intake dropped continuously, having its lowest
value for 3-5 drinks daily. Even drinking wine only occasionally seemed to help.
This article was the basis of a segment of 60 minutes on November 5 on the benefit
of wine in the prevention of heart disease. This is the second such discussion 60
minutes has had. Their first four years ago, called the "French Paradox", is generally
credited in the wine business with the subsequent upsurge in red wine sales, which
continues
today.
DISCUSSION QUESTIONS:
(1) What is meant by a "U-shaped relation?
(2) The researchers worried a lot about possible confounding factors in this study.
What factors do you think they worried out?
(3) Why was the study so large? Why didn't they just study 1000 people?
(4) Will this research change your life, and why?
<<<========<<
>>>>>==============>
Many find 'remote viewing' a far-fetched science.
Washington Post, 2 December 1995, A3
Curt Suplee
Federal intelligence agencies spent $20 million dollars in the last two decades studying
and trying to exploit a psychic ability called "remote viewing". These initiatives
began in the early 1970's, because of a concern that the Soviet Union was making
substantial progress in understanding extrasensory perception. People spoke of the
"ESP" gap.
The CIA inherited this program and asked statistician Jessica Utts and Psychologist
Raymond Hyman to evaluate the results of the program. Professor Utts has been looking
at results in parapsychology for the past ten years as part of her research program.
She is known to support the possibility of extra-sensory perception. Hyman is a
psychologist and well-known skeptic who has consulted many times for government panels
and others regarding the possibility of extra-sensory perception.
The program, called Stargate, had three parts: (1) keeping tabs on what other countries
were doing, (2) actually using psychics to see if they could add anything to ordinary
intelligence, and (3) carrying out research to see if there were psychics who were effective at remote viewing.
Most of the research was carried out by two California contractors, SRI International
and Science Applications International Corp. They carried out thousands of tests,
with hundreds of subjects resulting in the identification of a couple of dozen psychics. In a typical experiment an assistant in another room would hold up a randomly selected
photo from a set of five different pictures, each in a sealed envelope. The subject
would be asked to describe what was in the picture.
The subject would produce verbal impressions, hand drawings or both. A judge would
rank each photo from 1 (best) to 5 (worst) according to how closely each corresponded
to the psychic's description. Statistician Jessica Utts examined years of data and
found that a substantial number of the tests turned up average ranks around 2.3, about
14 percent better than chance. She concluded that the results of these tests could
not be accounted for by chance and that remote sensing was a real phenomenon. Hyman
remains skeptical being concerned, about methodological problems that need further investigation
and problems related to being able to replicate the results in these experiments.
The CIA has decided to drop the program in favor of watchful waiting. Utts and Hyman
wrote individual reports and you can find them both on Professor Utts Homepage
They are both very well written and discribe many of the basic issues involved in
using statistics to establish a scientific claim.
DISCUSSION QUESTIONS:
(1) How do you think Professor Utts got the 14% chance for getting an average ranking
of 2.3 or lower?
(2) Hyman claims that establishing statistically significant results related to ESP
is a far cry from showing that there is such a thing. What does he mean by this?
(3) What do you think are some of the methodological concerns one might have about
experiments of the kind described above?
<<<========<<
>>>>>==============>
New sign of the times--speed limit fast.
The Christian Science Monitor, 7 December 1995, U.S. p1
Laurel Shaper Waters
The national speed limit has been repealed, effective December 8. Most states will
not increase limits beyond 55 or 65 mph. On the other hand, nine Western states
will move to 70 mph, while Nevada Wyoming and Kansas will increase to 75 mph, and
Montana will actually eliminate daytime speed limits.
The Transportation Department estimates that the federal repeal will lead to 6400
more highway deaths a year. Chuck Hurley of the Insurance Institute for Highway
Safety says he expects more honest debate at the state level, since states will have
to bear the health-care costs of auto injuries. Tom Magliozzi, co-host of National Public
Radio's "Car Talk" says that "what we have done in this country is that we have decided
that getting someplace 15 or 20 minutes earlier is more important than life itself."
DISCUSSION QUESTION:
(1) Is it reasonable for the Transportation Department to estimate a death toll without
waiting to see how the states respond to the repeal? How do you think the 6400 figure
was calculated?
(2) Some lobbyists who pushed for the appeal argued that the hazards of higher speeds
are offset by the many safety improvements in automobile manufacture. Does this
say in effect that the current death rate is acceptable? Should we pass legislation
tying speed limits to some index of auto safety, so that limits will automatically go
up with improvements in design?
(3) Never mind quibbling about whether 65 mph is safer than 75 mph for wide-open
western roads. We can certainly all agree that a 15 mph speed limit would save lives.
Why is this not being proposed by the safety advocates?
<<<========<<
>>>>>==============>
Conception found to be likely just 6 days a month.
The Boston Globe, 7 December 1995, p 11.
Alison Bass
A study reported in today's New England Journal of Medicine found a 6-day "window
of fertility", a shorter period than what is commonly believed. The study tracked
the experience of 221 healthy women who planned to become pregnant between 1982 and
1985. Of these, 192 actually conceived. Their dates of ovulation were estimated by measuring
hormones in urine samples. In a departure from findings of earlier studies, none
of these women became pregnant from intercourse more than five days before or a day
or two after ovulation. The probability of conception was found to be highest on the
day of ovulation itself and dropped to 10% with intercourse five days before ovulation.
Some specialists have also maintained that the timing of intercourse can influence
the sex of the baby, with times closer to ovulation favoring boys and earlier times
favoring girls. The current study found no evidence to support such claims.
DISCUSSION QUESTION:
What problems can you anticipate with the design of this experiment? What else would
you like to know?
<<<========<<
>>>>>==============>
Homing in on poll target.
The Boston Globe, 8 December 1995, p 3.
David M. Schribman
Both Democratic and Republican strategists are finding homemakers to be a key demographic
group for the coming presidential race. Polling numbers for white men and women
with jobs outside the home have been relatively stable in recent months, whereas
homemakers seem more apt to shift. Up to 1992, homemakers under 45 had favored Republicans,
perhaps responding to family values messages. But increased worries about their
financial vulnerability may be moving them towards the Democrats. Private Democratic
polls suggest that these women feel the Republican budget proposals cut too deep.
DISCUSSION QUESTION:
If these polls are correct, how might candidates take advantage of this knowledge?
<<<========<<
>>>>>==============>
Mean statistics: when is the average best?
Washington Post, 6 Dec. 1995, p. H7
John Schwartz
Schwartz remarks that politicians and others often choose a definition of average
that best suits their needs.
He tells his readers what mean, median, and mode mean and gives examples of their
use and misuse. He starts with the example of John Cannell, who notices that his
state's school system claimed high scores on nationally standardized tests and requested
test scores from all 50 states. Cannell found that every one claimed to be "above the
national average" or the statistical "norm". He called this as the "Wobegan effect".
A more detailed discussion of this example can be found in the article
Taking the tests.
Dallas Morning News, 4 Oct. 1994
Karel Holloway.
As another example, Schwartz remarks that if Bill Gates were to move to a town with
10,000 penniless people the average (mean) income would be more than a million and
might suggest that the town is full of millionaires.
DISCUSSION QUESTIONS:
(1) How could the answers Cannell received be correct?
(2) Someone once claimed that if any one person moved from state X to state Y the
average intelligence in both states would be increased. How could this be? Can
you think of an X and a Y that might make this statement true?
<<<========<<
>>>>>==============>
Is your baby shy? feisty? expert says it may be lasting.
Sacramento Bee, 24 Nov. 1995
Deborah Blum
Jerom Kagan, a psychology professor at Harvard, has been pursuing for several decades
the idea that a significant portion of personality is in the genes. It is not that
environment does not effect personality but rather that each child begins in a different place, predisposed to certain behaviors.
He has followed a group of 700 children comparing them from infancy. He sees which
babies are frightened when, for example, suddenly seing a bright, moving mobile or
hearing a stranger's voice on tape. Fearful babies often grow into reserved toddlers
and the easygoing into laid-back ones.
Other researchers have made the same point about cultures. Kagan went to San Francisco
and compared 24 Chinese American and 34 European American newborns, born to middle-class
parents of comparable age. Caucasian babies were much more volatile, while the Chinese American infants scored on the calmer and steadier side.
Following these children as they grew up, he found that the Caucasian children were
much more likely to flare up in toy disputes than the Asian toddlers. In an earlier
book he observed that these same differences were reflected in the artistic works
and tastes of the two cultures.
DISCUSSION QUESTION:
Other recent research has found that patients with certain kinds of brain damage have
lost certain aspects of personality. This suggests that, to some extent, personality
is hard-wired into our brains. What relation does this research have to that discussed in the article. In particular, does this new research support the fact that personality
is in our genes?
<<<========<<
>>>>>==============>
France finds a reading test incomprehensible.
The New York Times, 12 Dec. 1995, A3
Marlise Simons
The Paris-based Organization for Economic Cooperation and Development carried out
a survey in 8 industrial countries to test how literacy skills relate to job success
and economic performance. The survey judged reading comprehension, in the form of
documents like tables, nutritional charts, train schedules, and day-to-day mathematical skills.
The results varied significantly between countries. Looking at the percentage of
adults that scored in the lowest of five proficiency levels, for the portion of the
study examining the ability to understand and use written information, you find
Sweden 7.5%
Netherlands 10.5%
Germany 14.4%
Switzerland 15.3%
Canada 16.6%
United States 20.7%
France 40.1%
Poland 42.6%
When the French officials studied the preliminary results two months ago, they insisted
that all references to France be excised from the report, "Literacy, Society and
Economy" published last week.
The director of the French Education Ministry rejected the findings because he said
the methodology was flawed. He is quoted as saying: "I was all the more convinced
of the flaws when I saw the results, particularly since each country defined its
own control conditions."
He said: "Different societies and cultures emphasize different things. For example,
one exercise with a recipe asked how many eggs were needed to bake a cake for four
people. It then asked how many eggs were needed to bake a cake for six. This is
not an exercise in French schools. Anyway, if you make a mistake of one egg, your cake
will not be spoiled."
DISCUSSION QUESTION:
What might the director have meant by the comment "each country defined its own control
conditions"?
<<<========<<
>>>>>==============>
Book reviews.
Reason, Dec. 1995, 27(7) p. 55
Brian Doherty
This article provides reviews and comparisons for the following four books that deal
with interpretation of numbers we read about in the media.
The Tyranny of Numbers
Mismeasurement and Misrule.
Nicholas Eberstadt
AEI Press. 1995
Labyrinth of Prosperity.
Reuven Brenner
U. of Michigan Press 1994
Tainted Truth.
Cynthia Crossen
S&S Press 1994
A Mathematician Reads the Newspaper.
John A. Paulos
Basic 1995
We have already reviewed the books by Crossen and Paulos. Doherty admired Crossen's
book but felt she could lighten up once in a while. He obviously both admired and
enjoyed the Paulos book "by far more of a pleasure to read than the others under
discussion. It is simply nifty, larded with clever and informative tidbits as he strolls his
broad, discursive way through typical newspaper reporting".
Eberstadt, in his book, gives examples of the dangers of taking data at face value
when making policies. For example he claims we are not as ill-fed, ill-housed and
ill-nourished as we have been led to believe. Doherty was particular impressed by
Eberstadt's discussion of the policy consequences of the CIA taking, at face value, Soviet
economic data even though these exagerated reports were not consistent with the impressions
of poverty and squalor reported by visitors to the Soviet Union. Doherty asks why he should take on faith data that Eberstadt presents as gospel. "Am I to take it
on faith that the U.W. Census Bureau can calculate with trustworthy accuracy the
life expectancy of the Chinese people from the 1950's to the present?"
In his book, Brenner claims that aggregate statistics, even when carried out accurately
are pretty useless in making policy. For example, Brenner points out that there
is a tremendous difference between a deficit that efficiently finances the construction of schools and roads and one that simply wastes money on bureaucrats. There are
similar problems with averages like the comsumer price index.. He seems to regard
aggregation of economic data rather a hopeless and useless activity.
I have read the books by Paulos and Crossen and agree with Doherty that they are both
outstanding books. I have not read the other two. My impression from this review
is that they have some good examples along with a fair amount of nonsense.
DISCUSSION QUESTION:
Brenner argues that aggregate data doesn't really say much about individuals and therefor
should not be used for making policy. Do you agree?
<<<========<<
>>>>>==============>
Silicone in the System.
Discover Magazine, Dec. 1995
Gary Taubes
There was a discussion on Edstat-l of the very interesting article on breast implants
in the current Discover magazine. You can read this article "Silicone in the System" on the Web.
Paul Bernhardt wrote:
In the current issue of Discover magazine (December), there is an interesting article
on the breast implants controversy. I recommend it for use in the stats classroom
because it describes the experimental and statistical procedure misused by a leading
expert in this area. It is clearly indicated as the flaw in the research. Quoting:
"The problem was simple. Kossovsky had reported that the ELISA scores of 9 of his
249 women with implants were significantly higher than the mean score of the 47 healthy
women or of the 39 women with autoimmune disorders. But those 9 women represented less
than 4 percent of all the women with implants he tested. What if in reality his ELISA
test was meaningless? Then he might expect 4 percent of all women to score equally
high...."
The article is an excellent discussion of drawing unsubstantiated conclusions based
on little to no data. Kossovsky's theory, connecting silicon to autoimmune disease,
is elegant and plausible but has not yet received any empirical support. Students
can clearly see the importance of data in the scientific enterprise.
This was followed by the following response from pfleury@popmail.mcs.com (Patrick
Fleury)
Kossovsky has some other problems with statistics as well. He has been involved in
several papers where the conclusions reached based on the numbers he uses are certainly
open to question. For example, here is a set of numbers which appears in one of
the papers of which he is a co-author.
TNFALPHA IL6
78 308287
65 33291
149 124550
451 17075
64 22955
79 95102
115 5649
618 840585
69 58924
(There is one other case with missing data which I left out.)
Anyway, the authors point out that the correlation between the above two sets of data
is .77 and has a p-value of <.01. They don't really do much more than quote the
numbers, however, so I don't know what conclusions they draw except that these seem
to be high correlations. These also aren't the numbers I get. (I get .694 as the correlation
and .038 as the p-value but never mind that.)
If you plot the above, however, you immediately see that it's case number 8 which
is pulling the correlation up. It's an incredible outlier. If you delete it, then
the high positive correlation goes away and you get a correlation of -.249. (This
was pointed out by a friend of mine, Mark Coward.) The moral is "Plot your data" and it can
be understood by any undergraduate.
By the way, the citation for the above paper is Mena, Kossovsky, Chu, Hu, "Inflammatory
Intermediates Produced by Tissues Encasing Silicone Breast Prostheses", Journal of
Investigative Surgery, Volume 5, 1993. (The volume number might be 8, actually,
I can't make it out clearly from the Xerox.)
The paper that the Discover article refers to is:
Kossovsky, et. al., Surface Dependent Antigens Identified by High Binding Avidity
of Serum Antibodies in a Sub population of Patients with Breast Prostheses, Journal
of Applied Biomaterials, Vol. 4, pp. 281-288, (1993).
Please send comments and suggestions for articles
to jlsnell@dartmouth.edu.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
CHANCE News 4.16
(17 November to 10 December 1995)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!