4.16.html

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

CHANCE News 4.16
(17 November to 10 December 1995)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Prepared by J. Laurie Snell, with help from William Peterson,
Fuxing Hou, Ma.Katrina Munoz Dy and Joan Snell, as part of the CHANCE Course Project supported by the National Science Foundation.

Please send comments and suggestions for articles
to jlsnell@dartmouth.edu.

Back issues of Chance News and other materials for teaching a
CHANCE course are available from the Chance Web Data Base.

http://www.geom.umn.edu/locate/chance

=====================================================

In Lake Wobegon, all the women are strong, the men are good looking, and all the children are above average.

Garrison Keillor

=====================================================

Contents

1. Bordeaux wine vintage quality and the weather.
2. Picturing an L.A. bus schedule.
3. Fuzzy logic: great hope or grating hype.
4. Mortality associated with alcohol consumption.
5. Evaluating 'remote viewing'.
6. New sign of the times--speed limit fast.
7. Conception likely just 6 days a month.
8. Homing in on poll target.
9. Mean statistics: when is the average best?
10. Is your baby shy? feisty? it may be lasting.
11. France finds a reading test incomprehensible.
12. Four books dealing with interpreting numbers.
13. Silicone in the system.

<<<========<<

>>>>>==============>
Note: The current Chance Magazine reports that John Rolph is finishing his term as editor. This issue is a good example of the high standards John has brought to Chance. We wish his successor George Styan good luck during his term. You can get more information about Chance Magazine including how to subscribe. We will start with three of our favorites from the current issue of Chance Magazine.
<<<========<<

>>>>>==============>
Bordeaux wine vintage quality and the weather.
Chance Magazine, Fall 1995, pp.7-14
Orley Ashenfelter, David Ashmore, and Robert Lalonde.

If you want to drink good wine you can buy new wine and let it mature in your cellar or you can buy older wine that has matured in some dealer's cellar. To decide which is the better strategy it is helpful to know the answer to questions like: does the price of mature wine reflect the quality of the wine? Presumably the answer to this is yes since by this time the quality of the wine is known and the prices reflect this knowledge. If so, the natural next questions might be: is the price of new wine a good predictor of the price after it has matured? If not, what is a good predictor? This article tries to answer such questions.

The authors begin by providing the 1990-1991 London auction prices of red wines from 7 of the best known Chateaux (Vinyards) that were produced in the years from 1960 to 1969. These years were picked because by 1990 they should be fully mature and their quality known. For a given Chateaux, there is wide variation in these prices through the years and, for a given year, there is wide variation in the prices between Chateaux.
Using regression techniques the authors show that the prices of the wines at the time they were grown are not good predictors of their prices when they matured. On the other hand weather conditions are very good predictors. Great vintages for Bordeaux wines correspond to years in which August and September are dry, the growing season is warm, and the previous winter has been wet. Ashenfelter uses this fact to estimate the value of new wines and provides these estimates in a newsletter he distributes called "Liquid Assets: The International Guide to Fine Wines".

Professor Ashenfelter is a Princeton Economist who is widely quoted in the newspapers on weightier matters, but his newsletter also makes the news occasionally. This article provides some of the more humorous remarks made by well known wine critics on the use of statistics to assist in judging wines.

DISCUSSION QUESTION:

Wine critics are offended by the idea of using statistics to help judge wines. Why do you think this is?

The authors state that their results show that the market for new wines is not "effcicient" but the market for mature wines is. What do you think they mean by this?
<<<========<<

>>>>>==============>
Picturing an L.A. Bus Schedule.
Chance Magazine, Winter 1995, p. 44
Howard Wainer

Howard Wainer edits a column in Chance Magazine called "Visual Revelations". His columns provide wonderful examples for classroom discussions of the use of graphics. This month he considers a question from the first National Adult Literacy Survey conducted in 1992. In this question you are given the appropriate L.A. bus schedule and asked how long you would have to wait for the next bus on a Saturday afternoon if you miss the 2:35 bus leaving Hancock and Buena Ventura going to Flintridge and Academy. The schedule is typical of those we have all struggled with: columns of outbound times and inbound times, remarks about buses that run only Monday through Friday only etc.

Wainer suggests that we should think about how to make a general purpose plot of the bus data and then see how it serves to answer a variety of questions, including the one on the quiz. His choice is to plot the time of day on the horizontal axis and the various bus stops on the vertical axis. A change of scale suggests itself, and after this change is made we have a plot that makes it easy to see regularities in the way the buses run. The cyclic nature of the graph suggests that there is a single bus going back and forth on the route considered, making a round trip in just under two hours. The graph provides easy answers to a variety of reasonable questions including the survey question.
<<<========<<

>>>>>==============>
Fuzzy logic: great hope or grating hype.
Chance Magazine, Winter 1995, pp. 15-19
Michael Laviolette

The author of this article feels that many problems currently being solved by fuzzy set theory could be equally well solved using probability theory. To illustrate this he considers the following simple application of fuzzy set theory to control theory.

You want to design an air conditioner-controller to make a motor go at speed y when the temperature is x. You only have a vague feeling about when the room is cool, just right, or warm. Fuzzy logic suggests associating these labels with appropriate intervals of temperatures. These sets are called "fuzzy sets" and are allowed to overlap. For example, suppose you assign the interval from 50 to 70 as the "cool" set, 60 to 80 as "just right", and 70 to 90 as "warm." Then the temperature 65 is in both the "cool" set and the "just right" set. For each fuzzy set you define a membership function that assigns a value between 0 and 1 to each member of the set. For example, for the cool interval from 50 to 70, you might make the membership function increase linearly from 0 to 1 as the temperature goes from 50 to 60 and decrease linearly from 1 to 0 as the temperature goes from 60 to 70. Then 60 is a really cool temperature, but 65 is only .5 cool.

Similarly, you can determine fuzzy sets and measurement functions corresponding to intervals of speed that you consider slow, medium, and fast.

We associate "cool" with the motor being on "slow", "just right" with it being on "medium," and "warm" with it being on "fast." This restricts how we make the correspondence between temperatures and speeds, but at the same time it creates some conflicts. For example, the temperature 65 is in both the "cool" and the "just right" temperature sets, so it should correspond to a point in either the "slow" or "medium" set or possibly both. The temperature and speed membership functions determine, by fuzzy logic, a new speed fuzzy set and membership function for the temperature 65. When the temperature is 65, the controller sets the speed equal to the average speed calculated using this membership function.

For a detailed description of how this is done, consult the author's longer article (Technometrics (1995), 37(3), 249-261). The explanation in the Chance article is rather brief, and a key figure (the last part of Figure 2) is incorrect.

In the probabilistic approach to the problem, the membership functions are replaced by conditional probabilities. We determine subjectively or otherwise probabilities of the form "the probability that the room is perceived as cool given that the temperature is 65" and probabilities of the form "the probability that the machine is running at speed x given that it is running at medium speed." These probabilities, combined with the rules associating temperature sets with speed sets, allow you to compute the expected speed given a specific temperature, say 65. Then, for a given temperature x, the controller sets the speed equal to y, where y is the expected speed with respect to these conditional probabilities.

Laviolette's article in Technometrics includes long discussions by workers in fuzzy set theory describing their feelings about the relationship between probability and fuzzy sets.

DISCUSSION QUESTIONS

(1) How could you estimate the probability that a room that is perceived to be warm is at a specific temperature x?

(2) How could you estimate the probability that a motor perceived to be running at a moderate speed is actually running at a specific speed x?

(3) Do you think that "fuzzy logic" is a good name for this subject?
<<<========<<

>>>>>==============>
Mortality associated with moderate intakes of wine, beer, or spirits.
British Medical Journal, 310, pp. 1164-1169
Morten Gronbaek et al.

A number of studies have shown a U shaped for the relative risk of mortality in relation to the alcoholic intake both for men and women. This article reports the results of a large study carried out in Denmark to assess the effects of different types of alcoholic drinks on the risk of death from all causes and from heart attacks, taking into account sex, age, socioeconomic conditions, smoking habits, and body mass index.

The study followed 13,285 subjects (6051 men, 7234 women) aged 30-79 from 1976 to 1988. The authors found that beer intake had little effect on the relative risk of mortality. Spirits intake also had little effect until you reach 3-5 drinks daily; and this caused a significant increase in the relative risk of mortality. On the other hand the relative risk as a function of wine intake dropped continuously, having its lowest value for 3-5 drinks daily. Even drinking wine only occasionally seemed to help.

This article was the basis of a segment of 60 minutes on November 5 on the benefit of wine in the prevention of heart disease. This is the second such discussion 60 minutes has had. Their first four years ago, called the "French Paradox", is generally credited in the wine business with the subsequent upsurge in red wine sales, which continues
today.

DISCUSSION QUESTIONS:

(1) What is meant by a "U-shaped relation?

(2) The researchers worried a lot about possible confounding factors in this study. What factors do you think they worried out?

(3) Why was the study so large? Why didn't they just study 1000 people?

(4) Will this research change your life, and why?
<<<========<<

>>>>>==============>
Many find 'remote viewing' a far-fetched science.
Washington Post, 2 December 1995, A3
Curt Suplee

Federal intelligence agencies spent $20 million dollars in the last two decades studying and trying to exploit a psychic ability called "remote viewing". These initiatives began in the early 1970's, because of a concern that the Soviet Union was making substantial progress in understanding extrasensory perception. People spoke of the "ESP" gap.

The CIA inherited this program and asked statistician Jessica Utts and Psychologist Raymond Hyman to evaluate the results of the program. Professor Utts has been looking at results in parapsychology for the past ten years as part of her research program. She is known to support the possibility of extra-sensory perception. Hyman is a psychologist and well-known skeptic who has consulted many times for government panels and others regarding the possibility of extra-sensory perception.

The program, called Stargate, had three parts: (1) keeping tabs on what other countries were doing, (2) actually using psychics to see if they could add anything to ordinary intelligence, and (3) carrying out research to see if there were psychics who were effective at remote viewing.

Most of the research was carried out by two California contractors, SRI International and Science Applications International Corp. They carried out thousands of tests, with hundreds of subjects resulting in the identification of a couple of dozen psychics. In a typical experiment an assistant in another room would hold up a randomly selected photo from a set of five different pictures, each in a sealed envelope. The subject would be asked to describe what was in the picture.

The subject would produce verbal impressions, hand drawings or both. A judge would rank each photo from 1 (best) to 5 (worst) according to how closely each corresponded to the psychic's description. Statistician Jessica Utts examined years of data and found that a substantial number of the tests turned up average ranks around 2.3, about 14 percent better than chance. She concluded that the results of these tests could not be accounted for by chance and that remote sensing was a real phenomenon. Hyman remains skeptical being concerned, about methodological problems that need further investigation and problems related to being able to replicate the results in these experiments.

The CIA has decided to drop the program in favor of watchful waiting. Utts and Hyman wrote individual reports and you can find them both on Professor Utts Homepage

They are both very well written and discribe many of the basic issues involved in using statistics to establish a scientific claim.

DISCUSSION QUESTIONS:

(1) How do you think Professor Utts got the 14% chance for getting an average ranking of 2.3 or lower?

(2) Hyman claims that establishing statistically significant results related to ESP is a far cry from showing that there is such a thing. What does he mean by this?

(3) What do you think are some of the methodological concerns one might have about experiments of the kind described above?
<<<========<<

>>>>>==============>
New sign of the times--speed limit fast.
The Christian Science Monitor, 7 December 1995, U.S. p1
Laurel Shaper Waters

The national speed limit has been repealed, effective December 8. Most states will not increase limits beyond 55 or 65 mph. On the other hand, nine Western states will move to 70 mph, while Nevada Wyoming and Kansas will increase to 75 mph, and Montana will actually eliminate daytime speed limits.

The Transportation Department estimates that the federal repeal will lead to 6400 more highway deaths a year. Chuck Hurley of the Insurance Institute for Highway Safety says he expects more honest debate at the state level, since states will have to bear the health-care costs of auto injuries. Tom Magliozzi, co-host of National Public Radio's "Car Talk" says that "what we have done in this country is that we have decided that getting someplace 15 or 20 minutes earlier is more important than life itself."

DISCUSSION QUESTION:

(1) Is it reasonable for the Transportation Department to estimate a death toll without waiting to see how the states respond to the repeal? How do you think the 6400 figure was calculated?

(2) Some lobbyists who pushed for the appeal argued that the hazards of higher speeds are offset by the many safety improvements in automobile manufacture. Does this say in effect that the current death rate is acceptable? Should we pass legislation tying speed limits to some index of auto safety, so that limits will automatically go up with improvements in design?

(3) Never mind quibbling about whether 65 mph is safer than 75 mph for wide-open western roads. We can certainly all agree that a 15 mph speed limit would save lives. Why is this not being proposed by the safety advocates?
<<<========<<

>>>>>==============>
Conception found to be likely just 6 days a month.
The Boston Globe, 7 December 1995, p 11.
Alison Bass

A study reported in today's New England Journal of Medicine found a 6-day "window of fertility", a shorter period than what is commonly believed. The study tracked the experience of 221 healthy women who planned to become pregnant between 1982 and 1985. Of these, 192 actually conceived. Their dates of ovulation were estimated by measuring hormones in urine samples. In a departure from findings of earlier studies, none of these women became pregnant from intercourse more than five days before or a day or two after ovulation. The probability of conception was found to be highest on the day of ovulation itself and dropped to 10% with intercourse five days before ovulation.

Some specialists have also maintained that the timing of intercourse can influence the sex of the baby, with times closer to ovulation favoring boys and earlier times favoring girls. The current study found no evidence to support such claims.

DISCUSSION QUESTION:

What problems can you anticipate with the design of this experiment? What else would you like to know?
<<<========<<

>>>>>==============>
Homing in on poll target.
The Boston Globe, 8 December 1995, p 3.
David M. Schribman

Both Democratic and Republican strategists are finding homemakers to be a key demographic group for the coming presidential race. Polling numbers for white men and women with jobs outside the home have been relatively stable in recent months, whereas homemakers seem more apt to shift. Up to 1992, homemakers under 45 had favored Republicans, perhaps responding to family values messages. But increased worries about their financial vulnerability may be moving them towards the Democrats. Private Democratic polls suggest that these women feel the Republican budget proposals cut too deep.

DISCUSSION QUESTION:

If these polls are correct, how might candidates take advantage of this knowledge?
<<<========<<

>>>>>==============>
Mean statistics: when is the average best?
Washington Post, 6 Dec. 1995, p. H7
John Schwartz

Schwartz remarks that politicians and others often choose a definition of average that best suits their needs.

He tells his readers what mean, median, and mode mean and gives examples of their use and misuse. He starts with the example of John Cannell, who notices that his state's school system claimed high scores on nationally standardized tests and requested test scores from all 50 states. Cannell found that every one claimed to be "above the national average" or the statistical "norm". He called this as the "Wobegan effect". A more detailed discussion of this example can be found in the article

Taking the tests.
Dallas Morning News, 4 Oct. 1994
Karel Holloway.

As another example, Schwartz remarks that if Bill Gates were to move to a town with 10,000 penniless people the average (mean) income would be more than a million and might suggest that the town is full of millionaires.

DISCUSSION QUESTIONS:

(1) How could the answers Cannell received be correct?

(2) Someone once claimed that if any one person moved from state X to state Y the average intelligence in both states would be increased. How could this be? Can you think of an X and a Y that might make this statement true?
<<<========<<

>>>>>==============>
Is your baby shy? feisty? expert says it may be lasting.
Sacramento Bee, 24 Nov. 1995
Deborah Blum

Jerom Kagan, a psychology professor at Harvard, has been pursuing for several decades the idea that a significant portion of personality is in the genes. It is not that environment does not effect personality but rather that each child begins in a different place, predisposed to certain behaviors.

He has followed a group of 700 children comparing them from infancy. He sees which babies are frightened when, for example, suddenly seing a bright, moving mobile or hearing a stranger's voice on tape. Fearful babies often grow into reserved toddlers and the easygoing into laid-back ones.

Other researchers have made the same point about cultures. Kagan went to San Francisco and compared 24 Chinese American and 34 European American newborns, born to middle-class parents of comparable age. Caucasian babies were much more volatile, while the Chinese American infants scored on the calmer and steadier side.

Following these children as they grew up, he found that the Caucasian children were much more likely to flare up in toy disputes than the Asian toddlers. In an earlier book he observed that these same differences were reflected in the artistic works and tastes of the two cultures.

DISCUSSION QUESTION:

Other recent research has found that patients with certain kinds of brain damage have lost certain aspects of personality. This suggests that, to some extent, personality is hard-wired into our brains. What relation does this research have to that discussed in the article. In particular, does this new research support the fact that personality is in our genes?
<<<========<<

>>>>>==============>
France finds a reading test incomprehensible.
The New York Times, 12 Dec. 1995, A3
Marlise Simons

The Paris-based Organization for Economic Cooperation and Development carried out a survey in 8 industrial countries to test how literacy skills relate to job success and economic performance. The survey judged reading comprehension, in the form of documents like tables, nutritional charts, train schedules, and day-to-day mathematical skills.

The results varied significantly between countries. Looking at the percentage of adults that scored in the lowest of five proficiency levels, for the portion of the study examining the ability to understand and use written information, you find

Sweden 7.5%
Netherlands 10.5%
Germany 14.4%
Switzerland 15.3%
Canada 16.6%
United States 20.7%
France 40.1%
Poland 42.6%

When the French officials studied the preliminary results two months ago, they insisted that all references to France be excised from the report, "Literacy, Society and Economy" published last week.

The director of the French Education Ministry rejected the findings because he said the methodology was flawed. He is quoted as saying: "I was all the more convinced of the flaws when I saw the results, particularly since each country defined its own control conditions."

He said: "Different societies and cultures emphasize different things. For example, one exercise with a recipe asked how many eggs were needed to bake a cake for four people. It then asked how many eggs were needed to bake a cake for six. This is not an exercise in French schools. Anyway, if you make a mistake of one egg, your cake will not be spoiled."

DISCUSSION QUESTION:

What might the director have meant by the comment "each country defined its own control conditions"?
<<<========<<

>>>>>==============>
Book reviews.
Reason, Dec. 1995, 27(7) p. 55
Brian Doherty

This article provides reviews and comparisons for the following four books that deal with interpretation of numbers we read about in the media.

The Tyranny of Numbers
Mismeasurement and Misrule.
Nicholas Eberstadt
AEI Press. 1995

Labyrinth of Prosperity.
Reuven Brenner
U. of Michigan Press 1994

Tainted Truth.
Cynthia Crossen
S&S Press 1994

A Mathematician Reads the Newspaper.
John A. Paulos
Basic 1995

We have already reviewed the books by Crossen and Paulos. Doherty admired Crossen's book but felt she could lighten up once in a while. He obviously both admired and enjoyed the Paulos book "by far more of a pleasure to read than the others under discussion. It is simply nifty, larded with clever and informative tidbits as he strolls his broad, discursive way through typical newspaper reporting".

Eberstadt, in his book, gives examples of the dangers of taking data at face value when making policies. For example he claims we are not as ill-fed, ill-housed and ill-nourished as we have been led to believe. Doherty was particular impressed by Eberstadt's discussion of the policy consequences of the CIA taking, at face value, Soviet economic data even though these exagerated reports were not consistent with the impressions of poverty and squalor reported by visitors to the Soviet Union. Doherty asks why he should take on faith data that Eberstadt presents as gospel. "Am I to take it on faith that the U.W. Census Bureau can calculate with trustworthy accuracy the life expectancy of the Chinese people from the 1950's to the present?"

In his book, Brenner claims that aggregate statistics, even when carried out accurately are pretty useless in making policy. For example, Brenner points out that there is a tremendous difference between a deficit that efficiently finances the construction of schools and roads and one that simply wastes money on bureaucrats. There are similar problems with averages like the comsumer price index.. He seems to regard aggregation of economic data rather a hopeless and useless activity.

I have read the books by Paulos and Crossen and agree with Doherty that they are both outstanding books. I have not read the other two. My impression from this review is that they have some good examples along with a fair amount of nonsense.

DISCUSSION QUESTION:

Brenner argues that aggregate data doesn't really say much about individuals and therefor should not be used for making policy. Do you agree?
<<<========<<

>>>>>==============>
Silicone in the System.
Discover Magazine, Dec. 1995
Gary Taubes

There was a discussion on Edstat-l of the very interesting article on breast implants in the current Discover magazine. You can read this article "Silicone in the System" on the Web.

Paul Bernhardt wrote:

In the current issue of Discover magazine (December), there is an interesting article on the breast implants controversy. I recommend it for use in the stats classroom because it describes the experimental and statistical procedure misused by a leading expert in this area. It is clearly indicated as the flaw in the research. Quoting: "The problem was simple. Kossovsky had reported that the ELISA scores of 9 of his 249 women with implants were significantly higher than the mean score of the 47 healthy women or of the 39 women with autoimmune disorders. But those 9 women represented less than 4 percent of all the women with implants he tested. What if in reality his ELISA test was meaningless? Then he might expect 4 percent of all women to score equally high...."

The article is an excellent discussion of drawing unsubstantiated conclusions based on little to no data. Kossovsky's theory, connecting silicon to autoimmune disease, is elegant and plausible but has not yet received any empirical support. Students can clearly see the importance of data in the scientific enterprise.

This was followed by the following response from pfleury@popmail.mcs.com (Patrick Fleury)

Kossovsky has some other problems with statistics as well. He has been involved in several papers where the conclusions reached based on the numbers he uses are certainly open to question. For example, here is a set of numbers which appears in one of the papers of which he is a co-author.

TNFALPHA IL6

78 308287
65 33291
149 124550
451 17075
64 22955
79 95102
115 5649
618 840585
69 58924

(There is one other case with missing data which I left out.)
Anyway, the authors point out that the correlation between the above two sets of data is .77 and has a p-value of <.01. They don't really do much more than quote the numbers, however, so I don't know what conclusions they draw except that these seem to be high correlations. These also aren't the numbers I get. (I get .694 as the correlation and .038 as the p-value but never mind that.)

If you plot the above, however, you immediately see that it's case number 8 which is pulling the correlation up. It's an incredible outlier. If you delete it, then the high positive correlation goes away and you get a correlation of -.249. (This was pointed out by a friend of mine, Mark Coward.) The moral is "Plot your data" and it can be understood by any undergraduate.

By the way, the citation for the above paper is Mena, Kossovsky, Chu, Hu, "Inflammatory Intermediates Produced by Tissues Encasing Silicone Breast Prostheses", Journal of Investigative Surgery, Volume 5, 1993. (The volume number might be 8, actually, I can't make it out clearly from the Xerox.)

The paper that the Discover article refers to is:

Kossovsky, et. al., Surface Dependent Antigens Identified by High Binding Avidity of Serum Antibodies in a Sub population of Patients with Breast Prostheses, Journal of Applied Biomaterials, Vol. 4, pp. 281-288, (1993).

Please send comments and suggestions for articles
to jlsnell@dartmouth.edu.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

CHANCE News 4.16
(17 November to 10 December 1995)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!