current issue of CHANCE News

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CHANCE News 4.05 (4 March 1995 to 21 March 1995) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Prepared by J. Laurie Snell, with help from William Peterson, Fuxing Hou and Ma.Katrina Munoz Dy, as part of the CHANCE Course Project supported by the National Science Foundation. Please send comments and suggestions for articles to jlsnell@dartmouth.edu Back issues of Chance News and other materials for teaching a CHANCE course are available from the Chance Web Data Base http://www.geom.umn.edu/docs/snell/chance/welcome.html =========================================== The average salary of most of the members of the Sierra Club is $70,000. Representative Don Young, R-Alaska ===========================================. FROM OUR READERS We failed to find any studies relating to Roger Pinkham's question about physicians being less likely to be called in when girls are born than when boys are born. However, Steven Wells gave us several references relating to the original question about full moons including: Kelly, I. W., J. Rotton, and R. Culver. 1985. The moon was full and nothing happened. Skeptical Inquirer, 10(2) pp.129-143. Martens, R., I. W. Kelly, and D. H. Saklofske. 1988. Lunar phase and birthrate: A 50 year critical review. Psychological Reports, 63 pp. 923-934 Martens and Kelly provided an update: Psychological reports, 1994 Aug; 75(1 Pt 2):507-11. Here is an abstract of this update from Medline. Martens, Kelly, and Saklofske in 1988 examined 21 studies considering the possible relationship between lunar periodicities and birthrate. They reported that the majority of studies uncovered no relationship and that the positive studies were inconsistent in their findings. The present update reports on six additional studies on birthrate and lunar periodicities from five different countries. None of these studies produced evidence of lunar periodicities consistent with folklore or some previous studies. <<<========<<

>>>>>==========>> Norton Starr sent the following remark: Regarding CHANCE News 4.05, last item, the one about Marilyn vos Savant discussing one vs. two engine plane failures. This is a variation on a classic example discussed by Mosteller, Rourke, & Thomas in their "Probability with Statistical Applications", 2nd. ed., Addison-Wesley, 1970. See example 3, pages 140-142, where the safety of two-vs. four-engine planes is compared, under varying probabilities q of failure for a single engine. A plane succeeds in its flight if at least half its engines continue to function. The four engine plane is safer if q < 1/3 and the two- engine is safer if q > 1/3. (Of course, for q=0 or 1, the probabilities are the same) ON THE INTERNET "Resampling Stats" is a software package developed by Julian Simon that allows the user to simulate a wide variety of chance experiments using a simple BASIC like programming language. The value of computer simulations in the understanding of probability and statistical concepts is generally recognized today. The way it is used by Simon and co- author Peter C. Bruce has been the subject of lively discussions in Chance Magazine and elsewhere. Articles by Simon and Bruce on the use of Resampling Stats are available from the Resampling Stats home-page. For the other side of the coin see "Review of the Resampling Method of Teaching Statistics" by Jim Albert and Mark Berliner, American Statistician, May 1994, Vol. 48, No. 2. From Resampling Stats homepage you will find examples of the use of the resampling software including the resampling approach to solving the problems in Fred Mosteller's well-know problem book: "Fifty challenging problems in probability with solutions", Addison- Wesley, 1965. <<<========<<

>>>>>==========>> ARTICLES ABSTRACTED

1. Lead remains danger to children.
2. Dirty air can shorten your life.
3. Experiment with success.
4. Word & Image; Innumeracy<2>.
5. Lottery paradox and preface paradox.
6. Estrogen cuts risk of heart disease.
7. Measuring our nation's diversity?
8. Statisticians and the news media.
9. Bayesian analysis in medical trials.
10. Statistics as Principled Argument.

<<<========<<

>>>>>==========>> Despite reductions in exposure, lead remains danger to children. The New York Times, 21 March, 1995, C3 Jane E. Brody

This is an excellent review article on current knowledge about the incidence of lead poisoning in children and the effect it has on I.Q. and other learning abilities. In recent years there has been a dramatic decrease in the exposure to lead in the environment. At the same time, studies have led to changes in the level of lead in blood considered dangerous. In 1969 this was 60 micrograms per deciliter of blood and is now 10 micrograms. An estimated three million pre-school-age American children currently have more than 10 micrograms of lead. Silent lead poisoning refers to low levels that are believed to have hidden effects on brain development. Dr. John Rosen, a specialist in the pediatric effects of lead, states that the highest prevalence of silent lead poisoning occurs in African-American children aged 1 to 5 who live in large cities and whose families have marginal incomes. Among children living under these conditions, it is estimated that 33 percent of the African-American children have hazardous amounts of lead in their blood while this is the case for only 17 percent of Hispanic-American children, and 6 percent of White children. One study indicated that every 10-microgram blood level above the 10 danger level at age 2 years led to a 5.8 point decline in overall I.Q. and an 8.9 point decline in achievement test scores by age 10. While there are conflicting studies on the dangers of lead poisoning, recent articles reviewing the existing studies suggest that the dangers are real. However, there are researchers who believe that other explanations should be sought for observed lead-related deficits in I.Q. DISCUSSION QUESTIONS (1) What other explanations might be given for the cause of the observed lead-related deficits in I.Q? (2) What is the relations of studies on lead poisoning and I.Q. to the issue of heritability of I.Q.? <<<========<<

>>>>>==========>> Dirty air can shorten your life. The Washington Post, 10 March 1995 Curt Suplee

An article in the March issue of the "American Journal of Respiratory and Critical Care Medicine" reports on the largest study ever conducted on the health effects of airborne particles from traffic and smokestacks. The study tracked the health histories of 552,138 adults in 151 metropolitan from 1982 to 1989 and compared mortality rates with the amount of fine particle matter such as soot, smoke, and sulfate particles as measured by the EPA. After controlling for age, smoking, etc., researchers found that high sulfate and fine particle levels raised the risk of premature death from all causes by about 15 percent. Death rates from heart disease, lung disease, and respiratory diseases averaged 30 percent higher in the most polluted cities as compared to the cites the least polluted. On the East Coast the preponderance of fine particles are sulfates produced primarily by power plants and industrial sources while on the West Coasts, most are nitrates from automobile pollution. DISCUSSION QUESTION The article comments that "those exposed to the highest concentrations of particles run a risk of premature death about one-sixth as great as if they had been smoking for 25 years". Is that a reasonable way to describe a risk? <<<========<<

>>>>>==========>> Experiment with success. The New York Times, 9 March 1995, A1 Celia W. Dugger

A youth program that worked. The New York Times, 20 March, 1995, A16 Editorial

A million dollar program called the Quantum Opportunity Program financed by the Ford Foundation is reported to have had remarkable success in encouraging under- privileged children to continue their education. In the experiment 25 participants at each of four sites, Philadelphia, Oklahoma City, San Antonio, and Saginaw Michigan, were randomly chosen from a group of students entering ninth grade and whose families were on welfare. These students were given special guidance for a period of four years. The students had adult supervisors who worked with them on a daily basis to give help both in their academic and personal problems. They were paid $1.33 for each hour spent on extra academic study, volunteer work or cultural and educational events and an additional $100 for each 100 hours completed. Their earnings were matched in an account that could be used for college or trade school. The average four year cost per students was $10,600. By the end of the program, 63 percent of the participants graduated from high-school, 42 percent were enrolled in a post-secondary program, 23 percent dropped out of school, 24 percent had children and 2 percent had arrest records. Of a control group, 42 percent finished high school, 16 percent went on to post-secondary schools, 50 percent dropped out, 36% had children and 13% had arrest records. Starting in September, the Labor Department and the Ford Foundation will make a larger test of the program involving 700 participants in five areas. The Times editorial states that "The success of the program shows that careful investments in disadvantaged youths can work". DISCUSSION QUESTIONS. How would you determine if the results of this experiment are statistically significant? The article by Duggar described the Philadelphia program that was considered the most successful. It was clear from the article that they a particularly effective supervisor had been picked. How could you assess the importance of this to the success of such a program? Given the success of this trial, is it ethical to have another trial in which participants are randomly chosen? <<<========<<

>>>>>==========>> Word & Image; Innumeracy<2>. The New York Times, 5 March 1995, Magazine p. 24, Max Frankel

Max Frankel is former executive editor of "The New York Times" and now writes a column on communication for the "Times Magazine". On this occasion, Frankel discusses the poor job newspapers do in communicating information involving numbers. His examples are all taken from the "New York Times", though he assures us that the "Times" is better at dealing with numbers than most newspapers. Frankel starts with a series of examples with missing denominators; for example: "Clinton has reduced the Federal payroll by 98,000". The reader is not told the total number on the Federal payroll, making it difficult to assess the significance of the cut. Frankel comments that: "America needs baseball back, if only because that is the only way it learns to handle rates, probabilities, and context." Frankel next gives examples in which he feels the newspapers handle numbers in a sloppy way or give a false impression of the precision of the numbers provided. His basic example is a report in the "Times" of a study that estimated the economic costs of depression in America to be 14.7 billion dollars. One aspect of this total cost was the loss of earnings of the 18,400 suicides in a year which Frankel describes as "magically rendered as 7.5 billion ($407,608.69 each?)". The original article, in the "Journal of American Psychiatry", provided a detailed justification for this figure. It seems to us that the problem here is just that of having to summarize a lot of information in a short space. Frankel's simpler examples, such as the report that Russian miners were demanding a 150% pay increase (in one paragraph) and wanting their salaries increased by two-and-a-half times (in the next), are more convincing. Frankel discusses several examples from the forthcoming book by John Paulos, "A Mathematician Reads the Newspaper". He admires a conditional probability problem where a positive test gives only a 1 chance in 11 for a patient actually being sick in a situation where a test can be "rightly called 99 percent accurate". He asks "how many newspapers reporting such a study could correctly instruct their readers in the meaning of 99 percent accurate?". The next example shows Frankel's own difficulty in being clear about numbers. He writes: "How many of us had the wit to question, as Paulos does, the judgment that proportionately more blacks (95 percent) voted for David Dinkins because he is black then whites (75 percent) voted for Mayer Guiliani because he is white?" Noting that 80 percent of New York's blacks normally vote Democratic while only 50 percent of whites normally vote Republican, Paulos says it could also be argued that only 15 percent of blacks voted racially as against 25 percent of whites. Frankel should have first just said that 95% of the blacks voted for Dinkins and 75% of the whites voted for Guilianni and then gone on to give Paulos' argument that it is still possible that fewer blacks voted racially than whites. DISCUSSION QUESTION. If you were a science writer, how would you explain what "99% accurate" means for AIDS tests for example. <<<========<<

>>>>>==========>> The lottery paradox and the preface paradox. Pyrrhonian reflections on knowledge and justification. Appendix A, Oxford University Press, 1994 Robert J. Fogelin

Fogelin writes "Since most of our empirical knowledge claims are inductively based, their probabilities typically fall short of 1. It seems, then, that we are willing to say that something counts as empirical knowledge provided the level of probability is suitably high. But however high we set the probability, it is possible to show that fixing the probability at that level leads to paradoxical results". The lottery paradox and the preface paradox discussed in this appendix are meant to illustrate this claim. The Lottery Parodox: In a lottery the probability that any one person wins is very high. Assume that it is above our chosen level. Then we believe that any individual will not win and yet we also believe that someone will win. The Preface Parodox: An author of the history book states in the preface that he believes everything he has written but he has undoubtedly made some mistakes. DISCUSSION QUESTIONS. (1) Do you regard either or both of these situations as paradoxes? (2) If they are paradoxes, are they the same paradox? (3) Assume that all the inmates on death row have been proven guilty beyond a reasonable doubt. Is it true, beyond a reasonable doubt, that all those on death row are guilty? <<<========<<

>>>>>==========>> More evidence shows estrogen therapy cuts risk of heart disease. The Boston Globe, 10 March 1995, A15 Dolores King

Estrogen is taken by millions of American women to treat symptoms of menopause or prevent osteoporosis. Previous studies have indicated that it can cut the risk of heart disease but increase the risk of breast cancer. A study of nearly 100,000 women, reported at the American Heart Association meeting in San Antonio, found that the women who had estrogen therapy had a 30 percent lower risk of dying from all causes then those who had never used estrogen. This result was statistically significant. They also had a 48 percent lower risk of dying from heart disease which was nearly statistically significant. The corresponding results for women under 75 who had estrogen therapy for at least ten years were both highly significant. Jane A. Cauley who presented the results cautioned that the findings may be influenced by factors such as: women who choose estrogen replacement may be more health conscious or more willing to comply with doctor's advice. The article remarks that this is a case where the entire set of risk factors has to be taken into account. The fact that heart disease kills about five times as many women as breast cancer suggests the advantage of estrogen therapy would outweigh the disadvantages. On the other hand , a women with a history of breast cancer who exercises regularly and eats a low-fat diet might not choose estrogen therapy because of being at a lower risk for heart disease than for breast cancer. <<<========<<

>>>>>==========>> How Shall we Measure our Nation's Diversity? Chance Magazine, Winter 1995, pp. 7-14. Suzann Evinger

Black, White, and Shades of Gray (and Brown and Yellow) Chance Magazine, Winter 1995, 15-18. Margo Anderson and Stephen E. Fienberg The Office of Management and Budget (OMB) provides the racial and ethnic categories used by federal agencies. These categories are used, for example, in the census, in civil rights enforcement, and in demographic studies. The current categories, established in 1977, are American Indian or Alaskan Native, Asian or Pacific Islander, Black, White, Hispanic. The OMB is reviewing these categories and a wide range of changes have been recommended to them including: Change "Black" to "African American" and "American Indian or Alaskan Native" to "Native American". Include "Native Hawaiians" as a separate category or as part of "Native American" rather than as part of the "Asian or Pacific Islander" category. Add a "Multiracial" category to the list of racial designations. In addition to these specific suggestions, more general suggestions have been made including eliminating racial and ethnic categories altogether since they appear to have no real genetic significance. These two articles discuss the many issues facing the OMB. Needless to say a revision of racial categories raises a number of interesting statistical issues such as: "Should race be self reported?" and "Would important information be lost by introducing a multiracial category?". Anderson and Fienberg make a number of recommendations which include: Do away with the labels 'race' and 'ethnicity' and substitute something like "identified population groups". Allow people to identify with more than one group. Regarding this last recommendation they remark: "We can always construct statistical rules for taking multiple responses and producing aggregate information on the categories". DISCUSSION QUESTIONS. The article continually refers to both racial and ethnic categories. Is there a difference? If so, what is it? What advice would you give to OMB relating to the questions we have listed that they are considering? <<<========<<

>>>>>==========>> "How statisticians can help the news media do a btter job?"

"How the news media can help statisticians do a better job?" Chance Magazine, Winter 1995, pp. 24-29. John C. Bailar III

These two companion articles are based on Bailar's extensive experience assisting the media reporting science, particularly in the field of medicine. In the first article, Bailar stresses that science writers are professionals just as scientists are and "have their own interests, professional standards, technical language, time schedule and so forth". Bailar gives his own advice, as well as that of an experienced science writer, as to how to provide scientific information to a reporter and, through the reporter, to the public. Much of this advice is very relevant to our own attempts in the classroom to convey current scientific knowledge. Bailar makes six suggestions for ways the news media can help statisticians do a better job: Reporters should stress that the outcome of one study is usually only one part of a general research program and rarely should be interpreted in isolation. The media should not give dead controversies the impression of still being alive -- for example, they should not allow tobacco companies to suggest that the issue of the danger to health of smoking is still being debated. When a controversy is not settled, such as the possible dangers of magnetic fields, the media should bring out arguments on both sides. A recent article ("Explaining EMF: science writers did it better" by Sharon M. Friedman, "ScienceWriters", Winter 1994-95, pp. 7-8) reviewed a large number of news articles on EMF and found that the media, as a whole, did not follow Bailar's advice-but also found three excellent in depth articles by science writers who did. News media should be more skeptical when the pronouncements of individuals or organizations, including the government, tend to serve some ulterior purpose. An example would be the National Cancer Institute announcing dramatic new progress in curing cancer. News media should be skeptical about claims that are not backed by extensive data. Here he mentions the flurry of news suggesting a connection between cellular telephones use and brain cancer. The News media should continue to educate the public in the ways that science progresses -- i.e., the big picture. <<<========<<

>>>>>==========>> Placing Trials in Context Using Bayesian Analysis. JAMA, March 15, 1995, pp. 871-875 James M. Brophy and Lawrence Joseph

The authors point out that the standard use of p-values and confidence intervals to judge the outcome of a medical trial often lead a doctor to use a drug that does not take into account information from previous studies. They argue that a Bayesian analysis can be used to take into account previous information and can lead to a different decision about the effectiveness of a drug. The authors illustrate this in terms of a recent study called the GUSTO study that compared two different treatments, tissue-type plasminogen activator (t-PA) and streptokinase (SK) for heart attacks. The study involved 41,021 patients and t-PA had a statistically significant lower mortality rate (6.3% vs 7.3%, respectively; p = .001). Since t-PA costs about $2000 and SK only $200, and two similar previous studies indicated no significant difference, some doctors have been hesitant to use the more expensive treatment despite the latest significant result. A Bayesian analysis is carried out by choosing a prior distribution for difference in mortality rates for the two drugs. The authors suggest that people might differ in how they weight the previous studies. They give three prior distributions for the mortality difference corresponding to weighting the previous studies by 10%, 50%, or 100%. For each of these three prior distrib- utions they obtain the posterior distribution for the difference in the two drugs given the results of the GUSTO trial. From the posterior distribution for the mortality difference, it is possible to calculate the probability that t-PA has a lower mortality rate than SK. In addition, it is possible to calculate the probability that one drug is "clinically superior". In this example the authors say that a drug is clinically superior if the mortality rate is at least 1% lower. The 50% weighting of the previous studies leads to a 44% probability that t-PA has a lower mortality rate and a negligible probability that the difference is clinically significant. Ignoring previous results altogether, t-PA has a very high probability (99.95%) of having a lower mortality rate but still only a 48% chance of being clinically significant. Thus, a wide range of weightings of the previous results all provide some justification for not immediately switching to the more expensive drug. DISCUSSION QUESTION. You have a coin that you initially assume is a fair coin. Jones tosses the coin ten times and gets nine heads. You then toss the coin ten times and get five heads. You want to find the posterior probability that the coin is a fair coin weighting Jones' tosses fifty percent. How would you do this? <<<========<<

>>>>>==========>> Statistics as Principled Argument. Publisher: Lawrence Erlbaum Associations, Hillsdale, NJ, 1995 Robert P. Abelson

In the preface of this new book, Abelson suggests students learn to do the statistical analysis for a study but do not learn what he calls the "narrative" part for the study. "Ask a student the question, 'If your study were reported in the newspaper, what would the headline be?' and you are likely to receive in response a rare exhibition of incoherent mumblings." Pursuing the headline question he arrived at the thesis of his book: "The purpose of statistics is to organize a useful argument from quantitative evidence, using a form of principled rhetoric". The book discusses the many issues involved in making valid arguments. The author assumes the reader is familiar with elementary probability, standard tests such as the t tests, analysis of variance, and simple issues of research design that might be presented in a first statistics course. The first 5 chapters review basic statistical concepts in an informal manner with no formulas and always from the point of view of using them to make proper arguments. Chapters 6 through 10 discuss more general topics, such as meta-analyses, again from the point of view of making valid statistical arguments. In all cases the ideas are illustrated in terms of studies, mostly taken from Abelson's own field of research, experimental social psychology. One of Abelson's theses is that problems chosen for study should be interesting. Consistent with this, he chooses interesting examples. Here are three of them. A study found that the average life expectancy of famous orchestral conductors was 73.4 years, significantly higher than the life expectancy for males, 68.5, at the time of the study. Jane Brody in her "New York Times" health column reported that this was thought to be due to arm exercise. J. D Caroll gave an alternative suggestion, remarking that it was reasonable to assume that a famous orchestra conductor was at least 32 years old. The life expectancy for a 32 year old male was 72 years making the 73.4 average not at all surprising. A curious reporter found there were an unexpected number of births on Monday and Tuesday exactly 9 months after the famous New England blackout of 1965. He wrote an article suggesting the obvious causal effect. The detective work of a curious statistician found that on Mondays and Tuesdays more children born by about the same number that the reporter found. Further, he found that doctors prefer to schedule induced labor or Cesarean operations at the beginning of the week, which provided an explanation for the excess births on Mondays and Tuesdays. In 1968 Rosenthal and Jacobson designed a study for what they called the "Pygmalion effect in the classroom". They told elementary school teachers that certain students in their class, called "bloomers", had been identified by a special test as being likely to display future excellence. These students were, in fact, randomly chosen from the class. The researchers hypothesized that the bloomers would receive extra attention and do better in future tests. This was verified. . One observation was that the mean I.Q. scores of the bloomers were 4 to 6 points higher than those of the control group, which caused a great deal of controversy. The meta analysis of 18 further studies did not, initially, support this I.Q. finding; but further analysis did in the studies where the teachers had not had previous contact with the students. This last example, used to illustrate meta-analyses, has been discussed recently in connection with the book "The Bell Curve". In his preface Abelson remarks: "I have always wanted to write a statistics book, full of tips, wisdom and wit." He has certainly succeeded! On the back cover Abelson remarks he is not also the Robert P. Abelson who sings in the Yiddish theater in New York. !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! CHANCE News 4.05 (4 March 1995 to 21 March 1995) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Please send suggestions to: jlsnell@dartmouth.edu >>>==========>>|<<==========<<<

>>>==========>>|<<==========<<<