Prepared by J. Laurie Snell, Bill Peterson and Charles Grinstead, with help from Fuxing Hou, and Joan Snell.
Please send comments and suggestions for articles to
jlsnell@dartmouth.edu.
Back issues of Chance News and other materials for teaching a Chance course are available from the Chance web site:
Chance News is distributed under the GNU General Public License (so-called 'copyleft'). See the end of the newsletter for details.
===========================================================
It is difficult to forecast, especially about the future.
Believed to have first been first said in the Danish
parliament during the period 1930-39.
===========================================================
Contents of Chance News 7.09
<<<========<<
>>>>>==============>
Note: In the last Chance News we mentioned that David Freedman and
his colleagues were preparing a more up-to-date paper on their
concerns about the sampling techniques proposed for the Census
2000. They have done so and it is available as
Technical Report no. 537
<<<========<<
>>>>>==============>
Beth Chance suggested the following article showing once more that grade school children can do statistics.
California and the West; sixth-grade science project inspires
legislature.
The Los Angeles Times, 8 Sept. 1998, A3
Sarah Yang
Ellie Lammer sixth-grade science project was inspired by an experience of her mother when she put an hour's worth of coins into a Berkeley meter and got only 45 minutes -- and a $20 ticket.
Armed with $5 worth of nickels and three stopwatches, Ellie tested 50 randomly chosen meters around Berkeley. She found that only three of the meters provided the correct amount of time. In fact, while 28% of the meters shortchanged the motorist, 66% gave more than the time specified.
Ellie displayed her findings with spreadsheets and pie charts in an exhibit at her sixth-grade science fair in April. Reporters and then politicians were soon at her door. Senator Quentin Kopp asked Ellie to show her project to legislators in Sacramento. Kopp then proposed a bill authorizing county sealers of weights and measures to test and certify the accuracy of all parking meters within their jurisdiction. The bill was overwhelmingly approved by the Assembly and unanimously approved by the Senate. It is expected that the Governor will approve the bill although the wide margin of support assures its becoming law even if the Governor were not to approve it.
After Ellie's testimony, Assemblyman Tom Woods told Ellie:
At 12 years old, you've done for $5 a perfectly infallible study. By the time you get through college you'll learn how to do this for several hundred thousand dollars.
DISCUSSION QUESTIONS:
(1) How do you think Ellie actually carried out the experiment?
(2) Do you think there is a germ of truth in Tom Woods' remark?
<<<========<<Most people cannot do the odds. Which is a better deal over a year: A 100% safe return with 5% interest or a 90%-safe return with a 20% return? For the first deal, your return will be 5%. For the second, your expected return will be 8%.We received the following letter from Peter Doyle:
To the editor of Chance News:We think you will enjoy answering the first two questions. As penance for not including this in our book we put our solution to Peter's first 2 questions at the end of this Chance News.Faced with a 100%-safe investment returning 5% and a 90%-safe investment returning 20%, you should invest 20% of your funds in the risky investment and 80% in the safe investment. This gives you an effective return of roughly 5.31962607969968292 percent.
QUESTIONS:
1) Derive this answer by considering what happens over the long haul if at each period you allocate a fraction x of your money to the risky investment, and maximizing with respect to x.
2) Use the same method to lay to rest the so-called St. Petersburg paradox.
3) Why is the method of analyzing investment not discussed in introductory probability texts like the one by Grinstead and Snell?
Peter
Norton Starr send us three items from the October 1998 RSS News published by the Royal Statistical Society. Recall that RSS News has a column "Forsooth" which provides quotes from the press which they feel deserve a "Forsooth!".
Forsooth!
... on the potential health risks of choloroform and dichloroacetate ...these are the first proposed risk- assessments to recommend less-than-zero human exposure for compounds found to be carcinogenic in laboratory animals.DISCUSSION QUESTION:
ILSI NEWS (International Life Sciences Institute) May/June 1998.In the study, men who began taking light exercise in their sixties reduced their chances of dying by about 45 per cent compared with those who stayed inactive. Liverpool Echo, 29 May 1998
Infant mortality (deaths at ages under one year per 1000 births):
1948 - 26,766
1994 - 3,979
The Independent 4 July 1998
For each example, explain why you think the RRS gave them a "Forsooth!".
<<<========<<The placebo effect can be regarded as a confounding factor in experiments designed to expose the potency of drugs. This Times article discusses recent evidence for the biological bases for the effect: "But now scientists... are... beginning to discover the biological mechanisms that cause it (the placebo effect) to achieve results that border on the miraculous. Using new techniques of brain imagery, they are uncovering a host of biological mechanisms that can turn a thought, belief or desire into an agent of change in cells, tissues and organs. They are learning that much of human perception is based not on information flowing into the brain from the outside world but what the brain, based on previous experience, expects to happen next." Readers teaching statistics may use the issues surrounding the placebo effect in various ways: (1) some articles give enough numerical data to allow students to run comparisons of proportions; (2) the present article suggests the question as to how to design experiments to avoid confounding via the placebo effect; and (3) the ethics of placebo use can be a source for classroom discussions.
Below are some references on the subject.
H. K. Beecher, Measurement of subjective responses; quantitative effects of drugs, Oxford U. Pr., New York, 1959. (See esp. pp. 66-67.)
Louis Lasagna, "Placebos", Scientific American, 193:68, 1956.
B. Roueche, "Annals of medicine: placebo", New Yorker, October 15, 1960
"Placebo Studies Are Not Just 'All in Your Mind'", New York Times, News of the Week in Review, Jan. 6, 1980. Interesting report on work of Levine, Gordon & Fields (The mechanism of placebo analgesia, Lancet, September 23, 1978, 654-657) and Herbert Benson et al., (Cf. The placebo effect: a neglected asset in the care of patients, J. Amer. Med. Ass. 232:12, June 23, 1975). "Every remedy for angina introduced during the last 200 years, Dr. Benson wrote in The New England Journal of Medicine... was effective in 70 to 90 percent of patients who received it when it was new. But as soon as another therapy appeared, the effectiveness of the old one fell to 30 or 40 percent. This pattern prompted the 19th century French physician Armand Trousseau to recommend, 'You should treat as many patients as possible with the new drugs while they still have the power to heal.'"
Placebo effect is shown to be twice as powerful as expected, New York Times, Aug. 17, 1993, C3.
Placebo effect can last for years, New York Times, Apr. 16, 1997, C8.
<<<========<<Certainly one of the most intriguing and important questions that is still unanswered is whether we are alone in the Universe. This book is an attempt to answer this question by using probability theory, our knowledge of the processes of life on this planet, and observations about the way the universe works.
The book begins with a discussion of the famous Drake equation, named after the astronomer Frank Drake who was one of the first people to attempt to quantify research concerning extraterrestrial life. His equation describes how the number of civilizations in our galaxy that are capable of communicating with other civilizations depends on certain facts about our galaxy, such as the number of stars, the fraction of these stars that have planets, the fraction of these planets that might be 'habitable' by life, and so on. See Chance News 6.13 for a more detailed account of the Drake equations and other references to estimating the probability of life out there.
There have been many attempts to estimate the various facts which enter into the Drake equation. Unfortunately, it is still the case that most of these estimates are very rough, since many depend upon knowledge about planets outside of the solar system, and very few such planets have been detected.
The author does a good job of giving some of the history of the search for extraterrestrial life. He also discusses the evolution of life on this planet and the gradual increase in intelligence during this process. (However, as our Congress has recently demonstrated, we must be prepared for some dips in this intelligence curve.)
At this point, the reader is informed of several notorious problems in probability theory, namely the bus paradox (called the inspection paradox in the book) and the birthday problem. The first of these concerns the following situation: Suppose that the average time between arrivals of busses at a given bus stop is A minutes. Now suppose that someone arrives at the bus stop at some random time. What is the average length of time that he or she should expect to wait for the bus. The first 'obvious' answer is A/2, and this answer is correct, but only if the busses come every A minutes on the dot. If the interarrival times are exponentially distributed, the answer is A, and it is the case that, for certain distributions, the answer is greater than A. The reason for this is that, if the interarrival times have large variance, then the person who arrives at a random time is more likely to arrive at an inter-bus interval of long length rather than of short length.
What has this to do with extraterrestrial life? If we replace the interarrival times by lengths of time that life has existed on planets in the universe, we might ask ourselves what is the average of these numbers? The point is that, on this planet, this length is about 3.5 billion years, and it is only after that much time has passed that intelligence has increased to the point where one thinks about intelligent life. If a life form is sent 'at random' to live on some planet, it is more likely to be sent to a planet where life has long duration than to one where life has only been in existence for a short time. The author concludes from this argument that, even if other life exists in our galaxy, it is likely that we are among the most advanced because it is likely that life has existed on our planet longer than on most planets.
The second problem from probability theory that is brought to bear on the central question in this book is the birthday problem. The problem can be stated as follows: Assume that each of the 365 possible days of the year (ignore February 29) is equally likely to be the birthday of a randomly chosen person. How large must a random set of people be in order that there be a greater than even chance that two people in the set share a birthday? As all of our readers know, the answer in this case is surprisingly small; it is 23. Of what relevance is this problem to the author's arguments?
In this reviewer's opinion, the relevance is weak. In fact, the author really uses the idea of independent events (on which the birthday problem rests) rather than the birthday problem itself. Since one can posit that DNA, and hence life as we know it, was formed by a (very long) sequence of random events, one can imagine ascribing a certain probability p that, around a random star, a similar sequence of events occurred which led to the formation of DNA (or even some other life-generating molecule). Since life has occurred around at least one star in the universe, this probability is positive. Since we assume that the presence or absence of life around different stars are independent events, it follows that the probability that no other life exists in the universe is (1-p)^n, where n is the number of stars in the universe. So far, so good.
It is here, at the punch line of the book, that a problem develops. It is possible to determine, within a few orders of magnitude, the number n in the above expression; it is about 10^(22). It is not possible to get any clear idea, even to within many orders of magnitude, of what p is. The author suggests a value of 10^(-14) or so, and of course with this value, the above expression is essentially 1. However, if this reviewer were to suggest another value for p, say 10^(-28) or so, the above expression is now essentially 0. It is not clear that there is any way at the present time to decide which of these two proposed values of p is closer to the 'right' one. The author adds the remark that, if the number of stars in the universe is infinite, then the probability is exactly 1. However, if one ascribes to the Big Bang theory concerning the creation of the universe, then of course the number of stars in the universe is not infinite.
DISCUSSION QUESTIONS:
(1) Physicist Enrico Fermi supported the idea of extraterrestrial life, but, at lunch with a friend, he made the offhand comment: "Where are they?". This became called the Fermi paradox and caused a major reconsideration of the issue of extraterrestrial life. It even resulted in a conference called "Where are they?" How does Aczel take care of this concern?
(2) From Chance News 6.13 we read (referring to a book by Steven J. Dick):Dick cites an estimate by physicist Harold Morowitz that the probability of creating a bacterium -- the simplest living organism - through random molecular collisions is 1 in 10^100,000,000. Fred Hoyle raises this chance to a more optimistic 1 in 10^40,000. Biochemist Robert Shapiro estimates that the probability of chance formation of a short strand of self-replicating RNA is considerably greater - as "large" as 1 in 10^992.How do you think Aczel would respond to these estimates?Aczel would <<<========<<
It appears that this problem originated with the physicist James Jeans who wrote on page 32 of his book "An Introduction to the Kinetic Theory of Gases". Jeans wrote:
A man is known to breathe out about 400 c.c. of air at each breath, so that a single breath of air must contain about 10^22 molecules. The whole atmosphere of the earth consists of about 10^44 molecules. Thus one molecule bears the same relation to a breath of air as the latter does to the whole atmosphere of the earth. If we assume that the last breath of, say, Julius Caesar has by now become thoroughly scattered through the atmosphere, then the chances are that each of us inhales one molecule of it with every breath we take. A man's lungs hold about 2000 c.c. of air so that the chances are that in the lungs of each of us there are about five molecules from the last breath of Julius Caesar.To answer the Mt. Washington question we note that, from Jeans' remarks, the probability that a particular molecule in our breath came from Caesar's last breath is 2^22/2^44 = 1/2^22. Thus the probability that none of the 2^22 molecules in our breath came from Caesar's last breath is (1-1/2^22)^(2^22) = 1/e = .368 resulting in about a 63% chance that you have at least one of the molecules from Caesar's last breath.
It is assumed here that all the molecules that were in the atmosphere in Caesar's time are still around. We have checked this with two experts on the chemistry of the atmosphere, Richard P. Wayne at Oxford University and Akkihebbal Ravishankara at NOAA (National Oceanic and Atmosperhic Administration). They both commented that air is about 78% nitrogen (in volume) and nitrogen is very stable, so that almost all of the nitrogen in the atmosphere in Ceasar's day will still be there today. If we restrict ourselves to nitrogen molecules, we find that there is still a 54% chance that our breath will contain one of these molecules.
DISCUSSION QUESTIONS:
(1) How did Jeans estimate that there are about 5 molecules in our lungs?
(2) We have also assumed that the molecules in the atmosphere in Caesar's days have become well mixed. Do you think this is a reasonable assumption?
NOTE: This famous problem has found its way into several articles and books. Paulos used it in his best seller "Innumeracy". In an article "Thoughts on Innumeracy: Mathematics versus the world? (Mathematical Monthly, October 1993, p 732) Peter Renz argued that Paulos' answer was wrong and further that, by not pointing out the assumptions needed to solve the problem, Paulos was himself contributing to innumeracy. In the same article, Paulos replied to the criticism and asked Renz to "lighten up."
Chapter 12 of "The Fermi Solution: Essays on Science" by Hans Christian von Baeyer (Random House, 1993) is entitled "Caesar's Last Breath." The chapter starts by describing Edward Tufte's work on improving the graphical display of data. He quotes Tufte's goal of helping "to give visual access to the subtle and the difficult--that is,[to reveal] the complex." Von Baeyer sees this as directly analogous to the science writer's goal of giving verbal access to the subtle and difficult.
In this spirit, Von Baeyer views the "Caesar's Last Breath" example as a lucid illustration of the enormity of Avogadro's number (6.02 x 10^23). According to von Baeyer, if you used a bottle with the same volume as your lungs to measure the volume of the atmosphere, then the number of bottles counted would equal the number of molecules in one breath. Each figure is about one-tenth of Avogadro's number. "This," he says, "is probably the best verbal description that I have encountered of a large number."
Other examples described in this chapter are "The Royal Chessboard" illustration of geometric growth (one grain of rice on the first square, two on the second, four on the third...) and Einstein's elevator analogy for describing general relativity theory.
<<<========<<It is a well-known phenomenon that any two people in the U. S. (and indeed, in the world) can be 'connected' to each other by a relatively short chain of friendships. This idea has been the subject of a play: Six Degrees of Seperation and and a game: the Kevin Drake game. You can learn about this game at www.cs.virginia.edu/oracle/. You will find probabilist David Griffeath in their hall of fame.
The network of friendships has at least one property that is not shared by 'random' networks.
A network is just a set of points, with certain pairs of points connected by lines (usually called edges). Networks can be used to model situations such as the one above, where the points represent people and the edges represent friendships between pairs of people. They are also used to model situations such as power- transmission grids and neural regions in animals.
A network is 'random' if, for any two points, there is an edge between them with probability p, where p is a constant for the entire network. In addition, the presence or absence of edges between different pairs are usually taken to be mutually independent events.
Given two points in a network, we can define the shortest path between them in the obvious way. We can then average the lengths of the shortest paths over all pairs of points, obtaining a value that we will call L.
Another parameter that can be defined for a network is something we will call C. For each point in the network, the set of neighbors of that point is just the set of points that are connected to the given point by an edge. For each point P, we can calculate the fraction of possible edges between points in the set of neighbors of P that actually exist. If we average this over all points in the network, we obtain C.
The friendship network has a large value of C and a small value of L. That L is small might seem somewhat surprising, in view of the fact that the size of the network is large. If one tries to model this behavior with a random network, one finds that, for small values of p (the edge probability), C is small rather than large. We note that p must be small, since most pairs of people are not friends. The article is mute on whether L is small for small values of p, but in this reviewer's opinion, L will still be fairly small, even when p is small. So L probably does not provide a reason for rejecting the random network as a model.
Duncan Watts and Steven Strogatz, at Cornell University, have created a model that has relatively few edges and has high C and low L values. This model, called the 'small-world' model, can be constructed as follows: One starts with the points spaced around a circle, and one connects each point to the k nearest points on the circle, where k is a parameter. Then one looks at each edge in this network, and, with probability p, changes one of the endpoints to another, randomly chosen point. It turns out that doing this, even for small values of p, introduces some 'long- range' links into the network, thus drastically decreasing the value of L. Nonetheless, for small values of p, the value of C does not change much by this randomization procedure.
The article points out that this 'small-world' network model can be used to model certain networks from real life. As an example, the neural network of the worm Caenorhabditis elegans (which has the distinction of being the only organism whose neural network is completely known) has small L and large C values.
DISCUSSION QUESTIONS:
(1) In a random network on n points with edge probability p, what is the expected value of C? (Hint: every edge in a random network has probability p of existing.)
(2) In the regular network, before randomization occurs, what is C? (Hint: draw a picture of a portion of the circle that contains a point and its neighbors. Calculate the proportion of possible edges between neighbors of the given point that are actually there.)
(3) Now suppose that p is very small and apply the randomization procedure to the regular network. Explain why the edges between the neighbors do not change very much. Explain why this means that the value of C does not change much.
(4) In the regular network, with parameter k, what is the value of L? (Hint: Given a point P in the network, explain why the length of the shortest path between P and another point Q which is t positions away from P on the circle is about 2t/k. Then, average this value for t from 1 to n/2.
(5) Explain why it is reasonable that the randomization procedure in the creation of the 'small-world' network should significantly lower the value of L.
<<<========<<In this article, science writer Robert Matthews argues that the reliance on classical p-values and tests of significance in medical trials leads to over-optimistic conclusions providing false hopes and public confusion. He begins the article by given a number of examples where clinical trials suggested major breakthroughs in preventing and treating major diseases, such as heart disease, and were later the treatment was shown to be of very little value.
He then goes on to explain, in very general terms, the fact that significance testing can easily double the apparent effectiveness of a drug. He is referring to the fact that, saying that a drug produced an effect significant at the 5% level, is often misinterpreted as saying that there is only a 5% chance that the drug is not effective. For the latter probability it is necessary to use a Baysian analysis which quite generally gives a significantly higher probability that the drug has no effect.
For example, we often ask students in a Chance course to see if they can tell the difference between Pepsi and Coke. We might give them ten trials where, on each trial, the choice of Pepsi and Coke is randomly made. Suppose that we tell them they will establish their claim if they succeed in telling the difference on 8 or more of the 10 trials. The chance of doing this by guessing is .055 so we could reject the hypothesis of guessing at the .055 level if they get 8 or more correct.
For a Baysian analysis we need to assume an aprior probability for the probability p that the student gets a trial correct. Assume, for example, that we assign probability 1/2 to the hypothesis that the student is guessing, p = 1.2, and then distribute the other probability uniformly between 1/2 and 1. Then if the students gets 8 correct. P(guessing | 8 correct) = (1/2*.044) / (1/2*.044 + Integral{1/2, 1} (p^8 )(1-p^2) dp) = .19. Thus the probability that the student was guessing is more than 3 times as likely as the significance level.055.
That a Baysian analysis quite generally makes the null hypothesis more likely than might be suggested by the p value has been pointed out many times in the literature starting in the 60's with a review article by Savage and colleagues in Psychological Review, Vol 70, No. 3 p. 193 (1963). When these authors were ignored, the concerns were repeated in the 80's by Berger and Sellke in a review article in JASA 82 112 (1987). Now Matthew's is attempting to bring the battle to the public.
Matthews' most charitable explanation of the excessive confidence in p values is the objection to using subjective probabilities. However, as he points out, the above authors establish lower bounds for the P(null hypothesis|data) that are significantly greater than the significance level for large classes of apriori probability distributions. Matthews' less charitable explanation is that classical methods lead to more successful experiments which leads to more support for their research.
Matthews discusses all this in detail in a more technical working paper "Facts versus Factions: the use and abuse of subjectivity in scientific research" prepared for the European Science and Environment Forum.
DISCUSSION QUESTIONS:
(1) Do you think Matthews by bringing this debate out into the public will have any more luck than previous Baysian statisticians have had?
(2) Why do you think that Baysian statistics is rarely taught in a beginning statistics course? Do you think it should be?
(3) In our example, we actually know from past experiments that only about 20% of the students can establish significance. What would happen to our probability for P(guessing| data) if we assume that the apriori probability of guessing is .2 and the rest of the aprior probability is uniformly distributed over 1/2 to 1.
<<<========<<Fred Hoppe sent us the URL of WebMath where we find the infamous birthday problem as a poll.
You are asked to vote on the following:
There are 23 people in a room. Would your consider it a good bet that two or more of these 23 people have the same birthday?When we looked at the site, 45% had answered yes and 55% no. After you vote you can see the answer computed and the conclusion:
So, the odds that 2 or more of the 23 people in the room have the same birthday is about 50%. Odds as good as betting heads on a coin toss! Yes--It's a safe bet!
DISCUSSION QUESTIONS:
(1) Do you regard this as a "safe bet"?
(2) Are you surprised by the number who consider it a good bet?
<<<========<<Steve Simon is a statistician who works at Children's Mercy Hospital in Kansas City, where he helps with the planning of medical research studies. He also runs statistics training classes and has developed web pages from some of his teaching materials: STATS.
One feature you will find there is "Ask Doctor Mean." Doctor Mean is Steve's alter ego who answers questions from people in need of non-technical explanations for statistical terms. Given that so much of the statistics we read in the news concerns medical studies, Steve's web site should be a welcome reference!
<<<========<<The gap between black students and white students on standardized tests, such as the SAT, has long been a source of concern for educators. A study sponsored by the Mellon Foundation tried to identify traits shared by high-scoring students. The study looked at questionnaire data from the College Board from some 100,000 blacks who took the SAT in 1996, focusing on those with a combined math and verbal score of 1200 or better.
As a group, high-scoring blacks tended to be economically advantaged as compared to lower-scoring blacks, though they were still not as well-off as high-scoring whites. There was some effect of school-type, with students from private and parochial schools scoring higher. But the biggest difference between high and low scorers was not school type, but rather the specific courses taken and extracurricular activities pursued. The majority of high scorers had taken calculus and honors English. Also high scorers were more likely to participate in extracurricular activities that have an intellectual focus, such as journalism and debate.
Stanford psychologist Claude Steele warns that we should not focus exclusively on SATs. He estimates that the SAT measures only 18% of the things that influence freshman grades. Still, he concedes that standardized tests continue to play an important role in admissions and says that, as long as the tests are around, it is important that blacks do well on them.
A data table accompanying the article presents the following figures comparing African Americans with whites.
African-Americans all >1200 SAT Percent taking calculus 11% 59% Percent taking honor English 28% 76% Percent of neighbors with B.A. 18% 26% Median income of neighborhood $34.8 $41.6 Whites all >1200 SAT Percent taking calculus 23% 58% Percent taking honor English 40% 74% Percent of neighbors with B.A. 25% 30% Median income of neighborhood $43.5 $47.3
DISCUSSION QUESTIONS:
(1) What is meant by the statement that the SAT exam "measures only 18% of the things that influence freshman grades"?
(2) What similarities and differences (between African Americans and Whites) do you see in the data presented in the table?
<<<========<<This article reports on a new theory on what triggers wars. This theory is being advanced by two psychologists, Christian G. Mesquida and Neil I.Wiener, at York University in Toronto. In a nutshell, the theory says that wars are triggered by societies that are 'bottom-heavy with young, unmarried and violence-prone males.'
The researchers considered the pattern of wars and rebellions over the last decade, and compared this to the population demographics. It was found that there was a strong correlation between the set of societies in which wars and rebellions had occurred and those societies in which there was a large population of unmarried males between the ages of 15 and 29.
Evolutionary biology seeks to explain much of human behavior, both individual and aggregate, by considering whether the behavior imparts any evolutionary advantage. In this case, the explanation is made that war is 'a form of intrasexual male competition among groups, occasionally to obtain mates but more often to acquire the resources necessary to attract and retain mates.' Of course, the theory applies to the offensive, and not the defensive, side in a war. For example, the United States was drawn into World War II, so, in this case, this theory would apply to Germany, Italy, and Japan, rather than the U. S.
Almost half of the countries in Africa have young, unmarried male populations in excess of 49 percent of the entire population. In the last 10 years, there have been at least 17 major civil wars in countries in Africa, along with several conflicts involving more than one country. In contrast, in Europe there are almost no countries with the young, unmarried male population making up as high a percentage as 35, and in the last 10 years there has been only one major civil war (in Yugoslavia, which has more than 42 percent young, unmarried males).
DISCUSSION QUESTION:
Is it possible that the two variables 'probability of conflict' and 'percentage of unmarried males between the ages of 15 and 29' are positively correlated but not causally related? Can you think of some other variables that might give rise to high values in both of these variables?
<<<========<<A letter from Judith Alexander of Chicago reads as follows: "A reader asked if you believe that spelling ability is a measure of education, intelligence or desire. I was fascinated by the survey you published in response. The implication of the questions is that you believe spelling ability may be related to personality. What were the results? I'm dying to know."
The "biggest news," according to Marilyn, is that poor spelling has no relationship to general intelligence. On the other hand, she is sure that education, intelligence or desire logically must have something to do with achieving excellence in spelling. But her theory is that, even if one has "the basics," personality traits can get in the way of good spelling.
She bases her conclusions on results of a write-in poll, to which 42,063 of her readers responded (20,188 by postal mail and 22,415 by e-mail). Participants were first asked to provide a self- assessment of their spelling skills, on a scale of 1 to 100. They then ranked other personal traits on the same scale. For each respondent, Marilyn identified the quality or qualities that were ranked closest to spelling. She considers this quality to be most closely related to spelling ability. She calls this process "self-normalization," explaining that matching up the ratings for each individual respondent overcomes the problem that respondents differ in how accurately individuals can assess their own abilities relative to assessments of other people.
The trait that she found most frequently linked to spelling in this analysis was: ability to follow instructions. Next was "ability to solve problems," followed by "rank as an organized person." The first two were related very strongly for strong spellers but hardly related at all for weak spellers. She reports that only 6% of the weak spellers ranked their ability to follow instructions closest to their spelling ability, and only 5% ranked their ability to solve problems closest to their spelling ability. On the other hand the relationship with organizational ability showed up at all spelling levels, with top spellers being the most organized, and weak spellers being the least organized.
Marilyn says she asked for a ranking of leadership abilities in order to validate her methods. She did not believe this trait was related to spelling, and, indeed, leadership was linked least often to the spelling in the data. Similarly, creativity also appeared to be unrelated to spelling ability.
The article includes a sidebar discussing how Marilyn dealt with the limitations in her polling methodology. Because of possible response bias, and because the ratings are self-assessments, she did not report "n percent of Americans are good spellers."
DISCUSSION QUESTIONS:
(1) How does the self-assessment scheme allow Marilyn to distinguish weak spellers from strong spellers?
(2) What other concerns do you have about the polling methods used here? Do the qualifications noted in the last paragraph adequately respond to them?
(3) In what sense do the results on the leadership trait "validate" Marilyn's methodology?
(4) Marilyn asserts that ability to follow instructions is a personality trait interacting with general intelligence, whereas ability to solve problems represents general intelligence alone, and organization is a personality trait alone. How do you think she arrived at this classification? Do you agree with her?
<<<========<<Linda Gillick was convinced that there must be a reason why, in an ordinary patch of New Jersey seashore, an unusual number of children, including her son, had similar types of cancer. She organized a movement which has persuaded the state to make a comprehensive environmental study of her town, Toms River, and of surrounding Dover Township in Ocean county. The township has had 90 cases of childhood cancers over a 12-year period, compared with the 67 cases that New Jersey health officials said could be expected.
Cancer clusters have been difficult to document with standard statistical analysis and even when "a true excess" is found it is hard to explain according to scientists at the National Cancer Institute.
However, the public demands have led to studies of particular clusters. This, in turn, has led to an increased effort to integrate two sets of data that are readily available; toxic chemical releases and geographic variations in incidences of cancer and other diseases. Attempts are being made to look at the "big picture" and try to see if there are patterns of risk that should be looked at.
DISCUSSION QUESTIONS:
(1) The well know expert Bruce Ames who directs the National Institute of Environmental Health Sciences Center at Berkeley is quoted as saying: "There are so many types of cancer and birth defects that just by chance alone you'd expect any one of them to be high in any one little town." Do you think he actually said this? What did he mean to say?
(2) How would you go about trying to decide if the Cancer Cluster in Dover Township could be attributed to chance?
<<<========<<A probability distribution is said to obey a power law if there are constants C and k such that the distribution has the form Cx^k. This article examines the distribution of the sizes of forest fires, both by using real data concerning fires in the United States and Australia, and by simulation. In both cases, it is seen that the distribution obeys a power law.
In the case of actual data, it is shown that if we let f(A) denote the number of fires per year of area A, then f(A) is approximately equal to CA^k, where k ranges from -1.49 to -1.31, depending upon which data set is used. The data come from such places as Alaska, the western United States, and Australia. One of the data sets covers the time period from 1150 to 1960 A. D. (much of this data was gathered by looking at tree rings).
The simulation of forest fires was carried out as follows. Trees are planted on a square grid at successive time steps, and, after a certain fixed number of steps, a lighted match is dropped on the grid. The match burns the tree (if any exists) at the site where it drops, and the fire spreads through adjacent trees, until no adjacent unburned trees exist. The fixed number of steps between match drops is called the sparking frequency, and changing this parameter affects the distribution of fire sizes. One can think of a low sparking frequency as corresponding to the policy of fire suppression (as was followed in Yellowstone National Park and other places) before the 1988 Yellowstone fire. Not surprisingly, the number of large fires increases if the sparking frequency decreases.
The results reported on in this article could be used in several ways. The frequencies of small-and medium-sized fires can be used to estimate the frequency of very large fires (this method is used to predict the frequencies of large earthquakes). Also, one can use the power-law distribution to help decide whether it is better to suppress small fires or to let them burn.
DISCUSSION QUESTIONS:
(1) Show that f(A), the number of fires per year of area A, is of the form CA^k, where k < -1, then most of the area that is burned by all of the fires is burned in small fires.
(2) Suppose that you were given a list of forest fires that burned in a given area during a given time period, together with the amount of area burned for each fire. How would you estimate C and k in the power-law model discussed above?
<<<========<<This article discusses various problems that can occur when the public is asked to deal with predictions. The first is the problem of relying on point estimates of unknown parameters without taking into account the margin of error of the estimate. An example of this occurred in the Spring of 1997, when the Red River of the North was rising quickly. The National Weather Service forecast that the river would crest at 49 feet: Unfortunately, the river actually crested at 54 feet. The fact that many of the residents of Grand Forks, N. D. relied on the point estimate of 49 meant that they eventually faced a situation that required that they leave their homes quickly. In this case, it would have been much better for the weather service to have broadcast an error bar along with the point estimate. In this reviewer's opinion, it is entirely possible that the weather service did publish an error bar (not made clear in the article), but the public failed to pay sufficient attention to this piece of information.
Another example in which insufficient attention is paid to the size of the possible error is in the field of global warming. The models used to predict the rise in the average surface temperature of the Earth in the next century are full of assumptions and uncertainty. Yet, when the results are published for public consumption, too little emphasis is placed on the sizes of the possible prediction errors.
With some types of predictions, people have become used to the idea that the predictions are not always right. An example of this is the prediction of the amount of rain that will fall in a given area in the next 24 hours. Earthquake prediction has been so inaccurate that no attempt is made at present to predict any individual earthquake; instead, long-range earthquake potentials for a given region are published.
Another situation in which predictions should be viewed with a jaundiced eye occurs when the group that is making the prediction uses the prediction to make policy, money, or both. For example, if a lumber company predicts that clear-cutting a section of forest will have minimal adverse effects on the river that runs through the forest, it is clear-cut that this prediction, if used to form policy, will have a positive effect on the company's bottom line.
DISCUSSION QUESTIONS:
(1) Do you think that in some cases it might be better to leave the point estimate out entirely, when publishing a prediction, and instead just to publish a range of possibilities?
(2) Do you think the media or the National Weather Service was to blame for the Grand Forks problem?
(3) Can you think of other areas where predictions could be improved by including the probable error range?
(4) Some weather experts feel that instead of forecasting the high and low temperature they should forecast probable intervals for these quantities. Do you think this would be a good idea?
<<<========<<This is a discussion with mathematicians and science writers about why there is so little mathematics reported in the media. Its interest to Chance News comes from remarks of Persi Diaconis.
Diaconis joins David Freedman and other statisticians in expressing concern that the sampling methods proposed by the Census Bureau for the undercount problem are flawed and might do more harm than good. He indicates that their objections are falling on deaf ears.
In the discussion about how mathematics can become more attractive to the media, Diaconis remarks that talking about serious probability problems such as "how many times should you shuffle a deck of cards to have them well mixed" can be livened up by throwing in some magic tricks. He provides the following example:
Suppose I mailed you a deck of cards and wrote: Dear Ira, it was exciting to be on the program and I thought might you like to see a card trick. Notice I've enclosed a deck of cards with this letter. If you don't mind, take the cards out of their case, give them a cut, give them a riffle-shuffle, so you cut them about in half and you go rrrip the way you do, you know. Give them another cut, give them another shuffle give them a few more cuts.Take the top card off, look at it, remember it, poke it back into the deck, and give the deck another shuffle. Give it a few more cuts, mail me back the deck. O yes, and at 7 o'clock every evening concentrate on your card.
And you mail me back the deck and in a week you get back a note saying it was the six of hearts. And that was the card you picked. So that's a sort of amazing trick.
FLATOW: How do you do that?
The deck I send you isn't just any old mixed up deck, it's arranged. And just for purposes of this sentence, let's say it's ace of hearts, deuce of hearts, three of hearts from the top on down. I might do something more subtle 'cause you might look.Now, picture a deck of cards being riffle-shuffled once. What do you do? You cut off about half the cards and you kind of riffle them together. So the top half of the deck gets interlaced with the bottom half of the deck and the deck has two rising sequences. That is, the cards in the top half of the deck are still in their same relative order, the ace is above the deuce, and the cards in the bottom half are still in the same order, but the two halves are mixed together.
After two shuffles there are four rising sequences, and after three shuffles, there are eight rising sequences. rising sequence of length one.
So what do I do? I play solitaire. I start turning them up one at a time. Say the top card is the seven of spades. If the next card's the eight of spades, I play it on the seven. If it's another card, the nine of hearts, I start a new pile.
And if you do that, just undoing the piles in order, what you'll find is you get eight piles, each about an eighth of the deck, and a ninth pile of size one, which would be the card.
DISCUSSION QUESTION:
Is Persi's trick sure to work? If not, how likely is it to fail?
<<<========<<Here are our answers (with help from Peter) to his first two questions mentioned at the beginning of this Chance News.
Peter is using a money management system due to J. L. Kelly (A new interpretation of information rate. Bell System Technical Journal, 35 (1956). Kelly was interested in finding a rational way to invest your money faced with one or more investment schemes each of which has a positive expected gain. He did not think it reasonable to try simply to maximize your expected return. If you did this in the Motley Fools example as suggested by Mark Brady, you would choose the risky investment and might well lose all your money the first year. We will illustrate what Kelly did propose in terms of Motley Fools example.
We start with an initial capital, which we choose for convenience to be $1, and invest a fraction r of this money in the sure-thing investment and a fraction 1-r in the gamble. Then for the next year we use the same fractions to invest the money that resulted from the first year's investments and continue, using the same r each year. Assume, for example, that in the first and third years we win the gamble and in the second year we lose it. Then after 3 years our investment returns an amount f(r) where
f(r) = (1.2r + 1.05(1-r))(1.05(1-r))(1.2r + 1.05(1-r)).
After n years, we would have n such factors, each corresponding to a win or a loss of our risky investment. Since the order of the terms does not matter our capital will be
f(r,n) = (1.2r + (1.05(1-r))^W)*((1.05(1-r)^L)
where W is the number of times we won the risky investment and L the number of times we lost it. Now Kelly calls the quantity
G(r) = lim(n->oo)log(f(r,n)/n
the exponential rate of growth of our capital. In terms of G(r) our capital should grow like a^n where a = e^(G(r)). In our example
Since we have a 90% chance of winning our gamble, the law of large numbers tells us that as n tends to infinity this will converge to
It is a simple calculus problem to show that G(r) is maximized by r = .2 with G(.2) = .05183. Then exp(.05183) = 1.0532, showing that our maximum rate of growth is 5.32% as claimed by Peter.
The attractiveness of the Kelly investment scheme is that, in the long run, any other investment strategy (including those where you are allowed to change your investment proportions at different times) will do less well. "In the long run and less well" means more precisely that the ratio of your return under the Kelly scheme and your return under any other strategy will tend to infinity.
For the second question Peter asks us to use the Kelly system to "put to rest" the St. Petersburg paradox. Recall that this paradox arises by considering a game where you toss a coin until the first time heads turns up and, if this occurs on the kth toss, you are paid off 2^k dollars. Since the expected winning is infinite you should be willing to pay any amount to make a sequence of plays of this game. In fact we have found that few people are willing to pay even $10 to play.
Peter has in mind our considering the St. Petersburg paradox as an opportunity to invest in a St. Petersburg stock and to re-invest our money at the end of each period (say a year). We assume that a unit of St. Petersburg stock cost c dollars and in a single time period it pays back c*(2^k) with probability 1/2^k. Now suppose that we start with $1 and decide to invest a fraction r of our capital in St Petersburg stock and in successive periods continue to invest a proportion r of at the beginning of each year. Then after n years our capital is
where n(k) is the number of periods in which the stock paid off 2^k. Note that r/c is the number of shares of stock we are able to buy with the proportion of our capital we are investing. Then
As n tends to infinity this tends to
Assume now that c = 10. Using Mathematica we find that this is maximized at r = .0072 and G(.0072,10) = .00102. Thus the effective rate of return for this investment is exp(.00102) = 1.00102. Therefore, if the cost of a share of St. Petersburg stock is $10 we should invest a very small fraction of our money in this stock, namely, about .72 percent, and our returns will grow at a rate of about .1 percent. If the cost is only $5 then we find that we should invest 17.88 percent of our money and we would get an effective return of 4.37 percent. Thus under the Kelly scheme of investment there is a value to the St. Petersburg game for any cost of playing and there is no longer a paradox.
Of course, the answer to why Grinstead and Snell did not put this solution in their probability book is that they did not know anything about this money management system.
DISCUSSION QUESTIONS:
(1) The Kelly system is an asymptotic result. It is interesting to see how it would do in a small number of investment intervals. For this we have to look at the actual distribution of our returns on our investment. In our first example, if we invest 20% of our money (starting with $1) in the risky stock as Kelly suggests we find the following distribution for our capital after 10 time periods.
Number of probability final capital successes 0 0 .84 1 0 .8614 2 0 .8833 3 0 .9058 4 .0001 .9288 5 .0015 .9525 6 .0112 .9767 7 .0574 1.0016 8 .1937 1.0271 9 .3874 1.0532 10 .3487 1.08 Expected final capital = 1.053
Would you be satisfied with a 26% chance of your doing less well than the safe investment?
(2) Paul Samuleson wrote an article, The "fallacy" of maximizing the geometric mean in long sequences of investing or gambling, (Proc. Nat. Acad. Sci., Vol. 68, No. 10, pp. 2493-2496) in which he criticizes the Kelly criteria. He says that people think, incorrectly, that the Kelly result also means that their expected effective return should be bigger under the Kelly scheme than under any other investment scheme. How would you calculate the expected effective return for our Motley Fools example? What value of r maximizes the expected long run effective return in this example? What do you think about this strategy? What do you think about Samuleson objection?
This work is freely redistributable under the terms of the GNU General Public License as published by the Free Software Foundation. This work comes with ABSOLUTELY NO WARRANTY.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!