The following paper was submitted by me as a final project for a Hampshire College cognitive science course. It has been edited into a second draft, and has also been edited and formatted significantly from its original version for pulication on this site.
Numbers never lie is a phrase etched into our
collective minds at a young age. Unfortunately this could not be further from
the truth, especially in regards to statistics. In fact, one professor went as
far to say that statistics is often classified as a subcategory of lying. (Moore 1985). As powerful
tools of inference, statistical tests are able to give us insights from the data
we collect about the world around us. However, these insights are
generalizations and are open into interpretation; just as a translation can
only ever be an interpretation, statistical results are also only
interpretations of data and variables.
Before this paper begins, it is
important to start with a disclaimer. By no means am I stating that the statistical
inferences we have today are the “begin and end all” of data assessment. If
anything, they are just another perspective to assimilate data and makes the
data easier to “work” with. For instance, the student’s T-test is much more
conservative in terms of data analysis than the z test, although there is criterion
for using each; even then this criterion can be vague. One textbook I have
states that you should use the z-test when you have a sample size under 30
(Diez, 2012). But this is by no means a standardized rule, and can be up to the
researcher’s discretion.
As a result of this amalgamation of
what statistics can mean, different groups of people have different ways of perceiving
their intended use and for the layperson, the media, students, and academics,
statistics analysis have very different meanings. I would like to explore what
these perceptions are. Before that, I would also like to present some ways in
which statistics can be altered in malicious ways in order to suit the needs of
a researcher. While not the only ways one can be dishonest in academic
literate, doing so will demonstrate how easy such feats can be.
A very simply way one could change
their statistically insignificant results to statistically significant results would
be to change their one tailed test to a two tailed test. Doing so would lover
the p-value (the probability of getting your result if the null hypothesis is
true) of your result, possibly changing the outcome of your study. This is
dishonest because you need to start off with a good reason as to why you are
using a one or two-tailed test, One–tailed tests are for studies where we know
that there will be no sort of adverse result, or studies where we are not interested in that
result for a valid reason. For example, if we were testing the effects of
meditation on blood pressure, we would not expect meditation to increase a
subject’s blood pressure. As a result, your alpha level (where one’s p-value
needs to fall in order to reject the null) is only on one side of a
distribution and would be five percent. However, if we were to test some sort of
new way for students to study, we would want to see if our independant variable increased or
decreased subbects' scores Therefore we would
split
the 0.05 alpha level between the two tails of the distribution, so that it is
now 0.025. If we were to run our "new study method" experiment, and at the end we
get a p-value of 0.033; if we decided that we wanted to change our study to a
one-tail experiment, this would be dishonest.
One of a more obvious way to
misinterpret statistics is to literally change the perspective of the visual
representation of your data. In the following example, the samples closest to
us would be considered larger, because they appear larger.
 |
(Chartingcontrol.com) |
A
verbal example of the same phenomena would be how you expressed your figure.
Would you like to say that you had a one percent return on sales, a fifteen
percent return on investment, a ten-million dollar profit, or a sixty percent
decrease from last year (Huff, 1954)? This topic alone could be covered in
volumes (such as the one just cited), however I am trying to show that there are many different ways one can
misrepresent, and misinterpret statistical information. One could write
multiple papers as well on each of these subjects (performing logical fallacies,
such as asking loaded questions, throwing out data, manipulating data, have
biased samples, etcetera). I just wanted to show how one can present data in
such a way that it would appear to be justified to an uninformed reader.
There are many real world examples
of studies that misrepresent their data or have had their data misrepresented
by others, whether it was an accident or just a simple mistake. Duncan
MacDougall’s attempt to find the weight of the human soul in the twentieth
century is something that has entered Americans’ minds as a faucet of
pop-science. While MacDougall reported that the human soul might have an
average weight of about twenty-one grams. The truth is that his methodology was
seriously lacking. MacDougall only had a sample size of six, two of which were
discarded, with the rest there was trouble with determining the exact time of
death. The weight of subjects often fluctuated after death, and contemporaries
of the time provided a laundry list of physiologically plausable alternitive reasons as to why the
subjects would have lost weight (Evans, 1947). Never the less, MacDougall
continued his research and his results found no difference in weight with dying
dogs.
Still today, MacDougall’s research is
something I have been told by others with conviction is a scientific fact. I
personally have had a high-school history teacher relate this faux factoid to
us. “A scientist once measured the
human soul” she began. A few students backed her up, they too had heard about
the doctor who was able to weigh the human soul and that because he was a scientist,
there was some validity to the point being made. While this example is mostly
harmless, being confined to everyday “did you know?” situations, misinterpreted
studies can become toxic in the realm of news media.
Take the television show Ancient Aliens as
another example many should be familiar with. While there is virtually no
manipulation of statistics in the show, there is vast falsifying of information
and misinterpretation of current archeologist evidence making such absurd
claims such as the pyramids were carved out of lasers, and ancient peoples
having the ability to build aircrafts (Heiser, 2012). An example more related
to Neuroscience and statistics would the studies on average brain size often distributed
by ignorant racists to support their beliefs that certain races of people are
more intelligent. With the cited papers (an example would be: Witelson, 2005),
these racists claim that other races have smaller brains, and then insinuate
that this somehow means they must be less intelligent. Besides the fact that
the results of the study provided; many others like it are almost always either
insignificant or too vague to come to a solid conclusion, it is still a red
herring to say that brain size is related to or correlated with intelligence. However,
this is not an inference that an average layperson may make, and they could be
led to believe that certain bigoted viewpoints have some sort of validity. Here
is a different example from May of this year (2012 at the time of submission), CNBC reported a story with the headline “The
Inflation of Life, The Cost of Raising a Child Has Soared" (CNBC). The article
states that the cost of raising a child has raised 25%. What they don’t state
is that this can be reflected in ten years of inflation (Olitsky, 2012).
Students are also prone to these mistakes. Any
person who spends some time in a psychology class is eventually bound to hear
something along the lines of “Therefore this study proved X”. I am not accusing
these students of being malicious, and it would be unfair to hold them to the
same standard that we should the media, specifically news outlets and a
television channel that claims to have history as the primary subject matter. This may
partially be due to the fact that statistics are rarely taught outside of
higher education, and even then statistics may not be offered or required outside
of certain programs. One may argue as to why a high-school student or English
major would need to take statistics, but when statistics are used every day, it
is important for a more general understanding of what they mean, and how they
are used.
We can take a step further and examine who
exactly are teaching these statistics course. The author of the book Sense and
Nonsense of Statistical Inference, Chamont Wang, claims that many researchers do not even
understand the statistical methods they are applying. He cites a paper that
approximated half of the articles published in medical journals at the time used statistical
methods incorrectly (Wang, 1993. Glantz, 1980). And at this level of academic research, a major problem
is conflict of interest. Many professors are in a situation where they are
pressured to publish significant research in order to keep their jobs. Deemed
“publish or perish” this environment is
lethal for the integrity of science as it puts many academics in situations
where they can easily manipulate their data in order to just keep their jobs,
or to continue funding. A simple Google search of “scientific misconduct” will
yield thousands of studies that have been deemed invalid or fraudulent in the
last decade alone.
In closing, there are an exhausting number of
ways to misrepresent data, and what has been presented here is a brief overview
that does not even being to scratch the surface of misinterpreting statistics. I
would have liked to expanded more and more on each one of these subjects
provided, but I don’t know if I would have known when to stop. While this paper
has been rather bleak, I would like to note that one of the beautiful things
about science is that it is always open for discussion and criticism, and with
due time we may hopfully be able to cast out instances of bad research as examples of what is naught to be done.
Works Cited
Ancient Aliens Debunked.
Dir. Mike Heiser. N.p., 30 Sept. 2012. Web. 5 Dec. 2012.
Diez, David M., Christopher D. Barr, and Mine Çetinkaya-Rundel.
OpenIntro Statistics. Lexington,
KY: CreateSpace, 2012. Print.
Evans, Bergen. The
Natural History of Nonsense,. New
York: A.A. Knopf, 1947. Print.
Glantz, S. A. (1980). Biostatistics: How to Detect, Correct
and Prevent Errors in The Medical Literature. Vol. 61, 1-7.
Huff, Darrell, and Irving Geis. How to Lie with
Statistics. New York:
Norton, 1954. Print.
Introduction to SAS. UCLA: Statistical Consulting Group. (accessed December 5, 2012).
Moore, David S. Statistics:
Concepts and Controversies. San
Francisco: W.H. Freeman, 1985. Print.
Olitsky, Morris. "Misuse of
Statistics a National Problem. Amstat News, 1 Aug. 2012. Web. 10 Dec. 2012.
.
Pie3d. N.d. Photograph.
Charting Control. Chartin Control. 2012. Web. 9 Dec. 2012.
"The Inflation of Life - Cost of
Raising a Child Has Soared." Yahoo! Finance. CNBC, 7 May 2012. Web.
12 Dec. 2012.
Wang, Chamont. Sense and Nonsense of Statistical
Inference: Controversy, Misuse, and Subtlety. New York: Marcel Dekker, 1993. Print.
Witelson, S. F.
"Intelligence and Brain Size in 100 Postmortem Brains: Sex, Lateralization
and Age Factors." Brain 129.2 (2005): 386-98. Print.