Nowadays, we are all becoming aware of the fact that using statistics in media is a good way to persuade people. The only problem is that statistics can be used to manipulate data in the way we want. Some of the techniques for data manipulation are going to be uncovered in this article. As a result, people, who believe in statistics from media, are in danger of being manipulated. Mark Twain once said: “there are three types of lies: lies, bare lies and statistics”.
So the main argument of this essay is: “statistical analysis is a mathematical way of making some inference about the data or summarizing it, hence the data analysed using formal methods is unbiased”. In the first passage I would like to discuss the fact that averages hide a lot of information. Afterwards, I would like to present a case of O. J. Simpson and incorrect interpretation of conditional probabilities, which is a good example of unwarranted assumptions or in particular Black-and-White thinking. Final argument is how important is to make correct assumptions also known as false or misleading presuppositions.
The first argument is that: “there are several types of averages such as median, mode and mean, so averages hide a lot of information” . Even though in most of the cases arithmetic mean is used as an average, other two averages can be used as well. To prove the argument above, let’s assume we are given a collection of values {1,2,3,4,20,20,20}. Then, arithmetic mean is the sum of all values from the set, divided by the number of values. Mode is the most frequent value in the set. Median is middle value if all the values from the set are sorted.
In this particular example, arithmetic mean, mode and median are 10, 20, and 4 respectively. Therefore, any of these three values can be used as an average. Let’s consider Simpson’s paradox, where the rate for the aggregate is very different from the rates for the sub-groups, which is another good manifestation that averages hide a lot of information. As a result, we have at least two examples how statistics can mislead people. Second argument is that: “majority of people are not aware how to use formal methods, so common sense is a good substitute for the former”.
This argument is unsound, because it is invalid, which is going to be showed below. A good example for this is the O. J. Simpson murder case: a former American football star and actor, was brought to trial for the 1994 murder of his ex-wife and her friend. All evidences were against him. The main statement, which was used to defend O. J. Simpson, was as follows. By official statistics, from a reliable source, for that period of time husbands, who beat their wives, only 30% of them kill their wives. Therefore, majority of the jury members were under impression that 70% of husbands, who beat their wives, don’t kill them.
This claim played crucial role in the final decision of the jury and O. J. Simpson was acquitted in 1995. Let’s analyse this case more formally, instead of using common sense. But for that we need to introduce some probability theory concepts and theorems. Conditional probabilities are quite simple concept in Probability theory, even though there are some cases when people misinterpret the probabilities, which lead to invalid conclusions. Conditional probability P(A|B) is a probability of event A occurring given that event B has occurred.
Theorem that we are going to use in our analysis is Law of Total Probabilities, which claims that, for events A and B the probability of event A is the probability of event A given event B has occurred plus probability of event A given that event B has not occurred, i. e. P(A) = P(A|B) + P(A|not B) Taking into account Law of Total Probabilities we will get that: P(Simpson killed his wife)= P(Simpson killed his wife | he beat her) + P(Simpson killed his wife | he didn’t beat her) P(Simpson didn’t kill his wife)= P(Simpson didn’t kill his wife | he beat her) + P(Simpson didn’t kill his wife | he didn’t beat her)
By axiom, the sum of probabilities add up to 1. Therefore, 1 = P(Simpson killed his wife) + P(Simpson didn’t kill his wife) So now if we want to calculate the probability that Simpson is innocent, we need to substitute the statistics used for defending O. J. Simpson in the trial and rearrange our equations. Therefore, we get 0. 7 = P(killed|not beat) + P(not killed|beat) + P(not killed|not beat) Hence, P(not killed|beat) = 0. 7 – P(killed|not beat) – P(not killed|not beat)
This probability can tend to zero if either of P(killed|not beat) or P(not killed|not beat) will tend to 0. 7. In the example above I showed that probability of O. J. Simpson being innocent can tend to zero in case we interpret conditional probabilities in a formal way, instead of using our common sense. So now, we can use should pattern as follows. Using formal methods for interpreting conditional probabilities, instead of common sense, will achieve a justified conclusion. When it comes to interpreting conditional probabilities, formal methods is the best known way to achieve justified conclusion. All things considered, using formal methods for interpreting conditional probabilities (and achieving a justified conclusion) is better than not achieving a justified conclusion. Read marketing mix Innocent