Small samples can have a significant effect on the analysis, as one negative change in a sample or one piece of data can completely change the outcome. When presenting data from small samples, the mean can be affected by an outlier; for example if the deviation from the mean is not taken into account, a set of results may all be close to fifteen, with just one number at forty-nine or four, then the mean will not reflect the true results, therefore being misleading.
Larger sample sizes are more desirable, reflecting what the range within the population is likely to be, as larger samples are less likely to produce a single error that will affect the analysis, whether it is due to a technician error, equipment error, learning effect or biological variance (see Appendix 3). A drawback of large samples is that testing the median will not always be reliable, as it may not be representative of a true average of the scores as it cannot account for a large data range (A. H. R. Q. , 2003; Gratton and Jones (c), 2004; h2g2, 2003).
Many factors must be taken into account when viewing and researching statistics, as although figures do not lie, people know how to lie through figures. When reviewing data, four questions must be asked: Who carried out the study? Who was being studied? Where was the study taken from? What is the data being compared to? If the viewer is unsure whether the statistical analysis, they may choose to carry out research themselves to compare their own findings to the original study, or to get a second opinion from a statistical analyser, professional in that area of the study or the author of the study.
Appendix 1 – Example of how an association between two variables is not evidence that one causes the other When people chose a gym to attend, they will mostly choose the one with the lowest monthly fees, as people like encouraging numbers to back up a decision. Although the figures prove to be best for one gym, it statistic does not state other factors that may persuade an individual towards one company, such as a compulsory joining or cancellation fee, extra class costs, employee/environment friendliness, facilities, opening times or any extra costs for peak times.
People will make the assumption that because that specific cost is advertised, then it is the best available deal at present; statistics are misinterpreted most commonly by people who do not understand how to read them. Appendix 2 – Sampling Techniques Random sampling involves every member of the population having an equal possibility of being selected for the study, providing the most reliable results for analysis.
Stratified random sampling is used if only certain subgroups in the population are requires; for example if a study was to be undertaken on how often children in school years five to six in public schools, and children in school years five to six in state schools play on a computer console in a week, then stratified random sampling can be used to prevent children from school years reception to four and school years seven and above being studied; this method will ensure that only the required populations are studied, reducing timescales and possible expenditure.
Cluster sampling samples groups at a random selection, rather than individuals. For example, a study to find out the attitudes towards intimidatory behaviour in under sixteen netball league could use cluster sampling. Random teams would be selected, and all the players in each selected team would then be studied, rather than selecting a random sample of individuals from the under sixteen netball league. By performing the study in this way, the researcher would be able to identify any relationships between the teams, their position in the league and their attitudes towards intimidatory behaviour and reasons why.
Systematic sampling samples the population in a specific systematic method. For instance, instead of questioning every player in the league, the researcher can use systematic sampling to question every third and eighth name from the list, depending that the names on the list are in random order (Gratton and Jones (b), 2004). Appendix 3 – Possible Error Techniques Technician error refers to the reliability in the technique of the person researcher, and as the experiment increases, the results become more unreliable (Williams and Wragg (c), 2004).
Machines may produce different results for two identical samples, resulting in misleading results through equipment error; when using machines to record data, the machines must always be tested against a known constant before each use. For example, if weighing scales are the measurement machine, a fixed weight such as a five kilogram dumbbell can be used before each test to ensure that the machine is working correctly, providing identical results for the identical sample (Kose et al. , 2007; Williams and Wragg (c), 2004). A test of human performance will almost always portray the learning effect.
For example, the more the respondent carries out the technique or strategy, the more they will improve each time, although it can be unrelated to the concept being tested. An example of this is the Cooper Run where aerobic fitness is tested; however the respondent may improve their performance through learning to run at an optimal pace throughout the process, and although their aerobic fitness may be the same, their performance will increase. To try and prevent this error occurring, the researcher should allow the respondent to become familiar of the study by practising the Cooper Run before being tested (Williams and Wragg (c), 2004).
Biological variance needs to be controlled by managing certain conditions relating to the subjects, due to biological processes that occur within the human body. For example, body weight varies throughout the day, depending on diet and fluid intake; the time of that day the test is conducted can be controlled or the diet and fluid intake of the subject can be measured to help prevent biological variance creating error in the analysis (Williams and Wragg (c), 2004).