Summary Note. Introduction To Statistical Inference

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

INTRODUCTION TO STATISTICAL INFERENCE

Population and sample


 A population is the entire group that you want to draw conclusions about.
 A sample is the specific group that you will collect data from. The size of the sample is always less
than the total size of the population
 In research, a population doesn’t always refer to people. It can mean a group containing elements
of anything you want to study, such as objects, events, organizations, countries, species,
organisms, etc.

Hypotheses
 A hypothesis is a statement that can be tested by scientific research. If you want to test a
relationship between two or more things, you need to write hypotheses before you start
your experiment or data collection.

Hypothesis Testing

1. State your null and alternate hypothesis


- After developing your initial research hypothesis (the prediction that you want to
investigate), it is important to restate it as a null (H o) and alternate (Ha) hypothesis so that
you can test it mathematically.
- The alternate hypothesis is usually your initial hypothesis that predicts a relationship
between variables. The null hypothesis is a prediction of no relationship between the
variables you are interested in.
- You want to test whether there is a relationship between gender and height. Based on
your knowledge of human physiology, you formulate a hypothesis that men are, on
average, taller than women. To test this hypothesis, you restate it as:
- Ho: Men are, on average, not taller than women.
Ha: Men are, on average, taller than women.

2. Collect Data
- For a statistical test to be valid, it is important to perform sampling and collect data in a
way that is designed to test your hypothesis. If your data are not representative, then you
cannot make statistical inferences about the population you are interested in.
- To test differences in average height between men and women, your sample should have
an equal proportion of men and women, and cover a variety of socio-economic classes
and any other variables that might influence average height.
- You should also consider your scope (Worldwide? For one country?) A potential data
source in this case might be census data, since it includes data from a variety of regions
and social classes and is available for many countries around the world.

3. Perform a Statistical Test


- There are a variety of statistical tests available, but they are all based on the comparison
of within-group variance (how spread out the data is within a category) versus between-
group variance (how different the categories are from one another).
- If the between-group variance is large enough that there is little or no overlap between
groups, then your statistical test will reflect that by showing a low p-value. This means it is
unlikely that the differences between these groups came about by chance.
- Alternatively, if there is high within-group variance and low between-group variance, then
your statistical test will reflect that with a high p-value. This means it is likely that any
difference you measure between groups is due to chance.
- Your choice of statistical test will be based on the type of data you collected.
- Based on the type of data you collected, you perform a one-tailed t-test to test whether
men are in fact taller than women. This test gives you:
 an estimate of the difference in average height between the two groups.
 a p-value showing how likely you are to see this difference if the null hypothesis
of no difference is true.
- Your t-test shows an average height of 175.4 cm for men and an average height of 161.7
cm for women, with an estimate of the true difference ranging from 10.2cm to infinity.
The p-value is 0.002.

4. Decide whether the null hypothesis is supported or refuted


- Based on the outcome of your statistical test, you will have to decide whether your null
hypothesis is supported or refuted.
- In most cases you will use the p-value generated by your statistical test to guide your
decision. And in most cases, your cutoff for refuting the null hypothesis will be 0.05 – that
is, when there is a less than 5% chance that you would see these results if the null
hypothesis were true.
- In your analysis of the difference in average height between men and women, you find
that the p-value of 0.002 is below your cutoff of 0.05, so you decide to reject your null
hypothesis of no difference

5. Present your Findings


- The results of hypothesis testing will be presented in the results and discussion sections
of your research paper.
- In the results section you should give a brief summary of the data and a summary of the
results of your statistical test (for example, the estimated difference between group means
and associated p-value). In the discussion, you can discuss whether your initial
hypothesis was supported or refuted.

P- Value
 The p-value, or probability value, tells you how likely it is that your data could have occurred under
the null hypothesis. It does this by calculating the likelihood of your test statistic, which is the
number calculated by a statistical test using your data.
 The p-value tells you how often you would expect to see a test statistic as extreme or more
extreme than the one calculated by your statistical test if the null hypothesis of that test was true.
The p-value gets smaller as the test statistic calculated from your data gets further away from the
range of test statistics predicted by the null hypothesis.
 The p-value is a proportion: if your p-value is 0.05, that means that 5% of the time you would see a
test statistic at least as extreme as the one you found if the null hypothesis was true.

_____________________________________________________________________________________________
Sources:
https://www.scribbr.com/statistics/p-value/
https://www.scribbr.com/methodology/hypothesis-testing/
https://www.scribbr.com/methodology/population-vs-sample/
https://www.scribbr.com/research-process/hypotheses/

You might also like