1 Prepared by Drashti Jasani
1 Prepared by Drashti Jasani
1 Prepared by Drashti Jasani
Differentiate between the two general classes of significance tests which statistical
technique will be appropriate when the testing involves two samples, the samples are
independent and the data are interval? Why?
Data Analysis is the methodical approach of applying the statistical measures to describe, analyze,
and evaluate data. The researchers analyze patterns and relationships among variables.
Univariate, Bivariate, and Multivariate are the major statistical techniques of data analysis.
• A variable is a condition or a category that the data falls under. For instance,
the analysis may be looking into the variable of “age” or “weight” of
demography. It takes one variable into concern at a time, i.e., either “age” or
“weight.”
• The univariate method is commonly used in analyzing data for cases where
there is a single variable for each element in a data sample or when there
prepared by drashti jasani 13
are multiple variables on each data set.
• The patterns that are identified from the univariate analysis can be described in the
following ways:
• Central tendency – (mean, mode and median)
• Dispersion – (range, variance)
• Procuring an adequate budget
• Quartiles (interquartile range)
• Standard deviation
• Inferences on population characteristics (or parameters) are often made on the basis of
sample observations, especially when the population is large and it may not be possible to
enumerate all the sampling units belonging to the population.
• In doing so, one has to take the help of certain assumptions (or hypothetical values) about
the characteristics of the population if some such information is available. Such hypothesis
about the population is termed as statistical hypothesis and the hypothesis is tested on the
basis of sample values.
• The procedure enables one to decide on a certain hypothesis and test its significance.
• “A claim or hypothesis about the population parameters is known as Null Hypothesis and is
written as, H0.”
• This hypothesis is then tested with available evidence and a decision is made whether to
accept this hypothesis or reject it. If this hypothesis is rejected, then we accept the
alternate hypothesis.
• This hypothesis is written as H1. For testing hypothesis or test of significance we use both
parametric tests and nonparametric or distribution free tests.
• Parametric tests assume within properties of the population, from which we draw samples.
• Such assumptions may be about population parameters, sample size, etc. In case of non-
parametric tests, we do not make such assumptions.
• Here we assume only nominal or ordinal data.
prepared by drashti jasani 24
• Important parametric tests used for testing of hypothesis
are:
• (i) z-test
• (ii) t-test
• (iii) χ2 test; and
• (iv) f-test 6
• When χ2 test is used as a test of goodness of fit and also
as a test of independence, we use non-parametric tests.
• As has been stated earlier all parametric tests used for
testing of hypothesis are based on the assumption of
normally, i.e., population is considered to be normally
distributed.
• The chi square and Analysis of Variance (ANOVA) are both inferential statistical
tests. Inferential statistics are used to determine if observed data we obtain
from a sample (i.e., data we collect) are different from what one would expect
by chance alone. A more simple answer is that we want to determine if the
relationships among variables or differences between groups that we see in
our sample data are occurring in the entire population.
• That said, chi square is used when we have two categorical variables (e.g.,
gender and alive/dead) and want to determine if one variable is related to
another. In ANOVA, we have two or more group means (averages) that we
want to compare. In an ANOVA, one variable must be categorical and the other
must be continuous. For example, we may want to examine if marijuana use (0
to 25 times) differs by grade level (9th grade, 10th grade, 11th grade).
• Two-Way ANOVA
• A two-way ANOVA (are also called factorial ANOVA) refers to an ANOVA using
two independent variables. Expanding the example above, a 2-way ANOVA can
examine differences in IQ scores (the dependent variable) by Country
(independent variable 1) and Gender (independent variable 2). Two-way ANOVA
can be used to examine the interaction between the two independent variables.
Interactions indicate that differences are not uniform across all categories of the
independent variables. For example, females may have higher IQ scores overall
compared to males, but this difference could be greater (or less) in European
countries compared to North American countries.
• N-Way ANOVA
• A researcher can also use more than two independent variables, and this is an n-
way ANOVA (with n being the number of independent variables you have). For
example, potential differences in IQ scores can be examined by Country, Gender,
prepared by drashti jasani 59
Age group, Ethnicity, etc, simultaneously.
• General Purpose and Procedure
• Omnibus ANOVA test:
• The null hypothesis for an ANOVA is that there is no significant difference
among the groups. The alternative hypothesis assumes that there is at
least one significant difference among the groups. After cleaning the
data, the researcher must test the assumptions of ANOVA.
• They must then calculate the F-ratio and the associated probability value
(p-value). In general, if the p-value associated with the F is smaller than .
05, then the null hypothesis is rejected and the alternative hypothesis is
supported.
• If the null hypothesis is rejected, one concludes that the means of all the
groups are not equal. Post-hoc tests tell the researcher which groups are
different from each other