Discriminant Function Analysis

Discriminant Function Analysis
(NB)
Content I. II. Characterisation of Discriminant Function Analysis (DFA) Assumptions underlying Discriminant Function Analysis 1. 2. 3. 4. 5. III. IV. V. Sample size Linearity Univariate and multivariate normality Homogeneity of variance-covariance matrices Multicollinearity
Significance of a classification Practice Example Interpretation and portage of DFA outputs
Reference: The utilised SPSS practice example can be found in: Brace, N., Kemp, R & Snelgar, R. (2009) SPSS for Psychologists (4th Ed.). UK: Palgrave MacMillan.
I. Characterisation of Discriminant Function Analysis

Discriminant Function Analysis (DFA) represents a modification of Logistic Regression Analysis (LRA). DFA and LRA are both statistical techniques that can be utilised to predict group memberships for a plethora of predictor variables. Whereas LRA computes the probability (log odds) that a particular outcome will occur, DFA computes a discriminant function, i.e. a function that optimally discriminates between two or more categories. The underlying assumptions are more stringent for DFA compared to LRA, since DFA requires normal distribution of the predictor variables. Furthermore DFA is conceptually and
analytically related to MANOVA and Multiple Linear Regression Analysis (MLRA).
1|Discriminant
Function
Analysis
The predominant aim of DFA lies in the prediction of group membership of a categorical DV in terms of a number of predictor variables. Whilst techniques such as MANOVA investigate how groups differ, DFA enables prediction of variables that best discriminate between two or more groups. The discriminant functions generated by DFA represent linear combinations of the predictor variables that best discriminate between the groups. Discriminant functions
resemble components or factors in Factor Analysis in that the first function explains the largest amount of variance and subsequent functions account for successively decreasing chunks of the remaining variance. The discriminant functions are independent of each other (i.e. uncorrelated). When there are two groups, only one discriminant function is needed; for cases comprising more than two groups, more functions are generated and only statistically significant discriminant functions are relevant. The generation of discriminant functions is a means of conceptually describing data sets which is somewhat similar to the way Factor Analysis accounts for the structure of data sets. As in MLRA there are three methods in DFA with reference to the ways predictor variables are entered: simultaneous hierarchical stepwise. As with MLRA, the method used is determined by the theoretical rationale underpinning the analysis. DFA performs significance tests for each function in addition, it can also generate oneway ANOVAs to examine the group differences for each predictor variable.
Discriminant Function Analysis is a two-stage process: 1. Analysis Stage in which the discriminant function/s is/are derived that best differentiate/s between groups. This stage can be used for purely descriptive purposes. 2. Classification Stage which utilises the discriminant function/s to identify group membership for cases of previously unknown membership.
The outlined two-stage process represents an analogy to MLRA which first identifies the equation that best predicts the criterion variable and subsequently uses the equation to predict the criterion (DV) for particular cases. DFA as a mathematical procedure attempts to identify group membership based on the relationship of individual scores on a set of variables to one of the group-centroid measures.
2|Discriminant
Function
Analysis
II. Assumptions underlying Discriminant Function Analysis

1. Sample size There has to be a sufficiently high ratio between of cases to predictor variables with different views on the minimum ratio: Tabachnick & Fidell (2007) suggest that unless the number of cases in the smallest group exceeds the number of predictor variables, over-fitting of results may occur. The detrimental effect of over-fitting is that the results may so closely fit the sample that they cannot be generalised. Diekhoff (1992) recommends that the smallest group should have more cases than there are predictor variables, and that the total sample size should have at least ten times as many cases as there are variables.
2. Linearity As in the majority of multivariate statistics applied to Psychology and the Social Sciences, DFA assumes linear relationships between predictor variables and between predictor variables and the criterion (outcome) variable. The degree of linearity within each separate group or category can be assessed by inspection of correlation matrices and scatterplots.
3. Univariate and multivariate normality Likewise all multivariate techniques, DFA is also sensitive to outliers. Univariate outliers can be examined using the Explore command in SPSS. Multivariate outliers can be investigated using Mahalanobis distances generated in MLRA. The Mahalanobis Distances statistic is equivalent to chi-square, with degrees of freedom reflecting the number of predictor variables. The Mahalanobis Distances represent a measure of how much a case's values on the independent variable differ from the average of all cases, with large Mahalanobis Distances identifying cases with extreme values. Chi-
square tables can be used to look up the critical values for such extreme vases (with an alpha level of .001). Utilisation of large sample sizes results in increased robustness; robustness can be assumed for n 20 in the smallest group, and a parsimonious number of predictor variables.
3|Discriminant
Function
Analysis
4. Homogeneity of variance-covariance-matrices Boxs M can be used to assess the assumption of homogeneity. Boxs M must be nonsignificant; in light of the sensitivity of the test, an alpha level of .001 is recommended.
5. Multicollinearity Multicollinearity refers to high inter-correlations between predictor variables (r> 0.8). The predictor variables must be related to the grouping variable (DV) but should be fairly independent of each other. The extent to which predictor variables are inter-
correlated should always be inspected prior to conducting DFA: If multicollinearity is high (r .80), discarding redundant variables from the analysis should be considered.
III. Significance of a classification

If sample sizes of groups are (fairly) equal the following thresholds apply with regards to establishing significance for classification. For a 2-group classification it should be better than 50% (50% reflects chance results), for a 3-group classification 33% (100/3); for a 4group 25% (100/4), and so on.
Groups of unequal size require adjustments as shown in the following example: Group 1: n1 = 10; Group 2: n2 = 15; Group 3 n3 = 35 Prior probabilities are: p1 = 10/60 = 0.17; p2 = 15/60 = 0.25; p3 = 35/60 = 0.58 By chance Group 1 = 10 x 0.17 Group 2 = 15 x 0.25 Group 3 = 35 x 0.50 Total = = = = 1.70 3.75 17.50 22.95
22.95/60 x 100 = 38.25%
In conclusion, a classification in excess of 38.25% reflects a better than by chance result for these three unequal groups (not 33% as for three equal groups).
4|Discriminant
Function
Analysis
IV. Practice Example

Source: Brace, N., Kemp, R & Snelgar, R. (2009) SPSS for Psychologists (4th Ed.). UK: Palgrave MacMillan. The data set LogisticRegression.sav contains the variables of 221 prison inmates in an attempt to examine core variables that predict reconviction in a group of offenders. The variables in the data set are:
age precons recon crimhis2 educemp2
Age in years Previous Convictions Reconvicted 2nd Criminal History* 2nd Education and Employment*
*) The variables crimhist2 and educemp2 have been derived from the LSI-R (Level of Service Inventory Revised) a scale used by psychologists working in prisons. The LSI-R measures various criteria of an offenders life such as previous offending, drug use, family history, education, employment and behaviour in prison.
Task: Conducting Discriminant Function Analysis (DFA) we want to examine whether and to what extend age, previous convictions, criminal history and education and employment history can be used predict reconviction more precisely the aim is to determine group membership reconvicted/ not reconvicted.
The required SPSS commands are shown on page 6.
5|Discriminant
Function
Analysis
SPSS commands for DFA

Prior to conducting DFA, the aforementioned assumptions sample size, linearity, normality, homogeneity of variance and exclusion of multicollinearity should be checked.
Running the DA:
1. Select the Analyze menu. 2. Click onto Classify, then Discriminant to open the Discriminant Analysis box 3. Select the grouping variable (recon') and move the variable into the Grouping Variable box. 4. Click the Define Range... and enter the range values (here: 0 in the minimum box, and 1 in the maximum box) 5. Click on Continue 6. Select the Predictors by highlighting the four variables (a) age (b) precons; (c) crimhis2 and (d) educemp2 and moving them over to the Independents box. 7. Click on Statistics to open the Discriminant Analysis: Statistics sub-dialogue box. In the Descriptives box, ensure that the Means, Univariate ANOVAs and Box Ms are selected. In the Matrices box, ensure that the Within Groups Correlation is selected. These options allow the testing of homogeneity of variance and multicollinearity assumptions. Click on Continue. 8. Click on Classify. In the Plots box, select Combined Groups. In the Display box, also ensure that the Summary Table has also been selected. 9. Click on Continue and then OK.
6|Discriminant
Function
Analysis
V. Interpretation and reportage of DFA outputs Homogeneity of variance-covariance-matrices

If Boxs M is not significant (with p>.001) the assumption of homogeneity of variancecovariance has not been violated. This is the case in our example with p = .30.
Multicollinearity High inter-correlations (r .80), shown in the within-groups correlation matrix are of concern and warrant to discarding redundant variables.
Number of cases by group The DFA output also provides information on the number of cases, group means and standard deviations for each of the four predictor variables.
Wilks Lambda One-way comparisons are reported using the Wilks Lambda statistic and show whether or not there are any significant differences between the yes and no reconviction categories on the four predictor variables. There are significant differences for our example (with p<.001); hence, the null hypothesis can be rejected.
Canonical discriminant functions An examination of the canonical discriminant functions output will indicate how many discriminant functions have been produced (one in a 2-group case). In addition the Eigenvalues, percentage of variance explained and significance of this function are reported.
Structure Matrix The structure matrix shows the correlation of each variable with the function. These resemble factor loadings in factor analysis or beta weights in multiple regression analysis.
Standardised canonical discriminant function coefficients This matrix is used to calculate predicted group membership using the products of the raw scores and the function coefficients, in a fashion similar to the B weights in multiple regression analysis. 7|Discriminant Function Analysis
Group centroids The MDA output also produces means (centroids) for each of the groups, in our example yes/no reconvictions to interpret the differences between the two groups.
Classification results This is the most important output because central to DFA is the classification of cases into groups. Therefore a core measure of the outcome of a DA is the extent to which it is capable of correctly assigning predicted group membership. DFA uses the raw scores to predict to which group a case should be assigned, then looks at actual group membership and calculates the likelihoods/percentages of cases correctly classified.
Classification Results Reconvi cted Original Count no yes % no yes
a) a
Predicted Group Membership No 104 16 66.2 25.0 yes 53 48 33.8 75.0 Total 157 64 100.0 100.0
68.8% of original grouped cases correctly classified.
Reportage Using a sample of 221 prison inmates a DFA was conducted with reconviction as the DV and age, number of previous convictions, criminal history and education & employment history as potential predictor variables. Reconvicted and non-convicted prisoners differed significantly on each of the four predictor variables as revealed by univariate ANOVAs. In DFA a significant discriminant function emerged (Chi-square = 30.42, df = 4, p < .001). The discriminant function and correlations between the predictor variables and the DV (future conviction/ no future conviction) suggested that age and criminal history were the strongest predictors of future conviction likelihoods. As age was negatively correlated with the
discriminant function value, older prisoners showed decreased likelihoods of future conviction. Prisoners with a higher number of previous convictions showed an increased likelihood of future convictions (positive correlation). In its entirety, the discriminant function successfully predicted the outcome for 68.8% of cases, with 66.2% correct predictions of
8|Discriminant
Function
Analysis
the prisoners with no future reconvictions and 75% accurate predictions for those who were reconvicted.
(NB, November 2011)
9|Discriminant
Function
Analysis

Discriminant Function Analysis

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Discriminant Function Analysis

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Discriminant Function Analysis

Uploaded by

Copyright:

Available Formats

Discriminant Function Analysis

Significance of a classification Practice Example Interpretation and portage of DFA outputs

I. Characterisation of Discriminant Function Analysis

analytically related to MANOVA and Multiple Linear Regression Analysis (MLRA).

II. Assumptions underlying Discriminant Function Analysis

III. Significance of a classification

22.95/60 x 100 = 38.25%

IV. Practice Example

age precons recon crimhis2 educemp2

The required SPSS commands are shown on page 6.

SPSS commands for DFA

Running the DA:

V. Interpretation and reportage of DFA outputs Homogeneity of variance-covariance-matrices

68.8% of original grouped cases correctly classified.

(NB, November 2011)

You might also like