RM File Deep Mehrotra Bba B& I 417

Download as pdf or txt
Download as pdf or txt
You are on page 1of 68

Guru Gobind Singh Indraprastha University

Dwarka, Delhi – 110064

Maharaja Surajmal Institute


Janakpuri, Delhi – 110058

R.M Lab ASSIGNMENT


SESSION:2018-21

Submitted to: Submitted by:

Dr. Ruchika Ma’am Deep Mehrotra


B.B.A (B&I)
4th Semester

41714901818
TABLE OF CONTENTS
S.NO MODULE NAME DATE
1. INTRODUCTION TO SPSS
2. MANAGE DATA IN SPSS
3. CODING AND RECODING IN
SPSS
4. SELECTING, SORTING AND
ANALYSING DATA IN SPSS
5. MISSING VALUES, RECODING
THE SAME VARIABLE
6. DESCRIPTIVE STATISTICS
7. CORRECTING DATA
PROBLEMS
8. ONE WAY ANOVA
9. CHI - SQUARE TEST
10. REGRESSION
11. CORRELATION
12. ONE SAMPLE T TEST
13. INDEPENDENT SAMPLE T TEST
14. PAIRED SAMPLE T TEST
MODULE 1
INTRODUCTION TO SPSS
SPSS means " Statistical package for the social sciences" and was first launched in 1968.
Since SPSS was acquired by IBM in 2009, it's officially known as IBM SPSS Statistics but
most users still just refer to it as "SPSS".
SPSS is a software for editing and analyzing all sorts of data. These data may come from
basically any source, scientific research, a customer database, Google analytics or even the
server log files of a website.
SPSS can open all file formats that are commonly used for structured data such as:
• Spreadsheets from MS Excel or open office
• Plain text files ( .txt or .csv )
• Relational (SQL) databases
• Stata and SAS

It is used by the searchers in educational institutes , research organization, government,


marketing firms etc.

Launching SPSS
To start SPSS go to
START>Programs>SPSS
A dialog box will open listing several options to choose from. The following options are
available.
▪ Run the tutorial
▪ Type in data
▪ Run an existing query
▪ Open an existing data source
▪ Open another type of file

SPSS has two working windows:


• SPSS data editor
• SPSS output viewer
SPSS Data Editor
This section informs the user about the method of entering a string type of variable or a
numeric type variable in a SPSS data file. It allows user to edit and a data set. An SPSS data
file always has two tabs in the left bottom corner.
• DATA VIEW is where we inspect our actual data
• VARIABLE VIEW is where we see additional information about our data

DATA VIEW
VARIABLE VIEW

The first step is to open the variable view window of the data editor and define variables. Let
us consider an example where the employ data of the organization need to be analyzed . The
objective is to create a small data file for the employees that consist of six variables.

RECODE COMMAND
To convert a variable into another SPSS has transform command. A variable can be
converted using any of the following tools:
• Recode into same variable
• Recode into different variable

FOR EX: To create a new variable " gender_group". USE THE FOLLOWING STEPS:

• Go to transform on TOP MENU > a drop down menu will appear > SELECT
" Recode into different variables" to change the name of the variable.
• Now click on the GENDER variable to highlight it and then click the arrow
button to move it to across the Numeric Variable > Output variable box.
• In the output variable give the variable a new name and label.
• Click on the Change button
• Now click the old and new button

Transform> recode into different variables> new name and label> change.

OLD AND NEW VALUES


• Now a new dialog box will appear.
• Now enter the lowest value in old value column i.e "30" range "30-50" and highest
value i.e "50" and in new value column enter "1" , "2" and "3" respectively .
• For missing value click on the " system or user missing" option and enter the value
"9".
• Now click on " CONTINUE".

This will create a new variable whose data will appear in data editor window.

VALUE LABEL
• After recoding command , a new variable "gender_group" is created.
• In the variable view , click on value column corresponding to gender_group.
• Click on dots and " value label" dialog box will open automatically.
• Add the value and their label in respective column i.e value 1 represents MALE, add
the other value similarly.
• Now click on " OK" to continue.
FREQUENCY DISTRIBUTION
Analyze command is used to convert the data into a frequency distribution in SPSS.
FOR EX: Creating a frequency distribution based on the variable, i.e GENDER. These steps
should be followed.
• In output window , ANALYZE> DESCRIPTIVE STATISTICS> FREQUENCY
DISTRIBUTION.
• Click on the variable i.e "GENDER" from the left column of the dialog box and move
it to the right hand " variables window " press OK.

• As output a frequency distribution will appear.


CROSS TABULATION
Cross tabulation is used to describe the relationship between two categorial variables.
Analyze command is used to create a crosstab SPSS.
FOR EX: Creating a crosstab for a relationship between variables i.e Year and Gender. USE
FOLLOWING STEPS.
• In output window, go to ANALYZE> DESCRIPTIVE STATISTICS>CROSS TABS.

• In the crosstab dialog box , add YEAR in ROWS and GENDER in COLUMN and
thenn click OK.

• The cross tab will appear in new output window.


DESCRIPTIVE
• To insert Descriptive, click on ANALYZE> DESCRIPTIVE STATISTICS>
DESCRIPTIVE.

• Click on CASTE and YEAR to highlight it and use arrow to put them into "variables"
box and then click OK.
• The descriptive will appear as output in output window.

GRAPHS
'Graph' command is used to make a graph for any variable.
• To make graph , go GRAPHS>SELECT LEGACY DIALOG>BARS.
• A new dialog box will appear on the screen. Select the appropriate option in type of
data in the variable list.
✓ Summaries of group of cases.
✓ Summaries of seperate variables.
✓ Values of individual cases.
• After selections , click on DEFINE button.

• Now transfer those variable which one want to represent in bar graph from the left to
right column and the variable you want category label to show.
• Click OK to generate the graph.
• In output window , a graph will appear representing NAME OF STUDENTS on
horizontal axis and GENDER OF STUDENTS on vertical axis.
PIE CHART
• To insert a pie chart for GENDER , go to GRAPHS>LEGACY DIALOG>PIE
CHART.
• Define slices in the dialog box by selecting the variable.
• Click OK to generate pie chart in output window.
MODULE 2
MANAGE DATA IN SPSS
FINDING OUT THE CASE SUMMARY
To understand the nature of the data one will use case summaries

ON THE BASIS OF GENDER


• Go to analyze> case summaries under report option.
• Add marks obtained in final year as variable and gender as grouping variable.
• Click on statistics option and mean to "CELL STATISTICS" and press continue.
• Limit cases to 120 and press OK button.

ANALYZE>REPORT>CASE SUMMARIES> ADD MARKS IN FINAL YEAR AS


VARIABLES AND GENDER AS GROUPING VARIABLES> STATISTICS > ADD
MEAN TO CELL STATISTICS> CONTINUE>LIMIT CASES TO 120> PRESS OK.
Output statistics viewer shows case summaries of final year marks based on
gender.
ON THE BASIS OF CASTE
Similar to above case , case summaries for final year marks based on caste can be found out
by adding caste as grouping variable instead of gender.
ANALYZE>REPORT>CASE SUMMARIES> ADD MARKS OBTAINED IN FINAL AS
VARIABLES AND CASTE TO GROUPING VARIABLE> ADD MEAN TO CELL
STATISTICS> CONTINUE> LIMIT CASES TO 120> OK.

Output statistics viewer shows case summaries of final year marks based on caste.
COMPUTING NEW VARIABLE
• Go to transform> compute variable.
• Type midterm as target variable.
• Select mean from function and special variables.
• Add midterm 1 to 5 to syntax.
• Press OK.

TRANSFORM>COMPUTE VARIABLE> MIDTERM AS TARGET VARIABLE>


FUNCTION GROUP- ALL> FUNCTION AND SPECIAL VARIABLE- MEAN>
NUMERIC EXPRESSION.
Midterm column represents the mean value of all the marks combined of five midterms
scored by every individual.
MODULE 3
CODING AND RECODING
IN SPSS
RECODING INTO DIFFERENT VARIABLES OLD AND NEW
VALUE
• Go to transform> recode into different variable.
• Add final as output variable and give name and label as GRADE and press change.

• Click on old and new values and check output variabless as strings.
• Label the range "0 thru 49" 'C' "50 thru 59" 'B', "60 thru 75" 'A'.
• Another column named Grade appears depicting grades on basis of final score.

TRANSFORM> RECODE INTO DIFFERENT VARIABLES OLD AND NEW VALUES>


ADD FINAL AS OUTPUT VARIABLE> NAME AND LABEL AS GRADE> CHANGE>
OLD AND NEW VALUES> CHECK OUTPUT VARIABLES AS STRINGS> LABEL
THE RANGE 0 TO 49 'C' , 50 TO 59 'B' 60 TO 75 'A' > CONTINUE.
RECODING INTO SAME VARIABLE : OLD AND NEW
VALUE
• Go to transform and select recode into different variable.
• Select midterm as numeric expression.
• Click on old and new value option.
• Add '3' for 0 to 5 , '2' for 5.1 to 6, '1' for 6.1 to 10.
• Press continue.
• Midterm column reappears in data editor with new values.

TRANSFORM> RECODE INTO SAME VARIABLES OLD AND NEW VARIABLES>


ADD '3' FOR 0 TO 5, '2' FOR 5.1 TO 6 , '1' FOR 6.1 TO 10.
MODULE 4
SELECTING , SORTING AND ANALYSING
THE DATA IN SPSS
SELECT CASES:
• Go to data tab ; choose " select cases" from the drop down menu.
• "select cases" dialog box will appear then select ' If condition is satisfied'.
• Choose "select cases If" and type gender = 1 in the variable box and then continue.
• In data editor , the data will appear with some changes. Male students will remain
unmarked since they are selected cases.
CASE SUMMARIES
• Go to analyze tab> reports> case summaries from the drop down menu

• A '' summarize cases" dialog box will appear, select "marks in final" in variable
column and "gender" in grouping variable column, also limit the cases to 120.

• Go to " statistics" , "summary statistics report" dialog box will appear . Choose
"mean,median etc" in cell statistic column and click continue.
• In output data viewer marks obtained in final exams by males will appear.
SORT CASES
• Go to data tab> sort cases from the drop down menu.
• Select " marks in final"
• Select descending order in sort order and click OK.

In data viewer , data will be arranged on the basis of marks.


MODULE 5
FINDING OUT THE MISSING VALUES
AND RECODING THE SAME VARIABLE
AFTER FILLING THE MISSING VALUES
ACCORDING TO GROUP: SPLITTING
FILE

MISSING VALUES

• Go to Analyze Tab > select descriptive statistics > statistics from drop down menu.

• "Frequency" dialogue box will appear, choose "marks in final exam” in variable
column and then click on 'statistics' .
• Choose Mean from options in the "Frequencies Statistics" dialogue box and click
Continue.
• Now click "OK"

In output data viewer,”Final exam” will appear with mean as shown .

RECODING INTO SAME VARIABLE: OLD AND NEW


VALUES

• Go to transform tab > Recoding into Same Variables > old and new value from the
drop down menu.
• "Recoding into Same Variables" dialog box will appear. Select '4- year resale value'
as numeric value expression as shown .
• Click on "Old and New Value" option .
• Select "System Missing" in Old Value box and '60.134' in new value box. Then click
continue as shown .

Then in Data Editor, all the missing values of the data will be replaced be a common mean
i.e. 60.134 as shown.
SPLITTING FILE

• Go to data tab > Split File from the drop down menu as shown.
• "Split File" dialogue box will appear, in which "gender" should be selected in "Group
Based on" box and Click "OK" as shown.

• In Output Viewer, data will appear into different splits based on "gender of students"
with their mean value as shown.
RECODING INTO SAME VARIABLE: OLD AND NEW
VARIABLE (SELECTING THE PARTICULAR BRAND NAME
AFTER SPLITTING THF FILE)

• Go to transform tab > choose recoding into same variables > old and new value from
the drop down menu.

• "Recoding into Same Variables" dialog box will appear. Select ‘score out of 10 in
midterm’ in Numeric Variables as shown.

• Click on "Old and New value" option.


• Click on "System Missing" in the old value box and '6.04' in new value box. Then
click Continue as shown.

• Then in Data Editor, all the missing values of the ford will be replaced by a common
mean i.e. 6.04 as shown.
• Press "OK".
• In data editor, all the missing values of Ford will be replaced by "6.04" shown.
MEAN VALUE OF FORD, ONCE MISSING VALUE IS
REMOVED

• Once missing value is replaced by 6.40, go to transform tab > recoding into same
variables old and new value from the drop down menu.
• "Recoding into Same Variables" dialog box will appear as shown.

• Choose 'IF' option from the dialogue box, new dialog box naming “Recode into Same
Variable: If Cases". Select the option “Include if case satisfies condition” and type
name = ''liza'' in the blank area and click "Continue" as shown.
In output viewer, mean, frequency, percentage and cumulative percentage is changed.
MODULE 6
DESCRIPTIVE STATISTICS
Analyze command is used to convert the data into descriptive statistics in SPSS, these steps
should be followed:
1. Click on analyze, select descriptive statistics from the drop down menu. select
descriptives.

2. 'Descriptives' dialogue box will appear, choose all the variables except Name in the
variable list and then click on 'statistics'

3. Click on options and select mean, standard deviationn, kurtosis and skewness.
4. Now click continue and then OK.
5. In output data viewer, we can see all the data described statistically.
MODULE 7
Correcting data problems
EXPLORE
• Go to Analyze Tab > select descriptive statistics > explore from drop down menu.

• "Explore" dialogue box will appear, choose "marks in midterm 1” in dependent list
and then click on 'statistics' .
• Explore: Statistics dialogue box will appear.
• Select - Descriptives

• Click continue
• Now click on PLOTS Explore: Plots dialogue box will appear.
• 123Click none under Boxplots and Histogram under Descriptives

• Click on continue then OK.


• The output statistics viewer will appear.
After the graph editor chart, we can use the method of transform and then compute into
different variables and find out the logarithms of midterm 2, fractional rank, sqrt, inverse and
IDF midterm by applying the same above methods. One of the outputs is as follows:
(Using : logarithms of midterm1)
The main data view and variable view in SPSS now looks like the one below:
MODULE 8
ONE WAY ANOVA
1. First the data is written.Then, the analyze is selected.
2. From the analyze the option of compare means is selected and from there
the One Way Anova test.

3. After this the dialogue box of One-way Anova appears and the marks in
midterm 1 are put in dependent list and in the factor the year of the
students of school.
4 .And once the dialogue box appears we click on the Post hoc.
5. The output of post hoc is shown below.

6. Click on options, select Descripive and homogenity of variance test. Click on


continue. Output will be as shown below.
7. Here the tukey test is selected and then continue is selected. Here we can see
that the significance is 0.818 and hence it means that the null hypothesis is to be
accepted.
Module 9
Chi-square test
1. First the data is entered and after entering it we click on analyze and then then descriptive
stats and then crosstabs.
2. Then after selecting the crosstabs the crosstab dialogue box appears and caste of students
and section of students are written in row and column respectively.

3. After that we click on statistics and click on chi-square test.


4. After this we click on cells in main dialogue box and the cell dialogue box appears.

5. The output is as below.


6. The significance here is 0.536 which is more than the value of 0.05 and hence null
hypothesis is accepted.
Module 10
REGRESSION

1. After the data is entered we first go to graphs and select chart builder. After this the chart
builder dialogue box appears.
2. After this we select the scatter or plot and then click ok and the output is as follows.

.
3. After this select analyze and then select linear.
4.The linear regression dialogue box appears as above and score in midterm 1 of students is
taken as dependent variable and the year in school of students is taken as independent.
From there the output is as below.

5.The significance of .000 means that there is almost no significant relationship between the
year in school of the students and the score obtained by them in midterm 1.
MODULE 11
CORRELATION
To determine the co-relation between no of session and level of
satisfaction.

• Click graphs from top menu


• Then select legacy dialogues

• Click on scatter
• Click scatter dot on simple scatter, click on define
• Move dependent variable on y-axis(MARKS IN FINAL) and independent on x-
axis(YEAR IN SCHOOL). Click ok.

• A scatter plot will be displayed


• To fit the trend line, click on graph.
• Chart editor will be opened. Select fit trend line.
Calculation of co-relation coefficient

• Go to analyse in the top menu.


• Select co-relate, then bivariate.

Move the two variables in the variable box. Click on Pearson co-efficient of co-relation. Then
click OK.
OUTPUT

Comparing co-relation co-efficient for two groups

Objective- To compare the co-relation co-efficient for satisfaction among male and female
groups separately.
• Go to data in the top menu. Then select split file.
• Tick on the compare groups.

• Follow the previous steps


• Go to analyse in the top menu.
• Select co-relate, then bivariate.
• Move the two variables in the variable box. Click on Pearson co efficient of co
relation. Then click Ok.

\
MODULE 12
ONE SAMPLE T TEST

1. Click on analyze then compare means and select One sample T Test.

2. One Sample T Test dialogue box appears as given below.


The significance level here is .000 which means the significance level of final marks of
students and the year of the students of school are almost nil and zero and there is as such no
significant relationship between them.
MODULE 13
Independent sample T test

1. Go to analyze and click on compare means and then select Independent sample T test.

2. The independent sample T test dialogue box appears where final marks if students is
testing variable and gender of students is grouping variable.
3. Now click on Define groups and label group as '1' for female and group 2 as '2' for
male.

4. After that we click on options and option dialogue box appears.


5. The output is as below.
Module 14
Paired sample T test
1. First after writing the data, we go to analyze and click on compare means and then
select Independent sample T test.

2. After clicking on the independent sample T test the dialogue box appears.
3. We select the 2 variables, score in midterm 1 of the students and mean score of the
students as pair 1 and score in midterm 2 and midterm 3 as pair 2.

4. The output would be as below:

The significance in pair 1 of the score in midterm 1 of students and mean score of students is
.240 that means there is little relationship between the 2 variables. The significance in pair 2
of score in midterm 3 and midterm 4 is 0.581 that means there is some relationship between
them.

You might also like