R Working Manuals Students
R Working Manuals Students
R Working Manuals Students
Shankar MM
Mentor – Research & Training
[email protected]
CARES, Bangalore
emp<-read.csv('emp.csv')
3. Exports in R
Export to txt
write.table(emp, file = 'emp.txt', sep="\t")
Export to CSV
Write.csv(dataset, file=’newdataset.csv’)
install.packages("foreign")
library(foreign)
Export to SPSS
write.foreign(emp, 'Pathname\\emp.txt', 'Pathname\\
emp.sps', package = "SPSS")
Export to SAS
write.foreign(emp, "Pathname\\emp.txt", "Pathname\\emp.sas",
package="SAS")
> dim(emp)
[1] 474 10
1 2 3
363 27 84
> prop.table(table(emp$jobcat))
1 2 3
0.76582278 0.05696203 0.17721519
1 2 3
0.77 0.06 0.18
1 2 3
f 206 0 10
m 157 27 74
# cross table
Install.packages(gmodel)
library(gmodels)
> CrossTable(emp$jobcat,emp$gender)
Cell Contents
|-------------------------|
| N |
| Chi-square contribution |
| N / Row Total |
| N / Col Total |
| N / Table Total |
|-------------------------|
| emp$gender
emp$jobcat | f | m | Row Total |
-------------|-----------|-----------|-----------|
clerical | 206 | 157 | 363 |
| 9.956 | 8.335 | |
| 0.567 | 0.433 | 0.766 |
| 0.954 | 0.609 | |
| 0.435 | 0.331 | |
-------------|-----------|-----------|-----------|
custodian | 0 | 27 | 27 |
| 12.304 | 10.301 | |
| 0.000 | 1.000 | 0.057 |
| 0.000 | 0.105 | |
| 0.000 | 0.057 | |
-------------|-----------|-----------|-----------|
manager | 10 | 74 | 84 |
| 20.891 | 17.490 | |
| 0.119 | 0.881 | 0.177 |
| 0.046 | 0.287 | |
| 0.021 | 0.156 | |
-------------|-----------|-----------|-----------|
Column Total | 216 | 258 | 474 |
| 0.456 | 0.544 | |
-------------|-----------|-----------|-----------|
Data Management using Cut, breaks and Add labels
> label<-c('0-50000','50000-100000','100000-150000')
> emp_label<-data.frame(emp, label=cut(emp$salary,
c(0,50000,100000,150000), labels = label))
lapply
> lapply(emp[,c(6,7)],sum)
$salary
[1] 16314875
$salbegin
[1] 8065625
tapply
> em<-as.list(emp)
> tapply(em$salary, em$gender, mean)
f m
26031.92 41441.78
> tapply(em$jobcat, em$gender, sum)
f m
236 433
Aggregate function
anova_emp<-aov(emp$salary~emp$jobcat)
summary(anova_emp)
Df Sum Sq Mean Sq F value Pr(>F)
emp$jobcat 2 8.944e+10 4.472e+10 434.5 <2e-16 ***
Residuals 471 4.848e+10 1.029e+08
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
TukeyHSD(anova_emp)
Tukey multiple comparisons of means
95% family-wise confidence level
$`emp$jobcat`
diff lwr upr p adj
custodian-clerical 3100.349 -1657.805 7858.503 0.2768689
manager-clerical 36139.258 33251.225 39027.291 0.0000000
manager-custodian 33038.909 27761.979 38315.839 0.0000000
Simple Linear regression
Data set used : women - R inbuilt dataset
Summary(womenreg)
Check R square, Residuals s.e, F value, p value and Coefficient value to assess the model goodness and
significant.
Model training
Model Testing