R Studio Lab Summary Sheet

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

R Studio Lab Summary Sheet

Getting R Studio Started Collect variables together into a dataset using


Select from the Start menu: dataset<-data.frame(variable1,variable2,…)

 RStudio > RStudio Working with Datasets


Dataset names are case sensitive and cannot have
spaces.
Creating Variables
A variable name can be a letter, word, or names(dataset) gives the variable
letter/numeric/punctuation combination. No names
spaces, case sensitive. Values are assigned using head(dataset) gives the first 5
“<-“or “=”. observations
tail(dataset) gives the last 5
eg, a=5 b=-3 c=15 are numeric variables observations
animal=’dog’ is a character variable
Calling a variable within a dataset:
z=c(1,2,3,4) is a combination of numeric values dataset$variable
belonging to the variable z.
Saving a dataset as an .RData dataset:
animals=c(‘dog’,cat’,’bird’) is a combination of save(dataset,file=’filename.RData’)
character values belonging to the variable animals.

Categorical Data
Calculating using R Studio table(variable) one-way table
+ addition table(variable1,variable2) two-way table
- subtraction percent=round(100*table(variable/sum(variable)))
* multiplication
/ division pie(table(variable)) for a pie
^ powers chart
() to ensure the BEDMAS rules are applied barplot(table(variable)) for a bar
correctly chart

eg. a=5 b=-3 c=15


(a*b)/c will give -1 Numerical Data
min(variable)
max(variable)
Datasets mean(variable)
A dataset can have numeric or character variables. median(variable)
IQR(variable)
Existing datasets within R Studio: sd(variable)
library(help=’datasets’)
dataset summary(variable) will give you the minimum, Q1,
help(dataset) median, mean, Q3, maximum for a variable.

Reading textfiles: by(numeric,category,function)


Within R Studio, choose function can be any of min, max, mean, median,
Environment >Import Dataset>Text file… IQR, sd or summary.

Reading RData files: dotplot(variable)


In the Files/Plots window go to the folder stripchart(variable)
containing the dataset. Double click on the stem(variable)
dataset. A dialogue box will ask to load the .RData hist(variable)
file. Click Yes. boxplot(variable)

Creating your own data:


library(aplpack) to obtain back to back stem pnorm(lower,mean=,sd=,lower.tail=FALSE) for
plots P(X>a)
stem.leaf.backback(variable1,variable2)
pnorm(c(lower,upper),mean=,sd=,lower.tail=TRUE)
to find P(a<X<b)

Copying a graph from R Studio to Word qnorm(c(probability), mean=,sd=,lower.tail=TRUE)


In the Files/Plots window use the Export pull down for P(X<?)=probability
menu, and Copy to Clipboard. A dialogue box will
open, click Copy Plot, and paste in your Word file. qnorm(c(probability), mean=,sd=,lower.tail=FALSE)
for P(X>?)=probability

Commands for both Categorical and Numerical qnorm(c(probability1,probability2),


Data mean=,sd=,lower.tail=TRUE)
plot(dataset) gives scatterplots for for P(?<X<?)=central probability
each pair of variables (both categorical and
numeric)
plot(category) gives a bar chart
plot(numeric) gives an index plot Sampling
plot(numeric~category) gives boxplots by sample(1:population size, sample size)
category
boxplot(numeric~category) gives boxplots by
category Experimental Design
library(lattice)
summary(category) will give you the number in dotplot(numeric~category,data=dataset)
each category.
stripchart(numeric~category,data=dataset)
summary(numeric) will give you the minimum, Q1,
median, mean, Q3, maximum for a variable. boxplot(numeric~category,data=dataset)

summary(dataset) will give you a numerical or library(gplots)


categorical summary for variables within a dataset. plotmeans(numeric~category,data=dataset)

Creating titles, axes labels Regression


The following labelling options can be used in any plot(response~explanatory,data=dataset)
graphing command: cor(dataset)
model=lm(response~explanatory,data=dataset)
windows() to create a bigger graph
window coef(model) gives the coefficient for
main=’main title ’ the regression equation
xlab=’x axis label ’ summary(model)
ylab=’y axis label‘ abline(model)

res=residuals(model) gives the residuals for the


model
Normal Distribution plot(res~explanatory,data=dataset)
qqnorm(variable) gives a normal quantile plot abline(0,0)

qqline(variable) puts a line through the quantile


plot to see if the values fit a normal distribution.
Customising your graphs
pnorm(upper,mean=,sd=,lower.tail=TRUE) for col=’red’ yellow, green, blue, etc
P(X<a) cex=3 point size, 1 (default), 2, 3, etc
pch=16 plotting character – google Quick-R
A good graph should have…
 A meaningful title
 Labels for axes with measurements or times
 Be easy to read
 If data is only 2-dimensional, then we shouldn’t
represent it as 3-D
 To compare graphs they need the same scale
 Do the numbers “add-up”?
 Does the picture “match” the data?
 Do measurements or times given on axes have
the same interval width, or the same order?
 Do variables need definitions or a legend (key)?

You might also like