Crash Course

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Some small examples in R

Henrik Andersson December 13, 2005

Contents
1 Getting help 2 Entering data 2.1 By hand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Regular sequences . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Spreadsheet data 3.1 Delimited les . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Clipboard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Using data frames 4.1 Sorting data frames 2 2 2 2 3 3 3 3 3 4 4 5 6 8 8 9 10

. . . . . . . . . . . . . . . . . . . . . . . . .

5 Interpolation 5.1 Linear Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . 6 A graph with two y-axes 7 Nonlinear curve tting 8 Multiple graphs 8.1 Subplots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Using colors 10 Mixing of watermasses - Linear algebra

Getting help

To get help about a function use ?function e.g ?plot gets you all the information about scatter plots. If you dont know the name of the function you want to use try help.search("useful phrase")

Entering data

There are several ways to get your data into R, including cut-and-paste from the clipboard. Importing large data sets is quite easy and entering small data sets by hand is not to dicult either.

2.1

By hand

To create a vector use the command c . > n = c(1, 5, 7) > n [1] 1 5 7 Or the command scan .

2.2

Regular sequences

Some examples of sequences: > x = 1:10 > x [1] 1 2 3 4 5 6 7 8 9 10

> y = seq(0, 1, length = 11) > y [1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 > F = rep("A", 2) > F [1] "A" "A" Combining them: > G = rep(c("A", "B"), 3) > G [1] "A" "B" "A" "B" "A" "B" 2

3
3.1

Spreadsheet data
Delimited les

To import data from spreadsheets (e.g MS Excel) rst save your le as comma separated values (for Excel: File->Save as->CSV). To read a CSV le (data.csv) into an R data-frame use the following: > data = read.csv("data.csv") > data A V Q N 1 Pelle 5 1.50 2 Kalle 6 0.90 3 Nisse 7 0.70 4 Eva 12 0.90 5 Gunnar 18 1.08

1 2 3 4 5

3.2

Clipboard

For smaller datasets it is also possible to read data from the clipboard, using read.delim(file="clipboard")

Using data frames

In section 3 you imported data in a data frame. To create a subset of this data frame you can select according to this scheme: subset = data[rows,column]. The expression for the rows can use both row numbers i.e. 1:10 or selection criteria.

4.1
order

Sorting data frames

To sort the following (generated) data on e.g. rst stn, then pos and then time, > data <- expand.grid(stn = c(140, 300), pos = c(1, 2), time = c(2, + 5), depth = c("shallow", "deep")) > i <- order(data$stn, data$pos, data$time) > data[i, ] stn pos time depth 140 1 2 shallow 140 1 2 deep 140 1 5 shallow 140 1 5 deep 3

1 9 5 13

3 11 7 15 2 10 6 14 4 12 8 16

140 140 140 140 300 300 300 300 300 300 300 300

2 2 2 2 1 1 1 1 2 2 2 2

2 2 5 5 2 2 5 5 2 2 5 5

shallow deep shallow deep shallow deep shallow deep shallow deep shallow deep

Interpolation

Interpolation in contrast to curve tting has only the purpose to derive data points that werent measured and not to t parameters of interest. Curve tting typically ts one or a few curves to the whole data domain, while interpolation ts lines or curves between every data pair. Smoothing functions can also be applied and then functions are based on more than two data points.

5.1

Linear Interpolation

approx The example is sediment trap data measured at eight times during a year and interpolated to daily measurements. > > > > > > > > > + x = seq(1, 365, length = 8) y = c(10, 15, 60, 40, 50, 30, 15, 10) xout = seq(1, 365, 1) yout = approx(x, y, xout) plot(y ~ x, pch = 16, cex = 1.5) lines(yout) youtc = approx(x, y, xout, method = "constant") lines(youtc, lty = 2) legend(max(x), max(y), lty = 1:2, legend = c("linear", "constant"), xjust = 1)

60

linear constant 50
q

40

y 30

20

q q

10

100 x

200

300

A graph with two y-axes

To generate a gure with two y-axes, related to two dierent quantities, for instance, the concentration and isotopic composition of a chemical species. > > > > > > > > par(mar = c(5.1, 4.1, 4.1, 4.1)) plot(y1 ~ x, bty = "u", ylab = "y=0+1*x") par(new = T) plot(y2 ~ x, bty = "u", ylab = "", yaxt = "n", pch = 16) axis(4) mtext("y=1/x", side = 4, line = 3) legend(5, 0.8, c("y=x", "y=1/x"), pch = c(1, 16)) title(main = "A plot with two y-axes")

A plot with two yaxes


q q

q q q

y=x y=1/x
q

y=0+1*x

0.6
q q q q

0.8 10 0.2 0.4 y=1/x

q q q q q q q q

4 x

7
nls

Nonlinear curve tting

The example shown is an example of asymptotic growth.


L(t) = L (L L0 )ert

1.0

10

(1)

And this is the data: > a = c(0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1) > l = c(10, 21, 50, 70, 83, 90, 92, 95, 99, 100, 105) > plot(l ~ a)

100

q q q q q

80

60

20

40

0.0

0.2

0.4 a

0.6

0.8

1.0

> length.nl = nls(l ~ Linf - (Linf - L0) * exp(-r * a), start = list(Linf = 100, + L0 = 10, r = 5)) > summary(length.nl) Formula: l ~ Linf - (Linf - L0) * exp(-r * a) Parameters: Estimate Std. Error t value Linf 109.4566 5.0720 21.581 L0 4.1270 4.5993 0.897 r 3.0251 0.4504 6.717 --Signif. codes: 0 '***' 0.001 '**'

Pr(>|t|) 2.24e-08 *** 0.39575 0.00015 *** 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 5.127 on 8 degrees of freedom Correlation of Parameter Estimates: Linf L0 L0 0.3393 r -0.8872 -0.5621 > plot(l ~ a, pch = 16, cex = 1.5) > aa = seq(min(a), max(a), length = 100) 7

> lines(aa, predict(length.nl, list(a = aa)))

100

q q q q q q

80

q q

60

40

20

q q

0.0

0.2

0.4 a

0.6

0.8

1.0

Multiple graphs

Graphs are often grouped together in a window or gure le, and there are two ways to do this.

8.1

Subplots

Use this when you need to group graphs that are dierent in a single le. If the graph is just conditioned on a factor use the method below. > par(mfrow = c(1, 2)) > plot(nh4, z, type = "b") > boxplot(heffa ~ f)

q q q q q

q q q q q

10

q q q q q

15

q q q q q

20

10

12

200 nh4

400

600

Using colors

If you make a graph intended for a poster or presentation, using colors can be very helpful and also makes your graphs look more interesting. You can use colors by their name e.g. green or by hexadecimal notation. The latter is best generated by specialized functions such as rainbow. > > > > > par(mfrow = c(2, 2)) barplot(rep(c(1, 2), 5), col = rainbow(2)) barplot(rnorm(5), col = heat.colors(5)) barplot(1:10, col = rainbow(10)) pie(1:10, col = terrain.colors(10))

2.0

1.5

0.5

1.0

10

0.0

0.3

0.1

0.0

6 7

4 3 2 1 10

8 2 9 0

10

Mixing of watermasses - Linear algebra


Ax = b 20 4 1 30 4 1 x1 22 25 5 x2 = 4.2 x3 1 1 (2)

(3)

> (A <- rbind(c(20, 30, 25), c(4, 4, 5), c(1, 1, 1))) [1,] [2,] [3,] [,1] [,2] [,3] 20 30 25 4 4 5 1 1 1

> (b <- c(22, 4.2, 1)) [1] 22.0 4.2 1.0

> (x <- solve(A, b)) [1] 0.7 0.1 0.2 which means that contributions of 70% from the rst, 10% and 20% from the third watermass. 10

Index
approx, 4 c, 2 nls, 6 order, 3 scan, 2

11

You might also like