Package Ggplot2': October 25, 2018
Package Ggplot2': October 25, 2018
Package Ggplot2': October 25, 2018
LazyData true
Collate 'ggproto.r' 'ggplot-global.R' 'aaa-.r' 'aes-calculated.r'
'aes-colour-fill-alpha.r' 'aes-group-order.r'
'aes-linetype-size-shape.r' 'aes-position.r' 'utilities.r'
'aes.r' 'legend-draw.r' 'geom-.r' 'annotation-custom.r'
'annotation-logticks.r' 'geom-polygon.r' 'geom-map.r'
'annotation-map.r' 'geom-raster.r' 'annotation-raster.r'
'annotation.r' 'autolayer.r' 'autoplot.r' 'axis-secondary.R'
'backports.R' 'bench.r' 'bin.R' 'compat-quosures.R' 'coord-.r'
'coord-cartesian-.r' 'coord-fixed.r' 'coord-flip.r'
'coord-map.r' 'coord-munch.r' 'coord-polar.r'
'coord-quickmap.R' 'coord-transform.r' 'data.R' 'facet-.r'
'facet-grid-.r' 'facet-null.r' 'facet-wrap.r' 'fortify-lm.r'
'fortify-map.r' 'fortify-multcomp.r' 'fortify-spatial.r'
R topics documented: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
aes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
aes_ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
aes_colour_fill_alpha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
aes_group_order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
aes_linetype_size_shape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
aes_position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
annotate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
annotation_custom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
annotation_logticks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
annotation_map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
annotation_raster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
autolayer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
autoplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
borders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
coord_cartesian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
coord_fixed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
coord_flip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
coord_map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
coord_polar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
coord_trans . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
cut_interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
diamonds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
expand_limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
expand_scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
facet_grid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
facet_wrap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
faithfuld . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
fortify . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
geom_abline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
geom_bar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
geom_bin2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
geom_blank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
geom_boxplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
geom_contour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4 R topics documented:
geom_count . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
geom_crossbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
geom_density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
geom_density_2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
geom_dotplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
geom_errorbarh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
geom_freqpoly . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
geom_hex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
geom_jitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
geom_label . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
geom_map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
geom_path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
geom_point . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
geom_polygon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
geom_qq_line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
geom_quantile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
geom_raster . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
geom_ribbon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
geom_rug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
geom_segment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
geom_smooth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
geom_spoke . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
geom_violin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
ggplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
ggproto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
ggsave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
ggsf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
ggtheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
guides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
guide_colourbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
guide_legend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
hmisc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
labeller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
labellers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
label_bquote . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
labs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
lims . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
luv_colours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
margin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
mean_se . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
midwest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
mpg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
msleep . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
position_dodge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
position_identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
position_jitter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
position_jitterdodge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
position_nudge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149 5
position_stack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
presidential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
print.ggplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
print.ggproto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
qplot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
scale_alpha . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
scale_colour_brewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
scale_colour_continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
scale_colour_gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
scale_colour_grey . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
scale_colour_hue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
scale_colour_viridis_d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
scale_continuous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
scale_date . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
scale_identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
scale_linetype . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
scale_manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
scale_shape . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
scale_size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
scale_x_discrete . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
seals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
sec_axis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
stat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
stat_ecdf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
stat_ellipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
stat_function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
stat_identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
stat_sf_coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
stat_summary_2d . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
stat_summary_bin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
stat_unique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
summarise_plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
theme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
theme_get . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
txhousing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
vars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
Index 216
+ is the key to constructing sophisticated ggplot2 graphics. It allows you to start simple, then get
more and more complex, checking your work at each step.
e1 %+% e2
To replace the current default data frame, you must use %+%, due to S3 method precedence issues.
You can also supply a list, in which case each element of the list will be added in turn.
See Also
Aesthetic mappings describe how variables in the data are mapped to visual properties (aesthetics)
of geoms. Aesthetic mappings can be set in ggplot2() and in individual layers.
aes(x, y, ...)
x, y, ... List of name value pairs giving aesthetics to map to variables. The names for
x and y aesthetics are typically omitted because they are so common; all other
aesthetics must be named.
This function also standardises aesthetic names by converting color to colour (also in substrings,
e.g. point_color to point_colour) and translating old style R names to ggplot names (eg. pch to
shape, cex to size).
A list with class uneval. Components of the list are either quosures or constants.
aes() is a quoting function. This means that its inputs are quoted to be evaluated in the context of
the data. This makes it easy to work with variables from the data frame because you can name those
directly. The flip side is that you have to use quasiquotation to program with aes(). See a tidy
evaluation tutorial such as the dplyr programming vignette to learn more about these techniques.
See Also
vars() for another quoting function designed for faceting specifications.
aes(x = mpg, y = wt)
aes(mpg, wt)
# Or to constants
aes(x = 1, colour = "smooth")
8 aes_
# Note that users of your wrapper can use their own functions in the
# quoted expressions and all will resolve as it should!
cut3 <- function(x) cut_number(x, 3)
scatter_by(mtcars, cut3(disp), drat)
Aesthetic mappings describe how variables in the data are mapped to visual properties (aesthetics)
of geoms. aes() uses non-standard evaluation to capture the variable names. aes_ and aes_string
require you to explicitly quote the inputs either with "" for aes_string(), or with quote or ~ for
aes_(). (aes_q is an alias to aes_). This makes aes_ and aes_string easy to program with.
aes_(x, y, ...)
aes_ 9
aes_string(x, y, ...)
aes_q(x, y, ...)
x, y, ... List of name value pairs. Elements must be either quoted calls, strings, one-
sided formulas or constants.
aes_string and aes_ are particularly useful when writing functions that create plots because you
can use strings or quoted names/calls to define the aesthetic mappings, rather than having to use
substitute() to generate a call to aes().
I recommend using aes_(), because creating the equivalents of aes(colour = "my colour") or
aes{x = `X$1`} with aes_string() is quite clunky.
Life cycle
All these functions are soft-deprecated. Please use tidy evaluation idioms instead (see the quasiquo-
tation section in aes() documentation).
See Also
This page demonstrates the usage of a sub-group of aesthetics: colour, fill and alpha.
Aesthetics: grouping
# By default, the group is set to the interaction of all discrete variables in the
# plot. This often partitions the data correctly, but when it does not, or when
# no discrete variable is used in the plot, you will need to explicitly define the
# grouping structure, by mapping group to a variable that has a different value
# for each group.
# For most applications you can simply specify the grouping with
# various aesthetics (colour, shape, fill, linetype) or with facets.
# Using fill
a <- ggplot(mtcars, aes(factor(cyl)))
a + geom_bar()
a + geom_bar(aes(fill = factor(cyl)))
a + geom_bar(aes(fill = factor(vs)))
# Using linetypes
rescale01 <- function(x) (x - min(x)) / diff(range(x))
ec_scaled <- data.frame(
date = economics$date,
plyr::colwise(rescale01)(economics[, -(1:2)]))
ecm <- reshape2::melt(ec_scaled, id.vars = "date")
f <- ggplot(ecm, aes(date, value))
f + geom_line(aes(linetype = variable))
12 aes_linetype_size_shape
# Using facets
k <- ggplot(diamonds, aes(carat, stat(density))) + geom_histogram(binwidth = 0.2)
k + facet_grid(. ~ cut)
# There are three common cases where the default is not enough, and we
# will consider each one below. In the following examples, we will use a simple
# longitudinal dataset, Oxboys, from the nlme package. It records the heights
# (height) and centered ages (age) of 26 boys (Subject), measured on nine
# occasions (Occasion).
Differentiation related aesthetics: linetype, size, shape
This page demonstrates the usage of a sub-group of aesthetics; linetype, size and shape.
aes_linetype_size_shape 13
# Line types should be specified with either an integer, a name, or with a string of
# an even number (up to eight) of hexadecimal digits which give the lengths in
# consecutive positions in the string.
# 0 = blank, 1 = solid, 2 = dashed, 3 = dotted, 4 = dotdash, 5 = longdash, 6 = twodash
# Data
df <- data.frame(x = 1:10 , y = 1:10)
f <- ggplot(df, aes(x, y))
f + geom_line(linetype = 2)
f + geom_line(linetype = "dotdash")
# An example with hex strings, the string "33" specifies three units on followed
# by three off and "3313" specifies three units on followed by three off followed
# by one on and finally three off.
f + geom_line(linetype = "3313")
# Size examples
# Should be specified with a numerical value (in millimetres),
# or from a variable source
p <- ggplot(mtcars, aes(wt, mpg))
p + geom_point(size = 4)
p + geom_point(aes(size = qsec))
p + geom_point(size = 2.5) +
geom_hline(yintercept = 25, size = 3.5)
# Shape examples
# Shape takes four types of values: an integer in [0, 25],
# a single character-- which uses that character as the plotting symbol,
# a . to draw the smallest rectangle that is visible (i.e., about one pixel)
# an NA to draw nothing
p + geom_point()
p + geom_point(shape = 5)
p + geom_point(shape = "k", size = 3)
p + geom_point(shape = ".")
p + geom_point(shape = NA)
aes_position Position related aesthetics: x, y, xmin, xmax, ymin, ymax, xend, yend
This page demonstrates the usage of a sub-group of aesthetics; x, y, xmin, xmax, ymin, ymax, xend,
and yend.
# Using annotate
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
p + annotate("rect", xmin = 2, xmax = 3.5, ymin = 2, ymax = 25,
fill = "dark grey", alpha = .5)
# Geom_segment examples
p + geom_segment(aes(x = 2, y = 15, xend = 2, yend = 25),
arrow = arrow(length = unit(0.5, "cm")))
p + geom_segment(aes(x = 2, y = 15, xend = 3, yend = 15),
arrow = arrow(length = unit(0.5, "cm")))
p + geom_segment(aes(x = 5, y = 30, xend = 3.5, yend = 25),
arrow = arrow(length = unit(0.5, "cm")))
This function adds geoms to a plot, but unlike typical a geom function, the properties of the geoms
are not mapped from variables of a data frame, but are instead passed in as vectors. This is useful
for adding small annotations (such as text labels) or if you have your data in vectors, and for some
reason don’t want to put them in a data frame.
annotate(geom, x = NULL, y = NULL, xmin = NULL, xmax = NULL,
ymin = NULL, ymax = NULL, xend = NULL, yend = NULL, ...,
na.rm = FALSE)
geom name of geom to use for annotation
x, y, xmin, ymin, xmax, ymax, xend, yend
positioning aesthetics - you must specify at least one of these.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
Note that all position aesthetics are scaled (i.e. they will expand the limits of the plot so they are
visible), but all other aesthetics are set. This means that layers created with this function will never
affect the legend.
p <- ggplot(mtcars, aes(x = wt, y = mpg)) + geom_point()
p + annotate("text", x = 4, y = 25, label = "Some text")
p + annotate("text", x = 2:5, y = 25, label = "Some text")
p + annotate("rect", xmin = 3, xmax = 4.2, ymin = 12, ymax = 21,
alpha = .2)
p + annotate("segment", x = 2.5, xend = 4, y = 15, yend = 25,
colour = "blue")
p + annotate("pointrange", x = 3.5, y = 20, ymin = 12, ymax = 28,
colour = "red", size = 1.5)
parse = TRUE)
p + annotate("text", x = 4, y = 25,
label = "paste(italic(R) ^ 2, \" = .75\")", parse = TRUE)
This is a special geom intended for use as static annotations that are the same in every panel. These
annotations will not affect scales (i.e. the x and y axes will not grow to cover the range of the grob,
and the grob will not be modified by any ggplot settings or mappings).
annotation_custom(grob, xmin = -Inf, xmax = Inf, ymin = -Inf,
ymax = Inf)
grob grob to display
xmin, xmax x location (in data coordinates) giving horizontal location of raster
ymin, ymax y location (in data coordinates) giving vertical location of raster
Most useful for adding tables, inset plots, and other grid-based decorations.
annotation_custom expects the grob to fill the entire viewport defined by xmin, xmax, ymin,
ymax. Grobs with a different (absolute) size will be center-justified in that region. Inf values can be
used to fill the full plot panel (see examples).
# Dummy plot
df <- data.frame(x = 1:10, y = 1:10)
base <- ggplot(df, aes(x, y)) +
geom_blank() +
# Inset plot
df2 <- data.frame(x = 1 , y = 1)
g <- ggplotGrob(ggplot(df2, aes(x, y)) +
geom_point() +
theme(plot.background = element_rect(colour = "black")))
base +
annotation_custom(grob = g, xmin = 1, xmax = 10, ymin = 8, ymax = 10)
This annotation adds log tick marks with diminishing spacing. These tick marks probably make
sense only for base 10.
annotation_logticks(base = 10, sides = "bl", scaled = TRUE,
short = unit(0.1, "cm"), mid = unit(0.2, "cm"), long = unit(0.3,
"cm"), colour = "black", size = 0.5, linetype = 1, alpha = 1,
color = NULL, ...)
base the base of the log (default 10)
sides a string that controls which sides of the plot the log ticks appear on. It can be set
to a string containing any of "trbl", for top, right, bottom, and left.
scaled is the data already log-scaled? This should be TRUE (default) when the data is
already transformed with log10() or when using scale_y_log10. It should be
FALSE when using coord_trans(y = "log10").
short a grid::unit() object specifying the length of the short tick marks
mid a grid::unit() object specifying the length of the middle tick marks. In base
10, these are the "5" ticks.
long a grid::unit() object specifying the length of the long tick marks. In base 10,
these are the "1" (or "10") ticks.
colour Colour of the tick marks.
size Thickness of tick marks, in mm.
linetype Linetype of tick marks (solid, dashed, etc.)
alpha The transparency of the tick marks.
color An alias for colour.
... Other parameters passed on to the layer
18 annotation_logticks
See Also
# Make a log-log plot (without log ticks)
a <- ggplot(msleep, aes(bodywt, brainwt)) +
geom_point(na.rm = TRUE) +
breaks = scales::trans_breaks("log10", function(x) 10^x),
labels = scales::trans_format("log10", scales::math_format(10^.x))
) +
breaks = scales::trans_breaks("log10", function(x) 10^x),
labels = scales::trans_format("log10", scales::math_format(10^.x))
) +
# Hide the minor grid lines because they don't align with the ticks
a + annotation_logticks(sides = "trbl") + theme(panel.grid.minor = element_blank())
# Another way to get the same results as 'a' above: log-transform the data before
# plotting it. Also hide the minor grid lines.
b <- ggplot(msleep, aes(log10(bodywt), log10(brainwt))) +
geom_point(na.rm = TRUE) +
scale_x_continuous(name = "body", labels = scales::math_format(10^.x)) +
scale_y_continuous(name = "brain", labels = scales::math_format(10^.x)) +
theme_bw() + theme(panel.grid.minor = element_blank())
b + annotation_logticks()
Display a fixed map on a plot.
annotation_map(map, ...)
map data frame representing a map. Most map objects can be converted into the right
format by using fortify()
... other arguments used to modify aesthetics
if (require("maps")) {
usamap <- map_data("state")
seal.sub <- subset(seals, long > -130 & lat < 45 & lat > 40)
ggplot(seal.sub, aes(x = long, y = lat)) +
annotation_map(usamap, fill = "NA", colour = "grey50") +
geom_segment(aes(xend = long + delta_long, yend = lat + delta_lat))
This is a special version of geom_raster() optimised for static annotations that are the same in
every panel. These annotations will not affect scales (i.e. the x and y axes will not grow to cover
the range of the raster, and the raster must already have its own colours). This is useful for adding
bitmap images.
20 autolayer
annotation_raster(raster, xmin, xmax, ymin, ymax, interpolate = FALSE)
raster raster object to display
xmin, xmax x location (in data coordinates) giving horizontal location of raster
ymin, ymax y location (in data coordinates) giving vertical location of raster
interpolate If TRUE interpolate linearly, if FALSE (the default) don’t interpolate.
# Generate data
rainbow <- matrix(hcl(seq(0, 360, length.out = 50 * 50), 80, 70), nrow = 50)
ggplot(mtcars, aes(mpg, wt)) +
geom_point() +
annotation_raster(rainbow, 15, 20, 3, 4)
# To fill up whole plot
ggplot(mtcars, aes(mpg, wt)) +
annotation_raster(rainbow, -Inf, Inf, -Inf, Inf) +
autolayer uses ggplot2 to draw a particular layer for an object of a particular class in a single
command. This defines the S3 generic that other classes and packages can extend.
autolayer(object, ...)
object an object, whose class will determine the behaviour of autolayer
... other arguments passed to specific methods
autoplot 21
a ggplot layer
See Also
autoplot(), ggplot() and fortify()
autoplot uses ggplot2 to draw a particular plot for an object of a particular class in a single com-
mand. This defines the S3 generic that other classes and packages can extend.
autoplot(object, ...)
object an object, whose class will determine the behaviour of autoplot
... other arguments passed to specific methods
a ggplot object
See Also
autolayer(), ggplot() and fortify()
This is a quick and dirty way to get map data (from the maps package) on to your plot. This is
a good place to start if you need some crude reference lines, but you’ll typically want something
more sophisticated for communication graphics.
borders(database = "world", regions = ".", fill = NA,
colour = "grey50", xlim = NULL, ylim = NULL, ...)
22 borders
database map data, see maps::map() for details
regions map region
fill fill colour
colour border colour
xlim, ylim latitudinal and longitudinal ranges for extracting map polygons, see maps::map()
for details.
... Arguments passed on to geom_polygon
mapping Set of aesthetic mappings created by aes() or aes_(). If specified
and inherit.aes = TRUE (the default), it is combined with the default
mapping at the top level of the plot. You must supply mapping if there is
no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in
the call to ggplot().
A data.frame, or other object, will override the plot data. All objects will
be fortified to produce a data frame. See fortify() for which variables
will be created.
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a posi-
tion adjustment function.
show.legend logical. Should this layer be included in the legends? NA, the
default, includes if any aesthetics are mapped. FALSE never includes, and
TRUE always includes. It can also be a named logical vector to finely select
the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining
with them. This is most useful for helper functions that define both data
and aesthetics and shouldn’t inherit behaviour from the default plot specifi-
cation, e.g. borders().
na.rm If FALSE, the default, missing values are removed with a warning. If
TRUE, missing values are silently removed.
if (require("maps")) {
coord_cartesian 23
The Cartesian coordinate system is the most familiar, and common, type of coordinate system. Set-
ting limits on the coordinate system will zoom the plot (like you’re looking at it with a magnifying
glass), and will not change the underlying data like setting limits on a scale will.
coord_cartesian(xlim = NULL, ylim = NULL, expand = TRUE,
default = FALSE, clip = "on")
xlim, ylim Limits for the x and y axes.
expand If TRUE, the default, adds a small expansion factor to the limits to ensure that
data and axes don’t overlap. If FALSE, limits are taken exactly from the data or
default Is this the default coordinate system? If FALSE (the default), then replacing this
coordinate system with another one creates a message alerting the user that the
coordinate system is being replaced. If TRUE, that warning is suppressed.
clip Should drawing be clipped to the extent of the plot panel? A setting of "on" (the
default) means yes, and a setting of "off" means no. In most cases, the default
of "on" should not be changed, as setting clip = "off" can cause unexpected
results. It allows drawing of data points anywhere on the plot, including in
the plot margins. If limits are set via xlim and ylim and some data points fall
outside those limits, then those data points may show up in places such as the
axes, the legend, the plot title, or the plot margins.
24 coord_fixed
# There are two ways of zooming the plot display: with scales or
# with coordinate systems. They work in two rather different ways.
# Setting the limits on a scale converts all values outside the range to NA.
p + scale_x_continuous(limits = c(325, 500))
# Simiarly, we can use expand = FALSE to turn off expansion with the
# default limits
p + coord_cartesian(expand = FALSE)
# When zooming the scale, the we get 25 new bins that are the same
# size on the plot, but represent smaller regions of the data space
d + scale_x_continuous(limits = c(0, 1))
A fixed scale coordinate system forces a specified ratio between the physical representation of data
units on the axes. The ratio represents the number of units on the y-axis equivalent to one unit on
the x-axis. The default, ratio = 1, ensures that one unit on the x-axis is the same length as one
unit on the y-axis. Ratios higher than one make units on the y axis longer than units on the x-axis,
and vice versa. This is similar to MASS::eqscplot(), but it works for all types of graphics.
coord_flip 25
coord_fixed(ratio = 1, xlim = NULL, ylim = NULL, expand = TRUE,
clip = "on")
ratio aspect ratio, expressed as y / x
xlim Limits for the x and y axes.
ylim Limits for the x and y axes.
expand If TRUE, the default, adds a small expansion factor to the limits to ensure that
data and axes don’t overlap. If FALSE, limits are taken exactly from the data or
clip Should drawing be clipped to the extent of the plot panel? A setting of "on" (the
default) means yes, and a setting of "off" means no. In most cases, the default
of "on" should not be changed, as setting clip = "off" can cause unexpected
results. It allows drawing of data points anywhere on the plot, including in
the plot margins. If limits are set via xlim and ylim and some data points fall
outside those limits, then those data points may show up in places such as the
axes, the legend, the plot title, or the plot margins.
# ensures that the ranges of axes are equal to the specified ratio by
# adjusting the plot aspect ratio
# Resize the plot to see that the specified aspect ratio is maintained
Flip cartesian coordinates so that horizontal becomes vertical, and vertical, horizontal. This is pri-
marily useful for converting geoms and statistics which display y conditional on x, to x conditional
on y.
coord_flip(xlim = NULL, ylim = NULL, expand = TRUE, clip = "on")
26 coord_map
# Very useful for creating boxplots, and other interval
# geoms in the horizontal instead of vertical position.
coord_map projects a portion of the earth, which is approximately spherical, onto a flat 2D plane
using any projection defined by the mapproj package. Map projections do not, in general, preserve
straight lines, so this requires considerable computation. coord_quickmap is a quick approximation
that does preserve straight lines. It works best for smaller areas closer to the equator.
coord_map 27
coord_map(projection = "mercator", ..., parameters = NULL,
orientation = NULL, xlim = NULL, ylim = NULL, clip = "on")
projection projection to use, see mapproj::mapproject() for list
..., parameters
Other arguments passed on to mapproj::mapproject(). Use ... for named
parameters to the projection, and parameters for unnamed parameters. ... is
ignored if the parameters argument is present.
orientation projection orientation, which defaults to c(90, 0, mean(range(x))). This
is not optimal for many projections, so you will have to supply your own. See
mapproj::mapproject() for more information.
xlim, ylim Manually specific x/y limits (in degrees of longitude/latitude)
clip Should drawing be clipped to the extent of the plot panel? A setting of "on"
(the default) means yes, and a setting of "off" means no. For details, please see
expand If TRUE, the default, adds a small expansion factor to the limits to ensure that
data and axes don’t overlap. If FALSE, limits are taken exactly from the data or
In general, map projections must account for the fact that the actual length (in km) of one degree of
longitude varies between the equator and the pole. Near the equator, the ratio between the lengths
of one degree of latitude and one degree of longitude is approximately 1. Near the pole, it tends
towards infinity because the length of one degree of longitude tends towards 0. For regions that span
only a few degrees and are not too close to the poles, setting the aspect ratio of the plot to the appro-
priate lat/lon ratio approximates the usual mercator projection. This is what coord_quickmap does,
and is much faster (particularly for complex plots like geom_tile()) at the expense of correctness.
if (require("maps")) {
nz <- map_data("nz")
# Prepare a map of NZ
nzmap <- ggplot(nz, aes(x = long, y = lat, group = group)) +
geom_polygon(fill = "white", colour = "black")
nzmap + coord_quickmap()
# Other projections
nzmap + coord_map("cylindrical")
nzmap + coord_map("azequalarea", orientation = c(-36.92, 174.6, 0))
nzmap + coord_map("lambert", parameters = c(-37, -44))
The polar coordinate system is most commonly used for pie charts, which are a stacked bar chart in
polar coordinates.
coord_polar 29
coord_polar(theta = "x", start = 0, direction = 1, clip = "on")
theta variable to map angle to (x or y)
start offset of starting point from 12 o’clock in radians
direction 1, clockwise; -1, anticlockwise
clip Should drawing be clipped to the extent of the plot panel? A setting of "on"
(the default) means yes, and a setting of "off" means no. For details, please see
# NOTE: Use these plots with caution - polar coordinates has
# major perceptual problems. The main point of these examples is
# to demonstrate how these common plots can be described in the
# grammar. Use with EXTREME caution.
# Wind rose
doh + geom_bar(width = 1) + coord_polar()
# Race track plot
doh + geom_bar(width = 0.9, position = "fill") + coord_polar(theta = "y")
coord_trans is different to scale transformations in that it occurs after statistical transformation and
will affect the visual appearance of geoms - there is no guarantee that straight lines will continue to
be straight.
coord_trans(x = "identity", y = "identity", limx = NULL,
limy = NULL, clip = "on", xtrans, ytrans)
x, y transformers for x and y axes
limx, limy limits for x and y axes. (Named so for backward compatibility)
clip Should drawing be clipped to the extent of the plot panel? A setting of "on"
(the default) means yes, and a setting of "off" means no. For details, please see
xtrans, ytrans Deprecated; use x and y instead.
Transformations only work with continuous values: see scales::trans_new() for list of transfor-
mations, and instructions on how to create your own.
# cf.
ggplot(diamonds, aes(carat, price)) +
geom_point() +
geom_smooth(method = "lm")
cut_interval makes n groups with equal range, cut_number makes n groups with (approximately)
equal numbers of observations; cut_width makes groups of width width.
cut_interval(x, n = NULL, length = NULL, ...)
x numeric vector
n number of intervals to create, OR
length length of each interval
... Arguments passed on to base::cut.default
breaks either a numeric vector of two or more unique cut points or a single
number (greater than or equal to 2) giving the number of intervals into
which x is to be cut.
labels labels for the levels of the resulting category. By default, labels are con-
structed using "(a,b]" interval notation. If labels = FALSE, simple inte-
ger codes are returned instead of a factor.
right logical, indicating if the intervals should be closed on the right (and open
on the left) or vice versa.
dig.lab integer which is used when labels are not given. It determines the num-
ber of digits used in formatting the break numbers.
ordered_result logical: should the result be an ordered factor?
width The bin width.
center, boundary
Specify either the position of edge or the center of a bin. Since all bins are
aligned, specifying the position of a single bin (which doesn’t need to be in the
range of the data) affects the location of all bins. If not specified, uses the "tile
layers algorithm", and sets the boundary to half of the binwidth.
To center on integers, width = 1 and center = 0. boundary = 0.5.
closed One of "right" or "left" indicating whether right or left edges of bins are
included in the bin.
diamonds 33
Randall Prium contributed most of the implementation of cut_width.
table(cut_interval(1:100, 10))
table(cut_interval(1:100, 11))
table(cut_number(runif(1000), 10))
table(cut_width(runif(1000), 0.1))
table(cut_width(runif(1000), 0.1, boundary = 0))
table(cut_width(runif(1000), 0.1, center = 0))
A dataset containing the prices and other attributes of almost 54,000 diamonds. The variables are
as follows:
A data frame with 53940 rows and 10 variables:
This dataset was produced from US economic time series data available from http://research. economics is in "wide" format, economics_long is in "long" format.
A data frame with 478 rows and 6 variables
Sometimes you may want to ensure limits include a single value, for all panels or all plots. This
function is a thin wrapper around geom_blank() that makes it easy to add such values.
... named list of aesthetics specifying the value (or values) that should be included
in each scale.
expand_scale 35
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point()
p + expand_limits(x = 0)
p + expand_limits(y = c(1, 9))
p + expand_limits(x = 0, y = 0)
This is a convenience function for generating scale expansion vectors for the expand argument of
scale_*_continuous and scale_*_discrete. The expansions vectors are used to add some space
between the data and the axes.
expand_scale(mult = 0, add = 0)
mult vector of multiplicative range expansion factors. If length 1, both the lower and
upper limits of the scale are expanded outwards by mult. If length 2, the lower
limit is expanded by mult[1] and the upper limit by mult[2].
add vector of additive range expansion constants. If length 1, both the lower and
upper limits of the scale are expanded outwards by add units. If length 2, the
lower limit is expanded by add[1] and the upper limit by add[2].
# No space below the bars but 10% above them
ggplot(mtcars) +
geom_bar(aes(x = factor(cyl))) +
scale_y_continuous(expand = expand_scale(mult = c(0, .1)))
facet_grid() forms a matrix of panels defined by row and column faceting variables. It is most
useful when you have two discrete variables, and all combinations of the variables exist in the data.
facet_grid(rows = NULL, cols = NULL, scales = "fixed",
space = "fixed", shrink = TRUE, labeller = "label_value",
as.table = TRUE, switch = NULL, drop = TRUE, margins = FALSE,
facets = NULL)
rows, cols A set of variables or expressions quoted by vars() and defining faceting groups
on the rows or columns dimension. The variables can be named (the names are
passed to labeller).
For compatibility with the classic interface, rows can also be a formula with the
rows (of the tabular display) on the LHS and the columns (of the tabular display)
on the RHS; the dot in the formula is used to indicate there should be no faceting
on this dimension (either row or column).
scales Are scales shared across all facets (the default, "fixed"), or do they vary across
rows ("free_x"), columns ("free_y"), or both rows and columns ("free")?
space If "fixed", the default, all panels have the same size. If "free_y" their height
will be proportional to the length of the y scale; if "free_x" their width will be
proportional to the length of the x scale; or if "free" both height and width will
vary. This setting has no effect unless the appropriate scales also vary.
shrink If TRUE, will shrink scales to fit output of statistics, not raw data. If FALSE, will
be range of raw data before statistical summary.
labeller A function that takes one data frame of labels and returns a list or data frame of
character vectors. Each input column corresponds to one factor. Thus there will
be more than one with formulae of the type ~cyl + am. Each output column gets
displayed as one separate line in the strip label. This function should inherit from
the "labeller" S3 class for compatibility with labeller(). See label_value()
for more details and pointers to other options.
as.table If TRUE, the default, the facets are laid out like a table with highest values at the
bottom-right. If FALSE, the facets are laid out like a plot with the highest value
at the top-right.
facet_grid 37
switch By default, the labels are displayed on the top and right of the plot. If "x", the
top labels will be displayed to the bottom. If "y", the right-hand side labels will
be displayed to the left. Can also be set to "both".
drop If TRUE, the default, all factor levels not used in the data will automatically be
dropped. If FALSE, all factor levels will be shown, regardless of whether or not
they appear in the data.
margins Either a logical value or a character vector. Margins are additional facets which
contain all the data for each of the possible values of the faceting variables.
If FALSE, no additional facets are included (the default). If TRUE, margins are
included for all faceting variables. If specified as a character vector, it is the
names of variables for which margins are to be created.
facets This argument is soft-deprecated, please us rows and cols instead.
p <- ggplot(mpg, aes(displ, cty)) + geom_point()
p + facet_grid(. ~ cyl)
p + facet_grid(drv ~ .)
p + facet_grid(drv ~ cyl)
# If scales and space are free, then the mapping between position
# and values in the data will be the same across all panels. This
38 facet_wrap
# Margins ----------------------------------------------------------
# Margins can be specified logically (all yes or all no) or for specific
# variables as (character) variable names
mg <- ggplot(mtcars, aes(x = mpg, y = wt)) + geom_point()
mg + facet_grid(vs + am ~ gear, margins = TRUE)
mg + facet_grid(vs + am ~ gear, margins = "am")
# when margins are made over "vs", since the facets for "am" vary
# within the values of "vs", the marginal facet for "vs" is also
# a margin over "am".
mg + facet_grid(vs + am ~ gear, margins = "vs")
facet_wrap wraps a 1d sequence of panels into 2d. This is generally a better use of screen space
than facet_grid() because most displays are roughly rectangular.
facet_wrap(facets, nrow = NULL, ncol = NULL, scales = "fixed",
shrink = TRUE, labeller = "label_value", as.table = TRUE,
switch = NULL, drop = TRUE, dir = "h", strip.position = "top")
facets A set of variables or expressions quoted by vars() and defining faceting groups
on the rows or columns dimension. The variables can be named (the names are
passed to labeller).
For compatibility with the classic interface, can also be a formula or charac-
ter vector. Use either a one sided formula, ~a + b, or a character vector,
c("a", "b").
nrow, ncol Number of rows and columns.
scales Should scales be fixed ("fixed", the default), free ("free"), or free in one
dimension ("free_x", "free_y")?
shrink If TRUE, will shrink scales to fit output of statistics, not raw data. If FALSE, will
be range of raw data before statistical summary.
facet_wrap 39
labeller A function that takes one data frame of labels and returns a list or data frame of
character vectors. Each input column corresponds to one factor. Thus there will
be more than one with formulae of the type ~cyl + am. Each output column gets
displayed as one separate line in the strip label. This function should inherit from
the "labeller" S3 class for compatibility with labeller(). See label_value()
for more details and pointers to other options.
as.table If TRUE, the default, the facets are laid out like a table with highest values at the
bottom-right. If FALSE, the facets are laid out like a plot with the highest value
at the top-right.
switch By default, the labels are displayed on the top and right of the plot. If "x", the
top labels will be displayed to the bottom. If "y", the right-hand side labels will
be displayed to the left. Can also be set to "both".
drop If TRUE, the default, all factor levels not used in the data will automatically be
dropped. If FALSE, all factor levels will be shown, regardless of whether or not
they appear in the data.
dir Direction: either "h" for horizontal, the default, or "v", for vertical.
strip.position By default, the labels are displayed on the top of the plot. Using strip.position
it is possible to place the labels on either of the four sides by setting strip.position = c("top", "bott
p <- ggplot(mpg, aes(displ, hwy)) + geom_point()
# Control the number of rows and columns with nrow and ncol
p + facet_wrap(vars(class), nrow = 4)
# To change the order in which the panels appear, change the levels
# of the underlying factor.
mpg$class2 <- reorder(mpg$class, mpg$displ)
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
40 fortify
# By default, the same scales are used for all panels. You can allow
# scales to vary across the panels with the `scales` argument.
# Free scales make it easier to see patterns within each panel, but
# harder to compare across panels.
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
facet_wrap(~class, scales = "free")
# To repeat the same data in every panel, simply construct a data frame
# that does not contain the faceting variable.
ggplot(mpg, aes(displ, hwy)) +
geom_point(data = transform(mpg, class = NULL), colour = "grey85") +
geom_point() +
A 2d density estimate of the waiting and eruptions variables data faithful.
A data frame with 5,625 observations and 3 variables.
Rather than using this function, I now recommend using the broom package, which implements a
much wider range of methods. fortify may be deprecated in the future.
geom_abline 41
fortify(model, data, ...)
model model or other R object to convert to data frame
data original dataset, if needed
... other arguments passed to methods
See Also
These geoms add reference lines (sometimes called rules) to a plot, either horizontal, vertical, or
diagonal (specified by slope and intercept). These are useful for annotating plots.
geom_abline(mapping = NULL, data = NULL, ..., slope, intercept,
na.rm = FALSE, show.legend = NA)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
42 geom_abline
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
xintercept, yintercept, slope, intercept
Parameters that control the position of the line. If these are set, data, mapping
and show.legend are overridden.
These geoms act slightly differently from other geoms. You can supply the parameters in two
ways: either as arguments to the layer function, or via aesthetics. If you use arguments, e.g.
geom_abline(intercept = 0, slope = 1), then behind the scenes the geom makes a new
data frame containing just the data you’ve supplied. That means that the lines will be the same in all
facets; if you want them to vary across facets, construct the data frame yourself and use aesthetics.
Unlike most other geoms, these geoms do not inherit aesthetics from the plot default, because they
do not understand x and y aesthetics which are commonly set in the plot. They also do not affect
the x and y scales.
These geoms are drawn using with geom_line() so support the same aesthetics: alpha, colour,
linetype and size. They also each have aesthetics that control the position of the line:
• geom_vline(): xintercept
• geom_hline(): yintercept
• geom_abline(): slope and intercept
See Also
See geom_segment() for a more general approach to adding straight line segments to a plot.
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
# Fixed values
p + geom_vline(xintercept = 5)
p + geom_vline(xintercept = 1:5)
p + geom_hline(yintercept = 20)
There are two types of bar charts: geom_bar() and geom_col(). geom_bar() makes the height of
the bar proportional to the number of cases in each group (or if the weight aesthetic is supplied,
the sum of the weights). If you want the heights of the bars to represent values in the data, use
geom_col() instead. geom_bar() uses stat_count() by default: it counts the number of cases at
each x position. geom_col() uses stat_identity(): it leaves the data as is.
geom_bar(mapping = NULL, data = NULL, stat = "count",
position = "stack", ..., width = NULL, binwidth = NULL,
na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
44 geom_bar
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
width Bar width. By default, set to 90% of the resolution of the data.
binwidth geom_bar() no longer has a binwidth argument - if you use it you’ll get an
warning telling to you use geom_histogram() instead.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Override the default connection between geom_bar() and stat_count().
A bar chart uses height to represent a value, and so the base of the bar must always be shown to
produce a valid visual comparison. This is why it doesn’t make sense to use a log-scaled y axis with
a bar chart.
By default, multiple bars occupying the same x position will be stacked atop one another by
position_stack(). If you want them to be dodged side-to-side, use position_dodge() or position_dodge2().
Finally, position_fill() shows relative proportions at each x by stacking the bars and then stan-
dardising each bar to have the same height.
geom_bar() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
geom_bar 45
• group
• linetype
• size
Computed variables
count number of points in bin
prop groupwise proportion
See Also
geom_histogram() for continuous data, position_dodge() and position_dodge2() for creating
side-by-side bar charts.
stat_bin(), which bins data in ranges and counts the cases in each range. It differs from stat_count,
which counts the number of cases at each x position (without binning into ranges). stat_bin() re-
quires continuous x data, whereas stat_count can be used for both discrete and continuous x data.
# geom_bar is designed to make it easy to create bar charts that show
# counts (or sums of weights)
g <- ggplot(mpg, aes(class))
# Number of cars in each class:
g + geom_bar()
# Total engine displacement of each class
g + geom_bar(aes(weight = displ))
# Bar charts are automatically stacked when multiple bars are placed
# at the same location. The order of the fill is designed to match
# the legend
g + geom_bar(aes(fill = drv))
# If you need to flip the order (because you've flipped the plot)
# call position_stack() explicitly:
g +
geom_bar(aes(fill = drv), position = position_stack(reverse = TRUE)) +
coord_flip() +
theme(legend.position = "top")
# You can also use geom_bar() with continuous data, in which case
46 geom_bin2d
Divides the plane into rectangles, counts the number of cases in each rectangle, and then (by default)
maps the number of cases to the rectangle’s fill. This is a useful alternative to geom_point() in the
presence of overplotting.
geom_bin2d(mapping = NULL, data = NULL, stat = "bin2d",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
geom_bin2d 47
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Use to override the default connection between geom_bin2d and stat_bin2d.
bins numeric vector giving number of bins in both vertical and horizontal directions.
Set to 30 by default.
binwidth Numeric vector giving bin width in both vertical and horizontal directions. Over-
rides bins if both set.
drop if TRUE removes all cells with 0 counts.
stat_bin2d() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• fill
• group
Learn more about setting these aesthetics in vignette("ggplot2-specs").
Computed variables
count number of points in bin
density density of points in bin, scaled to integrate to 1
ncount count, scaled to maximum of 1
ndensity density, scaled to maximum of 1
See Also
stat_binhex() for hexagonal binning
d <- ggplot(diamonds, aes(x, y)) + xlim(4, 10) + ylim(4, 10)
d + geom_bin2d()
# You can control the size of the bins by specifying the number of
# bins in each direction:
d + geom_bin2d(bins = 10)
d + geom_bin2d(bins = 30)
The blank geom draws nothing, but can be a useful way of ensuring common scales between differ-
ent plots. See expand_limits() for more details.
geom_blank(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
ggplot(mtcars, aes(wt, mpg))
# Nothing to see here!
geom_boxplot 49
The boxplot compactly displays the distribution of a continuous variable. It visualises five summary
statistics (the median, two hinges and two whiskers), and all "outlying" points individually.
geom_boxplot(mapping = NULL, data = NULL, stat = "boxplot",
position = "dodge2", ..., outlier.colour = NULL,
outlier.color = NULL, outlier.fill = NULL, outlier.shape = 19,
outlier.size = 1.5, outlier.stroke = 0.5, outlier.alpha = NULL,
notch = FALSE, notchwidth = 0.5, varwidth = FALSE, na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
outlier.colour, outlier.color, outlier.fill, outlier.shape, outlier.size, outlier.stroke,
Default aesthetics for outliers. Set to NULL to inherit from the aesthetics used for
the box.
In the unlikely event you specify both US and UK spellings of colour, the US
spelling will take precedence.
50 geom_boxplot
Sometimes it can be useful to hide the outliers, for example when overlaying
the raw data points on top of the boxplot. Hiding the outliers can be achieved
by setting outlier.shape = NA. Importantly, this does not remove the outliers,
it only hides them, so the range calculated for the y-axis will be the same with
outliers shown and outliers hidden.
notch If FALSE (default) make a standard box plot. If TRUE, make a notched box plot.
Notches are used to compare groups; if the notches of two boxes do not overlap,
this suggests that the medians are significantly different.
notchwidth For a notched box plot, width of the notch relative to the body (defaults to
notchwidth = 0.5).
varwidth If FALSE (default) make a standard box plot. If TRUE, boxes are drawn with
widths proportional to the square-roots of the number of observations in the
groups (possibly weighted, using the weight aesthetic).
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Use to override the default connection between geom_boxplot and stat_boxplot.
coef Length of the whiskers as multiple of IQR. Defaults to 1.5.
Summary statistics
The lower and upper hinges correspond to the first and third quartiles (the 25th and 75th percentiles).
This differs slightly from the method used by the boxplot() function, and may be apparent with
small samples. See boxplot.stats() for for more information on how hinge positions are calcu-
lated for boxplot().
The upper whisker extends from the hinge to the largest value no further than 1.5 * IQR from the
hinge (where IQR is the inter-quartile range, or distance between the first and third quartiles). The
lower whisker extends from the hinge to the smallest value at most 1.5 * IQR of the hinge. Data
beyond the end of the whiskers are called "outlying" points and are plotted individually.
In a notched box plot, the notches extend 1.58 * IQR / sqrt(n). This gives a roughly 95%
confidence interval for comparing medians. See McGill et al. (1978) for more details.
geom_boxplot() understands the following aesthetics (required aesthetics are in bold):
• x
• lower
• upper
• middle
geom_boxplot 51
• ymin
• ymax
• alpha
• colour
• fill
• group
• linetype
• shape
• size
• weight
Learn more about setting these aesthetics in vignette("ggplot2-specs").
Computed variables
width width of boxplot
ymin lower whisker = smallest observation greater than or equal to lower hinge - 1.5 * IQR
lower lower hinge, 25% quantile
notchlower lower edge of notch = median - 1.58 * IQR / sqrt(n)
middle median, 50% quantile
notchupper upper edge of notch = median + 1.58 * IQR / sqrt(n)
upper upper hinge, 75% quantile
ymax upper whisker = largest observation less than or equal to upper hinge + 1.5 * IQR
McGill, R., Tukey, J. W. and Larsen, W. A. (1978) Variations of box plots. The American Statisti-
cian 32, 12-16.
See Also
geom_quantile() for continuous x, geom_violin() for a richer display of the distribution, and
geom_jitter() for a useful technique for small data.
p <- ggplot(mpg, aes(class, hwy))
p + geom_boxplot()
p + geom_boxplot() + coord_flip()
p + geom_boxplot(notch = TRUE)
p + geom_boxplot(varwidth = TRUE)
p + geom_boxplot(fill = "white", colour = "#3366FF")
# By default, outlier points match the colour of the box. Use
# outlier.colour to override
p + geom_boxplot(outlier.colour = "red", outlier.shape = 1)
52 geom_contour
# You can also use boxplots with continuous x, as long as you supply
# a grouping variable. cut_width is particularly useful
ggplot(diamonds, aes(carat, price)) +
ggplot(diamonds, aes(carat, price)) +
geom_boxplot(aes(group = cut_width(carat, 0.25)))
# Adjust the transparency of outliers using outlier.alpha
ggplot(diamonds, aes(carat, price)) +
geom_boxplot(aes(group = cut_width(carat, 0.25)), outlier.alpha = 0.1)
ggplot2 can not draw true 3d surfaces, but you can use geom_contour and geom_tile() to visualise
3d surfaces in 2d. To be a valid surface, the data must contain only a single row for each unique
combination of the variables mapped to the x and y aesthetics. Contouring tends to work best when
x and y form a (roughly) evenly spaced grid. If your data is not evenly spaced, you may want to
interpolate to a grid before visualising.
geom_contour(mapping = NULL, data = NULL, stat = "contour",
position = "identity", ..., lineend = "butt", linejoin = "round",
geom_contour 53
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
lineend Line end style (round, butt, square).
linejoin Line join style (round, mitre, bevel).
linemitre Line mitre limit (number greater than 1).
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom The geometric object to use display the data
geom_contour() understands the following aesthetics (required aesthetics are in bold):
• x
• y
54 geom_contour
• alpha
• colour
• group
• linetype
• size
• weight
Computed variables
See Also
#' # Basic plot
v <- ggplot(faithfuld, aes(waiting, eruptions, z = density))
v + geom_contour()
# Setting bins creates evenly spaced contours in the range of the data
v + geom_contour(bins = 2)
v + geom_contour(bins = 10)
# Other parameters
v + geom_contour(aes(colour = stat(level)))
v + geom_contour(colour = "red")
v + geom_raster(aes(fill = density)) +
geom_contour(colour = "white")
geom_count 55
This is a variant geom_point() that counts the number of observations at each location, then maps
the count to point area. It useful when you have discrete data and overplotting.
geom_count(mapping = NULL, data = NULL, stat = "sum",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Use to override the default connection between geom_count and stat_sum.
56 geom_count
geom_point() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
• shape
• size
• stroke
Computed variables
n number of observations at position
prop percent of points in that panel at that position
See Also
For continuous x and y, use geom_bin2d().
ggplot(mpg, aes(cty, hwy)) +
scale_size_area(max_size = 10)
Various ways of representing a vertical interval defined by x, ymin and ymax. Each case draws a
single graphical object.
geom_crossbar(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., fatten = 2.5, na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
58 geom_crossbar
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
fatten A multiplicative factor used to increase the size of the middle bar in geom_crossbar()
and the middle point in geom_pointrange().
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom_linerange() understands the following aesthetics (required aesthetics are in bold):
• x
• ymin
• ymax
• alpha
• colour
• group
• linetype
• size
See Also
stat_summary() for examples of these guys in use, geom_smooth() for continuous analogue,
geom_errorbarh() for a horizontal error bar.
#' # Create a simple example dataset
df <- data.frame(
trt = factor(c(1, 1, 2, 2)),
resp = c(1, 5, 3, 4),
group = factor(c(1, 2, 1, 2)),
upper = c(1.1, 5.3, 3.3, 4.2),
lower = c(0.8, 4.6, 2.4, 3.6)
geom_density 59
Computes and draws kernel density estimate, which is a smoothed version of the histogram. This
is a useful alternative to the histogram for continuous data that comes from an underlying smooth
geom_density(mapping = NULL, data = NULL, stat = "density",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
60 geom_density
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Use to override the default connection between geom_density and stat_density.
bw The smoothing bandwidth to be used. If numeric, the standard deviation of
the smoothing kernel. If character, a rule to choose the bandwidth, as listed in
adjust A multiplicate bandwidth adjustment. This makes it possible to adjust the band-
width while still using the a bandwidth estimator. For example, adjust = 1/2
means use half of the default bandwidth.
kernel Kernel. See list of available kernels in density().
n number of equally spaced points at which the density is to be estimated, should
be a power of two, see density() for details
trim This parameter only matters if you are displaying multiple densities in one plot.
If FALSE, the default, each density is computed on the full range of the data.
If TRUE, each density is computed over the range of that group: this typically
geom_density 61
means the estimated x values will not line-up, and hence you won’t be able to
stack density values.
geom_density() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
• linetype
• size
• weight
Learn more about setting these aesthetics in vignette("ggplot2-specs").
Computed variables
density density estimate
count density * number of points - useful for stacked density plots
scaled density estimate, scaled to maximum of 1
ndensity alias for scaled, to mirror the syntax of stat_bin()
See Also
See geom_histogram(), geom_freqpoly() for other methods of displaying continuous distribu-
tion. See geom_violin() for a compact density display.
ggplot(diamonds, aes(carat)) +
ggplot(diamonds, aes(carat)) +
geom_density(adjust = 1/5)
ggplot(diamonds, aes(carat)) +
geom_density(adjust = 5)
# Stacked density plots: if you want to create a stacked density plot, you
# probably want to 'count' (density * n) variable instead of the default
# density
Perform a 2D kernel density estimation using MASS::kde2d() and display the results with contours.
This can be useful for dealing with overplotting. This is a 2d version of geom_density().
geom_density_2d(mapping = NULL, data = NULL, stat = "density2d",
position = "identity", ..., lineend = "butt", linejoin = "round",
linemitre = 10, na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom_density_2d 63
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
lineend Line end style (round, butt, square).
linejoin Line join style (round, mitre, bevel).
linemitre Line mitre limit (number greater than 1).
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Use to override the default connection between geom_density_2d and stat_density_2d.
contour If TRUE, contour the results of the 2d density estimation
n number of grid points in each direction
h Bandwidth (vector of length two). If NULL, estimated using MASS::bandwidth.nrd().
geom_density_2d() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• group
• linetype
• size
Computed variables
Same as stat_contour()
With the addition of:
See Also
geom_contour() for information about how contours are drawn; geom_bin2d() for another way
of dealing with overplotting.
m <- ggplot(faithful, aes(x = eruptions, y = waiting)) +
geom_point() +
xlim(0.5, 6) +
ylim(40, 110)
m + geom_density_2d()
dsmall <- diamonds[sample(nrow(diamonds), 1000), ]
d <- ggplot(dsmall, aes(x, y))
# If you map an aesthetic to a categorical variable, you will get a
# set of contours for each value of that variable
d + geom_density_2d(aes(colour = cut))
In a dot plot, the width of a dot corresponds to the bin width (or maximum width, depending on the
binning algorithm), and dots are stacked, with each dot representing one observation.
geom_dotplot(mapping = NULL, data = NULL, position = "identity", ...,
binwidth = NULL, binaxis = "x", method = "dotdensity",
binpositions = "bygroup", stackdir = "up", stackratio = 1,
geom_dotplot 65
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
binwidth When method is "dotdensity", this specifies maximum bin width. When method
is "histodot", this specifies bin width. Defaults to 1/30 of the range of the data
binaxis The axis to bin along, "x" (default) or "y"
method "dotdensity" (default) for dot-density binning, or "histodot" for fixed bin widths
(like stat_bin)
binpositions When method is "dotdensity", "bygroup" (default) determines positions of the
bins for each group separately. "all" determines positions of the bins with all the
data taken together; this is used for aligning dot stacks across multiple groups.
stackdir which direction to stack the dots. "up" (default), "down", "center", "centerw-
hole" (centered, but with dots aligned)
stackratio how close to stack the dots. Default is 1, where dots just just touch. Use smaller
values for closer, overlapping dots.
dotsize The diameter of the dots relative to binwidth, default 1.
stackgroups should dots be stacked across groups? This has the effect that position = "stack"
should have, but can’t (because this geom has some odd properties).
origin When method is "histodot", origin of first bin
right When method is "histodot", should intervals be closed on the right (a, b], or not
[a, b)
width When binaxis is "y", the spacing of the dot stacks for dodging.
drop If TRUE, remove all bins with zero counts
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
66 geom_dotplot
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
There are two basic approaches: dot-density and histodot. With dot-density binning, the bin po-
sitions are determined by the data and binwidth, which is the maximum width of each bin. See
Wilkinson (1999) for details on the dot-density binning algorithm. With histodot binning, the bins
have fixed positions and fixed widths, much like a histogram.
When binning along the x axis and stacking along the y axis, the numbers on y axis are not mean-
ingful, due to technical limitations of ggplot2. You can hide the y axis, as in one of the examples,
or manually scale it to match the number of dots.
geom_dotplot() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
Computed variables
x center of each bin, if binaxis is "x"
y center of each bin, if binaxis is "x"
binwidth max width of each bin if method is "dotdensity"; width of each bin if method is "histodot"
count number of points in bin
ncount count, scaled to maximum of 1
density density of points in bin, scaled to integrate to 1, if method is "histodot"
ndensity density, scaled to maximum of 1, if method is "histodot"
Wilkinson, L. (1999) Dot plots. The American Statistician, 53(3), 276-281.
geom_dotplot 67
ggplot(mtcars, aes(x = mpg)) + geom_dotplot()
ggplot(mtcars, aes(x = mpg)) + geom_dotplot(binwidth = 1.5)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom_freqpoly 69
• xmin
• xmax
• y
• alpha
• colour
• group
• height
• linetype
• size
df <- data.frame(
trt = factor(c(1, 1, 2, 2)),
resp = c(1, 5, 3, 4),
group = factor(c(1, 2, 1, 2)),
se = c(0.1, 0.3, 0.3, 0.2)
Visualise the distribution of a single continuous variable by dividing the x axis into bins and count-
ing the number of observations in each bin. Histograms (geom_histogram()) display the counts
with bars; frequency polygons (geom_freqpoly()) display the counts with lines. Frequency poly-
gons are more suitable when you want to compare the distribution across the levels of a categorical
70 geom_freqpoly
geom_freqpoly(mapping = NULL, data = NULL, stat = "bin",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
binwidth The width of the bins. Can be specified as a numeric value, or a function that
calculates width from x. The default is to use bins bins that cover the range of
the data. You should always override this value, exploring multiple widths to
find the best to illustrate the stories in your data.
geom_freqpoly 71
The bin width of a date variable is the number of days in each time; the bin
width of a time variable is the number of seconds.
bins Number of bins. Overridden by binwidth. Defaults to 30.
geom, stat Use to override the default connection between geom_histogram()/geom_freqpoly()
and stat_bin().
center, boundary
bin position specifiers. Only one, center or boundary, may be specified for a
single plot. center specifies the center of one of the bins. boundary specifies
the boundary between two bins. Note that if either is above or below the range
of the data, things will be shifted by the appropriate integer multiple of width.
For example, to center on integers use width = 1 and center = 0, even if 0 is
outside the range of the data. Alternatively, this same alignment can be specified
with width = 1 and boundary = 0.5, even if 0.5 is outside the range of the
breaks Alternatively, you can supply a numeric vector giving the bin boundaries. Over-
rides binwidth, bins, center, and boundary.
closed One of "right" or "left" indicating whether right or left edges of bins are
included in the bin.
pad If TRUE, adds empty bins at either end of x. This ensures frequency polygons
touch 0. Defaults to FALSE.
stat_bin() is suitable only for continuous x data. If your x data is discrete, you probably want to
use stat_count().
By default, the underlying computation (stat_bin()) uses 30 bins; this is not a good default, but
the idea is to get you experimenting with different bin widths. You may need to look at a few to
uncover the full story behind your data.
geom_histogram() uses the same aesthetics as geom_bar(); geom_freqpoly() uses the same
aesthetics as geom_line().
Computed variables
count number of points in bin
density density of points in bin, scaled to integrate to 1
ncount count, scaled to maximum of 1
ndensity density, scaled to maximum of 1
See Also
stat_count(), which counts the number of cases at each x position, without binning. It is suitable
for both discrete and continuous x data, whereas stat_bin() is suitable only for continuous x data.
72 geom_freqpoly
ggplot(diamonds, aes(carat)) +
ggplot(diamonds, aes(carat)) +
geom_histogram(binwidth = 0.01)
ggplot(diamonds, aes(carat)) +
geom_histogram(bins = 200)
if (require("ggplot2movies")) {
# Often we don't want the height of the bar to represent the
# count of observations, but the sum of some other variable.
# For example, the following plot shows the number of movies
# in each rating.
m <- ggplot(movies, aes(rating))
m + geom_histogram(binwidth = 0.1)
# Using log scales does not work here, because the first
# bar is anchored at zero, and so when transformed becomes negative
# infinity. This is not a problem when transforming the scales, because
# no observations have 0 ratings.
m + geom_histogram(boundary = 0) + coord_trans(x = "log10")
# Use boundary = 0, to make sure we don't take sqrt of negative values
m + geom_histogram(boundary = 0) + coord_trans(x = "sqrt")
# You can also transform the y axis. Remember that the base of the bars
# has value 0, so log transformations are not appropriate
m <- ggplot(movies, aes(x = rating))
m + geom_histogram(binwidth = 0.5) + scale_y_sqrt()
geom_hex 73
Divides the plane into regular hexagons, counts the number of cases in each hexagon, and then
(by default) maps the number of cases to the hexagon fill. Hexagon bins avoid the visual artefacts
sometimes generated by the very regular alignment of geom_bin2d().
geom_hex(mapping = NULL, data = NULL, stat = "binhex",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
74 geom_hex
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Override the default connection between geom_hex and stat_binhex.
bins numeric vector giving number of bins in both vertical and horizontal directions.
Set to 30 by default.
binwidth Numeric vector giving bin width in both vertical and horizontal directions. Over-
rides bins if both set.
• x
• y
• alpha
• colour
• fill
• group
• linetype
• size
Computed variables
See Also
d <- ggplot(diamonds, aes(carat, price))
d + geom_hex()
# You can control the size of the bins by specifying the number of
# bins in each direction:
d + geom_hex(bins = 10)
d + geom_hex(bins = 30)
The jitter geom is a convenient shortcut for geom_point(position = "jitter"). It adds a small
amount of random variation to the location of each point, and is a useful way of handling overplot-
ting caused by discreteness in smaller datasets.
geom_jitter(mapping = NULL, data = NULL, stat = "identity",
position = "jitter", ..., width = NULL, height = NULL,
na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
76 geom_jitter
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
width Amount of vertical and horizontal jitter. The jitter is added in both positive and
negative directions, so the total spread is twice the value specified here.
If omitted, defaults to 40% of the resolution of the data: this means the jitter
values will occupy 80% of the implied bins. Categorical data is aligned on the
integers, so a width or height of 0.5 will spread the data so it’s not possible to
see the distinction between the categories.
height Amount of vertical and horizontal jitter. The jitter is added in both positive and
negative directions, so the total spread is twice the value specified here.
If omitted, defaults to 40% of the resolution of the data: this means the jitter
values will occupy 80% of the implied bins. Categorical data is aligned on the
integers, so a width or height of 0.5 will spread the data so it’s not possible to
see the distinction between the categories.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom_point() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
• shape
• size
• stroke
See Also
geom_point() for regular, unjittered points, geom_boxplot() for another way of looking at the
conditional distribution of a variable
geom_label 77
p <- ggplot(mpg, aes(cyl, hwy))
p + geom_point()
p + geom_jitter()
geom_label Text
geom_text() adds text directly to the plot. geom_label() draws a rectangle behind the text, mak-
ing it easier to read.
geom_label(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., parse = FALSE, nudge_x = 0,
nudge_y = 0, label.padding = unit(0.25, "lines"),
label.r = unit(0.15, "lines"), label.size = 0.25, na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
78 geom_label
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
parse If TRUE, the labels will be parsed into expressions and displayed as described in
nudge_x, nudge_y
Horizontal and vertical adjustment to nudge labels by. Useful for offsetting text
from points, particularly on discrete scales.
label.padding Amount of padding around label. Defaults to 0.25 lines.
label.r Radius of rounded corners. Defaults to 0.15 lines.
label.size Size of label border, in mm.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
check_overlap If TRUE, text that overlaps previous text in the same layer will not be plotted.
Note that the "width" and "height" of a text element are 0, so stacking and dodging text will not
work by default, and axis limits are not automatically expanded to include all text. Obviously,
labels do have height and width, but they are physical units, not data units. The amount of space
they occupy on the plot is not constant in data units: when you resize a plot, labels stay the same
size, but the size of the axes changes.
geom_text() and geom_label() add labels for each row in the data, even if coordinates x, y are
set to single values in the call to geom_label() or geom_text(). To add labels at specified points
use annotate() with annotate(geom = "text", ...) or annotate(geom = "label", ...).
geom_text() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• label
• alpha
geom_label 79
• angle
• colour
• family
• fontface
• group
• hjust
• lineheight
• size
• vjust
Currently geom_label() does not support the angle aesthetic and is considerably slower than
geom_text(). The fill aesthetic controls the background colour of the label.
You can modify text alignment with the vjust and hjust aesthetics. These can either be a number
between 0 (right/bottom) and 1 (top/left) or a character ("left", "middle", "right", "bottom",
"center", "top"). There are two special alignments: "inward" and "outward". Inward always
aligns text towards the center, and outward aligns it away from the center.
p <- ggplot(mtcars, aes(wt, mpg, label = rownames(mtcars)))
p + geom_text()
# Avoid overlaps
p + geom_text(check_overlap = TRUE)
# Labels with background
p + geom_label()
# Change size of the label
p + geom_text(size = 10)
## End(Not run)
scale_colour_discrete(l = 40)
p + geom_label(aes(fill = factor(cyl)), colour = "white", fontface = "bold")
p + geom_text(aes(size = wt))
# Scale height of text, rather than sqrt(height)
p + geom_text(aes(size = wt)) + scale_radius(range = c(3,6))
# ggplot2 doesn't know you want to give the labels the same virtual width
# as the bars:
ggplot(data = df, aes(x, y, group = grp)) +
geom_col(aes(fill = grp), position = "dodge") +
geom_text(aes(label = y), position = "dodge")
# So tell it:
ggplot(data = df, aes(x, y, group = grp)) +
geom_col(aes(fill = grp), position = "dodge") +
geom_text(aes(label = y), position = position_dodge(0.9))
# Use you can't nudge and dodge text, so instead adjust the y position
ggplot(data = df, aes(x, y, group = grp)) +
geom_col(aes(fill = grp), position = "dodge") +
aes(label = y, y = y + 0.05),
position = position_dodge(0.9),
vjust = 0
# Justification -------------------------------------------------------------
df <- data.frame(
x = c(1, 1, 2, 2, 1.5),
geom_map 81
y = c(1, 2, 1, 2, 1.5),
text = c("bottom-left", "bottom-right", "top-left", "top-right", "center")
ggplot(df, aes(x, y)) +
geom_text(aes(label = text))
ggplot(df, aes(x, y)) +
geom_text(aes(label = text), vjust = "inward", hjust = "inward")
This is pure annotation, so does not affect position scales.
geom_map(mapping = NULL, data = NULL, stat = "identity", ..., map,
na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
map Data frame that contains the map coordinates. This will typically be created
using fortify() on a spatial object. It must contain columns x or long, y or
lat, and region or id.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
82 geom_map
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom_map() understands the following aesthetics (required aesthetics are in bold):
• map_id
• alpha
• colour
• fill
• group
• linetype
• size
# When using geom_polygon, you will typically need two data frames:
# one contains the coordinates of each polygon (positions), and the
# other the values associated with each polygon (values). An id
# variable links the two together
ggplot(values) +
geom_map(aes(map_id = id), map = positions) +
ggplot(values, aes(fill = value)) +
geom_map(aes(map_id = id), map = positions) +
ggplot(values, aes(fill = value)) +
geom_map(aes(map_id = id), map = positions) +
expand_limits(positions) + ylim(0, 3)
geom_path 83
# Better example
crimes <- data.frame(state = tolower(rownames(USArrests)), USArrests)
crimesm <- reshape2::melt(crimes, id = 1)
if (require(maps)) {
states_map <- map_data("state")
ggplot(crimes, aes(map_id = state)) +
geom_map(aes(fill = Murder), map = states_map) +
expand_limits(x = states_map$long, y = states_map$lat)
last_plot() + coord_map()
ggplot(crimesm, aes(map_id = state)) +
geom_map(aes(fill = value), map = states_map) +
expand_limits(x = states_map$long, y = states_map$lat) +
facet_wrap( ~ variable)
geom_path() connects the observations in the order in which they appear in the data. geom_line()
connects them in order of the variable on the x axis. geom_step() creates a stairstep plot, high-
lighting exactly when changes occur. The group aesthetic determines which cases are connected
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
84 geom_path
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
lineend Line end style (round, butt, square).
linejoin Line join style (round, mitre, bevel).
linemitre Line mitre limit (number greater than 1).
arrow Arrow specification, as created by grid::arrow().
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
direction direction of stairs: ’vh’ for vertical then horizontal, or ’hv’ for horizontal then
An alternative parameterisation is geom_segment(), where each line corresponds to a single case
which provides the start and end coordinates.
geom_path() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• group
• linetype
• size
Learn more about setting these aesthetics in vignette("ggplot2-specs").
geom_path 85
• If an NA occurs in the middle of a line, it breaks the line. No warning is shown, regardless of
whether na.rm is TRUE or FALSE.
• If an NA occurs at the start or the end of the line and na.rm is FALSE (default), the NA is removed
with a warning.
• If an NA occurs at the start or the end of the line and na.rm is TRUE, the NA is removed silently,
without warning.
See Also
geom_polygon(): Filled paths (polygons); geom_segment(): Line segments
# geom_line() is suitable for time series
ggplot(economics, aes(date, unemploy)) + geom_line()
ggplot(economics_long, aes(date, value01, colour = variable)) +
# geom_path lets you explore how two variables are related over time,
# e.g. unemployment and personal savings rate
m <- ggplot(economics, aes(unemploy/pop, psavert))
m + geom_path()
m + geom_path(aes(colour = as.numeric(date)))
geom_point Points
The point geom is used to create scatterplots. The scatterplot is most useful for displaying the rela-
tionship between two continuous variables. It can be used to compare one continuous and one cat-
egorical variable, or two categorical variables, but a variation like geom_jitter(), geom_count(),
or geom_bin2d() is usually more appropriate. A bubblechart is a scatterplot with a third variable
mapped to the size of points.
geom_point(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
geom_point 87
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
The biggest potential problem with a scatterplot is overplotting: whenever you have more than a few
points, points may be plotted on top of one another. This can severely distort the visual appearance
of the plot. There is no one solution to this problem, but there are some techniques that can help. You
can add additional information with geom_smooth(), geom_quantile() or geom_density_2d().
If you have few unique x values, geom_boxplot() may also be useful.
Alternatively, you can summarise the number of points at each location and display that in some
way, using geom_count(), geom_hex(), or geom_density2d().
Another technique is to make the points transparent (e.g. geom_point(alpha = 0.05)) or very
small (e.g. geom_point(shape = ".")).
geom_point() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
88 geom_point
• shape
• size
• stroke
p <- ggplot(mtcars, aes(wt, mpg))
p + geom_point()
# For shapes that have a border (like 21), you can colour the inside and
# outside separately. Use the stroke aesthetic to modify the width of the
# border
ggplot(mtcars, aes(wt, mpg)) +
geom_point(shape = 21, colour = "black", fill = "white", size = 5, stroke = 5)
# geom_point warns when missing values have been dropped from the data set
# and not plotted, you can turn this off by setting na.rm = TRUE
mtcars2 <- transform(mtcars, mpg = ifelse(runif(32) < 0.2, NA, mpg))
ggplot(mtcars2, aes(wt, mpg)) + geom_point()
ggplot(mtcars2, aes(wt, mpg)) + geom_point(na.rm = TRUE)
geom_polygon 89
geom_polygon Polygons
Polygons are very similar to paths (as drawn by geom_path()) except that the start and end points
are connected and the inside is coloured by fill. The group aesthetic determines which cases are
connected together into a polygon.
geom_polygon(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
90 geom_polygon
geom_polygon() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
• linetype
• size
See Also
geom_path() for an unfilled polygon, geom_ribbon() for a polygon anchored on the x-axis
# When using geom_polygon, you will typically need two data frames:
# one contains the coordinates of each polygon (positions), and the
# other the values associated with each polygon (values). An id
# variable links the two together
# Which seems like a lot of work, but then it's easy to add on
# other features in this coordinate system, e.g.:
geom_qq_line 91
# And if the positions are in longitude and latitude, you can use
# coord_map to produce different map projections.
geom_qq and stat_qq produce quantile-quantile plots. geom_qq_line and stat_qq_line compute
the slope and intercept of the line connecting the points at specified quartiles of the theoretical and
sample distributions.
geom_qq_line(mapping = NULL, data = NULL, geom = "path",
position = "identity", ..., distribution = stats::qnorm,
dparams = list(), line.p = c(0.25, 0.75), fullrange = FALSE,
na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
92 geom_qq_line
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
distribution Distribution function to use, if x not specified
dparams Additional parameters passed on to distribution function.
line.p Vector of quantiles to use when fitting the Q-Q line, defaults defaults to c(.25, .75).
fullrange Should the q-q line span the full range of the plot, or just the data
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
stat_qq() understands the following aesthetics (required aesthetics are in bold):
• sample
• group
• x
• y
Learn more about setting these aesthetics in vignette("ggplot2-specs").
stat_qq_line() understands the following aesthetics (required aesthetics are in bold):
• sample
• group
• x
• y
Learn more about setting these aesthetics in vignette("ggplot2-specs").
geom_quantile 93
Computed variables
Variables computed by stat_qq:
sample sample quantiles
theoretical theoretical quantiles
Variables computed by stat_qq_line:
x x-coordinates of the endpoints of the line segment connecting the points at the chosen quantiles
of the theoretical and the sample distributions
y y-coordinates of the endpoints
This fits a quantile regression to the data and draws the fitted quantiles with lines. This is as a
continuous analogue to geom_boxplot().
geom_quantile(mapping = NULL, data = NULL, stat = "quantile",
position = "identity", ..., lineend = "butt", linejoin = "round",
linemitre = 10, na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
94 geom_quantile
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
lineend Line end style (round, butt, square).
linejoin Line join style (round, mitre, bevel).
linemitre Line mitre limit (number greater than 1).
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Use to override the default connection between geom_quantile and stat_quantile.
quantiles conditional quantiles of y to calculate and display
formula formula relating y variables to x variables
method Quantile regression method to use. Currently only supports quantreg::rq().
method.args List of additional arguments passed on to the modelling function defined by
geom_raster 95
geom_quantile() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• group
• linetype
• size
• weight
Computed variables
quantile quantile of distribution
m <- ggplot(mpg, aes(displ, 1 / hwy)) + geom_point()
m + geom_quantile()
m + geom_quantile(quantiles = 0.5)
q10 <- seq(0.05, 0.95, by = 0.05)
m + geom_quantile(quantiles = q10)
geom_raster Rectangles
geom_rect and geom_tile do the same thing, but are parameterised differently: geom_rect uses
the locations of the four corners (xmin, xmax, ymin and ymax), while geom_tile uses the center
of the tile and its size (x, y, width, height). geom_raster is a high performance special case for
when all the tiles are the same size.
96 geom_raster
geom_raster(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., hjust = 0.5, vjust = 0.5,
interpolate = FALSE, na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
hjust, vjust horizontal and vertical justification of the grob. Each justification value should
be a number between 0 and 1. Defaults to 0.5 for both, centering each pixel over
its data location.
interpolate If TRUE interpolate linearly, if FALSE (the default) don’t interpolate.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom_raster 97
geom_tile() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
• height
• linetype
• size
• width
Learn more about setting these aesthetics in vignette("ggplot2-specs").
# The most common use for rectangles is to draw a surface. You always want
# to use geom_raster here because it's so much faster, and produces
# smaller output when saving to PDF
ggplot(faithfuld, aes(waiting, eruptions)) +
geom_raster(aes(fill = density))
# Interpolation smooths the surface & is most helpful when rendering images.
ggplot(faithfuld, aes(waiting, eruptions)) +
geom_raster(aes(fill = density), interpolate = TRUE)
For each x value, geom_ribbon displays a y interval defined by ymin and ymax. geom_area is a
special case of geom_ribbon, where the ymin is fixed to 0.
geom_ribbon(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
geom_ribbon 99
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
An area plot is the continuous analogue of a stacked bar chart (see geom_bar()), and can be used to
show how composition of the whole varies over the range of x. Choosing the order in which different
components is stacked is very important, as it becomes increasing hard to see the individual pattern
as you move up the stack. See position_stack() for the details of stacking algorithm.
geom_ribbon() understands the following aesthetics (required aesthetics are in bold):
• x
• ymin
• ymax
• alpha
• colour
• fill
• group
• linetype
• size
See Also
geom_bar() for discrete intervals (bars), geom_linerange() for discrete intervals (lines), geom_polygon()
for general polygons
# Generate data
huron <- data.frame(year = 1875:1972, level = as.vector(LakeHuron))
h <- ggplot(huron, aes(year))
h + geom_ribbon(aes(ymin=0, ymax=level))
100 geom_rug
h + geom_area(aes(y = level))
A rug plot is a compact visualisation designed to supplement a 2d display with the two 1d marginal
distributions. Rug plots display individual cases so are best used with smaller datasets.
geom_rug(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., sides = "bl", na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
sides A string that controls which sides of the plot the rugs appear on. It can be set to
a string containing any of "trbl", for top, right, bottom, and left.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
geom_rug 101
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
The rug lines are drawn with a fixed size (3 are dependent on the overall scale expansion in order
not to overplot existing data.
• alpha
• colour
• group
• linetype
• size
• x
• y
p <- ggplot(mtcars, aes(wt, mpg)) +
p + geom_rug()
p + geom_rug(sides="b") # Rug on bottom only
p + geom_rug(sides="trbl") # All four sides
geom_segment draws a straight line between points (x, y) and (xend, yend). geom_curve draws
a curved line. See the underlying drawing function grid::curveGrob() for the parameters that
control the curve.
geom_segment(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., arrow = NULL, arrow.fill = NULL,
lineend = "butt", linejoin = "round", na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
arrow specification for arrow heads, as created by arrow().
arrow.fill fill colour to use for the arrow head (if closed). NULL means use colour aes-
lineend Line end style (round, butt, square).
geom_segment 103
Both geoms draw a single segment/curve per case. See geom_path if you need to connect points
across multiple cases.
geom_segment() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• xend
• yend
• alpha
• colour
• group
• linetype
• size
See Also
geom_path() and geom_line() for multi- segment lines and paths.
geom_spoke() for a segment parameterised by a location (x, y), and an angle and radius.
104 geom_smooth
b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2), data = df, curvature = -0.2)
b + geom_curve(aes(x = x1, y = y1, xend = x2, yend = y2), data = df, curvature = 1)
b + geom_curve(
aes(x = x1, y = y1, xend = x2, yend = y2),
data = df,
arrow = arrow(length = unit(0.03, "npc"))
Aids the eye in seeing patterns in the presence of overplotting. geom_smooth() and stat_smooth()
are effectively aliases: they both use the same arguments. Use stat_smooth() if you want to
display the results with a non-standard geom.
geom_smooth(mapping = NULL, data = NULL, stat = "smooth",
position = "identity", ..., method = "auto", formula = y ~ x,
se = TRUE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
method Smoothing method (function) to use, accepts either a character vector, e.g. "auto",
"lm", "glm", "gam", "loess" or a function, e.g. MASS::rlm or mgcv::gam,
base::lm, or base::loess.
For method = "auto" the smoothing method is chosen based on the size of the
largest group (across all panels). loess() is used for less than 1,000 observa-
tions; otherwise mgcv::gam() is used with formula = y ~ s(x, bs = "cs").
Somewhat anecdotally, loess gives a better appearance, but is O(N 2 ) in mem-
ory, so does not work for larger datasets.
If you have fewer than 1,000 observations but want to use the same gam() model
that method = "auto" would use, then set method = "gam", formula = y ~ s(x, bs = "cs").
formula Formula to use in smoothing function, eg. y ~ x, y ~ poly(x, 2), y ~ log(x)
106 geom_smooth
Calculation is performed by the (currently undocumented) predictdf() generic and its methods.
For most methods the standard error bounds are computed using the predict() method – the ex-
ceptions are loess(), which uses a t-based approximation, and glm(), where the normal confidence
interval is constructed on the link scale and then back-transformed to the response scale.
geom_smooth() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
• linetype
• size
• weight
• ymax
• ymin
Computed variables
y predicted value
ymin lower pointwise confidence interval around the mean
ymax upper pointwise confidence interval around the mean
se standard error
See Also
See individual modelling functions for more details: lm() for linear smooths, glm() for generalised
linear smooths, and loess() for local smooths.
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
# Instead of a loess smooth, you can use any other modelling function:
ggplot(mpg, aes(displ, hwy)) +
geom_point() +
geom_smooth(method = lm, se = FALSE)
geom_jitter(height = 0.05) +
# But in this case, it's probably better to fit the model yourself
# so you can exercise more control and see whether or not it's a good model.
This is a polar parameterisation of geom_segment(). It is useful when you have variables that
describe direction and distance.
geom_spoke(mapping = NULL, data = NULL, stat = "identity",
position = "identity", ..., na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
stat The statistical transformation to use on the data for this layer, as a string.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
geom_spoke 109
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
• x
• y
• angle
• radius
• alpha
• colour
• group
• linetype
• size
df <- expand.grid(x = 1:10, y=1:10)
df$angle <- runif(100, 0, 2*pi)
df$speed <- runif(100, 0, sqrt(0.1 * df$x))
A violin plot is a compact display of a continuous distribution. It is a blend of geom_boxplot() and
geom_density(): a violin plot is a mirrored density plot displayed in the same way as a boxplot.
geom_violin(mapping = NULL, data = NULL, stat = "ydensity",
position = "dodge", ..., draw_quantiles = NULL, trim = TRUE,
scale = "area", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
draw_quantiles If not(NULL) (default), draw horizontal lines at the given quantiles of the density
trim If TRUE (default), trim the tails of the violins to the range of the data. If FALSE,
don’t trim the tails.
scale if "area" (default), all violins have the same area (before trimming the tails).
If "count", areas are scaled proportionally to the number of observations. If
"width", all violins have the same maximum width.
geom_violin 111
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
geom, stat Use to override the default connection between geom_violin and stat_ydensity.
bw The smoothing bandwidth to be used. If numeric, the standard deviation of
the smoothing kernel. If character, a rule to choose the bandwidth, as listed in
adjust A multiplicate bandwidth adjustment. This makes it possible to adjust the band-
width while still using the a bandwidth estimator. For example, adjust = 1/2
means use half of the default bandwidth.
kernel Kernel. See list of available kernels in density().
geom_violin() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• alpha
• colour
• fill
• group
• linetype
• size
• weight
Computed variables
density density estimate
scaled density estimate, scaled to maximum of 1
count density * number of points - probably useless for violin plots
violinwidth density scaled for the violin plot, according to area, counts or to a constant maximum
n number of points
width width of violin bounding box
112 geom_violin
Hintze, J. L., Nelson, R. D. (1998) Violin Plots: A Box Plot-Density Trace Synergism. The Ameri-
can Statistician 52, 181-184.
See Also
geom_violin() for examples, and stat_density() for examples with data along the x axis.
p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_violin()
# Show quartiles
p + geom_violin(draw_quantiles = c(0.25, 0.5, 0.75))
ggplot() initializes a ggplot object. It can be used to declare the input data frame for a graphic and
to specify the set of plot aesthetics intended to be common throughout all subsequent layers unless
specifically overridden.
ggplot(data = NULL, mapping = aes(), ...,
environment = parent.frame())
data Default dataset to use for plot. If not already a data.frame, will be converted to
one by fortify(). If not specified, must be supplied in each layer added to the
mapping Default list of aesthetic mappings to use for plot. If not specified, must be sup-
plied in each layer added to the plot.
... Other arguments passed on to methods. Not currently used.
environment DEPRECATED. Used prior to tidy evaluation.
ggplot() is used to construct the initial plot object, and is almost always followed by + to add
component to the plot. There are three common ways to invoke ggplot:
• ggplot(df, aes(x, y, other aesthetics))
• ggplot(df)
• ggplot()
The first method is recommended if all layers use the same data and the same set of aesthetics,
although this method can also be used to add a layer using data from another data frame. See the
first example below. The second method specifies the default data frame to use for the plot, but
no aesthetics are defined up front. This is useful when one data frame is used predominantly as
layers are added, but the aesthetics may vary from one layer to another. The third method initializes
a skeleton ggplot object which is fleshed out as layers are added. This method is useful when
multiple data frames are used to produce different layers, as is often the case in complex graphics.
114 ggproto
# Generate some sample data, then compute mean and standard deviation
# in each group
df <- data.frame(
gp = factor(rep(letters[1:3], each = 10)),
y = rnorm(30)
ds <- plyr::ddply(df, "gp", plyr::summarise, mean = mean(y), sd = sd(y))
# The summary data frame ds is used to plot larger red points on top
# of the raw data. Note that we don't need to supply `data` or `mapping`
# in each layer because the defaults from ggplot() are used.
ggplot(df, aes(gp, y)) +
geom_point() +
geom_point(data = ds, aes(y = mean), colour = 'red', size = 3)
Construct a new object with ggproto, test with is.proto, and access parent methods/fields with
ggproto(`_class` = NULL, `_inherit` = NULL, ...)
ggproto_parent(parent, self)
ggproto 115
_class Class name to assign to the object. This is stored as the class attribute of the
object. This is optional: if NULL (the default), no class name will be added to the
_inherit ggproto object to inherit from. If NULL, don’t inherit from any object.
... A list of members in the ggproto object.
parent, self Access parent class parent of object self.
x An object to test.
ggproto implements a protype based OO system which blurs the lines between classes and instances.
It is inspired by the proto package, but it has some important differences. Notably, it cleanly sup-
ports cross-package inheritance, and has faster performance.
In most cases, creating a new OO system to be used by a single package is not a good idea. How-
ever, it was the least-bad solution for ggplot2 because it required the fewest changes to an already
complex code base.
Calling methods
ggproto methods can take an optional self argument: if it is present, it is a regular method; if it’s
absent, it’s a "static" method (i.e. it doesn’t use any fields).
Imagine you have a ggproto object Adder, which has a method addx = function(self, n) n + self$x.
Then, to call this function, you would use Adder$addx(10) – the self is passed in automatically
by the wrapper function. self be located anywhere in the function signature, although customarily
it comes first.
Adder <- ggproto("Adder",
x = 0,
add = function(self, n) {
self$x <- self$x + n
116 ggsave
ggsave Save a ggplot (or other grid object) with sensible defaults
ggsave() is a convenient function for saving a plot. It defaults to saving the last plot that you
displayed, using the size of the current graphics device. It also guesses the type of graphics device
from the extension.
## Not run:
ggplot(mtcars, aes(mpg, wt)) + geom_point()
## End(Not run)
This set of geom, stat, and coord are used to visualise simple feature (sf) objects. For simple plots,
you will only need geom_sf() as it uses stat_sf() and adds coord_sf() for you. geom_sf() is an
unusual geom because it will draw different geometric objects depending on what simple features
are present in the data: you can get points, lines, or polygons. For text and labels, you can use
geom_sf_text() and geom_sf_label().
stat_sf(mapping = NULL, data = NULL, geom = "rect",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, ...)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes.
You can also set this to one of "polygon", "line", and "point" to override the
default legend.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
stat The statistical transformation to use on the data for this layer, as a string.
parse If TRUE, the labels will be parsed into expressions and displayed as described in
nudge_x Horizontal and vertical adjustment to nudge labels by. Useful for offsetting text
from points, particularly on discrete scales.
ggsf 119
nudge_y Horizontal and vertical adjustment to nudge labels by. Useful for offsetting text
from points, particularly on discrete scales.
label.padding Amount of padding around label. Defaults to 0.25 lines.
label.r Radius of rounded corners. Defaults to 0.15 lines.
label.size Size of label border, in mm.
fun.geometry A function that takes a sfc object and returns a sfc_POINT with the same length
as the input. If NULL, function(x) sf::st_point_on_surface(sf::st_zm(x))
will be used. Note that the function may warn about the incorrectness of the re-
sult if the data is not projected, but you can ignore this except when you really
care about the exact locations.
check_overlap If TRUE, text that overlaps previous text in the same layer will not be plotted.
xlim Limits for the x and y axes.
ylim Limits for the x and y axes.
expand If TRUE, the default, adds a small expansion factor to the limits to ensure that
data and axes don’t overlap. If FALSE, limits are taken exactly from the data or
crs Use this to select a specific coordinate reference system (CRS). If not specified,
will use the CRS defined in the first layer.
datum CRS that provides datum to use when generating graticules
Character vector indicating which graticule lines should be labeled where. Merid-
ians run north-south, and the letters "N" and "S" indicate that they should be
labeled on their north or south end points, respectively. Parallels run east-west,
and the letters "E" and "W" indicate that they should be labeled on their east
or west end points, respectively. Thus, label_graticule = "SW" would label
meridians at their south end and parallels at their west end, whereas label_graticule = "EW"
would label parallels at both ends and meridians not at all. Because meridians
and parallels can in general intersect with any side of the plot panel, for any
choice of label_graticule labels are not guaranteed to reside on only one
particular side of the plot panel.
This parameter can be used alone or in combination with label_axes.
label_axes Character vector or named list of character values specifying which graticule
lines (meridians or parallels) should be labeled on which side of the plot. Merid-
ians are indicated by "E" (for East) and parallels by "N" (for North). Default is
"--EN", which specifies (clockwise from the top) no labels on the top, none on
the right, meridians on the bottom, and parallels on the left. Alternatively, this
setting could have been specified with list(bottom = "E", left = "N").
This parameter can be used alone or in combination with label_graticule.
ndiscr number of segments to use for discretising graticule lines; try increasing this
when graticules look unexpected
default Is this the default coordinate system? If FALSE (the default), then replacing this
coordinate system with another one creates a message alerting the user that the
coordinate system is being replaced. If TRUE, that warning is suppressed.
120 ggsf
clip Should drawing be clipped to the extent of the plot panel? A setting of "on" (the
default) means yes, and a setting of "off" means no. In most cases, the default
of "on" should not be changed, as setting clip = "off" can cause unexpected
results. It allows drawing of data points anywhere on the plot, including in
the plot margins. If limits are set via xlim and ylim and some data points fall
outside those limits, then those data points may show up in places such as the
axes, the legend, the plot title, or the plot margins.
Geometry aesthetic
geom_sf() uses a unique aesthetic: geometry, giving an column of class sfc containing simple
features data. There are three ways to supply the geometry aesthetic:
Unlike other aesthetics, geometry will never be inherited from the plot.
coord_sf() ensures that all layers use a common CRS. You can either specify it using the CRS
param, or coord_sf() will take it from the first layer that defines a CRS.
See Also
if (requireNamespace("sf", quietly = TRUE)) {
nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
ggplot(nc) +
geom_sf(aes(fill = AREA))
# If not supplied, coord_sf() will take the CRS from the first layer
# and automatically transform all other layers to use that CRS. This
# ensures that all data will correctly line up
nc_3857 <- sf::st_transform(nc, "+init=epsg:3857")
ggplot() +
geom_sf(data = nc) +
geom_sf(data = nc_3857, colour = "red", fill = NA)
# You can also use layers with x and y aesthetics: these are
# assumed to already be in the common CRS.
ggplot(nc) +
geom_sf() +
annotate("point", x = -80, y = 35, colour = "red", size = 4)
These are complete themes which control all non-data display. Use theme() if you just need to
tweak the display of an existing theme.
theme_grey(base_size = 11, base_family = "",
base_line_size = base_size/22, base_rect_size = base_size/22)
base_size base font size
base_family base font family
base_line_size base size for line elements
base_rect_size base size for rect elements
theme_gray The signature ggplot2 theme with a grey background and white gridlines, designed to
put the data forward yet make comparisons easy.
theme_bw The classic dark-on-light ggplot2 theme. May work better for presentations displayed
with a projector.
theme_linedraw A theme with only black lines of various widths on white backgrounds, reminis-
cent of a line drawings. Serves a purpose similar to theme_bw. Note that this theme has some
very thin lines (« 1 pt) which some journals may refuse.
theme_light A theme similar to theme_linedraw but with light grey lines and axes, to direct
more attention towards the data.
theme_dark The dark cousin of theme_light, with similar line sizes but a dark background. Use-
ful to make thin coloured lines pop out.
theme_minimal A minimalistic theme with no background annotations.
theme_classic A classic-looking theme, with x and y axis lines and no gridlines.
theme_void A completely empty theme.
theme_test A theme for visual unit tests. It should ideally never change except for new features.
mtcars2 <- within(mtcars, {
vs <- factor(vs, labels = c("V-shaped", "Straight"))
am <- factor(am, labels = c("Automatic", "Manual"))
cyl <- factor(cyl)
guides 123
p1 <- ggplot(mtcars2) +
geom_point(aes(x = wt, y = mpg, colour = gear)) +
labs(title = "Fuel economy declines as weight increases",
subtitle = "(1973-74)",
caption = "Data from the 1974 Motor Trend US magazine.",
tag = "Figure 1",
x = "Weight (1000 lbs)",
y = "Fuel economy (mpg)",
colour = "Gears")
Guides for each scale can be set scale-by-scale with the guide argument, or en masse with guides().
124 guides
... List of scale name-guide pairs. The guide can either be a string (i.e. "color-
bar" or "legend"), or a call to a guide function (i.e. guide_colourbar() or
guide_legend()) specifying additional arguments.
A list containing the mapping between scale and guide.
See Also
Other guides: guide_colourbar, guide_legend
# ggplot object
p + theme(legend.position = "bottom")
# position of guides
guide_colourbar 125
Colour bar guide shows continuous colour scales mapped onto values. Colour bar is available
with scale_fill and scale_colour. For more information, see the inspiration for this function:
Matlab’s colorbar function.
guide_colourbar(title = waiver(), title.position = NULL,
title.theme = NULL, title.hjust = NULL, title.vjust = NULL,
label = TRUE, label.position = NULL, label.theme = NULL,
label.hjust = NULL, label.vjust = NULL, barwidth = NULL,
barheight = NULL, nbin = 20, raster = TRUE, frame.colour = NULL,
frame.linewidth = 0.5, frame.linetype = 1, ticks = TRUE,
ticks.colour = "white", ticks.linewidth = 0.5, draw.ulim = TRUE,
draw.llim = TRUE, direction = NULL, default.unit = "line",
reverse = FALSE, order = 0, available_aes = c("colour", "color",
"fill"), ...)
title A character string or expression indicating a title of guide. If NULL, the title is
not shown. By default (waiver()), the name of the scale object or the name
specified in labs() is used for the title.
126 guide_colourbar
title.position A character string indicating the position of a title. One of "top" (default for a
vertical guide), "bottom", "left" (default for a horizontal guide), or "right."
title.theme A theme object for rendering the title text. Usually the object of element_text()
is expected. By default, the theme is specified by legend.title in theme() or
title.hjust A number specifying horizontal justification of the title text.
title.vjust A number specifying vertical justification of the title text.
label logical. If TRUE then the labels are drawn. If FALSE then the labels are invisible.
label.position A character string indicating the position of a label. One of "top", "bottom"
(default for horizontal guide), "left", or "right" (default for vertical guide).
label.theme A theme object for rendering the label text. Usually the object of element_text()
is expected. By default, the theme is specified by legend.text in theme().
label.hjust A numeric specifying horizontal justification of the label text.
label.vjust A numeric specifying vertical justification of the label text.
barwidth A numeric or a grid::unit() object specifying the width of the colourbar.
Default value is legend.key.width or legend.key.size in theme() or theme.
barheight A numeric or a grid::unit() object specifying the height of the colourbar. De-
fault value is legend.key.height or legend.key.size in theme() or theme.
nbin A numeric specifying the number of bins for drawing the colourbar. A smoother
colourbar results from a larger value.
raster A logical. If TRUE then the colourbar is rendered as a raster object. If FALSE
then the colourbar is rendered as a set of rectangles. Note that not all graphics
devices are capable of rendering raster image.
frame.colour A string specifying the colour of the frame drawn around the bar. If NULL (the
default), no frame is drawn.
A numeric specifying the width of the frame drawn around the bar.
frame.linetype A numeric specifying the linetype of the frame drawn around the bar.
ticks A logical specifying if tick marks on the colourbar should be visible.
ticks.colour A string specifying the colour of the tick marks.
A numeric specifying the width of the tick marks.
draw.ulim A logical specifying if the upper limit tick marks should be visible.
draw.llim A logical specifying if the lower limit tick marks should be visible.
direction A character string indicating the direction of the guide. One of "horizontal" or
default.unit A character string indicating grid::unit() for barwidth and barheight.
reverse logical. If TRUE the colourbar is reversed. By default, the highest value is on the
top and the lowest value is on the bottom
order positive integer less than 99 that specifies the order of this guide among multiple
guides. This controls the order in which multiple guides are displayed, not the
contents of the guide itself. If 0 (default), the order is determined by a secret
guide_colourbar 127
available_aes A vector of character strings listing the aesthetics for which a colourbar can be
... ignored.
Guides can be specified in each scale_* or in guides(). guide="legend" in scale_* is syntactic
sugar for guide=guide_legend() (e.g. scale_colour_manual(guide = "legend")). As for
how to specify the guide for each scale in more detail, see guides().
A guide object
See Also
Other guides: guide_legend, guides
df <- reshape2::melt(outer(1:4, 1:4), varnames = c("X1", "X2"))
# Basic form
p1 + scale_fill_continuous(guide = "colourbar")
p1 + scale_fill_continuous(guide = guide_colourbar())
p1 + guides(fill = guide_colourbar())
# Control styles
# bar size
p1 + guides(fill = guide_colourbar(barwidth = 0.5, barheight = 10))
# no label
p1 + guides(fill = guide_colourbar(label = FALSE))
# no tick marks
p1 + guides(fill = guide_colourbar(ticks = FALSE))
# label position
p1 + guides(fill = guide_colourbar(label.position = "left"))
# label theme
p1 + guides(fill = guide_colourbar(label.theme = element_text(colour = "blue", angle = 0)))
p2 +
scale_fill_continuous(guide = guide_colourbar(direction = "horizontal")) +
scale_size(guide = guide_legend(direction = "vertical"))
Legend type guide shows key (i.e., geoms) mapped onto values. Legend guides for various scales
are integrated if possible.
guide_legend(title = waiver(), title.position = NULL,
title.theme = NULL, title.hjust = NULL, title.vjust = NULL,
label = TRUE, label.position = NULL, label.theme = NULL,
label.hjust = NULL, label.vjust = NULL, keywidth = NULL,
keyheight = NULL, direction = NULL, default.unit = "line",
override.aes = list(), nrow = NULL, ncol = NULL, byrow = FALSE,
reverse = FALSE, order = 0, ...)
title A character string or expression indicating a title of guide. If NULL, the title is
not shown. By default (waiver()), the name of the scale object or the name
specified in labs() is used for the title.
title.position A character string indicating the position of a title. One of "top" (default for a
vertical guide), "bottom", "left" (default for a horizontal guide), or "right."
title.theme A theme object for rendering the title text. Usually the object of element_text()
is expected. By default, the theme is specified by legend.title in theme() or
title.hjust A number specifying horizontal justification of the title text.
title.vjust A number specifying vertical justification of the title text.
label logical. If TRUE then the labels are drawn. If FALSE then the labels are invisible.
guide_legend 129
label.position A character string indicating the position of a label. One of "top", "bottom"
(default for horizontal guide), "left", or "right" (default for vertical guide).
label.theme A theme object for rendering the label text. Usually the object of element_text()
is expected. By default, the theme is specified by legend.text in theme().
label.hjust A numeric specifying horizontal justification of the label text.
label.vjust A numeric specifying vertical justification of the label text.
keywidth A numeric or a grid::unit() object specifying the width of the legend key.
Default value is legend.key.width or legend.key.size in theme().
keyheight A numeric or a grid::unit() object specifying the height of the legend key.
Default value is legend.key.height or legend.key.size in theme().
direction A character string indicating the direction of the guide. One of "horizontal" or
default.unit A character string indicating grid::unit() for keywidth and keyheight.
override.aes A list specifying aesthetic parameters of legend key. See details and examples.
nrow The desired number of rows of legends.
ncol The desired number of column of legends.
byrow logical. If FALSE (the default) the legend-matrix is filled by columns, otherwise
the legend-matrix is filled by rows.
reverse logical. If TRUE the order of legends is reversed.
order positive integer less than 99 that specifies the order of this guide among multiple
guides. This controls the order in which multiple guides are displayed, not the
contents of the guide itself. If 0 (default), the order is determined by a secret
... ignored.
Guides can be specified in each scale_* or in guides(). guide = "legend" in scale_* is
syntactic sugar for guide = guide_legend() (e.g. scale_color_manual(guide = "legend")).
As for how to specify the guide for each scale in more detail, see guides().
See Also
Other guides: guide_colourbar, guides
# Basic form
p1 + scale_fill_continuous(guide = guide_legend())
130 guide_legend
# Control styles
# title position
p1 + guides(fill = guide_legend(title = "LEFT", title.position = "left"))
# label position
p1 + guides(fill = guide_legend(label.position = "left", label.hjust = 1))
# label styles
p1 + scale_fill_continuous(breaks = c(5, 10, 15),
labels = paste("long", c(5, 10, 15)),
guide = guide_legend(
direction = "horizontal",
title.position = "top",
label.position = "bottom",
label.hjust = 0.5,
label.vjust = 1,
label.theme = element_text(angle = 90)
These are wrappers around functions from Hmisc designed to make them easier to use with stat_summary().
See the Hmisc documentation for more details:
• Hmisc::smean.sdl()
• Hmisc::smedian.hilow()
mean_cl_boot(x, ...)
mean_cl_normal(x, ...)
mean_sdl(x, ...)
median_hilow(x, ...)
x a numeric vector
... other arguments passed on to the respective Hmisc function.
x <- rnorm(100)
132 labeller
This function makes it easy to assign different labellers to different factors. The labeller can be a
function or it can be a named character vectors that will serve as a lookup table.
... Named arguments of the form variable = labeller. Each labeller is passed
to as_labeller() and can be a lookup table, a function taking and returning
character vectors, or simply a labeller function.
.rows, .cols Labeller for a whole margin (either the rows or the columns). It is passed to
as_labeller(). When a margin-wide labeller is set, make sure you don’t men-
tion in ... any variable belonging to the margin.
Deprecated. All supplied labellers and on-labeller functions should be able to
work with character labels.
.multi_line Whether to display the labels of multiple factors on separate lines. This is passed
to the labeller function.
.default Default labeller for variables not specified. Also used with lookup tables or
non-labeller functions.
In case of functions, if the labeller has class labeller, it is directly applied on the data frame of
labels. Otherwise, it is applied to the columns of the data frame of labels. The data frame is then
processed with the function specified in the .default argument. This is intended to be used with
functions taking a character vector such as Hmisc::capitalize().
See Also
as_labeller(), labellers
labeller 133
# Or whole margins:
p1 + facet_grid(
vs + am ~ gear,
labeller = labeller(.rows = label_both, .cols = label_value)
Labeller functions are in charge of formatting the strip labels of facet grids and wraps. Most of
them accept a multi_line argument to control whether multiple factors (defined in formulae such
as ~first + second) should be displayed on a single line separated with commas, or each on their
own line.
labels Data frame of labels. Usually contains only one element, but faceting over mul-
tiple factors entails multiple label variables.
multi_line Whether to display the labels of multiple factors on separate lines.
sep String separating variables and values.
width Maximum number of characters before wrapping the strip.
labellers 135
label_value() only displays the value of a factor while label_both() displays both the variable
name and the factor value. label_context() is context-dependent and uses label_value() for
single factor faceting and label_both() when multiple factors are involved. label_wrap_gen()
uses base::strwrap() for line wrapping.
label_parsed() interprets the labels as plotmath expressions. label_bquote() offers a more
flexible way of constructing plotmath expressions. See examples and bquote() for details on the
syntax of the argument.
See Also
labeller(), as_labeller(), label_bquote()
mtcars$cyl2 <- factor(mtcars$cyl, labels = c("alpha", "beta", "gamma"))
p <- ggplot(mtcars, aes(wt, mpg)) + geom_point()
label_bquote() offers a flexible way of labelling facet rows or columns with plotmath expressions.
Backquoted variables will be replaced with their value in the facet.
See Also
labellers, labeller(),
Good labels are critical for making your plots accessible to a wider audience. Always ensure the
axis and legend labels display the full variable name. Use the plot title and subtitle to explain
the main findings. It’s common to use the caption to provide information about the data source.
tag can be used for adding identification tags to differentiate between multiple plots.
labs(..., title = waiver(), subtitle = waiver(), caption = waiver(),
tag = waiver())
... A list of new name-value pairs. The name should be an aesthetic.
title The text for the title.
subtitle The text for the subtitle for the plot which will be displayed below the title.
caption The text for the caption which will be displayed in the bottom-right of the plot
by default.
tag The text for the tag label which will be displayed at the top-left of the plot by
label The title of the respective axis (for xlab() or ylab()) or of the plot (for ggtitle()).
You can also set axis and legend labels in the individual scales (using the first argument, the name).
If you’re changing other scale options, this is recommended.
If a plot already has a title, subtitle, caption, etc., and you want to remove it, you can do so by setting
the respective argument to NULL. For example, if plot p has a subtitle, then p + labs(subtitle = NULL)
will remove the subtitle from the plot.
p <- ggplot(mtcars, aes(mpg, wt, colour = cyl)) + geom_point()
p + labs(colour = "Cylinders")
p + labs(x = "New x label")
138 lims
This is a shortcut for supplying the limits argument to the individual scales. Note that, by default,
any values outside the limits will be replaced with NA.
... A name-value pair. The name must be an aesthetic, and the value must be either
a length-2 numeric, a character, a factor, or a date/time.
A numeric value will create a continuous scale. If the larger value comes first,
the scale will be reversed. You can leave one value as NA to compute from the
range of the data.
A character or factor value will create a discrete scale.
A date-time value will create a continuous date/time scale.
See Also
For changing x or y axis limits without dropping data observations, see coord_cartesian(). To
expand the range of a plot to always include certain values, see expand_limits().
luv_colours 139
# Zoom into a specified area
ggplot(mtcars, aes(mpg, wt)) +
geom_point() +
xlim(15, 20)
# reverse scale
ggplot(mtcars, aes(mpg, wt)) +
geom_point() +
xlim(20, 15)
# You can also supply limits that are larger than the data.
# This is useful if you want to match scales across different plots
small <- subset(mtcars, cyl == 4)
big <- subset(mtcars, cyl > 4)
All built-in colors() translated into Luv colour space.
A data frame with 657 observations and 4 variables:
In conjunction with the theme system, the element_ functions specify the display of how non-data
components of the plot are a drawn.
rel() is used to specify sizes relative to the parent, margins() is used to specify the margins of
margin(t = 0, r = 0, b = 0, l = 0, unit = "pt")
t, r, b, l Dimensions of each margin. (To remember order, think trouble).
unit Default units of dimensions. Defaults to "pt" so it can be most easily scaled with
the text.
fill Fill colour.
colour, color Line/border colour. Color is an alias for colour.
size Line/border size in mm; text size in pts.
margin 141
linetype Line type. An integer (0:8), a name (blank, solid, dashed, dotted, dotdash, long-
dash, twodash), or a string with an even number (up to eight) of hexadecimal
digits which give the lengths in consecutive positions in the string.
inherit.blank Should this element inherit the existence of an element_blank among its par-
ents? If TRUE the existence of a blank element among its parents will cause this
element to be blank as well. If FALSE any blank parent element will be ignored
when calculating final element state.
lineend Line end Line end style (round, butt, square)
arrow Arrow specification, as created by grid::arrow()
family Font family
face Font face ("plain", "italic", "bold", "bold.italic")
hjust Horizontal justification (in [0, 1])
vjust Vertical justification (in [0, 1])
angle Angle (in [0, 360])
lineheight Line height
margin Margins around the text. See margin() for more details. When creating a
theme, the margins should be placed on the side of the text facing towards the
center of the plot.
debug If TRUE, aids visual debugging by drawing a solid rectangle behind the complete
text area, and a point where each label is anchored.
x A single number specifying size relative to parent element.
An S3 object of class element, rel, or margin.
plot <- ggplot(mpg, aes(displ, hwy)) + geom_point()
plot + theme(
panel.background = element_blank(),
axis.text = element_blank()
plot + theme(
axis.text = element_text(colour = "red", size = rel(1.5))
plot + theme(
axis.line = element_line(arrow = arrow())
plot + theme(
panel.background = element_rect(fill = "white"),
plot.margin = margin(2, 2, 2, 2, "cm"),
plot.background = element_rect(
142 midwest
fill = "grey90",
colour = "black",
size = 1
mean_se(x, mult = 1)
x numeric vector
mult number of multiples of standard error
x <- rnorm(100)
midwest 143
poptotal Total population
popdensity Population density
popwhite Number of whites.
popblack Number of blacks.
popamerindian Number of American Indians.
popasian Number of Asians.
popother Number of other races.
percwhite Percent white.
percblack Percent black.
percamerindan Percent American Indian.
percasian Percent Asian.
percother Percent other races.
popadults Number of adults.
percollege Percent college educated.
percprof Percent profession.
inmetro In a metro area.
144 msleep
mpg Fuel economy data from 1999 and 2008 for 38 popular models of car
This dataset contains a subset of the fuel economy data that the EPA makes available on http:
// It contains only models which had a new release every year between 1999
and 2008 - this was used as a proxy for the popularity of the car.
A data frame with 234 rows and 11 variables
model model name
displ engine displacement, in litres
year year of manufacture
cyl number of cylinders
trans type of transmission
drv f = front-wheel drive, r = rear wheel drive, 4 = 4wd
cty city miles per gallon
hwy highway miles per gallon
fl fuel type
class "type" of car
This is an updated and expanded version of the mammals sleep dataset. Updated sleep times and
weights were taken from V. M. Savage and G. B. West. A quantitative, theoretical framework for
understanding mammalian sleep. Proceedings of the National Academy of Sciences, 104 (3):1051-
1056, 2007.
position_dodge 145
A data frame with 83 rows and 11 variables
name common name
vore carnivore, omnivore or herbivore?
conservation the conservation status of the animal
sleep_total total amount of sleep, in hours
sleep_rem rem sleep, in hours
sleep_cycle length of sleep cycle, in hours
awake amount of time spent awake, in hours
brainwt brain weight in kilograms
bodywt body weight in kilograms
Additional variables order, conservation status and vore were added from wikipedia.
Dodging preserves the vertical position of an geom while adjusting the horizontal position. position_dodge2
is a special case of position_dodge for arranging box plots, which can have variable widths.
position_dodge2 also works with bars and rectangles.
position_dodge(width = NULL, preserve = c("total", "single"))
width Dodging width, when different to the width of the individual elements. This
is useful when you want to align narrow geoms with wider geoms. See the
preserve Should dodging preserve the total width of all elements at a position, or the
width of a single element?
padding Padding between elements at the same position. Elements are shrunk by this
proportion to allow space between them. Defaults to 0.1.
reverse If TRUE, will reverse the default stacking order. This is useful if you’re rotating
both the plot and legend.
146 position_dodge
See Also
Other position adjustments: position_identity, position_jitterdodge, position_jitter,
position_nudge, position_stack
ggplot(mtcars, aes(factor(cyl), fill = factor(vs))) +
geom_bar(position = "dodge2")
# Box plots use position_dodge2 by default, and bars can use it too
ggplot(data = iris, aes(Species, Sepal.Length)) +
geom_boxplot(aes(colour = Sepal.Width < 3.2))
See Also
Counterintuitively adding random noise to a plot can sometimes make it easier to read. Jittering is
particularly useful for small datasets with at least one discrete position.
width, height Amount of vertical and horizontal jitter. The jitter is added in both positive and
negative directions, so the total spread is twice the value specified here.
If omitted, defaults to 40% of the resolution of the data: this means the jitter
values will occupy 80% of the implied bins. Categorical data is aligned on the
integers, so a width or height of 0.5 will spread the data so it’s not possible to
see the distinction between the categories.
seed A random seed to make the jitter reproducible. Useful if you need to apply the
same jitter twice, e.g., for a point and a corresponding label. The random seed is
reset after jittering. If NA (the default value), the seed is initialised with a random
value; this makes sure that two subsequent calls start with a different seed. Use
NULL to use the current random seed and also avoid resetting (the behaviour of
ggplot 2.2.1 and earlier).
See Also
Other position adjustments: position_dodge, position_identity, position_jitterdodge, position_nudge,
# Jittering is useful when you have a discrete position, and a relatively
# small number of points
# take up as much space as a boxplot or a bar
ggplot(mpg, aes(class, hwy)) +
geom_boxplot(colour = "grey50") +
This is primarily used for aligning points generated through geom_point() with dodged boxplots
(e.g., a geom_boxplot() with a fill aesthetic supplied).
position_jitterdodge(jitter.width = NULL, jitter.height = 0,
dodge.width = 0.75, seed = NA)
jitter.width degree of jitter in x direction. Defaults to 40% of the resolution of the data.
jitter.height degree of jitter in y direction. Defaults to 0.
dodge.width the amount to dodge in the x direction. Defaults to 0.75, the default position_dodge()
seed A random seed to make the jitter reproducible. Useful if you need to apply the
same jitter twice, e.g., for a point and a corresponding label. The random seed is
reset after jittering. If NA (the default value), the seed is initialised with a random
value; this makes sure that two subsequent calls start with a different seed. Use
NULL to use the current random seed and also avoid resetting (the behaviour of
ggplot 2.2.1 and earlier).
See Also
Other position adjustments: position_dodge, position_identity, position_jitter, position_nudge,
dsub <- diamonds[ sample(nrow(diamonds), 1000), ]
ggplot(dsub, aes(x = cut, y = carat, fill = clarity)) +
geom_boxplot(outlier.size = 0) +
geom_point(pch = 21, position = position_jitterdodge())
position_nudge is generally useful for adjusting the position of items on discrete scales by a
small amount. Nudging is built in to geom_text() because it’s so useful for moving labels a small
distance from what they’re labelling.
position_nudge(x = 0, y = 0)
150 position_stack
x, y Amount of vertical and horizontal distance to move.
See Also
Other position adjustments: position_dodge, position_identity, position_jitterdodge, position_jitter,
df <- data.frame(
x = c(1,3,2,5),
y = c("a","c","d","c")
# Or, in brief
ggplot(df, aes(x, y)) +
geom_point() +
geom_text(aes(label = y), nudge_y = -0.1)
position_stack() stacks bars on top of each other; position_fill() stacks bars and standard-
ises each stack to have constant height.
position_stack(vjust = 1, reverse = FALSE)
vjust Vertical adjustment for geoms that have a position (like points or lines), not a
dimension (like bars or areas). Set to 0 to align with the bottom, 0.5 for the
middle, and 1 (the default) for the top.
reverse If TRUE, will reverse the default stacking order. This is useful if you’re rotating
both the plot and legend.
position_stack 151
position_fill() and position_stack() automatically stack values in reverse order of the group
aesthetic, which for bar charts is usually defined by the fill aesthetic (the default group aesthetic is
formed by the combination of all discrete aesthetics except for x and y). This default ensures that
bar colours align with the default legend.
There are three ways to override the defaults depending on what you want:
1. Change the order of the levels in the underlying factor. This will change the stacking order,
and the order of keys in the legend.
2. Set the legend breaks to change the order of the keys without affecting the stacking.
3. Manually set the group aesthetic to change the stacking order without affecting the legend.
Stacking of positive and negative values are performed separately so that positive values stack up-
wards from the x-axis and negative values stack downward.
See Also
See geom_bar() and geom_area() for more examples.
Other position adjustments: position_dodge, position_identity, position_jitterdodge, position_jitter,
# Stacking and filling ------------------------------------------------------
# You control the stacking order by setting the levels of the underlying
# factor. See the forcats package for convenient helpers.
series$type2 <- factor(series$type, levels = c('c', 'b', 'd', 'a'))
ggplot(series, aes(time, value)) +
geom_area(aes(fill = type2))
# You can change the order of the levels in the legend using the scale
ggplot(series, aes(time, value)) +
geom_area(aes(fill = type)) +
scale_fill_discrete(breaks = c('a', 'b', 'c', 'd'))
# When stacking across multiple layers it's a good idea to always set
# the `group` aesthetic in the ggplot() call. This ensures that all layers
# are stacked in the same way.
ggplot(series, aes(time, value, group = type)) +
geom_line(aes(colour = type), position = "stack") +
geom_point(aes(colour = type), position = "stack")
# You can also stack labels, but the default position is suboptimal.
ggplot(series, aes(time, value, group = type)) +
geom_area(aes(fill = type)) +
geom_text(aes(label = type), position = "stack")
# You can override this with the vjust parameter. A vjust of 0.5
# will center the labels inside the corresponding area
ggplot(series, aes(time, value, group = type)) +
geom_area(aes(fill = type)) +
geom_text(aes(label = type), position = position_stack(vjust = 0.5))
df <- tibble::tribble(
~x, ~y, ~grp,
"a", 1, "x",
"a", 2, "y",
"b", 1, "x",
"b", 3, "y",
"b", -1, "y"
ggplot(data = df, aes(x, y, group = grp)) +
presidential 153
The names of each president, the start and end date of their term, and their party of 11 US presidents
from Eisenhower to Obama.
A data frame with 11 rows and 4 variables
Generally, you do not need to print or plot a ggplot2 plot explicitly: the default top-level print
method will do it for you. You will, however, need to call print() explicitly if you want to draw a
plot inside a function or for loop.
## S3 method for class 'ggplot'
print(x, newpage = is.null(vp), vp = NULL, ...)
x plot to display
newpage draw new (empty) page first?
vp viewport to draw plot in
... other arguments not used by this method
154 print.ggproto
Invisibly returns the result of ggplot_build(), which is a list with components that contain the
plot itself, the data, information about the scales, panels etc.
colours <- list(~class, ~drv, ~fl)
If a ggproto object has a $print method, this will call that method. Otherwise, it will print out the
members of the object, and optionally, the members of the inherited objects.
Dog <- ggproto(
print = function(self, n) {
cat(format(Dog), "\n")
qplot is a shortcut designed to be familiar if you’re used to base plot(). It’s a convenient wrapper
for creating a number of different types of plots using a consistent calling scheme. It’s great for
allowing you to produce plots quickly, but I highly recommend learning ggplot() as it makes it
easier to create complex graphics.
qplot(x, y, ..., data, facets = NULL, margins = FALSE, geom = "auto",
xlim = c(NA, NA), ylim = c(NA, NA), log = "", main = NULL,
xlab = NULL, ylab = NULL, asp = NA, stat = NULL,
position = NULL)
x, y, ... Aesthetics passed into each layer
data Data frame to use (optional). If not specified, will create one, extracting vectors
from the current environment.
facets faceting formula to use. Picks facet_wrap() or facet_grid() depending on
whether the formula is one- or two-sided
margins See facet_grid: display marginal facets?
geom Character vector specifying geom(s) to draw. Defaults to "point" if x and y are
specified, and "histogram" if only x is specified.
xlim, ylim X and y axis limits
log Which variables to log transform ("x", "y", or "xy")
main, xlab, ylab
Character vector (or expression) giving plot title, x axis label, and y axis label
156 resolution
# Use data from data.frame
qplot(mpg, wt, data = mtcars)
qplot(mpg, wt, data = mtcars, colour = cyl)
qplot(mpg, wt, data = mtcars, size = cyl)
qplot(mpg, wt, data = mtcars, facets = vs ~ am)
f <- function() {
a <- 1:10
b <- a ^ 2
qplot(a, b)
# qplot will attempt to guess what geom you want depending on the input
# both x and y supplied = scatterplot
qplot(mpg, wt, data = mtcars)
# just x supplied = histogram
qplot(mpg, data = mtcars)
# just y supplied = scatterplot, with x = seq_along(y)
qplot(y = mpg, data = mtcars)
The resolution is the smallest non-zero distance between adjacent values. If there is only one unique
value, then the resolution is defined to be one. If x is an integer vector, then it is assumed to represent
a discrete variable, and the resolution is 1.
scale_alpha 157
resolution(x, zero = TRUE)
x numeric vector
zero should a zero value be automatically included in the computation of resolution
resolution((1:10) - 0.5)
resolution((1:10) - 0.5, FALSE)
Alpha-transparency scales are not tremendously useful, but can be a convenient way to visually
down-weight less important observations. scale_alpha is an alias for scale_alpha_continuous
since that is the most common use of alpha, and it saves a bit of typing.
scale_alpha(..., range = c(0.1, 1))
... Other arguments passed on to continuous_scale() or discrete_scale() as
appropriate, to control name, limits, breaks, labels and so forth.
range Output range of alpha values. Must lie between 0 and 1.
See Also
Other colour scales: scale_colour_brewer, scale_colour_gradient, scale_colour_grey, scale_colour_hue,
158 scale_colour_brewer
p <- ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(alpha = year))
p + scale_alpha("cylinders")
p + scale_alpha(range = c(0.4, 0.8))
The brewer scales provides sequential, diverging and qualitative colour schemes from ColorBrewer.
These are particularly well suited to display discrete values on a map. See http://colorbrewer2.
org for more information.
scale_colour_brewer(..., type = "seq", palette = 1, direction = 1,
aesthetics = "colour")
... Other arguments passed on to discrete_scale() or, for distiller scales,
continuous_scale() to control name, limits, breaks, labels and so forth.
type One of seq (sequential), div (diverging) or qual (qualitative)
palette If a string, will use that named palette. If a number, will index into the list of
palettes of appropriate type
direction Sets the order of colours in the scale. If 1, the default, colours are as output by
RColorBrewer::brewer.pal(). If -1, the order of colours is reversed.
aesthetics Character string or vector of character strings listing the name(s) of the aes-
thetic(s) that this scale works with. This can be useful, for example, to ap-
ply colour settings to the colour and fill aesthetics at the same time, via
aesthetics = c("colour", "fill").
scale_colour_brewer 159
values if colours should not be evenly positioned along the gradient this vector gives
the position (between 0 and 1) for each colour in the colours vector. See
rescale() for a convenience function to map an arbitrary range to between
0 and 1.
space colour space in which to calculate gradient. Must be "Lab" - other values are
na.value Colour to use for missing values
guide Type of legend. Use "colourbar" for continuous colour bar, or "legend" for
discrete colour legend.
The brewer scales were carefully designed and tested on discrete data. They were not designed to
be extended to continuous data, but results often look good. Your mileage may vary.
The following palettes are available for use with these scales:
Diverging BrBG, PiYG, PRGn, PuOr, RdBu, RdGy, RdYlBu, RdYlGn, Spectral
Qualitative Accent, Dark2, Paired, Pastel1, Pastel2, Set1, Set2, Set3
Sequential Blues, BuGn, BuPu, GnBu, Greens, Greys, Oranges, OrRd, PuBu, PuBuGn, PuRd,
Purples, RdPu, Reds, YlGn, YlGnBu, YlOrBr, YlOrRd
The distiller scales extend brewer to continuous scales by smoothly interpolating 6 colours from
any palette to a continuous scale.
See Also
Other colour scales: scale_alpha, scale_colour_gradient, scale_colour_grey, scale_colour_hue,
dsamp <- diamonds[sample(nrow(diamonds), 1000), ]
(d <- ggplot(dsamp, aes(carat, price)) +
geom_point(aes(colour = clarity)))
d + scale_colour_brewer()
Continuous colour scales
Colour scales for continuous data default to the values of the ggplot2.continuous.colour and
ggplot2.continuous.fill options. If these options are not present, "gradient" will be used.
See options() for more information.
type = getOption("ggplot2.continuous.colour", default = "gradient"))
See Also
v <- ggplot(faithfuld, aes(waiting, eruptions, fill = density)) +
v + scale_fill_continuous(type = "gradient")
v + scale_fill_continuous(type = "viridis")
... Arguments passed on to continuous_scale
scale_name The name of the scale
palette A palette function that when called with a numeric vector with values
between 0 and 1 returns the corresponding values in the range the scale
maps to.
name The name of the scale. Used as the axis or legend title. If waiver(), the
default, the name of the scale is taken from the first mapping used for that
aesthetic. If NULL, the legend title will be omitted.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
• A numeric vector of positions
• A function that takes the limits as input and returns breaks as output
minor_breaks One of:
• NULL for no minor breaks
• waiver() for the default breaks (one minor break between each major
• A numeric vector of positions
• A function that given the limits returns a vector of minor breaks.
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
limits A numeric vector of length two providing limits of the scale. Use NA to
refer to the existing minimum or maximum.
rescaler Used by diverging and n colour gradients (i.e. scale_colour_gradient2(),
scale_colour_gradientn()). A function used to scale the input values to
the range [0, 1].
oob Function that handles limits outside of the scale limits (out of bounds). The
default replaces out of bounds values with NA.
trans Either the name of a transformation object, or the object itself. Built-in
transformations include "asn", "atanh", "boxcox", "exp", "identity", "log",
"log10", "log1p", "log2", "logit", "probability", "probit", "reciprocal", "re-
verse" and "sqrt".
A transformation object bundles together a transform, its inverse, and meth-
ods for generating breaks and labels. Transformation objects are defined in
the scales package, and are called name_trans, e.g. scales::boxcox_trans().
You can create your own transformation with scales::trans_new().
position The position of the axis. "left" or "right" for vertical scales, "top" or
"bottom" for horizontal scales
super The super class to use for the constructed scale
scale_colour_gradient 163
expand Vector of range expansion constants used to add some padding around
the data, to ensure that they are placed some distance away from the axes.
Use the convenience function expand_scale() to generate the values for
the expand argument. The defaults are to expand the scale by 5% on each
side for continuous variables, and by 0.6 units on each side for discrete
low, high Colours for low and high ends of the gradient.
space colour space in which to calculate gradient. Must be "Lab" - other values are
na.value Colour to use for missing values
guide Type of legend. Use "colourbar" for continuous colour bar, or "legend" for
discrete colour legend.
aesthetics Character string or vector of character strings listing the name(s) of the aes-
thetic(s) that this scale works with. This can be useful, for example, to ap-
ply colour settings to the colour and fill aesthetics at the same time, via
aesthetics = c("colour", "fill").
mid colour for mid point
midpoint The midpoint (in data value) of the diverging scale. Defaults to 0.
colours, colors
Vector of colours to use for n-colour gradient.
values if colours should not be evenly positioned along the gradient this vector gives
the position (between 0 and 1) for each colour in the colours vector. See
rescale() for a convenience function to map an arbitrary range to between
0 and 1.
Default colours are generated with munsell and mnsl(c("2.5PB 2/4", "2.5PB 7/10")). Gener-
ally, for continuous colour scales you want to keep hue constant, but vary chroma and luminance.
The munsell package makes this easy to do using the Munsell colour system.
See Also
scales::seq_gradient_pal() for details on underlying palette
Other colour scales: scale_alpha, scale_colour_brewer, scale_colour_grey, scale_colour_hue,
df <- data.frame(
x = runif(100),
y = runif(100),
z1 = rnorm(100),
z2 = abs(rnorm(100))
# Equivalent fill scales do the same job for the fill aesthetic
ggplot(faithfuld, aes(waiting, eruptions)) +
geom_raster(aes(fill = density)) +
scale_fill_gradientn(colours = terrain.colors(10))
Based on gray.colors(). This is black and white equivalent of scale_colour_gradient().
scale_colour_grey(..., start = 0.2, end = 0.8, na.value = "red",
aesthetics = "colour")
... Arguments passed on to discrete_scale
palette A palette function that when called with a single integer argument (the
number of levels in the scale) returns the values that they should take.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
scale_colour_grey 165
See Also
Other colour scales: scale_alpha, scale_colour_brewer, scale_colour_gradient, scale_colour_hue,
166 scale_colour_hue
p <- ggplot(mtcars, aes(mpg, wt)) + geom_point(aes(colour = factor(cyl)))
p + scale_colour_grey()
p + scale_colour_grey(end = 0)
# You may want to turn off the pale grey background with this scale
p + scale_colour_grey() + theme_bw()
This is the default colour scale for categorical variables. It maps each level to an evenly spaced hue
on the colour wheel. It does not generate colour-blind safe palettes.
scale_colour_hue(..., h = c(0, 360) + 15, c = 100, l = 65,
h.start = 0, direction = 1, na.value = "grey50",
aesthetics = "colour")
... Arguments passed on to discrete_scale
palette A palette function that when called with a single integer argument (the
number of levels in the scale) returns the values that they should take.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
• A character vector of breaks
• A function that takes the limits as input and returns breaks as output
limits A character vector that defines possible values of the scale and their or-
scale_colour_hue 167
drop Should unused factor levels be omitted from the scale? The default, TRUE,
uses the levels that appear in the data; FALSE uses all the levels in the factor.
na.translate Unlike continuous scales, discrete scales can easily show missing
values, and do so by default. If you want to remove missing values from a
discrete scale, specify na.translate = FALSE.
na.value If na.translate = TRUE, what value aesthetic value should missing
be displayed as? Does not apply to position scales where NA is always
placed at the far right.
scale_name The name of the scale
name The name of the scale. Used as the axis or legend title. If waiver(), the
default, the name of the scale is taken from the first mapping used for that
aesthetic. If NULL, the legend title will be omitted.
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
expand Vector of range expansion constants used to add some padding around
the data, to ensure that they are placed some distance away from the axes.
Use the convenience function expand_scale() to generate the values for
the expand argument. The defaults are to expand the scale by 5% on each
side for continuous variables, and by 0.6 units on each side for discrete
guide A function used to create a guide or its name. See guides() for more
position The position of the axis. "left" or "right" for vertical scales, "top" or
"bottom" for horizontal scales
super The super class to use for the constructed scale
h range of hues to use, in [0, 360]
c chroma (intensity of colour), maximum value varies depending on combination
of hue and luminance.
l luminance (lightness), in [0, 100]
h.start hue to start at
direction direction to travel around the colour wheel, 1 = clockwise, -1 = counter-clockwise
na.value Colour to use for missing values
aesthetics Character string or vector of character strings listing the name(s) of the aes-
thetic(s) that this scale works with. This can be useful, for example, to ap-
ply colour settings to the colour and fill aesthetics at the same time, via
aesthetics = c("colour", "fill").
See Also
Other colour scales: scale_alpha, scale_colour_brewer, scale_colour_gradient, scale_colour_grey,
168 scale_colour_viridis_d
# Vary opacity
# (only works with pdf, quartz and cairo devices)
d <- ggplot(dsamp, aes(carat, price, colour = clarity))
d + geom_point(alpha = 0.9)
d + geom_point(alpha = 0.5)
d + geom_point(alpha = 0.2)
Viridis colour scales from viridisLite
The viridis scales provide colour maps that are perceptually uniform in both colour and black-
and-white. They are also designed to be perceived by viewers with common forms of colour blind-
ness. See also
scale_colour_viridis_d 169
scale_colour_viridis_d(..., alpha = 1, begin = 0, end = 1,
direction = 1, option = "D", aesthetics = "colour")
... Other arguments passed on to discrete_scale() or continuous_scale() to
control name, limits, breaks, labels and so forth.
alpha The alpha transparency, a number in [0,1], see argument alpha in hsv.
begin The (corrected) hue in [0,1] at which the viridis colormap begins.
end The (corrected) hue in [0,1] at which the viridis colormap ends.
direction Sets the order of colors in the scale. If 1, the default, colors are ordered from
darkest to lightest. If -1, the order of colors is reversed.
option A character string indicating the colormap option to use. Four options are avail-
able: "magma" (or "A"), "inferno" (or "B"), "plasma" (or "C"), "viridis" (or "D",
the default option) and "cividis" (or "E").
aesthetics Character string or vector of character strings listing the name(s) of the aes-
thetic(s) that this scale works with. This can be useful, for example, to ap-
ply colour settings to the colour and fill aesthetics at the same time, via
aesthetics = c("colour", "fill").
values if colours should not be evenly positioned along the gradient this vector gives
the position (between 0 and 1) for each colour in the colours vector. See
rescale() for a convenience function to map an arbitrary range to between
0 and 1.
space colour space in which to calculate gradient. Must be "Lab" - other values are
na.value Missing values will be replaced with this value.
guide A function used to create a guide or its name. See guides() for more info.
See Also
Other colour scales: scale_alpha, scale_colour_brewer, scale_colour_gradient, scale_colour_grey,
170 scale_continuous
# viridis is the default colour/fill scale for ordered factors
dsamp <- diamonds[sample(nrow(diamonds), 1000), ]
ggplot(dsamp, aes(carat, price)) +
geom_point(aes(colour = clarity))
scale_x_continuous() and scale_y_continuous() are the default scales for continuous x and y
aesthetics. There are three variants that set the trans argument for commonly used transformations:
scale_*_log10(), scale_*_sqrt() and scale_*_reverse().
scale_x_continuous(name = waiver(), breaks = waiver(),
minor_breaks = waiver(), labels = waiver(), limits = NULL,
expand = waiver(), oob = censor, na.value = NA_real_,
trans = "identity", position = "bottom", sec.axis = waiver())
scale_continuous 171
name The name of the scale. Used as the axis or legend title. If waiver(), the default,
the name of the scale is taken from the first mapping used for that aesthetic. If
NULL, the legend title will be omitted.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
• A numeric vector of positions
• A function that takes the limits as input and returns breaks as output
minor_breaks One of:
• NULL for no minor breaks
• waiver() for the default breaks (one minor break between each major
• A numeric vector of positions
• A function that given the limits returns a vector of minor breaks.
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
limits A numeric vector of length two providing limits of the scale. Use NA to refer to
the existing minimum or maximum.
expand Vector of range expansion constants used to add some padding around the data,
to ensure that they are placed some distance away from the axes. Use the con-
venience function expand_scale() to generate the values for the expand argu-
ment. The defaults are to expand the scale by 5% on each side for continuous
variables, and by 0.6 units on each side for discrete variables.
172 scale_continuous
oob Function that handles limits outside of the scale limits (out of bounds). The
default replaces out of bounds values with NA.
na.value Missing values will be replaced with this value.
trans Either the name of a transformation object, or the object itself. Built-in trans-
formations include "asn", "atanh", "boxcox", "exp", "identity", "log", "log10",
"log1p", "log2", "logit", "probability", "probit", "reciprocal", "reverse" and "sqrt".
A transformation object bundles together a transform, its inverse, and methods
for generating breaks and labels. Transformation objects are defined in the scales
package, and are called name_trans, e.g. scales::boxcox_trans(). You can
create your own transformation with scales::trans_new().
position The position of the axis. "left" or "right" for vertical scales, "top" or "bottom"
for horizontal scales
sec.axis specify a secondary axis
... Other arguments passed on to scale_(x|y)_continuous()
For simple manipulation of labels and limits, you may wish to use labs() and lims() instead.
See Also
sec_axis() for how to specify secondary axes
Other position scales: scale_x_date, scale_x_discrete
p1 <- ggplot(mpg, aes(displ, hwy)) +
# you can also use the short hand functions `xlim()` and `ylim()`
p1 + xlim(2, 6)
These are the default scales for the three date/time class. These will usually be added automatically.
To override manually, use scale_*_date for dates (class Date), scale_*_datetime for datetimes
(class POSIXct), and scale_*_time for times (class hms).
scale_x_date(name = waiver(), breaks = waiver(),
date_breaks = waiver(), labels = waiver(), date_labels = waiver(),
minor_breaks = waiver(), date_minor_breaks = waiver(),
limits = NULL, expand = waiver(), position = "bottom",
sec.axis = waiver())
name The name of the scale. Used as the axis or legend title. If waiver(), the default,
the name of the scale is taken from the first mapping used for that aesthetic. If
NULL, the legend title will be omitted.
breaks One of:
• NULL for no breaks
• waiver() for the breaks specified by date_breaks
• A Date/POSIXct vector giving positions of breaks
• A function that takes the limits as input and returns breaks as output
date_breaks A string giving the distance between breaks like "2 weeks", or "10 years". If
both breaks and date_breaks are specified, date_breaks wins.
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
date_labels A string giving the formatting specification for the labels. Codes are defined
in strftime(). If both labels and date_labels are specified, date_labels
scale_date 175
See Also
sec_axis() for how to specify secondary axes
Other position scales: scale_x_continuous, scale_x_discrete
last_month <- Sys.Date() - 0:29
df <- data.frame(
date = last_month,
price = runif(30)
base <- ggplot(df, aes(date, price)) +
# Set limits
base + scale_x_date(limits = c(Sys.Date() - 7, NA))
Use this set of scales when your data has already been scaled, i.e. it already represents aesthetic
values that ggplot2 can handle directly. These scales will not produce a legend unless you also
supply the breaks, labels, and type of guide you want.
scale_colour_identity(..., guide = "none", aesthetics = "colour")
... Other arguments passed on to discrete_scale() or continuous_scale()
guide Guide to use for this scale. Defaults to "none".
aesthetics Character string or vector of character strings listing the name(s) of the aes-
thetic(s) that this scale works with. This can be useful, for example, to ap-
ply colour settings to the colour and fill aesthetics at the same time, via
aesthetics = c("colour", "fill").
The functions scale_colour_identity(), scale_fill_identity(), scale_size_identity(),
etc. work on the aesthetics specified in the scale name: colour, fill, size, etc. However,
the functions scale_colour_identity() and scale_fill_identity() also have an optional
aesthetics argument that can be used to define both colour and fill aesthetic mappings via a sin-
gle function call. The functions scale_discrete_identity() and scale_continuous_identity()
scale_linetype 177
are generic scales that can work with any aesthetic or set of aesthetics provided via the aesthetics
df <- data.frame(
x = 1:4,
y = 1:4,
colour = c("red", "green", "blue", "yellow")
ggplot(df, aes(x, y)) + geom_tile(aes(fill = colour))
ggplot(df, aes(x, y)) +
geom_tile(aes(fill = colour)) +
Default line types based on a set supplied by Richard Pearson, University of Manchester. Continu-
ous values can not be mapped to line types.
178 scale_linetype
scale_linetype(..., na.value = "blank")
... Arguments passed on to discrete_scale
palette A palette function that when called with a single integer argument (the
number of levels in the scale) returns the values that they should take.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
• A character vector of breaks
• A function that takes the limits as input and returns breaks as output
limits A character vector that defines possible values of the scale and their or-
drop Should unused factor levels be omitted from the scale? The default, TRUE,
uses the levels that appear in the data; FALSE uses all the levels in the factor.
na.translate Unlike continuous scales, discrete scales can easily show missing
values, and do so by default. If you want to remove missing values from a
discrete scale, specify na.translate = FALSE.
aesthetics The names of the aesthetics that this scale works with
scale_name The name of the scale
name The name of the scale. Used as the axis or legend title. If waiver(), the
default, the name of the scale is taken from the first mapping used for that
aesthetic. If NULL, the legend title will be omitted.
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
guide A function used to create a guide or its name. See guides() for more
super The super class to use for the constructed scale
na.value The linetype to use for NA values.
base <- ggplot(economics_long, aes(date, value01))
base + geom_line(aes(group = variable))
base + geom_line(aes(linetype = variable))
scale_manual 179
These functions allow you to specify your own set of mappings from levels in the data to aesthetic
scale_colour_manual(..., values, aesthetics = "colour")
scale_size_manual(..., values)
scale_shape_manual(..., values)
scale_linetype_manual(..., values)
scale_alpha_manual(..., values)
... Arguments passed on to discrete_scale
palette A palette function that when called with a single integer argument (the
number of levels in the scale) returns the values that they should take.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
180 scale_manual
The functions scale_colour_manual(), scale_fill_manual(), scale_size_manual(), etc. work
on the aesthetics specified in the scale name: colour, fill, size, etc. However, the functions
scale_colour_manual() and scale_fill_manual() also have an optional aesthetics argument
that can be used to define both colour and fill aesthetic mappings via a single function call (see
examples). The function scale_discrete_manual() is a generic scale that can work with any
aesthetic or set of aesthetics provided via the aesthetics argument.
p <- ggplot(mtcars, aes(mpg, wt)) +
geom_point(aes(colour = factor(cyl)))
scale_shape 181
# You can set color and fill aesthetics at the same time
aes(mpg, wt, colour = factor(cyl), fill = factor(cyl))
) +
geom_point(shape = 21, alpha = 0.5, size = 2) +
values = cols,
aesthetics = c("colour", "fill")
# As with other scales you can use breaks to control the appearance
# of the legend.
p + scale_colour_manual(values = cols)
p + scale_colour_manual(
values = cols,
breaks = c("4", "6", "8"),
labels = c("four", "six", "eight")
scale_shape maps discrete variables to six easily discernible shapes. If you have more than six
levels, you will get a warning message, and the seventh and subsequence levels will not appear on
the plot. Use scale_shape_manual() to supply your own values. You can not map a continuous
variable to shape.
scale_shape(..., solid = TRUE)
... Arguments passed on to discrete_scale
palette A palette function that when called with a single integer argument (the
number of levels in the scale) returns the values that they should take.
182 scale_shape
dsmall <- diamonds[sample(nrow(diamonds), 100), ]
scale_size scales area, scale_radius scales radius. The size aesthetic is most commonly used
for points and text, and humans perceive the area of points (not their radius), so this provides for
optimal perception. scale_size_area ensures that a value of 0 is mapped to a size of 0.
scale_radius(name = waiver(), breaks = waiver(), labels = waiver(),
limits = NULL, range = c(1, 6), trans = "identity",
guide = "legend")
scale_size_area(..., max_size = 6)
name The name of the scale. Used as the axis or legend title. If waiver(), the default,
the name of the scale is taken from the first mapping used for that aesthetic. If
NULL, the legend title will be omitted.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
• A numeric vector of positions
• A function that takes the limits as input and returns breaks as output
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
limits A numeric vector of length two providing limits of the scale. Use NA to refer to
the existing minimum or maximum.
184 scale_size
range a numeric vector of length 2 that specifies the minimum and maximum size of
the plotting symbol after transformation.
trans Either the name of a transformation object, or the object itself. Built-in trans-
formations include "asn", "atanh", "boxcox", "exp", "identity", "log", "log10",
"log1p", "log2", "logit", "probability", "probit", "reciprocal", "reverse" and "sqrt".
A transformation object bundles together a transform, its inverse, and methods
for generating breaks and labels. Transformation objects are defined in the scales
package, and are called name_trans, e.g. scales::boxcox_trans(). You can
create your own transformation with scales::trans_new().
guide A function used to create a guide or its name. See guides() for more info.
... Arguments passed on to continuous_scale
name The name of the scale. Used as the axis or legend title. If waiver(), the
default, the name of the scale is taken from the first mapping used for that
aesthetic. If NULL, the legend title will be omitted.
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
• A numeric vector of positions
• A function that takes the limits as input and returns breaks as output
minor_breaks One of:
• NULL for no minor breaks
• waiver() for the default breaks (one minor break between each major
• A numeric vector of positions
• A function that given the limits returns a vector of minor breaks.
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
limits A numeric vector of length two providing limits of the scale. Use NA to
refer to the existing minimum or maximum.
oob Function that handles limits outside of the scale limits (out of bounds). The
default replaces out of bounds values with NA.
na.value Missing values will be replaced with this value.
trans Either the name of a transformation object, or the object itself. Built-in
transformations include "asn", "atanh", "boxcox", "exp", "identity", "log",
"log10", "log1p", "log2", "logit", "probability", "probit", "reciprocal", "re-
verse" and "sqrt".
A transformation object bundles together a transform, its inverse, and meth-
ods for generating breaks and labels. Transformation objects are defined in
the scales package, and are called name_trans, e.g. scales::boxcox_trans().
You can create your own transformation with scales::trans_new().
guide A function used to create a guide or its name. See guides() for more
scale_x_discrete 185
position The position of the axis. "left" or "right" for vertical scales, "top" or
"bottom" for horizontal scales
super The super class to use for the constructed scale
expand Vector of range expansion constants used to add some padding around
the data, to ensure that they are placed some distance away from the axes.
Use the convenience function expand_scale() to generate the values for
the expand argument. The defaults are to expand the scale by 5% on each
side for continuous variables, and by 0.6 units on each side for discrete
max_size Size of largest points.
See Also
scale_size_area() if you want 0 values to be mapped to points with size 0.
p <- ggplot(mpg, aes(displ, hwy, size = hwy)) +
p + scale_size("Highway mpg")
p + scale_size(range = c(0, 10))
# If you want to map size to radius (usually bad idea), use scale_radius
p + scale_radius()
You can use continuous positions even with a discrete position scale - this allows you (e.g.) to place
labels between bars in a bar chart. Continuous positions are numeric values starting at one for the
first level, and increasing by one for each level (i.e. the labels are placed at integer positions). This
is what allows jittering to work.
scale_x_discrete(..., expand = waiver(), position = "bottom")
See Also
ggplot(diamonds, aes(cut)) + geom_bar()
d + scale_x_discrete("Cut")
d + scale_x_discrete("Cut", labels = c("Fair" = "F","Good" = "G",
"Very Good" = "VG","Perfect" = "P","Ideal" = "I"))
# you can also use the short hand functions xlim and ylim
d + xlim("Fair","Ideal", "Good")
d + ylim("I1", "IF")
This vector field was produced from the data described in Brillinger, D.R., Preisler, H.K., Ager,
A.A. and Kie, J.G. "An exploratory data analysis (EDA) of the paths of moving animals". J. Statis-
tical Planning and Inference 122 (2004), 43-63, using the methods of Brillinger, D.R., "Learning a
potential function from a trajectory", Signal Processing Letters. December (2007).
A data frame with 1155 rows and 4 variables
188 sec_axis
This function is used in conjunction with a position scale to create a secondary axis, positioned
opposite of the primary axis. All secondary axes must be based on a one-to-one transformation of
the primary axes.
sec_axis(trans = NULL, name = waiver(), breaks = waiver(),
labels = waiver())
trans A transformation formula
name The name of the secondary axis
breaks One of:
• NULL for no breaks
• waiver() for the default breaks computed by the transformation object
• A numeric vector of positions
• A function that takes the limits as input and returns breaks as output
labels One of:
• NULL for no labels
• waiver() for the default labels computed by the transformation object
• A character vector giving labels (must be same length as breaks)
• A function that takes the breaks as input and returns labels as output
sec_axis is used to create the specifications for a secondary axis. Except for the trans argument
any of the arguments can be set to derive() which would result in the secondary axis inheriting
the settings from the primary axis.
dup_axis is provide as a shorthand for creating a secondary axis that is a duplication of the primary
axis, effectively mirroring the primary axis.
stat 189
p <- ggplot(mtcars, aes(cyl, mpg)) +
Most aesthetics are mapped from variables found in the data. Sometimes, however, you want to
map from variables computed by the aesthetic. The most common example of this is the height of
bars in geom_histogram(): the height does not come from a variable in the underlying data, but is
instead mapped to the count computed by stat_bin(). The stat() function is a flag to ggplot2
to it that you want to use calculated aesthetics produced by the statistic.
190 stat_ecdf
x An aesthetic expression using variables calculated by the stat.
This replaces the older approach of surrounding the variable name with ...
# Default histogram display
ggplot(mpg, aes(displ)) +
geom_histogram(aes(y = stat(count)))
The empirical cumulative distribution function (ECDF) provides an alternative visualisation of dis-
tribution. Compared to other visualisations that rely on density (like geom_histogram()), the
ECDF doesn’t require any tuning parameters and handles both continuous and categorical vari-
ables. The downside is that it requires more training to accurately interpret, and the underlying
visual tasks are somewhat more challenging.
stat_ecdf(mapping = NULL, data = NULL, geom = "step",
position = "identity", ..., n = NULL, pad = TRUE, na.rm = FALSE,
show.legend = NA, inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
stat_ecdf 191
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
n if NULL, do not interpolate. If not NULL, this is the number of points to inter-
polate with.
pad If TRUE, pad the ecdf with additional points (-Inf, 0) and (Inf, 1)
na.rm If FALSE (the default), removes missing values with a warning. If TRUE silently
removes missing values.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
Computed variables
x x in data
y cumulative density corresponding x
df <- data.frame(
x = c(rnorm(100, 0, 3), rnorm(100, 0, 10)),
g = gl(2, 100)
ggplot(df, aes(x)) + stat_ecdf(geom = "step")
# Multiple ECDFs
ggplot(df, aes(x, colour = g)) + stat_ecdf()
192 stat_ellipse
The method for calculating the ellipses has been modified from car::ellipse (Fox and Weisberg,
stat_ellipse(mapping = NULL, data = NULL, geom = "path",
position = "identity", ..., type = "t", level = 0.95,
segments = 51, na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
type The type of ellipse. The default "t" assumes a multivariate t-distribution, and
"norm" assumes a multivariate normal distribution. "euclid" draws a circle
with the radius equal to level, representing the euclidean distance from the
center. This ellipse probably won’t appear circular unless coord_fixed() is
level The confidence level at which to draw an ellipse (default is 0.95), or, if type="euclid",
the radius of the circle to be drawn.
segments The number of segments to be used in drawing the ellipse.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
stat_function 193
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
John Fox and Sanford Weisberg (2011). An R Companion to Applied Regression, Second Edi-
tion. Thousand Oaks CA: Sage. URL:
ggplot(faithful, aes(waiting, eruptions)) +
geom_point() +
This stat makes it easy to superimpose a function on top of an existing plot. The function is called
with a grid of evenly spaced values along the x axis, and the results are drawn (by default) with a
194 stat_function
stat_function(mapping = NULL, data = NULL, geom = "path",
position = "identity", ..., fun, xlim = NULL, n = 101,
args = list(), na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
fun function to use. Must be vectorised.
xlim Optionally, restrict the range of the function to this range.
n number of points to interpolate along
args list of additional arguments to pass to fun
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
stat_function() understands the following aesthetics (required aesthetics are in bold):
• group
• y
Learn more about setting these aesthetics in vignette("ggplot2-specs").
stat_identity 195
Computed variables
x x’s along a grid
y value of function evaluated at corresponding x
df <- data.frame(
x = rnorm(100)
x <- df$x
base <- ggplot(df, aes(x)) + geom_density()
base + stat_function(fun = dnorm, colour = "red")
base + stat_function(fun = dnorm, colour = "red", args = list(mean = 3))
# To specify a different mean or sd, use the args parameter to supply new values
ggplot(data.frame(x = c(-5, 5)), aes(x)) +
stat_function(fun = dnorm, args = list(mean = 2, sd = .5))
The identity statistic leaves the data unchanged.
stat_identity(mapping = NULL, data = NULL, geom = "point",
position = "identity", ..., show.legend = NA, inherit.aes = TRUE)
196 stat_sf_coordinates
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
p <- ggplot(mtcars, aes(wt, mpg))
p + stat_identity()
stat_sf_coordinates() extracts the coordinates from ’sf’ objects and summarises them to one
pair of coordinates (x and y) per geometry. This is convenient when you draw an sf object as geoms
like text and labels (so geom_sf_text() and geom_sf_label() relies on this).
stat_sf_coordinates(mapping = aes(), data = NULL, geom = "point",
position = "identity", na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE, fun.geometry = NULL, ...)
stat_sf_coordinates 197
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
fun.geometry A function that takes a sfc object and returns a sfc_POINT with the same length
as the input. If NULL, function(x) sf::st_point_on_surface(sf::st_zm(x))
will be used. Note that the function may warn about the incorrectness of the re-
sult if the data is not projected, but you can ignore this except when you really
care about the exact locations.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
coordinates of an sf object can be retrieved by sf::st_coordinates(). But, we cannot simply
use sf::st_coordinates() because, whereas text and labels require exactly one coordinate per
geometry, it returns multiple ones for a polygon or a line. Thus, these two steps are needed:
1. Choose one point per geometry by some function like sf::st_centroid() or sf::st_point_on_surface().
2. Retrieve coordinates from the points by sf::st_coordinates().
For the first step, you can use an arbitrary function via fun.geometry. By default, function(x) sf::st_point_on_surface
is used; sf::st_point_on_surface() seems more appropriate than sf::st_centroid() since
lables and text usually are intended to be put within the polygon or the line. sf::st_zm() is needed
to drop Z and M dimension beforehand, otherwise sf::st_point_on_surface() may fail when
the geometries have M dimension.
198 stat_summary_2d
Computed variables
if (requireNamespace("sf", quietly = TRUE)) {
nc <- sf::st_read(system.file("shape/nc.shp", package="sf"))
ggplot(nc) +
ggplot(nc) +
aes(geometry = geometry,
xmin = stat(x) - 0.1,
xmax = stat(x) + 0.1,
y = stat(y),
height = 0.04),
stat = "sf_coordinates"
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
bins numeric vector giving number of bins in both vertical and horizontal directions.
Set to 30 by default.
binwidth Numeric vector giving bin width in both vertical and horizontal directions. Over-
rides bins if both set.
drop drop if the output of fun is NA.
fun function for summary.
fun.args A list of extra arguments to pass to fun
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
• x: horizontal position
• y: vertical position
• z: value passed to the summary function
Computed variables
x,y Location
value Value of summary statistic.
200 stat_summary_bin
See Also
stat_summary_hex() for hexagonal summarization. stat_bin2d() for the binning options.
d <- ggplot(diamonds, aes(carat, depth, z = price))
d + stat_summary_2d()
# Specifying function
d + stat_summary_2d(fun = function(x) sum(x^2))
d + stat_summary_2d(fun = var)
d + stat_summary_2d(fun = "quantile", fun.args = list(probs = 0.1))
if (requireNamespace("hexbin")) {
d + stat_summary_hex()
stat_summary operates on unique x; stat_summary_bin operates on binned x. They are more
flexible versions of stat_bin(): instead of just counting, they can compute any aggregate.
stat_summary_bin(mapping = NULL, data = NULL, geom = "pointrange",
position = "identity", ..., = NULL, fun.y = NULL,
fun.ymax = NULL, fun.ymin = NULL, fun.args = list(), bins = 30,
binwidth = NULL, breaks = NULL, na.rm = FALSE, show.legend = NA,
inherit.aes = TRUE)
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
stat_summary_bin 201
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom Use to override the default connection between geom_histogram()/geom_freqpoly()
and stat_bin().
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat. A function that is given the complete data and should return a data frame with
variables ymin, y, and ymax.
fun.ymin, fun.y, fun.ymax
Alternatively, supply three individual functions that are each passed a vector of
x’s and should return a single number.
fun.args Optional additional arguments passed on to the functions.
bins Number of bins. Overridden by binwidth. Defaults to 30.
binwidth The width of the bins. Can be specified as a numeric value, or a function that
calculates width from x. The default is to use bins bins that cover the range of
the data. You should always override this value, exploring multiple widths to
find the best to illustrate the stories in your data.
The bin width of a date variable is the number of days in each time; the bin
width of a time variable is the number of seconds.
breaks Alternatively, you can supply a numeric vector giving the bin boundaries. Over-
rides binwidth, bins, center, and boundary.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
stat_summary() understands the following aesthetics (required aesthetics are in bold):
• x
• y
• group
Learn more about setting these aesthetics in vignette("ggplot2-specs").
202 stat_summary_bin
Summary functions
You can either supply summary functions individually (fun.y, fun.ymax, fun.ymin), or as a single
function ( Complete summary function. Should take numeric vector as input and return data frame
as output
fun.ymin ymin summary function (should take numeric vector and return single number)
fun.y y summary function (should take numeric vector and return single number)
fun.ymax ymax summary function (should take numeric vector and return single number)
A simple vector function is easiest to work with as you can return a single number, but is somewhat
less flexible. If your summary function computes multiple values at once (e.g. ymin and ymax), use
If no aggregation functions are supplied, will default to mean_se().
See Also
geom_errorbar(), geom_pointrange(), geom_linerange(), geom_crossbar() for geoms to
display summarised data
d <- ggplot(mtcars, aes(cyl, mpg)) + geom_point()
d + stat_summary( = "mean_cl_boot", colour = "red", size = 2)
# Don't use ylim to zoom into a summary plot - this throws the
# data away
p <- ggplot(mtcars, aes(cyl, mpg)) +
stat_summary(fun.y = "mean", geom = "point")
p + ylim(15, 30)
# Instead use coord_cartesian
p + coord_cartesian(ylim = c(15, 30))
Remove duplicates
mapping Set of aesthetic mappings created by aes() or aes_(). If specified and inherit.aes = TRUE
(the default), it is combined with the default mapping at the top level of the plot.
You must supply mapping if there is no plot mapping.
data The data to be displayed in this layer. There are three options:
If NULL, the default, the data is inherited from the plot data as specified in the
call to ggplot().
A data.frame, or other object, will override the plot data. All objects will be
fortified to produce a data frame. See fortify() for which variables will be
A function will be called with a single argument, the plot data. The return
value must be a data.frame, and will be used as the layer data.
geom The geometric object to use display the data
position Position adjustment, either as a string, or the result of a call to a position adjust-
ment function.
... Other arguments passed on to layer(). These are often aesthetics, used to set
an aesthetic to a fixed value, like colour = "red" or size = 3. They may also
be parameters to the paired geom/stat.
na.rm If FALSE, the default, missing values are removed with a warning. If TRUE,
missing values are silently removed.
show.legend logical. Should this layer be included in the legends? NA, the default, includes if
any aesthetics are mapped. FALSE never includes, and TRUE always includes. It
can also be a named logical vector to finely select the aesthetics to display.
inherit.aes If FALSE, overrides the default aesthetics, rather than combining with them.
This is most useful for helper functions that define both data and aesthetics and
shouldn’t inherit behaviour from the default plot specification, e.g. borders().
• group
ggplot(mtcars, aes(vs, am)) +
geom_point(alpha = 0.1)
ggplot(mtcars, aes(vs, am)) +
geom_point(alpha = 0.1, stat = "unique")
summarise_plot 205
These functions provide summarised information about built ggplot objects.
p A ggplot_built object.
There are three types of summary that can be obtained: A summary of the plot layout, a summary
of the plot coord, and a summary of plot layers.
Layout summary
The function summarise_layout() returns a table that provides information about the plot panel(s)
in the built plot. The table has the following columns:
Importantly, the values for xmin, xmax, ymin, ymax, xscale, and yscale are determined by the
variables that are mapped to x and y in the aes() call. So even if a coord changes how x and y are
shown in the final plot (as is the case for coord_flip() or coord_polar()), these changes have
no effect on the results returned by summarise_plot().
206 theme
Coord summary
The function summarise_coord() returns information about the log base for coordinates that are
log-transformed in coord_trans(), and it also indicates whether the coord has flipped the x and y
Layer summary
The function summarise_layers() returns a table with a single column, mapping, which contains
information about aesthetic mapping for each layer.
p <- ggplot(mpg, aes(displ, hwy)) + geom_point() + facet_wrap(~class)
b <- ggplot_build(p)
Themes are a powerful way to customize the non-data components of your plots: i.e. titles, labels,
fonts, background, gridlines, and legends. Themes can be used to give plots a consistent customized
look. Modify a single plot’s theme using theme(); see theme_update() if you want modify the
active theme, to affect all subsequent plots. Theme elements are documented together according to
inheritance, read more about theme inheritance below.
theme(line, rect, text, title, aspect.ratio, axis.title, axis.title.x,, axis.title.x.bottom, axis.title.y, axis.title.y.left,
axis.title.y.right, axis.text, axis.text.x,,
axis.text.x.bottom, axis.text.y, axis.text.y.left, axis.text.y.right,
axis.ticks, axis.ticks.x,, axis.ticks.x.bottom,
axis.ticks.y, axis.ticks.y.left, axis.ticks.y.right, axis.ticks.length,
axis.line, axis.line.x,, axis.line.x.bottom, axis.line.y,
axis.line.y.left, axis.line.y.right, legend.background, legend.margin,
legend.spacing, legend.spacing.x, legend.spacing.y, legend.key,
legend.key.size, legend.key.height, legend.key.width, legend.text,
legend.text.align, legend.title, legend.title.align, legend.position,
legend.direction, legend.justification,,,,,,
panel.background, panel.border, panel.spacing, panel.spacing.x,
theme 207
line all line elements (element_line())
rect all rectangular elements (element_rect())
text all text elements (element_text())
title all title elements: plot, axes, legends (element_text(); inherits from text)
aspect.ratio aspect ratio of the panel
axis.title, axis.title.x, axis.title.y,, axis.title.x.bottom, axis.title.y.left, ax
labels of axes (element_text()). Specify all axes’ labels (axis.title), la-
bels by plane (using axis.title.x or axis.title.y), or individually for each
axis (using axis.title.x.bottom,, axis.title.y.left,
axis.title.y.right). axis.title.*.* inherits from axis.title.* which
inherits from axis.title, which in turn inherits from text
axis.text, axis.text.x, axis.text.y,, axis.text.x.bottom, axis.text.y.left, axis.tex
tick labels along axes (element_text()). Specify all axis tick labels (axis.text),
tick labels by plane (using axis.text.x or axis.text.y), or individually for
each axis (using axis.text.x.bottom,, axis.text.y.left,
axis.text.y.right). axis.text.*.* inherits from axis.text.* which in-
herits from axis.text, which in turn inherits from text
axis.ticks, axis.ticks.x,, axis.ticks.x.bottom, axis.ticks.y, axis.ticks.y.left, ax
tick marks along axes (element_line()). Specify all tick marks (axis.ticks),
ticks by plane (using axis.ticks.x or axis.ticks.y), or individually for each
axis (using axis.ticks.x.bottom,, axis.ticks.y.left,
axis.ticks.y.right). axis.ticks.*.* inherits from axis.ticks.* which
inherits from axis.ticks, which in turn inherits from line
length of tick marks (unit)
axis.line, axis.line.x,, axis.line.x.bottom, axis.line.y, axis.line.y.left, axis.lin
lines along axes (element_line()). Specify lines along all axes (axis.line),
lines for each plane (using axis.line.x or axis.line.y), or individually for
each axis (using axis.line.x.bottom,, axis.line.y.left,
axis.line.y.right). axis.line.*.* inherits from axis.line.* which in-
herits from axis.line, which in turn inherits from line
background of legend (element_rect(); inherits from rect)
legend.margin the margin around each legend (margin())
208 theme
Theme inheritance
Theme elements inherit properties from other theme elements heirarchically. For example, axis.title.x.bottom
inherits from axis.title.x which inherits from axis.title, which in turn inherits from text.
All text elements inherit directly or indirectly from text; all lines inherit from line, and all rect-
angular objects inherit from rect. This means that you can modify the appearance of multiple
elements by setting a single high-level component.
Learn more about setting these aesthetics in vignette("ggplot2-specs").
See Also and %+replace%, element_blank(), element_line(), element_rect(), and element_text()
for details of the specific theme elements.
p1 <- ggplot(mtcars, aes(wt, mpg)) +
geom_point() +
labs(title = "Fuel economy declines as weight increases")
# Plot ---------------------------------------------------------------------
p1 + theme(plot.title = element_text(size = rel(2)))
p1 + theme(plot.background = element_rect(fill = "green"))
# Panels --------------------------------------------------------------------
# Axes ----------------------------------------------------------------------
p1 + theme(axis.line = element_line(size = 3, colour = "grey80"))
p1 + theme(axis.text = element_text(colour = "blue"))
p1 + theme(axis.ticks = element_line(size = 2))
p1 + theme(axis.ticks.length = unit(.25, "cm"))
p1 + theme(axis.title.y = element_text(size = rel(1.5), angle = 90))
# Legend --------------------------------------------------------------------
p2 <- ggplot(mtcars, aes(wt, mpg)) +
theme_get 211
# Position
p2 + theme(legend.position = "none")
p2 + theme(legend.justification = "top")
p2 + theme(legend.position = "bottom")
# Or place legends inside the plot using relative coordinates between 0 and 1
# legend.justification sets the corner that the position refers to
p2 + theme(
legend.position = c(.95, .95),
legend.justification = c("right", "top"), = "right",
legend.margin = margin(6, 6, 6, 6)
# Strips --------------------------------------------------------------------
The current/active theme is automatically applied to every plot you draw. Use theme_get to get the
current theme, and theme_set to completely override it. theme_update and theme_replace are
shorthands for changing individual elements.
e1 %+replace% e2
theme_set, theme_update, and theme_replace invisibly return the previous theme so you can
easily save it, then later restore it.
Adding on to a theme
See Also
txhousing 213
p <- ggplot(mtcars, aes(mpg, wt)) +
Information about the housing market in Texas provided by the TAMU real estate center, http:
A data frame with 8602 observations and 9 variables:
Just like aes(), vars() is a quoting function that takes inputs to be evaluated in the context of a
dataset. These inputs can be:
• variable names
• complex expressions
In both cases, the results (the vectors that the variable represents or the results of the expressions)
are used to form faceting groups.
... Variables or expressions automatically quoted. These are evaluated in the con-
text of the data to form faceting groups. Can be named (the names are passed to
a labeller).
See Also
aes(), facet_wrap(), facet_grid()
p <- ggplot(mtcars, aes(wt, disp)) + geom_point()
p + facet_wrap(vars(vs, am))
# You can also supply expressions to vars(). In this case it's often a
vars 215
# Now let's unquote everything at the right place. Note that we also
# unquote `n` just in case the data frame has a column named
# `n`. The latter would have precedence over our local variable
# because the data is always masking the environment.
wrap_by(!!nm := cut_number(!!var, !!n))
x (aes_position), 14
xend (aes_position), 14
xlab (labs), 137
xlim (lims), 138
xmax (aes_position), 14
xmin (aes_position), 14
y (aes_position), 14
yend (aes_position), 14
ylab (labs), 137
ylim (lims), 138
ymax (aes_position), 14
ymin (aes_position), 14