library(ggplot2)
# Sample data
set.seed(007)
y <- round(rnorm(300), 1)
df <- data.frame(y = y,
group = sample(c("G1", "G2", "G3"), size = 300, replace = TRUE))
head(df)
## y group
## 1 2.3 G3
## 2 -1.2 G1
## 3 -0.7 G2
## 4 -0.4 G2
## 5 -1.0 G2
## 6 -0.9 G3
Adding jittered points (a stripchart) to a box plot in ggplot is useful to see the underlying distribution of the data. You will need to use geom_jitter.
# Basic box plot
ggplot(df, aes(x = "", y = y)) +
geom_boxplot() +
geom_jitter()
A good practice is removing the outliers of the box plot with outlier.shape = NA, as the jitter will add them again.
# Basic box plot
ggplot(df, aes(x = "", y = y)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter()
Flip the axes with coord_flip or flipping the variables inside aes.
# Basic box plot
ggplot(df, aes(x = "", y = y)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter() +
coord_flip()
The jittered points can be customized in several ways. You can change the amount of jitter with width, the color of the points, its shape or its size, as shown in the examples below.
# Basic box plot
ggplot(df, aes(x = "", y = y)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2)
# Basic box plot
ggplot(df, aes(x = "", y = y)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(colour = 2)
# Basic box plot
ggplot(df, aes(x = "", y = y)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(shape = 3, size = 1.2)
If you have a categorical variable representing groups you can create a box plot by group and add the points to each group and customize its color, size and shape.
# Box plot by group with jitter
ggplot(df, aes(x = group, y = y)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter()
# Box plot by group with jitter
ggplot(df, aes(x = group, y = y)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(colour = 2)
# Box plot by group with jitter
ggplot(df, aes(x = group, y = y, colour = group)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter()
# Box plot by group with jitter
ggplot(df, aes(x = group, y = y,
colour = group,
shape = group)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter()