R Data Visualisation Example: Box plot with jittered points using ggplot2

Import library

library(ggplot2)

Generate Sample Data for Plotting purposes

# Sample data
set.seed(007)
y <- round(rnorm(300), 1)

df <- data.frame(y = y,
                group = sample(c("G1", "G2", "G3"), size = 300, replace = TRUE))

head(df)
##      y group
## 1  2.3    G3
## 2 -1.2    G1
## 3 -0.7    G2
## 4 -0.4    G2
## 5 -1.0    G2
## 6 -0.9    G3

Box plot with jittered points with geom_jitter

Adding jittered points (a stripchart) to a box plot in ggplot is useful to see the underlying distribution of the data. You will need to use geom_jitter.

# Basic box plot
ggplot(df, aes(x = "", y = y)) + 
  geom_boxplot() +
  geom_jitter()

A good practice is removing the outliers of the box plot with outlier.shape = NA, as the jitter will add them again.

# Basic box plot
ggplot(df, aes(x = "", y = y)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter() 

Flip the axes with coord_flip or flipping the variables inside aes.

# Basic box plot
ggplot(df, aes(x = "", y = y)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter() +
  coord_flip() 

Customization of the jittered points within Box plot

The jittered points can be customized in several ways. You can change the amount of jitter with width, the color of the points, its shape or its size, as shown in the examples below.

Width

# Basic box plot
ggplot(df, aes(x = "", y = y)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter(width = 0.2) 

Color

# Basic box plot
ggplot(df, aes(x = "", y = y)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter(colour = 2) 

Shape and Size

# Basic box plot
ggplot(df, aes(x = "", y = y)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter(shape = 3, size = 1.2) 

Box plot by group with jitter

If you have a categorical variable representing groups you can create a box plot by group and add the points to each group and customize its color, size and shape.

# Box plot by group with jitter
ggplot(df, aes(x = group, y = y)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter() 

Jitter by group with custom color

# Box plot by group with jitter
ggplot(df, aes(x = group, y = y)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter(colour = 2) 

Jitter by group with color by group

# Box plot by group with jitter
ggplot(df, aes(x = group, y = y, colour = group)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter() 

Jitter by group with custom shape

# Box plot by group with jitter
ggplot(df, aes(x = group, y = y,
               colour = group,
               shape = group)) + 
  geom_boxplot(outlier.shape = NA) +
  geom_jitter()