R Data Visualisation Example: Box plot in ggplot2 with geom_boxplot

Import library

library(ggplot2)

Generate Sample Data for Plotting purposes

# Sample data
set.seed(007)
y <- round(rnorm(500), 1)

df <- data.frame(y = y,
                group = sample(c("G1", "G2", "G3"), size = 500, replace = TRUE))

head(df)
##      y group
## 1  2.3    G3
## 2 -1.2    G2
## 3 -0.7    G1
## 4 -0.4    G3
## 5 -1.0    G3
## 6 -0.9    G2

Basic BOX plot using ggplot2

# Basic box plot
ggplot(df, aes(y = y)) + 
  geom_boxplot()

# Set x = "". This will remove the values of the X axis and make the box plot thinner.
# Basic box plot
ggplot(df, aes(x = "", y = y)) + 
  geom_boxplot()

Adding error bars (whiskers) with stat_boxplot

The default box plot in ggplot doesn’t add the error bar lines, but you can add them with stat_boxplot, setting geom = “errorbar”. Note that you can change its width with width.

# Basic box plot
ggplot(df, aes(y = y)) + 
  stat_boxplot(geom = "errorbar",
               width = 0.15) + 
  geom_boxplot() 

ggplot(df, aes(x = "", y = y)) + 
  stat_boxplot(geom = "errorbar",
               width = 0.15) + 
  geom_boxplot() 

Horizontal box plot

There are two main ways to change the orientation of a box plot in ggplot2: changing the aes variable or using coord_flip as in the examples below.

# Horizontal box plot - option 1
ggplot(df, aes(x = y)) + 
  stat_boxplot(geom = "errorbar",
               width = 0.15) + 
  geom_boxplot() 

# Horizontal box plot - option 2
ggplot(df, aes(y = y)) + 
  stat_boxplot(geom = "errorbar",
               width = 0.15) + 
  geom_boxplot() +
  coord_flip()

Box plot customization

The ggplot box plots can be customized making use of the arguments of stat_boxplot and geom_boxplot. In the following examples we are changing the colors and line types of the plots, highlighting the corresponding arguments.

Colors customization

# Box plot customization
# Vertical plot
ggplot(df, aes(y = y)) + 
  stat_boxplot(geom = "errorbar",
               width = 0.15,
               color = 1) +  # Error bar color
  geom_boxplot(fill = 2,           # Box color
               alpha = 0.5,        # Transparency
               color = 1,          # Border color
               outlier.colour = 2) # Outlier color 

# Horizental plot
ggplot(df, aes(x = y)) + 
  stat_boxplot(geom = "errorbar",
               width = 0.15,
               color = 1) +  # Error bar color
  geom_boxplot(fill = 2,           # Box color
               alpha = 0.5,        # Transparency
               color = 1,          # Border color
               outlier.colour = 2) # Outlier color 

Lines customization

# Box plot lines customization
ggplot(df, aes(y = y)) + 
  stat_boxplot(geom = "errorbar",
               width = 0.15,
               linetype = 2, # Line type
               lwd = 0.5) +  # Line width
  geom_boxplot(linetype = 2, # Line type
               lwd = 0.5)    # Line width