R Data Visualisation Example: Histogram bins and binwidth using ggplot2

Import library

library(ggplot2)

Generate Sample Data for Plotting purposes

# Sample data
set.seed(007)

# Data
x <- c(rnorm(200, mean = -2, 1.5),
       rnorm(200, mean = 0, sd = 1),
       rnorm(200, mean = 2, 1.5))
group <- c(rep("A", 200), rep("B", 200), rep("C", 200))
df <- data.frame(x, group)

head(df)
##           x group
## 1  1.430871     A
## 2 -3.795158     A
## 3 -3.041439     A
## 4 -2.618439     A
## 5 -3.456010     A
## 6 -3.420920     A

Basic Histogram bins and binwidth using ggplot2

Default histogram

By default, the underlying computation of geom_histogram through stat_bin uses 30 bins, which is not always a good default.

# Data
x <- rnorm(600)
df <- data.frame(x)

# Default histogram
ggplot(df, aes(x = x)) + 
  geom_histogram() 
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

Using bins argument

# Histogram bins
ggplot(df, aes(x = x)) + 
  geom_histogram(colour = 4, fill = "white", 
                 bins = 15) 

# Histogram bins
ggplot(df, aes(x = x)) + 
  geom_histogram(colour = 4, fill = "white", 
                 bins = 45) 

binwidth argument

The other option is using the binwidth argument of the geom_histogram function. This argument controls the width of each bin along the X-axis. Note that this argument overrides the bin argument.

# Histogram bin width
ggplot(df, aes(x = x)) + 
  geom_histogram(colour = 4, fill = "white", 
                 binwidth = 0.5) 

# Histogram bin width
ggplot(df, aes(x = x)) + 
  geom_histogram(colour = 4, fill = "white", 
                 binwidth = 0.15)