library(ggplot2)
# Sample data
set.seed(007)
# Data
x <- c(rnorm(200, mean = -2, 1.5),
rnorm(200, mean = 0, sd = 1),
rnorm(200, mean = 2, 1.5))
group <- c(rep("A", 200), rep("B", 200), rep("C", 200))
df <- data.frame(x, group)
head(df)
## x group
## 1 1.430871 A
## 2 -3.795158 A
## 3 -3.041439 A
## 4 -2.618439 A
## 5 -3.456010 A
## 6 -3.420920 A
The default histograms in ggplot2 and in base R are different, as ggplot2 uses 30 bins by default while base R hist function uses the Sturges method to calculate the number of bins. As you can see, the ggplot2 histograms tend to be too binned due to this default. You can change the bin width or the number of bins to the desired value.
# Default histogram ggplot2
ggplot(df, aes(x = x)) +
geom_histogram()
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
The default histogram with hist function seems to be more appropriate, as the Sturges method is computed.
# Default histogram base R
hist(x)
If you want to create a histogram in ggplot2 which uses the Sturges method you can calculate the breaks as follows and pass them to the breaks argument.
# Data
set.seed(3)
x <- rnorm(450)
df <- data.frame(x)
# Calculating the Sturges bins
breaks <- pretty(range(x),
n = nclass.Sturges(x),
min.n = 1)
df$breaks <- breaks
# Histogram with Sturges method
ggplot(df, aes(x = x)) +
geom_histogram(color = 1, fill = "white",
breaks = breaks) +
ggtitle("Sturges method")