library(ggplot2)
# Sample data
set.seed(007)
set.seed(3)
x1 <- rnorm(500)
x2 <- rnorm(500, mean = 3)
x <- c(x1, x2)
group <- c(rep("G1", 500), rep("G2", 500))
df <- data.frame(x, group = group)
head(df)
## x group
## 1 -0.96193342 G1
## 2 -0.29252572 G1
## 3 0.25878822 G1
## 4 -1.15213189 G1
## 5 0.19578283 G1
## 6 0.03012394 G1
In order to overlay a kernel density estimate over a histogram in ggplot2 you will need to pass aes(y = ..density..) to geom_histogram and add geom_density as in the example below.
# Histogram with kernel density
ggplot(df, aes(x = x)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white", binwidth = 0.2) +
geom_density()
The color, line width and line type of the kernel density curve can be customized making use of colour, lwd and linetype arguments.
# Histogram with kernel density
ggplot(df, aes(x = x)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white", binwidth = 0.2) +
geom_density(lwd = 1.2,
linetype = 2,
colour = 2)
You can also shade the area behind the curve, specifying a fill color with the fill argument of the geom_density function. It is recommended to set a level of transparency (between 0 and 1) with alpha argument, so the histogram will keep visible.
# Histogram with kernel density
ggplot(df, aes(x = x)) +
geom_histogram(aes(y = ..density..),
colour = 1, fill = "white", binwidth = 0.2) +
geom_density(lwd = 1, colour = 4,
fill = 4, alpha = 0.25)