library(ggplot2)
library(ggridges)
# Sample data
df <- diamonds[1:100, c("color", "depth")]
The geom_density_ridges function from the ggridges package allows creating a ridgeline visualization. Given a numerical variable (depth) and a categorical variable (color) a density estimation of the data will be calculated and displayed for each group.
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges() # function 1
## Picking joint bandwidth of 0.678
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges2() # function 2
## Picking joint bandwidth of 0.678
The rel_min_height argument of the function can be used to cut the trailing tails. You will need to fine tune the value depending on your data.
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges(rel_min_height = 0.005)
## Picking joint bandwidth of 0.678
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges2(rel_min_height = 0.005)
## Picking joint bandwidth of 0.678
In addition, the scale argument controls the scaling of the ridgelines relative to the spacing between them.
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges(scale = 3)
## Picking joint bandwidth of 0.678
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges2(scale = 3)
## Picking joint bandwidth of 0.678
The stat argument can be used to select the statistical transformation to be used.
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges(stat = "binline", bins = 20, draw_baseline = FALSE)
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges2(stat = "binline", bins = 20, draw_baseline = FALSE)
The default gray color of the ridgelines can be changed with the fill argument of the geom_density_ridges function. Note that you can also specify a level of transparency with alpha.
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges(fill = "lightblue", alpha = 0.5)
## Picking joint bandwidth of 0.678
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges2(fill = "lightblue", alpha = 0.5)
## Picking joint bandwidth of 0.678
The color argument of the function controls the color of the lines. As in other plots you can also change the line type and the width of the lines.
ggplot(df, aes(x = depth, y = color)) +
geom_density_ridges2(fill = "white",
color = 4,
linetype = 1,
lwd = 0.5)
## Picking joint bandwidth of 0.678
You can also fill the densities based on the categorical variable, passing it to the fill argument of aes. The color palette can be changed with scale_fill_manual, for instance.
ggplot(df, aes(x = depth, y = color, fill = color)) +
geom_density_ridges()
## Picking joint bandwidth of 0.678
The scale_fill_cyclical and scale_color_cyclical functions can be used to add cyclical fill and border colors to the density estimations.
ggplot(df, aes(x = depth, y = color, fill = color, color = color)) +
geom_density_ridges() +
scale_fill_cyclical(name = "Cycle", guide = "legend",
values = c("#99E6FF", "#4CA6FF")) +
scale_color_cyclical(name = "Cycle", guide = "legend",
values = c(1, 4))
## Picking joint bandwidth of 0.678
Similarly, using stat(ecdf) it is possible to add a gradient to the densities displaying the tail probabilities.
ggplot(df, aes(depth, y = color,
fill = 0.5 - abs(0.5 - stat(ecdf)))) +
stat_density_ridges(geom = "density_ridges_gradient", calc_ecdf = TRUE) +
scale_fill_gradient(low = "white", high = "#87CEFF",
name = "Tail prob.")
## Picking joint bandwidth of 0.678
The same approach described above can be used to highlight the tails of the distributions.
ggplot(df, aes(x = depth, y = color, fill = stat(quantile))) +
stat_density_ridges(quantile_lines = TRUE,
calc_ecdf = TRUE,
geom = "density_ridges_gradient",
quantiles = c(0.05, 0.95)) +
scale_fill_manual(name = "Prob.", values = c("#E2FFF2", "white", "#B0E0E6"),
labels = c("(0, 5%]", "(5%, 95%]", "(95%, 1]"))
## Picking joint bandwidth of 0.678