library(ggplot2)
# Sample data
set.seed(007)
set.seed(3)
x1 <- rnorm(500)
x2 <- rnorm(500, mean = 3)
x <- c(x1, x2)
group <- c(rep("G1", 500), rep("G2", 500))
df <- data.frame(x, group = group)
head(df)
## x group
## 1 -0.96193342 G1
## 2 -0.29252572 G1
## 3 0.25878822 G1
## 4 -1.15213189 G1
## 5 0.19578283 G1
## 6 0.03012394 G1
In order to create a histogram by group in ggplot2 you will need to input the numerical and the categorical variable inside aes and use geom_histogram as follows.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group)) +
geom_histogram(binwidth = 0.15)
You can also set the categorical variable to the colour argument, so the border lines of each histogram will have a different color.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, colour = group)) +
geom_histogram(binwidth = 0.15)
Setting position = “identity” is the most common use case, but recall to set a level of transparency with alpha so both histograms are completely visible.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity", binwidth = 0.15)
Other option is using position = “dodge”, which will add an space between each bar so you will be able to see both histograms.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(position = "dodge", binwidth = 0.15)
If you set fill inside aes but not colour you can change the border color of all histograms as well as its width and linetype with geom_histogram arguments.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group)) +
geom_histogram(colour = "blue",
lwd = 0.75,
linetype = 1,
position = "identity")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
If you set colour but not fill you can change the fill color of all histograms with the fill argument of geom_histogram.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, colour = group)) +
geom_histogram(fill = "white", binwidth = 0.15,
position = "identity")
The borders color can be customized individually with scale_color_manual. If you want to use a palette you can use scale_color_brewer, for instance.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, colour = group)) +
geom_histogram(fill = "white",
position = "identity", binwidth = 0.15) +
scale_color_manual(values = c("blue", "orange"))
Similarly to customizing the borders color, the fill colors can be set with scale_fill_manual or any function supporting fills.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group)) +
geom_histogram(color = 1, alpha = 0.75,
position = "identity", binwidth = 0.2) +
scale_fill_manual(values = c("#8795E8", "#FE9AD5"))
The legend title is the name of the column of the categorical value of the data set. You can change it with the fill and/or colour arguments of the guides function. As we are passing fill and colour to aes we are setting both or two legends will be displayed.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity", binwidth = 0.2) +
guides(fill = guide_legend(title = "Title"),
colour = guide_legend(title = "Title"))
The legend will display the names of the categorical variable by default, but you can change them with scale_color_discrete and/or scale_fill_discrete. Note that this will depend to which aes you set.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity", binwidth = 0.2) +
scale_color_discrete(labels = c("A", "B")) +
scale_fill_discrete(labels = c("A", "B"))
The position of the legend defaults to the right, but can be changed with the legend.position component of the theme function as in the example below.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity", binwidth = 0.2) +
theme(legend.position = "left")
Setting position = “none” the legend will be completely removed.
# Histogram by group in ggplot2
ggplot(df, aes(x = x, fill = group, colour = group)) +
geom_histogram(alpha = 0.5, position = "identity", binwidth = 0.2) +
theme(legend.position = "none")