How to create Violine chart in R using ggplot2
Violin plots are a visualization technique used to display the distribution of a continuous variable across different levels of a categorical variable. They are similar to box plots, but instead of just showing the quartiles and outliers of a distribution, violin plots also display the shape of the distribution itself. In this article, we will explore how to create violin plots using the ggplot2 package in R.
Getting Started:
First, we need to install and load the ggplot2 package, which is an extension of the base R graphics system that allows for more flexible and customizable visualizations.
install.packages("ggplot2")
library(ggplot2)
Next, we need some data to work with. For this tutorial, we will use the famous Iris dataset, which is included in the ggplot2 package.
data(iris)
The Iris dataset contains information on the length and width of the petals and sepals of three species of iris flowers: Setosa, Versicolor, and Virginica. We will use this data to create a violin plot showing the distribution of petal lengths for each species.
Creating a Simple Violin Plot:
To create a basic violin plot, we use the ggplot() function to specify the data and the aesthetic mapping, which determines how variables are mapped to graphical attributes such as color, shape, and size. Then, we use the geom_violin() function to create the actual plot.
ggplot(iris, aes(x = Species, y = Petal.Length)) +
geom_violin()

In this plot, the x-axis represents the species of iris, while the y-axis represents the length of the petals. Each violin represents the distribution of petal lengths for a particular species.
Customizing the Violin Plot:
While the basic violin plot is informative, we can customize it to make it more visually appealing and informative. Here are some examples of customizations we can make:
Change the color and fill of the violins:
ggplot(iris, aes(x = Species, y = Petal.Length)) +
geom_violin(fill = "#69b3a2", color = "#e9ecef")

In this plot, we changed the fill color of the violins to a teal color (#69b3a2) and the border color to a light gray color (#e9ecef).
Add a box plot to the violin plot:
ggplot(iris, aes(x = Species, y = Petal.Length)) +
geom_violin(fill = "#69b3a2", color = "#e9ecef") +
geom_boxplot(width = 0.2, fill = "#e9ecef", color = "#2c3e50")

In this plot, we added a box plot to the violin plot using the geom_boxplot() function. We also changed the width of the box plot to 0.2 and the fill color to light gray (#e9ecef).
Add a point for each observation:
ggplot(iris, aes(x = Species, y = Petal.Length)) +
geom_violin(fill = "#69b3a2", color = "#e9ecef") +
geom_boxplot(width = 0.2, fill = "#e9ecef", color = "#2c3e50") +
geom_jitter(width = 0.2, height = 0.1, alpha = 0.5, color = "#2c3e50")

In this plot, we added a point for each observation using the geom_jitter() function. We also customized the width and height of the points, the transparency (alpha), and the color.
Another Example of Violine Chart:
Creating a Violin Plot of Tip Amount by Day:
To create a violin plot of tip amount by day of the week, we start by loading the ggplot2 package and the tips dataset:
library(ggplot2)
data(tips)
Then, we can create a basic violin plot using the ggplot() function and the geom_violin() function:
ggplot(tips, aes(x = day, y = tip)) +
geom_violin()
Customizing the Violin Plot:
We can customize the violin plot by adding color and changing the axis labels:
ggplot(tips, aes(x = day, y = tip, fill = day)) +
geom_violin(color = "#2c3e50") +
scale_fill_brewer(palette = "Set2") +
labs(x = "Day of the Week", y = "Tip Amount")
We can further customize the plot by adding a box plot and jittered points for each observation:
ggplot(tips, aes(x = day, y = tip, fill = day)) +
geom_violin(color = "#2c3e50") +
geom_boxplot(width = 0.1, fill = "#e9ecef", color = "#2c3e50") +
geom_jitter(width = 0.2, height = 0.1, alpha = 0.5, color = "#2c3e50") +
scale_fill_brewer(palette = "Set2") +
labs(x = "Day of the Week", y = "Tip Amount")

In this plot, we added a box plot and jittered points using the geom_boxplot() and geom_jitter() functions, respectively. We also customized the width and height of the points, the transparency (alpha), and the color.
In this article, we explored how to create violin plots using the ggplot2 package in R. We started with a simple violin plot and then customized it to make it more visually appealing and informative. Violin plots are a powerful visualization tool that can help us gain insights into the distribution of continuous variables across different levels of a categorical variable. With the ggplot2 package, we can easily create customized violin plots that meet our specific needs.
Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:
All Notebooks in One Bundle: Data Science Recipes and Examples in Python & R.
End-to-End Python Machine Learning Recipes & Examples.
End-to-End R Machine Learning Recipes & Examples.
Applied Statistics with R for Beginners and Business Professionals
Data Science and Machine Learning Projects in Python: Tabular Data Analytics
Data Science and Machine Learning Projects in R: Tabular Data Analytics
Python Machine Learning & Data Science Recipes: Learn by Coding
R Machine Learning & Data Science Recipes: Learn by Coding
Comparing Different Machine Learning Algorithms in Python for Classification (FREE)
There are 2000+ End-to-End Python & R Notebooks are available to build Professional Portfolio as a Data Scientist and/or Machine Learning Specialist. All Notebooks are only $29.95. We would like to request you to have a look at the website for FREE the end-to-end notebooks, and then decide whether you would like to purchase or not.