Data Analyst’s Recipe | How to create a scatter plot in R

Creating a scatter plot is a powerful way to visualize relationships between two continuous variables. In R, scatter plots can be easily created using the ggplot2
package. In this tutorial, we will walk through the steps to create a scatter plot in R using the ggplot2
package.
1. Loading the data
First, we need to load the data that we want to use for our scatter plot. For this tutorial, we will be using the iris
dataset which is included in the datasets
package in R.
# Load the iris dataset
data(iris)
2. Creating a scatter plot using ggplot2
Next, we will create a scatter plot using the ggplot2
package. We will use the ggplot()
function to create the basic plot object and then add layers to customize the plot.
# Load the ggplot2 package
library(ggplot2)
# Create a basic scatter plot
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width)) +
geom_point()

In the above code, we first loaded the ggplot2
package using the library()
function. Then, we created a basic scatter plot using the ggplot()
function and specified the iris
dataset as the data source. We used the aes()
function to specify the variables to be plotted on the x and y axes. Finally, we added a layer to the plot using the geom_point()
function to create the scatter plot itself.
3. Customizing the scatter plot
Now that we have created a basic scatter plot, we can customize it to make it more visually appealing and informative. Here are a few examples:
Changing the color of the points based on a third variable
# Create a scatter plot with points colored by Species
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point()

In the above code, we added a new argument to the aes()
function to specify that the points should be colored by the Species
variable. This creates a scatter plot where each species is represented by a different color.
Adding a title and axis labels
# Create a scatter plot with a title and axis labels
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point() +
labs(title = "Sepal Length vs. Sepal Width", x = "Sepal Length", y = "Sepal Width")

In the above code, we added a new layer to the plot using the labs()
function to specify the title and axis labels.
Changing the point shape
# Create a scatter plot with different point shapes based on Species
ggplot(data = iris, aes(x = Sepal.Length, y = Sepal.Width, shape = Species)) +
geom_point() +
labs(title = "Sepal Length vs. Sepal Width", x = "Sepal Length", y = "Sepal Width")

In the above code, we added a new argument to the aes()
function to specify that the point shapes should be based on the Species
variable. This creates a scatter plot where each species is represented by a different point shape.
Another example:
Here, as an example, we will use the mtcars
dataset from the datasets
package in R.
# Load the mtcars dataset
data(mtcars)
# Create a scatter plot of mpg vs. wt
ggplot(data = mtcars, aes(x = wt, y = mpg)) +
geom_point() +
labs(title = "MPG vs. Weight", x = "Weight", y = "Miles Per Gallon")
In the above code, we created a scatter plot of mpg
(miles per gallon) vs. wt
(weight) using the mtcars
dataset. We added a title and axis labels using the labs()
function.

Latest end-to-end Learn by Coding Projects (Jupyter Notebooks) in Python and R:
All Notebooks in One Bundle: Data Science Recipes and Examples in Python & R.
End-to-End Python Machine Learning Recipes & Examples.
End-to-End R Machine Learning Recipes & Examples.
Applied Statistics with R for Beginners and Business Professionals
Data Science and Machine Learning Projects in Python: Tabular Data Analytics
Data Science and Machine Learning Projects in R: Tabular Data Analytics
Python Machine Learning & Data Science Recipes: Learn by Coding
R Machine Learning & Data Science Recipes: Learn by Coding
Comparing Different Machine Learning Algorithms in Python for Classification (FREE)
There are 2000+ End-to-End Python & R Notebooks are available to build Professional Portfolio as a Data Scientist and/or Machine Learning Specialist. All Notebooks are only $29.95. We would like to request you to have a look at the website for FREE the end-to-end notebooks, and then decide whether you would like to purchase or not.