Data Analytics – TOP R COLOR PALETTES TO KNOW FOR GREAT DATA VISUALIZATION

TOP R COLOR PALETTES TO KNOW FOR GREAT DATA VISUALIZATION

 

This article presents the top R color palettes for changing the default color of a graph generated using either the ggplot2 package or the R base plot functions.

You’ll learn how to use the top 6 predefined color palettes in R, available in different R packages:

  • Viridis color scales [viridis package].
  • Colorbrewer palettes [RColorBrewer package]
  • Grey color palettes [ggplot2 package]
  • Scientific journal color palettes [ggsci package]
  • Wes Anderson color palettes [wesanderson package]
  • R base color palettes: rainbowheat.colorscm.colors.

 

Note that, the “rainbow” and “heat” color palettes are less perceptually uniform compared to the other color scales. The “viridis” scale stands out for its large perceptual range. It makes as much use of the available color space as possible while maintaining uniformity.

When comparing these color palettes as they might appear under various forms of colorblindness, the viridis palettes remain the most robust.

Contents:

  • Demo dataset
  • Create a basic ggplot colored by groups
  • Viridis color palettes
  • RColorBrewer palettes
  • Grey color palettes
  • Scientific journal color palettes
  • Wes Anderson color palettes
  • R base color palettes
  • Conclusion

 

Demo dataset

We’ll use the R built-in iris demo dataset.

head(iris, 6)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Create a basic ggplot colored by groups

You can change colors according to a grouping variable by:

  • Mapping the argument color to the variable of interest. This will be applied to points, lines and texts
  • Mapping the argument fill to the variable of interest. This will change the fill color of areas, such as in box plot, bar plot, histogram, density plots, etc.

 

In our example, we’ll map the options color and fill to the grouping variable Species, for scatter plot and box plot, respectively.

Changes colors by groups using the levels of Species variable:

library("ggplot2")
# Box plot
bp <- ggplot(iris, aes(Species, Sepal.Length)) + 
  geom_boxplot(aes(fill = Species)) +
  theme_minimal() +
  theme(legend.position = "top")
bp

# Scatter plot
sp <- ggplot(iris, aes(Sepal.Length, Sepal.Width)) + 
  geom_point(aes(color = Species)) +
  theme_minimal()+
  theme(legend.position = "top")
sp

Viridis color palettes

The viridis R package (by Simon Garnier) provides color palettes to make beautiful plots that are: printer-friendly, perceptually uniform and easy to read by those with colorblindness.

Install and load the package as follow:

install.packages("viridis")  # Install
library("viridis")           # Load

The viridis package contains four sequential color scales: “Viridis” (the primary choice) and three alternatives with similar properties (“magma”, “plasma”, and “inferno”).

Key functions:

  • scale_color_viridis(): Change the color of points, lines and texts
  • scale_fill_viridis(): Change the fill color of areas (box plot, bar plot, etc)
  • viridis(n)magma(n)inferno(n) and plasma(n): Generate color palettes for base plot, where n is the number of colors to returns.

 

Note that, the function scale_color_viridis() and scale_fill_viridis() have an argument named option, which is a character string indicating the colormap option to use. Four options are available: “magma” (or “A”), “inferno” (or “B”), “plasma” (or “C”), and “viridis” (or “D”, the default option).

  1. Usage in ggplot2
library(ggplot2)
# Gradient color
ggplot(iris, aes(Sepal.Length, Sepal.Width))+
  geom_point(aes(color = Sepal.Length)) +
  scale_color_viridis(option = "D")+
  theme_minimal() +
  theme(legend.position = "bottom")

# Discrete color. use the argument discrete = TRUE
ggplot(iris, aes(Sepal.Length, Sepal.Width))+
  geom_point(aes(color = Species)) +
  geom_smooth(aes(color = Species, fill = Species), method = "lm") + 
  scale_color_viridis(discrete = TRUE, option = "D")+
  scale_fill_viridis(discrete = TRUE) +
  theme_minimal() +
  theme(legend.position = "bottom")

  1. Usage in base plot. Use the function viridis() to generate the number of colors you want:
barplot(1:10, col = viridis(10))

RColorBrewer palettes

The RColorBrewer package creates a nice looking color palettes. You should first install it as follow: install.packages("RColorBrewer").

To display all the color palettes in the package, type this:

library(RColorBrewer)
display.brewer.all()

The package contains 3 types of color palettes: sequential, diverging, and qualitative.

  1. Sequential palettes (first list of colors), which are suited to ordered data that progress from low to high (gradient). The palettes names are : Blues, BuGn, BuPu, GnBu, Greens, Greys, Oranges, OrRd, PuBu, PuBuGn, PuRd, Purples, RdPu, Reds, YlGn, YlGnBu YlOrBr, YlOrRd.
  2. Qualitative palettes (second list of colors), which are best suited to represent nominal or categorical data. They not imply magnitude differences between groups. The palettes names are : Accent, Dark2, Paired, Pastel1, Pastel2, Set1, Set2, Set3.
  3. Diverging palettes (third list of colors), which put equal emphasis on mid-range critical values and extremes at both ends of the data range. The diverging palettes are : BrBG, PiYG, PRGn, PuOr, RdBu, RdGy, RdYlBu, RdYlGn, Spectral

 

The RColorBrewer package include also three important functions:

# 1. Return the hexadecimal color specification 
brewer.pal(n, name)

# 2. Display a single RColorBrewer palette 
# by specifying its name
display.brewer.pal(n, name)

# 3. Display all color palette
display.brewer.all(n = NULL, type = "all", select = NULL,
                   colorblindFriendly = FALSE)

Description of the function arguments:

  • n: Number of different colors in the palette, minimum 3, maximum depending on palette.
  • name: A palette name from the lists above. For example name = RdBu.
  • type: The type of palette to display. Allowed values are one of: “div”, “qual”, “seq”, or “all”.
  • select: A list of palette names to display.
  • colorblindFriendly: if TRUE, display only colorblind friendly palettes.

 

To display only colorblind-friendly brewer palettes, use this R code:

display.brewer.all(colorblindFriendly = TRUE)

You can also view a single RColorBrewer palette by specifying its name as follow :

# View a single RColorBrewer palette by specifying its name
display.brewer.pal(n = 8, name = 'Dark2')

# Hexadecimal color specification 
brewer.pal(n = 8, name = "Dark2")
## [1] "#1B9E77" "#D95F02" "#7570B3" "#E7298A" "#66A61E" "#E6AB02" "#A6761D"
## [8] "#666666"

Usage in ggplot2. Two color scale functions are available in ggplot2 for using the colorbrewer palettes:

  • scale_fill_brewer() for box plot, bar plot, violin plot, dot plot, etc
  • scale_color_brewer() for lines and points
# Box plot
bp + scale_fill_brewer(palette = "Dark2")

# Scatter plot
sp + scale_color_brewer(palette = "Dark2")

Usage in base plots. The function brewer.pal() is used to generate a vector of colors.

# Barplot using RColorBrewer
barplot(c(2,5,7), col = brewer.pal(n = 3, name = "RdBu"))

Grey color palettes

Key functions:

  • scale_fill_grey() for box plot, bar plot, violin plot, dot plot, etc
  • scale_colour_grey() for points, lines, etc
# Box plot
bp + scale_fill_grey(start = 0.8, end = 0.2) 

# Scatter plot
sp + scale_color_grey(start = 0.8, end = 0.2) 

Scientific journal color palettes

The R package ggsci contains a collection of high-quality color palettes inspired by colors used in scientific journals, data visualization libraries, and more.

The color palettes are provided as ggplot2 scale functions:

  • scale_color_npg() and scale_fill_npg(): Nature Publishing Group color palettes
  • scale_color_aaas() and scale_fill_aaas(): American Association for the Advancement of Science color palettes
  • scale_color_lancet() and scale_fill_lancet(): Lancet journal color palettes
  • scale_color_jco() and scale_fill_jco(): Journal of Clinical Oncology color palettes
  • scale_color_tron() and scale_fill_tron(): This palette is inspired by the colors used in Tron Legacy. It is suitable for displaying data when using a dark theme.

 

You can find more examples in the ggsci package vignettes.

Note that for base plots, you can use the corresponding palette generator for creating a list of colors. For example, you can use: pal_npg(), pal_aaas(), pal_lancet(), pal_jco(), and so on.

 

  1. Usage in ggplot2. We’ll use JCO and the Tron Legacy color palettes.
library("ggplot2")
library("ggsci")
# Change area fill color. JCO palette
ggplot(iris, aes(Species, Sepal.Length)) +
  geom_boxplot(aes(fill = Species)) +
  scale_fill_jco()+
  theme_classic() +
  theme(legend.position = "top")

# Change point color and the confidence band fill color. 
# Use tron palette on dark theme
ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
  geom_point(aes(color = Species)) +
  geom_smooth(aes(color = Species, fill = Species)) + 
  scale_color_tron()+
  scale_fill_tron()+
  theme_dark() +
  theme(
    legend.position = "top",
    panel.background = element_rect(fill = "#2D2D2D"),
    legend.key = element_rect(fill = "#2D2D2D")
    )

  1. Usage in base plots
par(mar = c(1, 3.5, 1, 1))
barplot(1:10, col = pal_jco()(10))

Wes Anderson color palettes

Install the latest developmental version from Github (devtools::install_github("karthik/wesanderson")) or install from CRAN (install.packages("wesanderson")).

It contains 16 color palettes from Wes Anderson movies:

library(wesanderson)
names(wes_palettes)
##  [1] "BottleRocket1"  "BottleRocket2"  "Rushmore1"      "Royal1"        
##  [5] "Royal2"         "Zissou1"        "Darjeeling1"    "Darjeeling2"   
##  [9] "Chevalier1"     "FantasticFox1"  "Moonrise1"      "Moonrise2"     
## [13] "Moonrise3"      "Cavalcanti1"    "GrandBudapest1" "GrandBudapest2"

The key R function in the package, for generating a vector of colors, is

wes_palette(name, n, type = c("discrete", "continuous"))
  • name: Name of desired palette
  • n: Number of colors desired. Unfortunately most palettes now only have 4 or 5 colors.
  • type: Either “continuous” or “discrete”. Use continuous if you want to automatically interpolate between colours.

 

If you need more colours than normally found in a palette, you can use a continuous palette to interpolate between existing colours.

The available color palettes are :

Usage in ggplot2:

library(wesanderson)
# Discrete color
bp + scale_fill_manual(values = wes_palette("GrandBudapest1", n = 3))

# Gradient color
pal <- wes_palette("Zissou1", 100, type = "continuous")
ggplot(heatmap, aes(x = X2, y = X1, fill = value)) +
  geom_tile() + 
  scale_fill_gradientn(colours = pal) + 
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0)) + 
  coord_equal() 

Usage in base plots:

barplot(1:10, col = wes_palette("Zissou1", 10, type = "continuous"))

R base color palettes

There are 5 R base functions that can be used to generate a vector of n contiguous colors: rainbow(n)heat.colors(n)terrain.colors(n)topo.colors(n), and cm.colors(n).

Usage in R base plots:

barplot(1:5, col=rainbow(5))
# Use heat.colors
barplot(1:5, col=heat.colors(5))
# Use terrain.colors
barplot(1:5, col=terrain.colors(5))
# Use topo.colors
barplot(1:5, col=topo.colors(5))
# Use cm.colors
barplot(1:5, col=cm.colors(5))

Conclusion

We present the top R color palette to customize graphics generated by either the ggplot2 package or by the R base functions. The main points are summarized as follow.

  • Create a basic ggplot. Map the color argument to a factor or grouping variable.
p <- ggplot(iris, aes(Sepal.Length, Sepal.Width))+
  geom_point(aes(color = Species))
p
  • Set the color palette manually using a custom color scale:
p + scale_color_manual(values = c("#00AFBB", "#E7B800", "#FC4E07"))
  • Use color blind-friendly palette:
cbp1 <- c("#999999", "#E69F00", "#56B4E9", "#009E73",
          "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
p + scale_color_manual(values = cbp1)
  • Use RColorBrewer palettes:
p + scale_color_brewer(palette = "Dark2")
  • Use viridis color scales:
library(viridis)
p + scale_color_viridis(discrete = TRUE)

Python Example for Beginners

Two Machine Learning Fields

There are two sides to machine learning:

  • Practical Machine Learning:This is about querying databases, cleaning data, writing scripts to transform data and gluing algorithm and libraries together and writing custom code to squeeze reliable answers from data to satisfy difficult and ill defined questions. It’s the mess of reality.
  • Theoretical Machine Learning: This is about math and abstraction and idealized scenarios and limits and beauty and informing what is possible. It is a whole lot neater and cleaner and removed from the mess of reality.

 

Data Science Resources: Data Science Recipes and Applied Machine Learning Recipes

Introduction to Applied Machine Learning & Data Science for Beginners, Business Analysts, Students, Researchers and Freelancers with Python & R Codes @ Western Australian Center for Applied Machine Learning & Data Science (WACAMLDS) !!!

Latest end-to-end Learn by Coding Recipes in Project-Based Learning:

Applied Statistics with R for Beginners and Business Professionals

Data Science and Machine Learning Projects in Python: Tabular Data Analytics

Data Science and Machine Learning Projects in R: Tabular Data Analytics

Python Machine Learning & Data Science Recipes: Learn by Coding

R Machine Learning & Data Science Recipes: Learn by Coding

Comparing Different Machine Learning Algorithms in Python for Classification (FREE)

Disclaimer: The information and code presented within this recipe/tutorial is only for educational and coaching purposes for beginners and developers. Anyone can practice and apply the recipe/tutorial presented here, but the reader is taking full responsibility for his/her actions. The author (content curator) of this recipe (code / program) has made every effort to ensure the accuracy of the information was correct at time of publication. The author (content curator) does not assume and hereby disclaims any liability to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from accident, negligence, or any other cause. The information presented here could also be found in public knowledge domains.  

Google –> SETScholars