HOW TO EASILY MANIPULATE FILES AND DIRECTORIES IN R This article presents the fs R package, which provides a cross-platform, uniform interface to file system operations. fs functions are divided into four main categories: path_ for manipulating and constructing paths file_ for files dir_ for directories link_ for links Contents: Prerequistes Some Key R functions Basic usage Filter files Read …
GGHIGHLIGHT: EASY WAY TO HIGHLIGHT A GGPLOT IN R This article presents how to easily highlight a ggplot using the gghighlight package. Contents: Prerequisites Line plot Histogram Scatter plot Bar plot Prerequisites Load required packages and set the default ggplot2 theme to theme_bw(). library(tidyverse) library(gghighlight) theme_set(theme_bw()) Line plot Basic line plot p <- ggplot( airquality, …
GGPLOT THEME BACKGROUND COLOR AND GRIDS This article shows how to change a ggplot theme background color and grid lines. The default theme of a ggplot2 graph has a grey background color. You can easily and quickly change this to a white background color by using the theme functions, such as theme_bw(), theme_classic(), theme_minimal() or theme_light() (See ggplot2 themes gallery). Another alternative is to modify directly …
GGPLOT AXIS LIMITS AND SCALES This article describes R functions for changing ggplot axis limits (or scales). We’ll describe how to specify the minimum and the maximum values of axes. Among the different functions available in ggplot2 for setting the axis range, the coord_cartesian() function is the most preferred, because it zoom the plot without clipping the data. In …
CLUSTER ANALYSIS IN R SIMPLIFIED AND ENHANCED In R software, standard clustering methods (partitioning and hierarchical clustering) can be computed using the R packages stats and cluster. However the workflow, generally, requires multiple steps and multiple lines of R codes. This article describes some easy-to-use wrapper functions, in the factoextra R package, for simplifying and improving cluster analysis in R. These …
(R Tutorials for Citizen Data Scientist) Statistics with R for Business Analysts – Nonlinear Least Square When modeling real world data for regression analysis, we observe that it is rarely the case that the equation of the model is a linear equation giving a linear graph. Most of the time, the equation of the model …
TensorFlow and Keras are two popular open-source tools used for machine learning and deep learning. They are often used together to build and train neural networks, which are a type of model that can be used for tasks such as image recognition, natural language processing, and more. One important technique used in training neural …
Tensorflow is an open-source software library developed by Google for machine learning. It is a powerful tool that can be used to build and train neural networks. Keras is a high-level library that runs on top of Tensorflow and is used to simplify the process of building and training neural networks. Together, Tensorflow and Keras …
Boosting is another ensemble learning method that is used to improve the performance of machine learning models. Like bagging, boosting combines the predictions of multiple models, but it does so in a different way. Instead of generating multiple subsets of the data and training a model on each subset, boosting trains a model on …
Random Forest is a type of ensemble learning algorithm that can be used for both classification and regression tasks. It works by building multiple decision trees and combining their predictions to make a final prediction. One of the advantages of Random Forest is that it can help to reduce overfitting, which is a common …