Tag Archives: Data Science in R

Reshaping Data with R

Reshaping Data with R   Introduction In predictive modeling, it is often necessary to reshape the data to make it ready for conducting analysis or building models. The process of transforming the data into a clear, simple, and desirable form is an integral component of data science. The most common reshaping process is converting the …

Coping with Missing, Invalid and Duplicate Data in R

Coping with Missing, Invalid and Duplicate Data in R   Introduction A vital component of data science is cleaning the data and getting it ready for predictive modeling. The most common problem related to data cleaning is coping with missing data, invalid records and duplicate values. In this guide, you will learn about techniques for …

Visualization of Text Data Using Word Cloud in R

Visualization of Text Data Using Word Cloud in R   Introduction Visualization plays an important role in exploratory data analysis and feature engineering. However, visualizing text data can be tricky because it is unstructured. Word Cloud provides an excellent option to visualize the text data in the form of tags, or words, where the importance …

Machine Learning with Text Data Using R

Machine Learning with Text Data Using R   Introduction The domain of analytics that addresses how computers understand text is called Natural Language Processing (NLP). NLP has multiple applications like sentiment analysis, chatbots, AI agents, social media analytics, as well as text classification. In this guide, you will learn how to build a supervised machine …

Hypothesis Testing – Interpreting Data with Statistical Models

Hypothesis Testing – Interpreting Data with Statistical Models   Introduction Building predictive models, or carrying out data science research, depends on formulating a hypothesis and drawing conclusions using statistical tests. In this guide, you will learn about how to perform these tests using the statistical programming language, ‘R’. The most widely used inferential statistic techniques …

Time Series Forecasting Using R

Time Series Forecasting Using R   Introduction In this guide, you will learn how to implement the following time series forecasting techniques using the statistical programming language ‘R’: 1. Naive Method 2. Simple Exponential Smoothing 3. Holt’s Trend Method 4. ARIMA 5. TBATS We will begin by exploring the data. Problem Statement Unemployment is a …

Interpreting Data Using Statistical Models with R

Interpreting Data Using Statistical Models with R   Introduction Statistical models are useful not only in machine learning, but also in interpreting data and understanding the relationships between the variables. In this guide, the reader will learn how to fit and analyze statistical models on the quantitative (linear regression) and qualitative (logistic regression) target variables. …

Data Science in R: Interpreting Data Using Descriptive Statistics with R

Interpreting Data Using Descriptive Statistics with R   Introduction Descriptive Statistics is the foundation block of summarizing data. It is divided into the measures of central tendency and the measures of dispersion. Measures of central tendency include mean, median, and the mode, while the measures of variability include standard deviation, variance, and the interquartile range. …

Data Analytics – DISPLAY A BEAUTIFUL SUMMARY STATISTICS IN R USING SKIMR PACKAGE

DISPLAY A BEAUTIFUL SUMMARY STATISTICS IN R USING SKIMR PACKAGE   This article describes how to quickly display summary statistics using the R package skimr. skimr handles different data types and returns a skim_df object which can be included in a tidyverse pipeline or displayed nicely for the human reader. Key features of skimr: Provides a larger set of statistics than …