What Is R Programming?

The R programming language provides a suite of useful tools for performing data science tasks. R enables data analysts to supercharge their analysis workflow by producing reproducible analyses and reports in a highly readable syntax on tabular data. Trainees will learn to extract data from common file formats, define the appropriate transformations for their analysis and produce informative visualizations using the ggplot2 package.

What Is Data Science?

Data science is a “concept to unify statistics, data analysis, machine learning and their related methods" in order to “understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, and information science

Learning Objectives

Use the ggplot2 package to explore distributions and relationships in a dataset.

Use the dplyr and tidyr packages to manipulate data into a format amenable for analysis.

Understand issues related to importing CSV and Excel file formats and how to overcome these issues.

Other Requirement

Course Durations

This course runs for 3 days

Technical Requirement

A laptop with the RStudio IDE and R (version 3.2 and above) installed is ideal. R’s system requirements are minimal, however we recommend a system with 4GB RAM or more.

Trainees may also use RStudio Cloud, an online-hosted instance of R designed for data science training, provided a stable internet connection at the venue can be guaranteed.


Familiarity with high school statistics is assumed. Prior programming experience will be helpful, but not essential.