The R programming language provides a suite of useful tools for performing data science tasks. R enables data analysts to supercharge their analysis workflow by producing reproducible analyses and reports in a highly readable syntax on tabular data. Trainees will learn to extract data from common file formats, define the appropriate transformations for their analysis and produce informative visualizations using the ggplot2 package.
Data science is a “concept to unify statistics, data analysis, machine learning and their related methods" in order to “understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, computer science, and information science
Use the ggplot2 package to explore distributions and relationships in a dataset.
Use the dplyr and tidyr packages to manipulate data into a format amenable for analysis.
Understand issues related to importing CSV and Excel file formats and how to overcome these issues.
This course runs for 3 days
A laptop with the RStudio IDE and R (version 3.2 and above) installed is ideal. R’s system requirements are minimal, however we recommend a system with 4GB RAM or more.
Trainees may also use RStudio Cloud, an online-hosted instance of R designed for data science training, provided a stable internet connection at the venue can be guaranteed.
Familiarity with high school statistics is assumed. Prior programming experience will be helpful, but not essential.
FEEL FREE TO ENQUIRE ABOUT OUR RANGE OF TRAINING COURSES