This lesson is a general introduction to the statistical program R, which will give
you foundational skills for doing exploratory data analysis.
Our sessions will be very hands-on, with a strong emphasis on data visualisation and
manipulation using a collection of packages known as the tidyverse
.
We will use a generic dataset from the Gapminder Foundation, but the skills we learn here apply to a wide range of datasets.
Along the way, we will not just learn R itself, but also (and importantly) about fundamental principles of exploratory data analysis. In that sense, R will be taught as a tool that gives us the freedom to ask a range of questions from our data in a reproducible manner. We will discuss topics such as how to critically evaluate the quality of our data, how it can be used to answer specific questions, explore sources of variation, what makes a good visualisation, how to deal with missing data, and so on.
Prerequisites
These lessons assume no prior knowledge of the skills or tools covered.
You will need a computer with a working copy of R and RStudio. Please make sure to install everything before working through this lesson.
Follow the instructions on the “Setup” tab to install the software and download the necessary data.