3 R & RStudio

R and RStudio are separate downloads and installations. R is the underlying programming language, but using R alone can be daunting – we’d have to write our R files in a text editor and then run the scripts in the terminal. RStudio is a graphical integrated development environment (IDE) that makes using R much easier and more interactive. To function correctly, RStudio needs R and therefore both need to be installed on your computer.

R is an implementation of the S programming language combined with some extra secret sauce, inspired by Scheme. R is named partly after the first names of the first two R authors and partly as a play on the name of S. We’ll use RStudio to write our code, navigate the files on our computer, inspect the variables we are going to create, and visualize the plots we will generate.

3.1 Why learn R?

  1. There is a lot of built-in help for reproducibility!
    • R does not involve lots of pointing and clicking, and that’s a good thing! The learning curve is steeper than SAS or STATA, but the results of your work do not rely on remembering a succession of pointing and clicking – so if you want to redo your analysis because you collected more data, you don’t have to remember which button you clicked in which order to obtain your results; you just have to run your script again.
    • Working with scripts makes the steps you used in your analysis clear, and the code you write can be inspected by someone else who can give you feedback and spot mistakes.
    • Your R code integrates with other tools (RMarkdown!) to generate manuscripts from your code. If you collect more data, or fix a mistake in your dataset, the figures and the statistical tests in your manuscript are updated automatically.
  2. R has a wide adoption across domains & works with any kind of data
    • 10,000+ packages that can be installed to extend its stock stats capabilities!
    • For instance, R has packages for image analysis, GIS, time series, population genetics, and a lot more.
    • You can use it with any type of data – there’s even a QDA package in R!
  3. The skills you learn with R scale easily with the size of your dataset. Whether your dataset has hundreds or millions of lines, it won’t make much difference to you.

3.2 Getting Started

We are going to create a new project in RStudio and then write some R Markdown files! When a new project is created RStudio:

  • Creates a project file (with an .Rproj extension) within the project directory. This file contains various project options and you can also double-click it to start RStudio again.
  • Creates a hidden directory (named .Rproj.user) where project-specific temporary files (e.g. auto-saved source documents, window-state, etc.) are stored.
  • Loads the project into RStudio and display its name in the Projects toolbar (which is located on the far right side of the main toolbar)

3.2.1 Setting up our project space

  1. Start RStudio.
  2. Under the File menu, click on New project. Choose Existing directory, and pick your git repository project folder from yesterday.
  3. Click on Create project.
Creating a new project in existing directory

Creating a new project in existing directory

You’ll notice immediately that RStudio is divided into 4 “Panes”: the editing window for your scripts and documents (top-left, in the default layout), your Environment/History/Git (top-right), your Files/Plots/Packages/Help/Viewer (bottom-right), and the R Console (bottom-left). The placement of these panes and their content can be customized (see menu, Tools -> Global Options -> Pane Layout).

One of the advantages of using RStudio is that all the information you need to write code is available in a single window. Additionally, with many shortcuts, autocompletion, and highlighting for the major file types you use while developing in R, RStudio will make typing easier and less error-prone.

Now we have a blank R project to work from! You might be wondering, “why do we need a project? Can’t I just have an RMarkdown file and call it a day?” Well, sure, but then you’d be missing a few key pieces that only come from using projects:

  • Integration with git! If you aren’t using a R project, you’ll have to go through the steps of adding, committing, and pushing on the terminal like we did yesterday.
  • Automatically sets you up on the relevant working directory
  • Keeps all the files associated with a project organized together

But, this is a reproducibility course! And R has one really great package that helps us with reproducibility, called packrat. packrat stores package dependencies inside a project, rather than relying on the personal R library that is shared across all R sessions. Why I like Packrat:

  • Isolated: Gives each project its own private package library.
  • Portable: Easily transport projects from one computer to another, even across different platforms. Packrat makes it easy to install the packages your project depends on.
  • Reproducible: Packrat records the exact package versions code depends on, and ensures those exact versions are the ones that get installed.

Here’s a good overview: r-bloggers.com/creating-reproducible-software-environments-with-packrat/. So let’s try it! On the R console (bottom right pane in RStudio):


This might take a few minutes, because it is going to get all the source code of all the libraries we have currently instantiated in our project.

While packrat runs, let’s do other stuff! Remember yesterday when we talked about the best way to organize your file structure? Well, there’s multiple R packages for this! These are the two most popular:

  1. rrtools: creates a basic project template to work from, called a ‘compendium’. It also allows for isolation of your computational environment using Docker, package versioning using MRAN, and continuous integration using Travis.
  2. ProjectTemplate: This package will set up an ideal directory structure for project management, very LARGE file structure though.
A good general outline for project structure.

A good general outline for project structure.

These will be total overkill for our purposes, but are very useful when starting out a new project from scratch.


  1. In RStudio, create a doc folder, a data folder, and a results folder (if you didn’t from the OpenRefine lesson).
  2. Make the doc folder your working directory.
  3. Raise your hand to show you’ve finished!

When packrat is done executing, you’ll also see a packrat folder. Leave that alone – it’ll have to be our one outstanding folder!

3.3 Literate Programming with RMarkdown

So let’s get into R now. We are going to write our R code using RMarkdown. RMarkdown is an extension of Markdown. It works sort of like an executable paper – it mixes documentation & code, and not just R! You can insert code snippets from other languages (SQL, bash, Python, and more!). This allow you write documents which integrate results from your analysis. Incorporating results directly into your documents is an important step in reproducible research. Any changes that occur in either your data set or the analysis are automatically updated in your document the next time the document is created.

From Kieran Healy, http://plain-text.co: a good workflow for reproducible report writing.

From Kieran Healy, http://plain-text.co: a good workflow for reproducible report writing.

The most important five aspects of RMarkdown:

  1. RMarkdown belongs to the field of literate programming which is about weaving text and source code into a single document.
  2. RMarkdown is one possible solution (an alternative are Jupyter Notebooks, though you can run R in Jupyter!) which allows you to combine text written in Markdown and source code written in R (and other languages).
  3. You can do the entire analysis pipeline in an RMarkdown document: Data (pre-)processing, analysis, outputs, visualisation.
  4. This document can be compiled into several output formats, such as PDF and HTML.
  5. Packages such as rticles which includes templates (e.g. ACM, Elsevier) which can be used to create submission-ready documents.

Five benefits of RMarkdown for your daily work:

  1. You can keep an eye on text (the paper) AND the source code. These computational steps are essential to ensure computational reproducibility.
  2. You can easily share R Markdown documents with colleagues, as supplemental material, or as the paper under review. Thanks to the package knitr, others can execute the document with a single click and receive, for example, HTML or PDF renderings.
  3. Figures get automatically updated if you change the underlying parameters in the code. The error-prone task of exporting figures and uploading the right figure version to another platform is thus not needed anymore.
  4. If you do not make any changes to document after creating the output document, you can be sure that the paper was executable at least at the time of submission. Since Markdown is a text-based format, you can also use versioning with git.
  5. You can refer to the corresponding code lines in the methodology section making it unnecessary to use pseudo code, high-level textual descriptions, or just too many words to describe the analysis.

3.3.1 Creating a reproducible document in RMarkdown

Ok, you should be in your docs folder now. To create a new R Markdown file, go to File > New File > R Markdown.

Some notes about R. What are known as objects in R are known as variables in many other programming languages. I am going to call them objects. Depending on the context, object and variable can have drastically different meanings. For more information see: https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Objects

Create code chunks and text Let’s look at some of the basics in R! Right now, this is just a narrative text written out with no special wrappers. Just text in a text box. But! I want to show you how R works. So, I need to insert a R chunk. In the source code pane, you might see a icon C with a plus sign and the word Insert. If you click that, you can choose to insert a code chunk of variable types. We want R right now.

This inserts a wrapper that tells R Markdown that it needs to run some R code now:

# blank code chunk

We can run code chucks by either clicking the > symbol in the top-right corner of the code chunk itself, or near the Insert menu, you’ll see a --> Run menu as well. There, you can choose to run the current code chunk, all the code chunks, or even selected lines within a chunk. Ok, let’s get into R and look at how to make objects. We assign values by typing out the <-. We use this for objects, for DataFrames – for everything that needs an assignment!

cool_num <- 100    # assigns the object a value and a name to call it by
(cool_num <- 100)  # but putting parenthesis around the call prints the value of `cool_num`
## [1] 100
cool_num           # and so does typing the name of the object
## [1] 100


  1. Write some text in your RMarkdown file explaining what you’re about to do in the code chunk directly below it.
  2. Add an R code chunk, and assign an object a value with a name.
  3. Print out your object.
  4. Raise your hand to show you’ve finished!

If you look at the Environment pane, you’ll see our variable cool_num! It’s been loaded into memory. Now, we can do other stuff with it, like call it in conjunction with maths!

other_num <- 2.2 * cool_num # doing some arithmetic and assigning it to a new object
## [1] 220


  1. In a new code chunk, create a new object based on the first object. Don’t forget to add some text above it explaining what you’re about to do!
  2. Raise your hand to show you’ve finished!

One of the main reasons people use R is the bevy of functions. Functions are built into R and extended with R packages. Functions help automate more complicated sets of commands. We’ll only use predefined functions, but you can also define your own as your R proficiency grows!

A function usually gets one or more inputs called arguments. Functions often (but not always) return a value.

We’ll take a look at the function sqrt(). The input (the argument) must be a number, and the return value (in fact, the output) is the square root of that number. Executing a function (‘running it’) is called calling the function. Let’s find the square root of our variable other_num:

## [1] 14.8324

The return ‘value’ of a function doesn’t have to be a number or even a single item! It can be a whole dataset. We’ll see that later.


  1. Type library(help = "base") in the console in R. If you need more information, use ?functionName substituting functionName for the actual name of the function.
  2. Find a function that gives us back a numerical or Boolean value.
  3. Add an R code chunk, and call the function from it using one of our previous objects.
  4. Raise your hand to show you’ve finished!

Looking at data types

The first main data type in R is a vector. A vector is composed by a series of values, which can be either numbers or characters.

vec1 <- c(4.6, 4.8, 5.2, 6.3, 6.8, 7.1, 7.2, 7.4, 7.5, 8.6)    
vec2 <- c('UT', 'IA', 'MN', 'AL', 'WI', 'MN', 'OH', 'IA', 'NY', 'IA')

##  [1] 4.6 4.8 5.2 6.3 6.8 7.1 7.2 7.4 7.5 8.6
##  [1] "UT" "IA" "MN" "AL" "WI" "MN" "OH" "IA" "NY" "IA"

You can take the values from vectors and put them into another wholly different vector! You’ll notice I use 1, 2, etc. instead of the actual numbers in my vector. That’s because I have to give the location of the data point inside the vector, not the actual data point itself. R is one of the few languages that start their index at 1 (most use 0).

more_vec <- vec1[c(1, 2, 3, 2, 1, 4)]
## [1] 4.6 4.8 5.2 4.8 4.6 6.3

You can also subset vectors to find data within them, using the == which tests for equality (are the two values the same??). Another useful subset is to use < and > to find values greater than or less than some measure within the subset.

vec1[vec1 < 2 | vec1 > 5]
## [1] 5.2 6.3 6.8 7.1 7.2 7.4 7.5 8.6
vec2[vec2 == "MN" | vec2 == "OH"]
## [1] "MN" "MN" "OH"

R was made for data analysis, so it stands to reasons that there is functionality to deal with the everyday messiness of data. This includes dealing with missing data too, which we see a lot in everyday life. Take this vector, cat_toys, which has one missing value. When I got to run some basic math, it fails:

cat_toys <- c(20, 14, 4, NA, 60)
## [1] NA
## [1] NA
# you can even do math right on vectors!
expo_vec <- vec1*5
##  [1] 23.0 24.0 26.0 31.5 34.0 35.5 36.0 37.0 37.5 43.0

However, R has a way to deal with that! You can add the argument na.rm=TRUE to calculate the result while ignoring the missing values.

mean(cat_toys, na.rm = TRUE)
## [1] 24.5
max(cat_toys, na.rm = TRUE)
## [1] 60


  1. Write a code chunk that creates this vector: heights <- c(63, 69, 60, 65, NA, 68, 61, 70, 61, 59, 64, 69, 63, 63, NA, 72, 65, 64, 70, 63, 65)
  2. Use R to figure out how many people in the set are taller than 67 inches.
  3. Use the function median() to calculate the median of the heights vector.
  4. Raise your hand to show you’ve finished!

However the most popular data type in R for sure are Data frames. This is what most folks use most tabular data, statistics, and plotting. Data frames are made of vectors.

A data frame can be created by hand, but most commonly they are generated by the functions read.csv() or read.table(); in other words, when importing spreadsheets from your hard drive or the web.

A data frame is essentially a table where the columns are vectors that all have the same length. Because columns are vectors, each column must contain a single type of data (e.g., characters, integers, factors). For example, here is a figure depicting a data frame comprising a numeric, a character, and a logical vector.

Data frame image; from data carpentry ecology lesson

Data frame image; from data carpentry ecology lesson

Ok, let’s use our own data now!

Your datasets and files are here(). We have some external datasets we want to to include into our RMarkdown file. What most people do is something like:

# source("/home/vicky/Downloads/2018-uutah-repro/anotherRscript.R")

This work fine. For me. On my computer. Not on yours. We want to work reproducibly, so we need to use something called relatively paths. These errors are one of the most common errors in sharing, and disturb the entire execution process of our executable paper. So let’s try the following:

  install.packages("here", repos = "https://cloud.r-project.org")

Alright, here() starts relative to the folder structure on my laptop. This means, when using it on your machine, it starts relative to the folder structure on your own machine. This allows us to provide all directories relative to the top directory of the project folder, which is in this case `/utah-project/. Now, our scripts will work on my machine and on yours. Ok, let’s put our cleaned data from OpenRefine into a data frame!

uni <- read.csv("results/2018-06-12_universityData.csv")

Now in our Environment pane, you should see two new rows – a merged row called Data and then our new data frame uni! Let’s click that and see what happens (hint: it’ll let us view our dataframe in a very nice way!). If we want to see the first few rows of data, we can use the head() function and if we want to see the first column, we call the data frame with the location of what we want – it goes dataFrame[col,row]:

head(uni) # see first six rows
##                universities endowment country      lat        long
## 1         Paris Universitas        15  France 46.22764    2.213749
## 2         Paris Universitas        15  France 46.22764    2.213749
## 3 Lumière University Lyon 2       121  France 46.22764    2.213749
## 4     Confederation College   4700000  Canada 56.13037 -106.346771
## 5    Rocky Mountain College  16586100     USA 37.09024  -95.712891
## 6    Rocky Mountain College  16586100     USA 37.09024  -95.712891
##   established
## 1        2005
## 2        2005
## 3        1835
## 4        1967
## 5        1878
## 6        1878
head(uni[1])  # see the first column
##                universities
## 1         Paris Universitas
## 2         Paris Universitas
## 3 Lumière University Lyon 2
## 4     Confederation College
## 5    Rocky Mountain College
## 6    Rocky Mountain College


  1. Load in our cleaned data using read.csv()
  2. Find the first cell in the 1st column using R.
  3. Find the first three cells in the 4th column using R.
  4. Raise your hand to show you’ve finished!

Useful data frame functions


  • dim(uni) - returns a vector with the number of rows in the first element, and the number of columns as the second element (the dimensions of the object)
  • nrow(uni) - returns the number of rows
  • ncol(uni) - returns the number of columns


  • head(uni) - shows the first 6 rows
  • tail(uni) - shows the last 6 rows


  • names(uni) - returns the column names (synonym of colnames() for data.frame objects)


  • summary(uni) - summary statistics for each column

We can also subset our dataframes by putting them into vectors, like this chunk which takes all the names of univerisites and puts them into another dataframe:

unis <- uni["universities"] # can also do `uni[1]`, I passed it name of column
##                universities
## 1         Paris Universitas
## 2         Paris Universitas
## 3 Lumière University Lyon 2
## 4     Confederation College
## 5    Rocky Mountain College
## 6    Rocky Mountain College


  1. Subset uni to create a new dataframe called uni_money that contains the university name and the endowment.
  2. Compute some summary statistics about the levels of endowments per university.
  3. Raise your hand to show you’ve finished!

uni_money<- subset(uni, select=c("universities", "endowment")) # make a new subset
head(uni_money) # makes ure it looks right
##                universities endowment
## 1         Paris Universitas        15
## 2         Paris Universitas        15
## 3 Lumière University Lyon 2       121
## 4     Confederation College   4700000
## 5    Rocky Mountain College  16586100
## 6    Rocky Mountain College  16586100
# make the endowment column a number
uni_money$endowment <- as.numeric(as.character(uni_money$endowment))
## Warning: NAs introduced by coercion
# deduplicate the data and sum the rows so we don't lose data!
dedupe_unimoney<-ddply(uni_money,.(universities),function(x) data.frame(universities=x$universities[1],endowment=sum(x$endowment)))

# get the first 6 rows of our newly deduped and summed rows
head_dedupedMoney <- head(dedupe_unimoney)
##          universities  endowment
## 1   Aarhus University 4.5864e+10
## 2   Acadia University 5.1840e+10
## 3  Adelphi University 8.6000e+07
## 4 Agnes Scott College 2.3060e+08
## 5        AIIMS Bhopal         NA
## 6       AIIMS Jodhpur         NA

Now that we have some clean, relevant information to use, let’s plot it! I mainly use R for plotting purposes myself. I think the range and options are vastly superior to other data viz packages out there. I can even get plots with DPIs enough for print!

For now, let’s do a basic histogram:

p <- ggplot(head_dedupedMoney, aes(x = universities, y = endowment))  + geom_histogram(stat="identity")
## Warning: Ignoring unknown parameters: binwidth, bins, pad
## Warning: Removed 2 rows containing missing values (position_stack).


  1. Pick another relevant plot using ggplot’s functions and plot the data a different way.
  2. BONUS: add some color!
  3. Raise your hand to show you’ve finished!

We can also do some mapping in R and make it interactive with leaftlet! First we need to make another subset, with just the relevant information – the universities and their geographic locations.

plot<- subset(uni, select=c("universities", "lat", "long")) # make a new subset for map
uni_plot <- unique(plot[,]) # deduplicate the data! 

head(uni_plot) # makes ure it looks right
##                 universities      lat        long
## 1          Paris Universitas 46.22764    2.213749
## 3  Lumière University Lyon 2 46.22764    2.213749
## 4      Confederation College 56.13037 -106.346771
## 5     Rocky Mountain College 37.09024  -95.712891
## 7     Idaho State University 37.09024  -95.712891
## 19       University of Milan       NA          NA

Ok, let’s map!

m <- leaflet(uni_plot) %>% 
m %>% setView(-72.690940, 41.651426, zoom = 8)
m %>% addMarkers(~long, ~lat, popup = ~as.character(universities), label = ~as.character(universities)) 
## Warning in validateCoords(lng, lat, funcName): Data contains 37 rows with
## either missing or invalid lat/lon values and will be ignored

3.3.2 Knit the document and get your final file!

Ok, we have some:

  • data cleaning steps (subsetting)
  • data analysis steps (summary statistics)
  • narrative (our text throughout)
  • plots (our map)

This way of publishing research results allows others to reuse your code and give you appropriate credit. For example, others might be interested in the same way of illustrating their data on the map, but for another region or another university dataset!

You can choose between HTML, PDF, and others. Note: Not every template support HTML, so choose wisely!.

Knitting a document A browser should open up, and you’ll be able to see a complete record of your process!


Would you rather have one giant R Markdown file for your whole process, or a separate one for each step? Why/why not?

3.4 BONUS: Git & RStudio

  1. Open RStudio
  2. Click Tools -> Global Options -> Git/SVN
  3. You should be able to see that git has a program associated with ti. If Git executable shows ‘(none)’, click Browse and select the git executable installed on your system.
    • On a Mac, this will likely be one of the following: /usr/bin/git, /usr/local/bin/git, or /usr/local/git/bin/git
    • On Windows, git.exe will likely be somewhere in Program Files or Program Files (x64).
  4. Click OK
  5. Restart RStudio
Configuring git in RStudio

Configuring git in RStudio

3.4.1 Adding Git to a project

Version control in RStudio can only be done on the project level. To use git with RStudio, you need to either add git to an existing project or start a new project with git enabled from the start. To add git to a new project in RStudio, all you need to do is check a box!

  1. Open RStudio
  2. Click File -> New Project -> New Directory -> Empty Project
    • Check Create a git repository for this project
Adding git a new RStudio project

Adding git a new RStudio project

To add git to an existing project in RStudio:

  1. Open your project in RStudio (click File -> Open Project)
  2. Click Tools -> Project Options
  3. In Project Options, click the Git/SVN tab.
  4. Change the “Version Control System” from “None” to “Git”
  5. [Optional] Add a link to the remote repository.
Adding git an existing RStudio project

Adding git an existing RStudio project

3.4.2 Working with Git in your project

So, just by virtue of doing your normal work within a git repository, you are in the working directory state. Say you want to tell git about some changes you’ve made to your files. You need to add it! Here’s what it looks like in RStudio when you have files that are untracked:

Unstaged files in RStudio

Unstaged files in RStudio

You can see here that there is a bright yellow ? next to the files that are untracked, and a green A next to the files that have been added. To add those two untracked images, just double click the question mark or check the check box! Then we’ll have it all staged.

Adding files in RStudio

Adding files in RStudio

Git also lets you choose which parts of the files you want to commit. Say you’re working on some analysis notebooks. One is done, but the other is unfinished. You’d like to make a commit and go home (5 o’clock, finally!) but wouldn’t like to commit the parts of the second notebook, which is not done yet. You stage the parts you know belong to the first notebook, and commit.

All that to say – you commit your changes after you finish adding everything you want to for the moment. When you commit a file, you are telling git that this is the new version of a file. To make a commit in RStudio you must:

  • Click the Git tab
  • Check Staged next to the files you’ve added
  • Click Commit
  • Type a message in Commit message
  • Click Commit
Commit message in RStudio

Commit message in RStudio

You can see what has changed in a given file since its last commit in this window as well:

Full commit screen in RStudio

Full commit screen in RStudio

Git has recorded a complete history of your work. To see all the changes for the project, you just go to the Git tab and click the History button!

Git history in RStudio

Git history in RStudio

Sometimes we make a change that doesn’t work so well in the end. In the event of errors or inconsistencies into your work, you can browse through your history, find the change that’s to blame, and restore your previous good work. It might be a straight-up error, or you decide that what you wrote isn’t the best way to do something. In this case, we’ll need to revert your change!

  • Go to the Git tab in RStudio
  • Click Diff
  • Select a file or a lot of files, view the differences, etc.
  • Click Revert Revert window in RStudio

  • Confirm you actually want to revert your change Confirm your revert in RStudio

The erroneous change has been undone and the previous version restored! After a revert in RStudio

3.4.3 Pushing to a git hosting platform!

Now that we know some git, we can use git repository hosting platforms for collaboration and open science! One of the very best is GitLab.

We can add any GitLab or GitHub repository easily when we start a new RStudio project.

  1. Go to File > New Project > Version Control
  2. Choose Git from the dropdown menu
  3. In the “repository URL” paste the URL of your new GitLab/Hub repository. It will be something like this https://gitlab.com/VickySteeves/hello-world.git.
Adding git a new RStudio project

Adding git a new RStudio project

This means that we can sync our local changes to a repository hosted on Gitlab/Hub! No, we have to PUSH all our locally created content to the origin remote. This adds 1 more step to what you already know how to do:

  1. Work on your files
  2. Add your files so git knows you want to track their changes
  3. Commit any changes you want to make the new version
  4. Send these changes to the repository hosted on GitLab by simply clicking the push button on the Git panel.
Pushing to GitLab from RStudio

Pushing to GitLab from RStudio

Go refresh your browser to see your changes!

3.5 EXTRA BONUS: GitLab CI & R

Rendering articles with GitLab CI

Continuous integration is a process that happens every time you push a commit to a repository. It executes a number of steps on the GitLab server for you, and checks to see if everything still works as expected. If not, it reports the errors.

We can use Gitlab CI for code testing and reproducible builds of analysis. Based on pipelines we can add jobs running arbitrary R scripts in a Docker container after each push to the GitLab repository.

A Docker container is a virtual environment defined by a text recipe, the Dockerfile, and based on an image. Luckily, for our case, we can just use an existing image from the Rocker project.

Create a GitLab CI configuration file, .gitlab-ci.yml, and add the following content:

  stage: deploy
  image: rocker/verse:3.4.4
    - Rscript -e "remove.packages('rticles')" -e "devtools::install_github('nuest/rticles', ref = 'copernicus')"
    - Rscript -e "rmarkdown::render('MyArticle/MyArticle.Rmd')"
      - MyArticle
    when: always

The file configures a build job named article and a list of script steps executed within a specific Docker image as part of the deploy stage. In this case we use a versioned image of rocker/geospatial because we want to do geospatial analyses in our article and the image has the important libraries already installed.

Learn more about rocker/geospatial: https://doi.org/10.5281/zenodo.1216750

The directory of the article is configured as the job’s artifacts. After you pushed this file to the repository and the job has completed, you can download the PDF and be sure your analysis and article rendering do not only work on your own computer.


You have learned how to!