Why GitLab rules (with a special look at reproducibility)


Vicky Steeves

@VickySteeves | @VickySteeves@Octodon.Social

csv,conf,v4 | May 9, 2019

GitLab is a git-based hosting service (they store your git repos!)

They have an open-core model: the community edition is FLOSS, the enterprise edition is not (like RStudio)

It's really different from other similar services for many reasons which you can find on their site (about.gitlab.com/comparison/) but I want to go over a few key reasons that make it ideal for building (a subset of) work reproducibly!

A quick note to say I wrote this out in a blog post: about.gitlab.com/2017/08/25/gitlab-and-reproducibility && co-taught it at EGU: vickysteeves.gitlab.io/repro-papers

1. Integrated CI

GitLab has integrated CI in each repo -- in fact, hosting, LFS, and CI is free for public & private repos!!

You can use their Docker Container Registry, a secure, private Docker registry with GitLab CI to run each job in a separate & isolated container using the predefined image that is set up in .gitlab-ci.yml. If you don’t feel like using the GitLab registry, you can also use images from DockerHub or a custom Docker container you’re already using locally.

Why integrated CI matters

This makes it easier to have a simple and reproducible build environment that can also run on your local workstation, regardless of it's operating system!

It also means that as soon as someone else forks your repo, the CI will automatically run and give them results without having to go to a 3rd party tool (e.g. Travis, Circle).

Example CI config

FYI, GitLab knows to syntax-check .gitlab-ci.yml!


article:
  stage: deploy
  image: rocker/verse:3.4.4
  script:
    - Rscript -e "remove.packages('rticles')" -e "devtools::install_github('nuest/rticles', ref = 'copernicus')"
    - Rscript -e "rmarkdown::render('MyArticle/MyArticle.Rmd')"
  artifacts:
    paths:
      - MyArticle
    when: always

The file configures a build job named article and a list of script steps executed within a specific Docker image as part of the deploy stage. In this case we use a versioned image of rocker/geospatial because we want to do geospatial analyses in our article and the image has the important libraries already installed.

The directory of the article is configured as the job’s artifacts, which in this example is a computational article for a journal.

More GitLab CI Goodness


  • Expose artifacts built from CI, like ReproZip bundles (fully reproducible package w/ super metadata) -- could also auto-push these artifacts to other systems via API, like the OSF or Zenodo
  • Build of a project can trigger a build of another, and GitLab shows the full picture -- AKA, multi-project pipelines! Chain those data processing + analyzing + visualizing workflows
  • GitLab can test your code on your own server, or deploy to your server
  • Autoscaling CI Runners -- You can automatically spin up and down VMs to make sure your builds get processed immediately.

2. Auto Sync with Remotes


If your collaborators are on another platform, or you want to simply be more discoverable, you can sync between remotes automatically, either pushing or pulling!

I use this to auto-sync repos between GitLab (where I work) and GitHub (where people tend to discover my work)

3. It's Open Source

3a. Put it on your own server!

3b. Modify it the way you want on your server.

Related -- they come out with a stable version 1x/month, so new features are added super quick!

Other nice features for managing research/work:

  • 4 levels of user permissions with optional expiration dates
  • Work-in-Progress Protection - add 'WIP' to the title of a merge request to prevent anyone from merging it
  • Due dates for individual issues
  • Move issues between projects
  • Ease of migration from other providers (#movingToGitLab)
  • Issue boards, each list of an issue board is based on a label that exists in your issue tracker
  • When a user is mentioned in or assigned to a merge request it will be included in the user Todos, easier to track

GitLab is great & I love talking about it. Look at the beautiful features about.gitlab.com/features


Feel free to hit me up about it --

Tweet me: @VickySteeves

Toot me: @VickySteeves@Octodon.Social

Get this Presentation:
vickysteeves.gitlab.io/csvconf4-lightningtalk