Setup
Prerequisites
- Request an account on the Tufts HPC Cluster
- Connect to the VPN if off campus
Navigate To The Cluster
Once you have an account and are connected to the VPN/Tufts Network, navigate to the OnDemand Website and log in with your tufts credentials. Once you are logged in you'll notice a few navigation options:
Click on Interactive Apps > RStudio Pax
and you will see a form to fill out to request compute resources to use RStudio on the Tufts HPC cluster. We will fill out the form with the following entries:
Number of hours
:3
Number of cores
:1
Amount of memory
:8GB
R version
:4.0.0
Reservation for class, training, workshop
:Bioinformatics Workshop
---> NOTE: This reservation closed on Nov 9, 2022, use Default if running through the materials after that date.Load Supporting Modules
:boost/1.63.0-python3 java/1.8.0_60 gsl/2.6
Click Lauch
and wait until your session is ready. Click Connect To RStudio Server
, and you will notice a new window will pop up with RStudio.
Project Setup
We are going to create a new project to begin:
- Go to
File
>New Project
New Directory
New Project
- Create a name for your project (e.g.
intro-to-16S
) Create Project
File Organization
In our project we will need some folders to contain our scripts, data and results:
- Click the New Folder icon
- Create a folder called data and click ok
- Following the same process, create a scripts folder and a results folder
Data & Scripts
Today we will be working with data from Rosshart et al. (2107) where wild-type and laboratory strain mouse microbiomes were assessed. To copy over this data we will enter the following command into the console:
file.copy(from="/cluster/tufts/bio/tools/training/microbiome16S/raw_fastq/",to="./data/", recursive = TRUE)
file.copy(from="/cluster/tufts/bio/tools/training/microbiome16S/meta/metaData.txt",to="./data/", recursive = TRUE)
file.copy(from="/cluster/tufts/bio/tools/training/microbiome16S/silva/silva_nr99_v138.1_train_set.fa.gz",to="./data/")
file.copy(from="/cluster/tufts/bio/tools/training/microbiome16S/scripts/dada2pipeline.Rmd",to="./scripts/")
Now that we have our data and scripts copied, let's navigate to our scripts folder and open up "dada2pipeline.Rmd".
Libraries
To run a code chunk in this R markdown file, click the play button at the top right hand side of the code chunk. We will practice by running the code chunk that loads the R libraries we will need for this workshop:
# load our libraries
.libPaths(c('/cluster/tufts/hpc/tools/R/4.0.0',.libPaths()))
library(dada2)
library(phyloseq)
library(ggplot2)
library(DESeq2)
library(tidyverse)
library(phangorn)
library(msa)