Thursday, March 31, 2022

BD Rhapsody analysis pipeline on AWS

Objectif: Implement BD Rhapsody analysis pipeline (WTA) on a instance of AWS (without using Seven Bridges)

1. Create an EC2 instance (r5.12xlarge)

2. Mount EFS storage
> sudo yum install -y amazon-efs-utils
> sudo yum install -y nfs-utils
> sudo mkdir /mnt/efs

> sudo chgrp -R ec2-user /mnt/efs
> sudo chmod -R g+w /mnt/efs
sudo mount -t efs -o tls File_system_ID:/ /mnt/efs
where File_system_ID is the file system id (format "fs-XXXXXXXX")

3. Install docker
> sudo yum install -y docker
> sudo service docker start
> sudo usermod -a -G docker ec2-user

4. install pip for python2
> sudo yum install -y python2-pip.noarch

5. install CWL-runner
> pip2 install cwlref-runner

6. download CWL and YML files
> curl -O https://bitbucket.org/CRSwDev/cwl/raw/2a9b10d03b02fd4b65b92f78cfe81b80253eff47/v1.10/rhapsody_wta_1.10.cwl
> curl -O https://bitbucket.org/CRSwDev/cwl/raw/2a9b10d03b02fd4b65b92f78cfe81b80253eff47/v1.10/template_wta_1.10.yml

7. Download reference file
> curl -O https://bd-rhapsody-public.s3.amazonaws.com/Rhapsody-WTA/GRCh38-PhiX-gencodev29/GRCh38-PhiX-gencodev29-20181205.tar.gz
> curl -O https://bd-rhapsody-public.s3.amazonaws.com/Rhapsody-WTA/GRCh38-PhiX-gencodev29/gencodev29-20181205.gtf

7b. Create reference file
> docker run -v /mnt/efs:/mnt -t -i bdgenomics/rhapsody bash
> mkdir ggOverhang100
> STAR --runMode genomeGenerate \
       --runThreadN 8 \
       --genomeDir ggOverhang100 \
       --genomeFastaFiles /mnt/genome/Mmul_10/Sequence/genome.fa \
       --sjdbGTFfile /mnt/genome/Mmul_10/Annotation/genes.gtf \
       --sjdbOverhang 100
> tar -czvf ggOverhang100.tgz ggOverhang100/

8. Edit YML file
...

9. Launch CWL-runner
> cwl-runner --outdir /mnt/efs/rhapsody_test/output_test rhapsody_wta_1.10.cwl template_wta_1.10.yml

---
Supplementary step: upload file to Seven Bridges
> sb projects list
> sb upload start ggOverhang100.tar.gz --destination projectName --name rhesus-ref







Transfer directory from EFS to S3 Glacier

1. Create an S3 bucket > aws s3 mb s3://rv398-20220712 2. Copy EFS files to the S3 bucket > aws s3 cp /mnt/efs/Joana3/Data s3://rv398-...