Friday, November 26, 2021

Email notification for AWS batch

Goal: Get an email notification when an AWS batch (single or array) job is completed. The problem (1) is to get email notifications only for the user that launched the job and (2) for array jobs only get one email notification when all the children jobs are completed.

Solution: For the array job problem the solution is to create a rule specific to the parental job. For the email notifications to only one user, the solution is to create a topic per job launched.

1. Create an AWS array job (called here sleep.json)
> cat sleep.json
{
    "jobName": "sleep-job-1",
    "jobQueue": "job-queue-1",
    "jobDefinition": "job-def-1",
    "arrayProperties": {
        "size": 3
    },
    "containerOverrides": {
        "command": [
            "sleep",
            "30"
        ]
    },
    "timeout": {
        "attemptDurationSeconds": 7200
    }
}

2. Create bash script that call aws batch job and fetch the jobId (called here sleep.sh)
> cat sleep.sh
#!/bin/bash
cmd="aws batch submit-job"
cmd="$cmd --cli-input-json file://sleep.json"


jobid=$(eval $cmd | \
        grep jobId | \
        sed -r 's|.+jobId\": \"(.+)\"$|\1|g')

3. Create an event rule (i.e. pattern to be matched) that is specific to the AWS array job
> cat rule.json
{
"Name": "rule-1",
"EventPattern": "{\"source\":[\"aws.batch\"],
\"detail-type\":[\"Batch Job State Change\"],
\"detail\":{\"jobId\":[\"JOBID\"],
\"status\":[\"FAILED\",\"SUCCEEDED\"]}}",
"State": "ENABLED",
"Description": "rule for specific jobId",
"EventBusName": "default"
}
> tail sleep.sh
sed -ri "s|\"jobId\\\\\":[^,]+|\"jobId\\\\\":[\\\\\"${jobid}\\\\\"]|g" \
rule.json
# register rule
aws events put-rule \
    --cli-input-json file://rule.json
4. Create a topic (i.e. communication channel) for that rule
> tail spleep.sh
# register topic
cmd="aws sns create-topic"
cmd="$cmd --name 'job-${jobid}"

topicArn=$(eval $cmd | \
grep TopicArn | \
sed -r 's|.+TopicArn\": \"(.+)\"$|\1|g')

5. Add iam role to allow notification
> cat sleep.sh
attributeValue="{\"Version\":\"2012-10-17\",
                 \"Id\":\"__default_policy_ID\",
                 \"Statement\":[{
                    \"Sid\":\"__default_statement_ID\",
                    \"Effect\":\"Allow\",
                    \"Principal\":{\"AWS\":\"*\"},
                    \"Action\":[
                      \"SNS:GetTopicAttributes\",
                      \"SNS:SetTopicAttributes\",
                      \"SNS:AddPermission\",
                      \"SNS:RemovePermission\",
                      \"SNS:DeleteTopic\",
                      \"SNS:Subscribe\",
                      \"SNS:ListSubscriptionsByTopic\",
                      \"SNS:Publish\",
                      \"SNS:Receive\"],
               \"Resource\":\"${topicArn}\",
              {\"Sid\":\"AWSEvents_rule-1_1\",
               \"Effect\":\"Allow\",
               \"Principal\":{\"Service\":\"events.amazonaws.com\"},
               \"Action\":\"sns:Publish\",
               \"Resource\":\"${topicArn}\"}]}"

aws sns set-topic-attributes \
   --topic-arn "$topicArn" \
   --attribute-name "Policy" \
   --attribute-value "$attributeValue"

6. create target (i.e. resource to be invoked) for the rule
> cat sleep.sh
# add target to rule
aws events put-targets \
--rule "$ruleName" \
    --targets "Id"=1,"Arn"="$topicArn"

PS: multiple rules can be linked to the same topic using different targets

7. Create subscription (mode of notification)

> cat sleep.sh
# create subscription
aws sns subscribe \
 --topic-arn "$topicArn" \
 --protocol "email" \
 --notification-endpoint "YourEmailAddress"

Friday, November 19, 2021

Globus transfer to AWS EFS

Topic: I needed to download files shared via Globus (globus.org) to AWS EFS drive. Globus provides an option to share a directory to AWS S3 bucket but not directly to EFS; however, this requires having a paid account with Globus (do not work with personal endpoint). This solution below (installing Globus CLI and downloading the file from an EC2 instance with the EFS mounted) works even with Globus's personal endpoint. If Globus CLI is already installed on your EC2 frontend, you can skip to step 6.

1. Create EC2 instance
I suggest an EC2 instance with high bandwidth for a faster download (ex. c5n with up to 25 Gbps network bandwidth)
> aws ec2 run-instances \
--image-id ami-0beaa649c482330f7 \
--count 1 \
--instance-type c5n.2xlarge \
--key-name sfourat \
--security-group-ids sg-0d0e3364014a7fc7e sg-0173b74c97e48493e \
--subnet-id subnet-04f3da868a634843d \
--profile "tki-aws-account-310-rhedcloud/RHEDcloudAdministratorRole"

2. Mount EFS
> sudo yum -y update
> sudo yum -y install amazon-efs-utils
> sudo yum -y install nfs-utils
> sudo mkdir -p /mnt/efs
> sudo mount -t efs -o tls fs-57e8702f:/ /mnt/efs

3. Install Globus on your AWS EC2 frontend
> pip3 install globus-cli

4. Connect to Globus
> globus login --no-local-server

5. Authentify in local browser and get Native App Authorization Code

> Please authenticate with Globus here:

> ------------------------------------

> https://auth.globus.org/v2/oauth2/authorize?client_id=...













6. Install Globus personal server

> wget https://downloads.globus.org/globus-connect-personal/v3/linux/stable/globusconnectpersonal-latest.tgz

> tar -xzvf globusconnectpersonal-latest.tgz 


7. create an endpoint

> ./globusconnectpersonal -setup

Globus Connect Personal needs you to log in to continue the setup process.


We will display a login URL. Copy it into any browser and log in to get a

single-use code. Return to this command with the code to continue setup.


Login here:

-----

https://auth.globus.org/v2/oauth2/authorize?...


Input a value for the Endpoint Name: aws

registered new endpoint, id: ...


8. start endpoint

> ./globusconnectpersonal -start -restrict-paths rw/mnt/efs &


9. print all endpoints by current user

> globus endpoint search --filter-scope my-endpoints


10. Directory Listing

> globus ls 'endpointUUID:/'


11. Start transfer

> globus transfer shared-endpoint:/ myendpoint:/

Transfer directory from EFS to S3 Glacier

1. Create an S3 bucket > aws s3 mb s3://rv398-20220712 2. Copy EFS files to the S3 bucket > aws s3 cp /mnt/efs/Joana3/Data s3://rv398-...