This is a bare bones tutorial of how to install anaconda and associated packages on an Amazon AMI EC2 instance. It assumes that you have followed the AWS setup tutorial to create an AWS account and are familiar with the steps to create an EC2 instance.
Online through AWS console, create an instance (https://us-west-2.console.aws.amazon.com/ec2).
Recommendations for genomics usage:
In the Terminal, log-in into your instance and perform basic setup.
More details are available in 1.AWS_setup_tutorial.pdf
.
This directory setup is slightly different than in the setup
tutorial. We mount the EBS storage to project/
and then
fuse a subdirectory project/data/
to our S3 bucket holding
all our data. This is because a fused directory becomes
read-only. Thus, if we fused to the main directory as in the
setup tutorial, we would not be able to write any of our results to the
EBS storage where we have the extra space.
## Updates if available
sudo yum upgrade -y
sudo yum update -y
## Install AWS command line client if not using Amazon OS
sudo yum install awscli -y
## Configure your account
aws configure
## FILL IN WITH YOUR KEYS ###
## Setup fuse
sudo amazon-linux-extras install -y epel
sudo yum install -y s3fs-fuse
### Fuse key
### FILL IN WITH YOUR KEYS ###
echo UserKey:SecretKey > ~/.passwd-s3fs
chmod 600 ~/.passwd-s3fs
## Setup EBS volumes
lsblk
sudo mkfs -t ext4 /dev/nvme1n1
sudo mkdir -p ~/project
sudo mount /dev/nvme1n1 ~/project/
### Change permissions
sudo chmod 777 -R ~/project/
## Mount S3 data
mkdir ~/project/data
sudo chmod 777 -R ~/project/data
s3fs kadm-data ~/project/data -o passwd_file=~/.passwd-s3fs \
-o default_acl=public-read -o uid=1000 -o gid=1000 -o umask=0007
Anaconda (often called conda) is a program management system for bioinformatic tools. For novice users, I recommend installing as many tools as possible through this system. As you gain experience, you may find that you move away from conda to get newer or more frequently up-dated tools.
Check if python 3 is installed.
python --version
If it is not, install it.
sudo yum install python3 -y
#Make directory for programs
sudo mkdir -p ~/apps/anaconda
sudo chmod 777 -R ~/apps
cd ~/apps/anaconda
# Change to correct URL if not using Linux 64-bit
# and to update to latest version if needed
sudo curl -O https://repo.anaconda.com/archive/Anaconda3-2021.05-Linux-x86_64.sh
sudo bash Anaconda3-2021.05-Linux-x86_64.sh -b -p /home/ec2-user/apps/anaconda -u
# Set PATH and initialize
eval "$(/home/ec2-user/apps/anaconda/bin/conda shell.bash hook)"
conda init
sudo chmod 777 -R ~/apps
You may need to exit and re-login for changes to take effect.
## Configure channel priority
conda config --add channels bioconda
conda config --add channels conda-forge
conda config --set channel_priority false
conda config --set allow_conda_downgrades true
## Install
## Example programs for RNA-seq data cleaning
conda install -c conda-forge -y fastqc
conda install -y adapterremoval bedtools
conda install -c bioconda/label/cf201901 -y picard star subread
conda install -y "samtools>=1.10"
#Check installs
conda list