7. Other manuals (to be updated)

7.1. RSAT installation manual

NB: OLD, to be updated / hopefully replaced with apt-get

Unless stated otherwise, the following commands will be executed as root.

!!! Experimented users only !!!

7.1.1. First steps

Set the locales manually if needed.

export LANGUAGE=en_US.UTF-8
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8
locale-gen en_US.UTF-8
sudo dpkg-reconfigure locales

Check date & time, and adjust your timezone if needed.

date
dpkg-reconfigure tzdata

Create RSAT directory. It must be readable by all users (in particular by the apache user).

mkdir -p /packages
cd /packages

Check the installation device. This should give sda1 or vda1?

DEVICE=`df -h / | grep '/dev'| awk '{print $1}' | perl -pe 's/\/dev\///'`
echo ${DEVICE}

7.1.2. Packages installation

Update apt-get.

apt-get update
apt-get --quiet --assume-yes upgrade

Packages to be installed.

PACKAGES="
ssh
git
cvs
wget
zip
unzip
finger
screen
make
g++
apache2
php5
libapache2-mod-php5
libgd-tools
libgd-gd2-perl
ghostscript
gnuplot
graphviz
mysql-client
default-jre
python
python-pip
python-setuptools
python-numpy
python-scipy
python-matplotlib
python-suds
python3
python3-pip
python3-setuptools
python3-numpy
python3-scipy
python3-matplotlib
r-base
emacs
x11-apps
firefox
eog
ntp
curl
libcurl4-openssl-dev
"

Perl modules.

PACKAGES_PERL="perl-doc
pmtools
libyaml-perl
libemail-simple-perl
libemail-sender-perl
libemail-simple-creator-perl
libpostscript-simple-perl
libstatistics-distributions-perl
libio-all-perl
libobject-insideout-perl
libobject-insideout-perl
libsoap-lite-perl
libsoap-wsdl-perl
libxml-perl
libxml-simple-perl
libxml-compile-cache-perl
libdbi-perl
liblockfile-simple-perl
libobject-insideout-perl
libgd-perl
libdbd-mysql-perl
libjson-perl
libbio-perl-perl
libdigest-md5-file-perl
libnet-address-ip-local-perl
"

Install the apt-get libraries.

echo "Packages to be installed with apt-get --quiet --assume-yes"
echo "${PACKAGES}"
echo "Perl module packages to be installed with apt-get --quiet --assume-yes"
echo "${PACKAGES_PERL}"
for LIB in ${PACKAGES} ${PACKAGES_PERL}; \
do \
   echo "`date '+%Y/%m/%d %H:%M:%S'`  installing apt-get library ${LIB}" ; \
   sudo apt-get install --quiet --assume-yes ${LIB} ; \
done

Package to be installed in an interactive mode.

apt-get install --quiet --assume-yes console-data
  • Options:
    • Select keymap from arch list
    • <Don’t touch keymap> (default)
    • Keep kernel keymap
    • Select keymap from full list

Specific treatment for some Python libraries.

sudo apt-get --quiet --assume-yes build-dep python-numpy python-scipy

To free space, remove apt-get packages that are no longer required. /?\

apt-get --quiet --assume-yes  autoremove
apt-get --quiet --assume-yes  clean

7.1.3. Python libraries installation

pip install soappy
pip install fisher
pip install httplib2

7.1.4. Apache Web server configuration

/!\ Manual interventions needed here.

Activate CGI module.

nano /etc/apache2/sites-available/000-default.conf

Uncomment the following line: Include conf-available/serve-cgi-bin.conf.

To avoid puzzling warning at apache start, set ServerName globally.

nano /etc/apache2/apache2.conf

Add the following line at the end of the file: ServerName localhost.

Add CGI script.

nano /etc/apache2/mods-available/mime.conf

Uncomment the line AddHandler cgi-script .cgi.

Optional: associate a plain/text mime type to extensions for some classical bioinformatics files. AddType text/plain .fasta AddType text/plain .bed.

Adapt the PHP parameters.

nano /etc/php5/apache2/php.ini

Modify the following parameters: post_max_size = 100M and upload_max_filesize=100M.

Activate cgi scripts. Found here.

chmod 755 /usr/lib/cgi-bin
chown root.root /usr/lib/cgi-bin
a2enmod cgi
service apache2 restart

You can check whether apache server was successfully configured and started by opening a web connection to http://{IP}.

7.1.5. RSAT distribution

/!\ Note: The git distribution requires an account at the ENS git server, which is currently only possible for RSAT developing team. In the near future, we may use git also for the end-user distribution. For users who don’t have an account on the RSAT git server, the code can be downloaded as a tar archive from the Web site.

Create RSAT directory.

mkdir -p /packages/rsat
cd /packages
export RSAT=/packages/rsat

Git repository cloning.

git clone git@depot.biologie.ens.fr:rsat
git config --global user.mail claire.rioualen@inserm.fr
git config --global user.name "reg-genomics VM user"

** OR **

Archive download.

export RSAT_DISTRIB=rsat_2016-11-06.tar.gz
export RSAT_DISTRIB_URL=http://pedagogix-tagc.univ-mrs.fr/download_rsat/${RSAT_DISTRIB}
sudo wget ${RSAT_DISTRIB_URL}
sudo tar -xpzf ${RSAT_DISTRIB}
sudo rm -f ${RSAT_DISTRIB}
cd ~; ln -fs /packages/rsat rsat

7.1.6. RSAT configuration

Run the configuration script, to specify the environment variables.

cd $RSAT
sudo perl perl-scripts/configure_rsat.pl

Which options to specify?

Load the (updated) RSAT environment variables.

source RSAT_config.bashrc

Check that the RSAT environment variable has been properly configured.

echo ${RSAT}

Initialise RSAT folders

make -f makefiles/init_rsat.mk init

7.1.7. Perl modules for RSAT

cpan
cpan> install YAML
cpan> install CPAN
cpan> reload cpan
cpan> quit

Get the list of Perl modules to be installed.

make -f makefiles/install_rsat.mk  perl_modules_list
make -f makefiles/install_rsat.mk perl_modules_check
more check_perl_modules_eval.txt
grep Fail  check_perl_modules_eval.txt
grep -v '^OK'  check_perl_modules_eval.txt | grep -v '^;'
MISSING_PERL_MODULES=`grep -v '^OK'  check_perl_modules_eval.txt | grep -v '^;' | cut -f 2 | xargs`
echo "Missing Perl modules:     ${MISSING_PERL_MODULES}"

Install the missing Perl modules.

make -f makefiles/install_rsat.mk perl_modules_install PERL_MODULES="${MISSING_PERL_MODULES}"

Check once more if all required Perl modules have been correctly installed.

make -f makefiles/install_rsat.mk perl_modules_check
more check_perl_modules_eval.txt

Note: Object::InsideOut always displays “Fail”, whereas it is OK during installation.

7.1.8. Configure RSAT web server

cd ${RSAT}
sudo rsync -ruptvl RSAT_config.conf /etc/apache2/sites-enabled/rsat.conf
apache2ctl restart

RSAT Web server URL

echo $RSAT_WWW

If the value is “auto”, get the URL as follows:

export IP=`ifconfig eth0 | awk '/inet /{print $2}' | cut -f2 -d':'`
echo ${IP}
export RSAT_WWW=http://${IP}/rsat/
echo $RSAT_WWW

7.1.9. Other

compile RSAT programs written in C

make -f makefiles/init_rsat.mk compile_all
export INSTALL_ROOT_DIR=/packages/

Install some third-party programs required by some RSAT scripts.

make -f makefiles/install_software.mk install_ext_apps

Mkvtree licence / Vmatch

Get a licence here

Alternately, you can copy-paste from another RSAT device...

rsync -ruptvl /packages/rsat/bin/vmatch.lic root@<IP>:/packages/rsat/bin/

7.1.10. Data management

export RSAT_DATA_DIR=/root/mydisk/rsat_data
cd ${RSAT}/public_html
mv data/* ${RSAT_DATA_DIR}/
mv data/.htaccess ${RSAT_DATA_DIR}/
rmdir data
ln -s ${RSAT_DATA_DIR} data
cd $RSAT

Install model organisms, required for some of the Web tools.

download-organism -v 1 -org Saccharomyces_cerevisiae -org Escherichia_coli_K_12_substr__MG1655_uid57779
download-organism -v 1 -org Drosophila_melanogaster

Get the list of organisms supported on your computer.

supported-organisms

7.1.11. Install selected R librairies

Packages required for some RSAT scripts.

cd $RSAT; make -f makefiles/install_rsat.mk install_r_packages
cd $RSAT; make -f makefiles/install_rsat.mk update ## install R packages + compile the C programs

NB: second only if git repo

7.1.12. Testing RSAT & external programs

Test a simple Perl script that does not require for organisms to be installed.(OK)

which random-seq
random-seq -l 100

Test a simple python script that does not require organisms to be installed.(OK)

random-motif -l 10 -c 0.90

Test vmatch

random-seq -l 100 | purge-sequence

seqlogo

which seqlogo
seqlogo

weblogo 3

which weblogo
weblogo --help

ghostscript

which gs
gs --version

Check that the model genomes have been correctly installed

# Retrieve all the start codons and count oligonucleotide frequencies (most should be ATG).
retrieve-seq -org Saccharomyces_cerevisiae -all -from 0 -to +2 | oligo-analysis -l 3 -1str -return occ,freq -sort

7.1.13. Configure the SOAP/WSDL Web services

Check the URL of the web services (RSAT_WS). By default, the server addresses the WS requests to itself (http://localhost/rsat) because web services are used for multi-tierd architecture of some Web tools (retrieve-ensembl-seq, NeAT).

cd $RSAT
#echo $RSAT_WS

Get the current IP address

export IP=`/sbin/ifconfig eth0 | awk '/inet /{print $2}' | cut -f2 -d':'`
echo ${IP}
export  RSAT_WS=http://${IP}/rsat/

Initialize the Web services stub

make -f makefiles/init_rsat.mk ws_init RSAT_WS=${RSAT_WS}

After this, re-generate the web services stubb, with the following command

make -f makefiles/init_rsat.mk ws_stub RSAT_WS=${RSAT_WS}

Test the local web services OK

make -f makefiles/init_rsat.mk ws_stub_test

Test RSAT Web services (local and remote) without using the SOAP/WSDL stubb (direct parsing of the remote WSDL file)

make -f makefiles/init_rsat.mk ws_nostub_test

Test the program supported-organisms-server, which relies on Web services without stub

supported-organisms-server -url ${RSAT_WS} | wc
supported-organisms-server -url http://localhost/rsat/ | wc
supported-organisms-server -url http://rsat-tagc.univ-mrs.fr/ | wc

Tests on the Web site

Run the demo of the following tools (to redo)

  • retrieve-seq to check the access to local genomes (at least Saccharomyces cerevisiae)
  • feature-map to check the GD library
  • retrieve-ensembl-seq to check the interface to Ensembl
  • fetch-sequences to check the interface to UCSC
  • some NeAT tools (they rely on web services)
  • peak-motifs because it mobilises half of the RSAT tools -> a good control for the overall installation.
  • footprint-discovery to check the tools depending on homology tables (blast tables).

7.1.14. Install the cluster management system (torque, qsub, ...)

Check the number of core (processors)

grep ^processor /proc/cpuinfo

Check RAM

grep MemTotal /proc/meminfo

Install Sun Grid Engine (SGE) job scheduler

Beware, before installing the grid engine we need to modify manually the file /etc/hosts

nano /etc/hosts

Initial config (problematic)

127.0.0.1       localhost       rsat-vm-2015-02
127.0.1.1      rsat-vm-2015-02

Config to obtain:

127.0.0.1       localhost       rsat-vm-2015-02
127.0.1.1      rsat-vm-2015-02

/?\

apt-get install --quiet --assume-yes gridengine-client
apt-get install --quiet --assume-yes gridengine-exec
apt-get install --quiet --assume-yes gridengine-master
apt-get install --quiet --assume-yes gridengine-qmon
qconf -aq default  ## aggregate a new queue called "default"
qconf -mq default  ## modify the queue "default"
qconf -as localhost ## aggregate the localhost tho the list of submitters

Set the following values: hostlist              localhost

Take all default parameters BUT for the SGE master parameter, type localhost (it must be the hostname)

Test that jobs can be sent to the job scheduler.

7.1.15. OPTIONAL

Install some software tools for NGS analysis.

cd ${RSAT}
make -f makefiles/install_software.mk install_meme

Ganglia: tool to monitor a cluster (or single machine)

Link.

sudo apt-get install -y ganglia-monitor rrdtool gmetad ganglia-webfrontend
sudo cp /etc/ganglia-webfrontend/apache.conf /etc/apache2/sites-enabled/ganglia.conf
sudo apachectl restart

7.2. Galaxy server setup

7.2.1. Downloading Galaxy code

We followed the instructions from the Galaxy Web site:

```{r eval=FALSE} ## get a git clone of galaxy git clone https://github.com/galaxyproject/galaxy/ cd galaxy ## Go th the galaxy directory

7.2.3. Configure the Galaxy server (and get python modules if required)

We first edit the config file to chooe a specific port for Galaxy

{r eval=FALSE} cp config/galaxy.ini.sample config/galaxy.ini

We then edit this file by setting the port to 8082, because our 8080 is already used for other purposes.

We performed the following modifications.

admin_users=admin1@address.fr,admin2@univbazar.fr,admin3@gmail.com port = 8082 # The port on which to listen. host = 0.0.0.0 ## To enable access over the network allow_user_deletion = True

7.2.4. Configuring the Apache server on RSAT

Activate the Apache module rewrite.load

{r eval=FALSE} ln -s /etc/apache2/mods-available/rewrite.load  /etc/apache2/mods-enabled/rewrite.load

Create a file /etc/apache2/sites-enabled/galaxy.conf with the following content

<VirtualHost *:80>
ServerAdmin webmaster@localhost
ServerSignature Off

# Config pour galaxy ands http://mydomain.com/galaxy
RewriteEngine on
RewriteRule ^/galaxy$ /galaxy/ [R]
RewriteRule ^/galaxy/static/style/(.*) /home/galaxy/galaxy/static/june_2007_style/blue/$1 [L]
RewriteRule ^/galaxy/static/scripts/(.*) /home/galaxy/galaxy/static/scripts/packed/$1 [L]
RewriteRule ^/galaxy/static/(.*) /home/galaxy/galaxy/static/$1 [L]
RewriteRule ^/galaxy/favicon.ico /home/galaxy/galaxy/static/favicon.ico [L]
RewriteRule ^/galaxy/robots.txt /home/galaxy/galaxy/static/robots.txt [L]
RewriteRule ^/galaxy(.*) http://localhost:8082$1 [P]
#RewriteRule ^/galaxy(.*) http://192.168.1.6:8082$1 [P]
</VirtualHost>

Restart the Apache server. {r eval=FALSE} sudo service apache2 restart

7.2.5. Starting the galaxy server

{r eval=FALSE} sh run.sh

On our internal network, the server becomes available at the address:

http://192.168.1.6:8082

7.2.6. Registrating

  • open a connection to the Galaxy server
  • In the Galaxy menu, run the command User -> Register. Enter the same email address as you declared as admin users.

7.2.7. Install Galaxy modules