Submitting a job to run on another server and retrieving the results

Imagine having two different servers called darwin and linnaeus. Imagine that darwin is a great server with loads of RAM for doing de-novo assembly and that linnaeus has loads of nodes so a great server for splitting up jobs and running lots of jobs in parallel. To make good use of all these resources, it would make sense to do part of the processing on one server and then automatically send jobs to be processed on another server.

So this is how you do that. On linnaeus you run:

ssh-keygen
cat id_rsa.pub

You copy the key and on darwin put the key in .ssh/authorized_keys2

vi .ssh/authorized_keys2
chmod 600 authorized_keys2

The reverse also needs to be done by putting a darwin key on linnaeus.

Now to test it out create the shell script that will be executed on linnaeus e.g. linnaeusshell:

gunzip -f /DataDisk/Joseph/*.gz
scp /DataDisk/Joseph/*embl darwin.vir.gla.ac.uk:.
echo "Done" > job.log
scp job.log darwin.server.ac.uk:.

This small script will uncompress a file, return the uncompressed file and return a “Done” log to darwin once the script is finished.

Now create a command shell on darwin, e.g. darwinshell:

scp Pf3D7_01.embl.gz linnaeus.server.ac.uk:/DataDisk/Joseph/.
ssh linnaeus.server.ac.uk 'bash -s' < linnaeusshell

And finally execute the darwinshell:

sh darwinshell

This will transfer the Pf3D7_01.embl.gz compressed file over to linnaeus where the file will be uncompressed and transferred back to darwin.

Big thanks to Sreenu who helped me a lot to sort this out.

Categories: UNIX