Configuring SPADE to use scp for file transfers

This article explains how to configure a SPADE deployment so that data transfers are done using scp .

Pre-requisites

It is assumed that the nest-spade-war project has been install and running as outlined here , and that the log output of the JBoss server server can be seen in a second terminal.

It is also helpful if you have at least read the "Local Warehouse" scenario in order to familiarize yourself with the concepts discussed there as they will be re-used here. Moreover, as this scenario builds upon the "Loopback Transfers" scenario to provide data, you should work through that one first.

Also, as in that scenario, the following environmental variables need to be set. They are show here being set to their standard values.

export WILDFLY_HOME=${HOME}/server/wildfly-9.0.2.Final
export SPADE_VERSION=3.0.1
export SPADE_HOME=${HOME}/nest-spade-war-${SPADE_VERSION}
export SPADE_WAR=${SPADE_HOME}/target/spade-${SPADE_VERSION}.war

Preparing for scp Transfers Between Nodes

SPADE can use any mechanism to transfer a file provided there is an implemention of the gov.lbl.nest.spade.services.FileTransfer interface for that mechanism. So far the example scenarios have only transferred files to the localhost and this has used the gov.lbl.nest.spade.services.impl.LocalhostTransfer implementation. However in order to transfer files between different hosts you'll need to configure your SPADE deployment to use an different class. The easiest choice for transfers is scp as it is pre-installed on most hosts, but it does require a bit of preparation to make it execute cleanly.

To begin with your host needs an ssh key in order to transfer files using scp without requiring a password. You will also want to make sure that that key can not be used for any other purpose on the target host. Therefore you will want to install the ~/bin/spade_scp_wrapper.py file. The following command do that for your current host.

mkdir -p ~/bin
cp ${SPADE_HOME}/src/main/python/spade_scp_wrapper.py ~/bin/
chmod +x ~/bin/spade_scp_wrapper.py

With this command in place, you can now generate the key you will use for all scp transfers. The following command do just that. (It does assume that hostname returns a reasonable value, if not, you'll need to set a suitable value by hand.)

mkdir ~/.ssh
chmod 700 ~/.ssh
cd ~/.ssh
ssh-keygen -t dsa -N '' -C "SCP Transfers from `hostname`" -f `hostname`_scptransfer
cat >> config << EOF

Host `hostname`
    HostName localhost
    User `whoami`
    IdentityFile ~/.ssh/`hostname`_scptransfer
EOF
chmod 644 config
cd -

Note that this key is generated without a passphrase. For production deployment you should reviewe this choice, taking into account the options set in the authorized_keys file to limit the key to execute ~/bin/spade_scp_wrapper.py , and decide if you need something more secure.

You now need to install the public portion of the key into the target ~/.ssh/authorized_keys file and also set up the matching ~/.ssh/known_hosts file. The following commands will do that, creating either file if necessary. (You will have to type in "yes" in response to the "Are you sure you want to continue connecting (yes/no)?" prompt.)

cd ~/.ssh
read PUBLIC_KEY < `hostname`_scptransfer.pub
echo from=\"localhost\",command=\"$HOME/bin/spade_scp_wrapper.py -d $HOME/spade/receiving/loopback\",\
no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding $PUBLIC_KEY >> authorized_keys
chmod 644 authorized_keys
unset PUBLIC_KEY
scp `hostname`_scptransfer.pub `hostname`:spade/receiving/loopback/temp

NOTE: The hostname command does not always return the FQDN of your host, and if it does not then scp may not work correctly, in that case you will need to edit the ~/.ssh/authorized_keys by hand to put the FQDN in there.


Finally you need to clean up the temporary file created by the preceding scp command with the following command.

cd
rm ~/spade/receiving/loopback/temp

Your host is now set up to transfer files to itself using scp and you can proceed by configuration SPADE to use this mechanism.

Configuration

In order for SPADE to transfer file using scp it need to be configured to use the gov.lbl.nest.spade.services.impl.SCPTransfer class rather than the gov.lbl.nest.spade.services.impl.LocalhostTransfer it has been using. The following commands make the necessary changes to the spade.xml file.

sed -i.bck.1 -e 's/LocalhostTransfer/SCPTransfer/g' ~/spade/spade.xml
export PATCH_HOST=s/localhost:\\~\\//`hostname`:/g
sed -i.bck.2 -e $PATCH_HOST ~/spade/spade.xml
unset PATCH_HOST

The same note above about the FQDN applied here to the resulting spade.xml file.

Execution

Execution of this scenario is the same as the scenario upon which it is built, namely the "Loopback Transfers" scenario . The following commands show how to run the whole scenario.

${WILDFLY_HOME}/bin/jboss-cli.sh --connect --command="deploy \
    --name=spade.war ${SPADE_WAR}"
mkdir -p ~/spade/dropbox/scenario/loopback
cat > ~/spade/dropbox/scenario/loopback/mock.3.data << EOF
put some junk in here
EOF
touch ~/spade/dropbox/scenario/loopback/mock.3.sem
${SPADE_HOME}/src/main/python/spade-cli local_scan
# Wait for the "finisher" for this file to stop, e.g. 5 seconds
sleep 5
${SPADE_HOME}/src/main/python/spade-cli inbound_scan
# Wait for the "finisher" for this file to stop, e.g. 5 seconds
sleep 5
${SPADE_HOME}/src/main/python/spade-cli send_confirmations
find ~/spade/warehouse -name "mock.3.*"

Cleanup

Having successfully completed this scenario you should now undeploy the application using the following command.

${WILDFLY_HOME}/bin/jboss-cli.sh --connect --command="undeploy spade.war"