Configuring SPADE to use scp for file transfers

This article explains how to configure a SPADE deployment so that data transfers are done using scp.

Pre-requisites

It is assumed that the nest-spade-war project has been install and running as outlined here, and that the log output of the JBoss server can be seen in a second terminal.

It is also helpful if you have at least read the Local Warehouse scenario in order to familiarize yourself with the concepts discussed there as they will be re-used here. Moreover, as this scenario builds upon the Loopback Transfers scenario to provide data, you should work through that one first.

Also, as in that scenario, the following environmental variables need to be set. They are show here being set to their standard values.

export JBOSS_HOME=${HOME}/server/jboss-as-7.1.1.Final
export SPADE_VERSION=2.2.0
export SPADE_WAR_HOME=${HOME}/nest-spade-war-${SPADE_VERSION}

Preparing for scp Transfers Between Nodes

SPADE can use any mechanism to transfer a file provided there is an implemention of the gov.lbl.nest.spade.services.FileTransfer interface for that mechanism. So far the example scenarios have only transferred file to the localhost and this has used the gov.lbl.nest.spade.services.impl.LocalhostTransfer implementation. However in order to transfer files between different hosts you'll need to configure your SPADE deployment to use an different class. The easiest choice for transfers is scp as it is pre-installed on most hosts, but it does require a bit of preparation to make it execute cleanly.

To begin with your host needs an ssh key in order to transfer files using scp without requiring a password. You will also want to make sure that that key can not be used for any other purpose on the target host. Therefore you will want to install the ~/bin/spade_scp_wrapper.py file. The following command do that for your current host.

cat > ~/bin/spade_scp_wrapper.py << EOF
#!/usr/bin/env python

scp_server = '/usr/bin/scp'

if __name__ == '__main__' :

    from optparse import OptionParser, OptionGroup, IndentedHelpFormatter
    parser = OptionParser(usage='scp-wrapper [-d dir]',
                          version='%prog 1.0')
    parser.add_option('-d',
                      '--directory',
                      dest = 'DIRECTORY',
                      help = 'limit actions this directory or below',
                      default = None)
    def suppress_error(self, message):
                pass
    import new
    parser.error = new.instancemethod(suppress_error,
                                      parser,
                                      parser.__class__)
    import sys
    try:
        (options, args) = parser.parse_args()
    except:
        sys.stderr.write('account restricted: misconfigured destination\n')
        sys.exit(1)
   
    import os
    try:
        command = os.environ['SSH_ORIGINAL_COMMAND']
    except:
        sys.stderr.write('account restricted: only non-interactive access allowed\n')
        sys.exit(-1)

    tokens = command.split()
    if 'scp' != tokens[0]:
        sys.stderr.write('account restricted: only scp is allowed\n')
        sys.exit(-2)

    scpParser = OptionParser(usage='scp [-pv] -t file',
                             version='%prog')
    scpParser.disable_interspersed_args()
    scpParser.add_option('-p',
                         dest='PRESERVE',
                         action='store_true',
                         default = False)
    scpParser.add_option('-t',
                         dest='TRANSFER',
                         action='store_true',
                         default = False)
    scpParser.add_option('-v',
                         dest='VERBOSE',
                         action='store_true',
                         default = False)
    scpParser.error = new.instancemethod(suppress_error,
                                         scpParser,
                                         scpParser.__class__)
    try:
        (scpOptions, scpArgs) = scpParser.parse_args(tokens[1:])
    except:
        sys.stderr.write('account restricted: only limited delivery is allowed\n')
        sys.exit(-3)

    if not scpOptions.TRANSFER:
        sys.stderr.write('account restricted: only delivery is allowed\n')
        sys.exit(-3)

    if None != options.DIRECTORY:
        for arg in scpArgs:
            if not os.path.abspath(arg).startswith(options.DIRECTORY):
                sys.stderr.write('account restricted: only SPADE delivery is allowed\n')
                sys.exit(-4)

    tokens[0] = scp_server
    os.execve(scp_server, tokens, {})
EOF
chmod +x ~/bin/spade_scp_wrapper.py

With this command in place, you can now generate the key you will use for all scp transfers. The following command do just that.

mkdir ~/.ssh
chmod 700 ~/.ssh
cd ~/.ssh
ssh-keygen -t dsa -N '' -C "SCP Transfers from `hostname`" -f `hostname`_scptransfer
cat >> config << EOF

Host `hostname`
    User `whoami`
    IdentityFile ~/.ssh/`hostname`_scptransfer
EOF
chmod 644 config
cd -

Note that this key is generated without a passphrase. For production deployment you should reviewe this choice, taking into account the options set in the authorized_keys file to limit the key to execute ~/bin/spade_scp_wrapper.py, and decide if you need something more secure.

You now need to install the public portion of the key into the target ~/.ssh/authorized_keys file and also set up the matching ~/.ssh/known_hosts file. The following commands will do that, creating either file if necessary. (You will have to type in "yes" in response to the "Are you sure you want to continue connecting (yes/no)?" prompt.)

cd ~/.ssh
read PUBLIC_KEY < `hostname`_scptransfer.pub
echo from=\"`hostname`\",command=\"$HOME/bin/spade_scp_wrapper.py -d $HOME/spade/receiving/loopback\",\
no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding $PUBLIC_KEY >> authorized_keys
chmod 644 authorized_keys
unset PUBLIC_KEY
scp -i `hostname`_scptransfer `hostname`_scptransfer.pub `hostname`:spade/receiving/loopback/temp

Note that if the hostname command does not return the FQDN of your host the scp may not work correctly, in that case you will need to edit the ~/.ssh/authorized_keys by hand to put the FQDN in there.

Finally you need to clean up the temporary file created by the preceding scp command with the following command.

rm ~/spade/receiving/loopback/temp

Your host is now set up to transfer files to itself using scp and you can proceed by configuration SPADE to use this mechanism.

Configuration

In order for SPADE to transfer file using scp it need to be configured to use the gov.lbl.nest.spade.services.impl.SCPTransfer class rather than the gov.lbl.nest.spade.services.impl.LocalhostTransfer it has been using and change the target node the be the host's name rather the "localhost". The following commands make the necessary changes to the spade.xml file.

sed -i.bck.1 -e 's/LocalhostTransfer/SCPTransfer/g' ~/spade/spade.xml
export PATCH_HOST=s/localhost:\\~\\//`hostname`:/g
sed -i.bck.2 -e $PATCH_HOST ~/spade/spade.xml
unset PATCH_HOST

The same note above about the FQDN applied here to the resulting spade.xml file.

Execution

Execution of this scenario is the same as the scenario upon which it is built, namely the Loopback Transfers scenario. The following commands show how to run the whole scenario.

${JBOSS_HOME}/bin/jboss-cli.sh --connect --command="deploy \
    --name=spade.war ${SPADE_WAR_HOME}/target/spade-${SPADE_VERSION}.war"
mkdir -p ~/spade/dropbox/mock
cat > ~/spade/dropbox/mock/mock.3.data << EOF
put some junk in here
EOF
touch ~/spade/dropbox/mock/mock.3.lem
${SPADE_WAR_HOME}/src/main/bash/localScan
# Wait for the "finisher" for this file to stop, e.g. 5 seconds
sleep 5
${SPADE_WAR_HOME}/src/main/bash/inboundScan
# Wait for the "finisher" for this file to stop, e.g. 5 seconds
sleep 5
${SPADE_WAR_HOME}/src/main/bash/sendConfirmations
find ~/spade/warehouse -name "mock.3.*"

Cleanup

Having successfully completed this scenario you should now undeploy the application using the following command.

${JBOSS_HOME}/bin/jboss-cli.sh --connect --command="undeploy spade.war"