Configuring SPADE to use a customized metadata

This article explains how to configure a SPADE deployment so that data it uses a customized MetaData class rather than the default, which simply contains the date and time that the semaphore file was originally created.


NOTE: As of JEE 7 this article is not longer valid. We are working to provide a suitable update.


Pre-requisites

It is assumed that the SPADE project has been install and running as outlined here, and that the log output of the JBoss server can be seen in a second terminal.

It is also helpful if you have at least read the "Local Warehouse" scenario in order to familiarize yourself with the concepts discussed there as they will be re-used here. Moreover, as this scenario builds upon the "Non-dropbox Data" scenario to provide data, you should work through that one first.

Also, as in that scenario, the following environmental variables need to be set. They are show here being set to their standard values.

export WILDFLY_HOME=${HOME}/server/wildfly-9.0.2.Final
export SPADE_VERSION=3.0.1
export SPADE_HOME=${HOME}/nest-spade-war-${SPADE_VERSION}
export SPADE_WAR=${SPADE_HOME}/target/spade-${SPADE_VERSION}.war

Configuration

SPADE can use any arbitrary class to carry a file's metadata provided it implements the gov.lbl.nest.spade.registry.Metadata tag interface. However you need to provide an accompanying gov.lbl.nest.spade.services.MetadataManager implementation for SPADE to be able to use the class. Details about the MetadataManager are provided elsewhere, suffice that you need to know such a class exists. The standard distribution of SPADE contains an alternate Metadata implementation, along with its accompanying MetadataManager implementation. In order to use this alternate version you need to edit the src/main/webapp/WEB-INF/beans.xml in your copy of the nest-spade-war project and add the following to the <bean> element in that file.

    <alternatives>
        <class>gov.lbl.nest.spade.metadata.Metadata2ManagerImpl</class>
    </alternatives>

You then need to rebuild the deployment WAR file using the following command.

cd ${SPADE_HOME}
mvn -DuseExampleDS clean package
cd -

Execution

Execution of this scenario is the same as the scenario upon which it is built, namely the "Non-dropbox" scenario. The following commands show how to run the whole scenario.

${WILDFLY_HOME}/bin/jboss-cli.sh --connect --command="deploy \
    --name=spade.war ${SPADE_WAR}"
mkdir -p ~/spade/other/directory/scenario/external
cat > ~/spade/other/directory/scenario/external/mock.9.data << EOF
put some junk in here
EOF
mkdir -p ~/spade/dropbox/scenario/external
echo "../../../other/directory/scenario/external/mock.9.data" > ~/spade/dropbox/scenario/external/mock.9.sem
${SPADE_HOME}/src/main/python/spade-cli local_scan
# Wait for the "finisher" for this file to stop, e.g. 5 seconds
sleep 5
${SPADE_HOME}/src/main/python/spade-cli inbound_scan
# Wait for the "finisher" for this file to stop, e.g. 5 seconds
sleep 5
${SPADE_HOME}/src/main/python/spade-cli send_confirmations
find ~/spade/warehouse -name "mock.9.*"

You can inspect the new metadata using the following command and will see that the metadata now contains a <sourcefile> element that contains the path where the shipping SPADE instance found the data and a <sourcehost> element containing the node on which the shipping SPADE instance was executing.

xmllint -format `find ~/spade/warehouse -name "mock.9.meta.xml"`

Cleanup

Having successfully completed this scenario you should now undeploy the application using the following command.

${WILDFLY_HOME}/bin/jboss-cli.sh --connect --command="undeploy spade.war"