This article explains how to configure a SPADE deployment so that it can run an analysis on any files it places in the local warehouse.
When data is analyzed is determined by its registration. The following commands creates a file containing a suitable registration.
mkdir -p ${HOME}/spade.zero/spade/registrations/local cat > ${HOME}/spade.zero/spade/registrations/local/analyze_data.3.xml << EOF <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <registration> <local_id>analyze.data.3</local_id> <drop_box> <location> <directory>data/spade/dropbox</directory> </location> <pattern>.*.ayz</pattern> <mapping>gov.lbl.nest.spade.tour.AnalysisLocator</mapping> </drop_box> <analyze>true</analyze> </registration> EOF
Analysis of data is done the the analysis task. Therefore, in order for SPADE to analyze data as it is put in the warehouse you need to add a suitable element in the the spade.xml file. In this case, the following element can be added to the assembly element immediately after its name element.
<activity> <name>analyzer</name> <init-param> <param-name>command</param-name> <param-value><![CDATA[/opt/jboss/data/analysis/scripts/simple.sh {0} {2}]]></param-value> </init-param> <init-param> <param-name>output</param-name> <param-value>/opt/jboss/data/external/spade.zero/log</param-value> </init-param> </activity>
You are now ready to redeploy SPADE:
docker exec -it tour_spade \ cp wars/spade-${SPADE_VERSION}.war \ /opt/wildfly/standalone/deployments/spade.war
You can see from the activity declaration above, SPADE expects to write the log files specified directory. Therefore before continuing you should create the appropriate directory with the following command.
mkdir -p ${HOME}/spade/shared/spade.zero/log
You can now send a data file to the local warehouse and have it analyzed with the following:
cat > ${HOME}/spade.zero/dropbox/tour.3.data << EOF This data file should be analyzed after it is placed in the local warehouse EOF touch ${HOME}/spade.zero/dropbox/tour.3.ayz docker exec -it tour_spade bash -l -c "spade-cli local_scan"
The following command will show the files generated by the analysis.
find ${HOME}/spade/shared/spade.zero/log/ -name "*tour.3.*"
There are two pairs of files. The first pair, in the date stamped directory tree, contain the output of the script run by SPADE, simple.sh. The second, under the analysis directory, contain the re-directed output of the script, analysis.sh, run by the simple.sh script.
In this case the analysis simple outputs the bundle name and its file location.
more $(find ${HOME}/spade/shared/spade.zero/log/ -name "*tour.3.log")
This is just an example of how to organize the analysis, the only contains are the script specified in the spade.xml file will write its output in a date based hierarchy.