|
Medical Bioinformatics and e-Bioscience
|
|
|
|
The Generic Application Service Wrapper is part of MOTEUR, a workflow management system. GASW is used to automatically wrap non-interactive command line linux applications (=programs) to run them as jobs on the grid. In our systems we adopt the gLite middleware, so GASW generates a job description language (name.jdl) file with details about the job and a script (name.sh) to invoke the application from a grid job. The .sh file is referenced in the .jdl file. The script takes care of dowloadiing the necessary files locally on the grid node, passing these local files as command-line parameters to the application, and uploading the results generated by the application to the grid storage. The application command-line parameters (among other things) are described in an XML file (called GASW descriptor) that is interpretedby GASW to generate the script.
The
GASW descriptor's
temporary reference manual explains more details, in particular the format of the
GASW descriptor.
GASW can also be used in a stand-alone fashion (without
MOTEUR) with
SaGasw.
Automatic generation of Gasw XML, scufl and gwendia file
To generate the Gasw xml, scufl and gwendia file use this (as is) perl script, one should be familiar with the format (see
GasWTemplate.xml ) of Gasw xml ..
Download and extract
create_GASW_SCUFL_GWENDIA20110409.zip which contains 4 files:
1. create_GASW_SCUFL_GWENDIA.pl , the perl script
2. GasWTemplateFULL.xml , the gasw template file read by the script
3. ScuflTemplate.scufl, the scufl template file.
4. GwendiaTemplate.gwendia, the gwendia template file.
Then:
-chmod a+x create_GASW_SCUFL_GWENDIA.pl
-start the script: ./create_GASW_SCUFL_GWENDIA.pl
All
files ("name".xml, WF_"name".scufl and "name".gwendia) will be created in the directory xmlfiles.
Scufl: Iteration strategy in scufl file should be added manually ! The default iteration strategy is cross-product.
Gwendia: The default depth for all inputs and outputs in the gwendia file is set to 0. The default Sourcetype for parameter input is set to "string" and the iteration strategy is "cross".
-contact
AngelaLuijf for questions
Tips
Gasw service (only for Scufl)
If the Workflow contains more than 6 inputs (parameters and/or files) the standard gasw_service (
http://egee1.unice.fr/wsdl/gasw_service.wsdl) OR (
http://amc-app1.amc.sara.nl/luyf/MoteurXML/wsdl/gasw_service.wsdl) has to be replaced in the scufl file.
E.g.:
http://egee1.unice.fr/wsdl/gasw_service_10_6.wsdl
http://egee1.unice.fr/wsdl/gasw_service_31_1.wsdl
Or:
http://amc-app1.amc.sara.nl/luyf/MoteurXML/wsdl/gasw_service_10_6.wsdl
http://amc-app1.amc.sara.nl/luyf/MoteurXML/wsdl/gasw_service_31_1.wsdl
Job Requirements
Job requirements (e.g. installed software, queue type and other characteristics of the worker node where the program can be executed) can be indicated in the
GASW descriptor. Multiple requirements must be given on separate lines, the && operator cannot be used because of a parsing bug. The correct use of single and double quotes is important.
An example:
<description>
<executable name="coregDataset1to2_2.sh">
<access type="LFN"/>
<value value="/grid/vlemed/matthan/bin/coregDataset1to2_2.sh"/>
<requirement value='RegExp("gina.sara.nl", other.GlueCEUniqueId)'/><requirement value='(other.GlueHostArchitecturePlatformType == "x86_64")' />
<input name="data1" option="no1">
<access type="LFN"> </access>
</input>
...