Medical Bioinformatics and e-Bioscience

GASW

The Generic Application Service Wrapper is part of MOTEUR, a workflow management system. GASW is used to automatically wrap non-interactive command line linux applications (=programs) to run them as jobs on the grid. In our systems we adopt the gLite middleware, so GASW generates a job description language (name.jdl) file with details about the job and a script (name.sh) to invoke the application from a grid job. The .sh file is referenced in the .jdl file. The script takes care of dowloadiing the necessary files locally on the grid node, passing these local files as command-line parameters to the application, and uploading the results generated by the application to the grid storage. The application command-line parameters (among other things) are described in an XML file (called GASW descriptor) that is interpretedby GASW to generate the script.

The GASW descriptor's temporary reference manual explains more details, in particular the format of the GASW descriptor. GASW can also be used in a stand-alone fashion (without MOTEUR) with SaGasw.

Automatic generation of Gasw XML, scufl and gwendia file

To generate the Gasw xml, scufl and gwendia file use this (as is) perl script, one should be familiar with the format (see GasWTemplate.xml ) of Gasw xml ..

Download and extract create_GASW_SCUFL_GWENDIA20110409.zip which contains 4 files:
1. create_GASW_SCUFL_GWENDIA.pl , the perl script
2. GasWTemplateFULL.xml , the gasw template file read by the script
3. ScuflTemplate.scufl, the scufl template file.
4. GwendiaTemplate.gwendia, the gwendia template file.

Then:
-chmod a+x create_GASW_SCUFL_GWENDIA.pl
-start the script: ./create_GASW_SCUFL_GWENDIA.pl

All files ("name".xml, WF_"name".scufl and "name".gwendia) will be created in the directory xmlfiles.
Scufl: Iteration strategy in scufl file should be added manually ! The default iteration strategy is cross-product.
Gwendia: The default depth for all inputs and outputs in the gwendia file is set to 0. The default Sourcetype for parameter input is set to "string" and the iteration strategy is "cross".

-contact AngelaLuijf for questions

Tips

Gasw service (only for Scufl)

If the Workflow contains more than 6 inputs (parameters and/or files) the standard gasw_service (http://egee1.unice.fr/wsdl/gasw_service.wsdl) OR (http://amc-app1.amc.sara.nl/luyf/MoteurXML/wsdl/gasw_service.wsdl) has to be replaced in the scufl file.

E.g.:

http://egee1.unice.fr/wsdl/gasw_service_10_6.wsdl

http://egee1.unice.fr/wsdl/gasw_service_31_1.wsdl

Or:

http://amc-app1.amc.sara.nl/luyf/MoteurXML/wsdl/gasw_service_10_6.wsdl

http://amc-app1.amc.sara.nl/luyf/MoteurXML/wsdl/gasw_service_31_1.wsdl

Job Requirements

Job requirements (e.g. installed software, queue type and other characteristics of the worker node where the program can be executed) can be indicated in the GASW descriptor. Multiple requirements must be given on separate lines, the && operator cannot be used because of a parsing bug. The correct use of single and double quotes is important.
An example:

<description>
<executable name="coregDataset1to2_2.sh">
<access type="LFN"/>
<value value="/grid/vlemed/matthan/bin/coregDataset1to2_2.sh"/>

<requirement value='RegExp("gina.sara.nl", other.GlueCEUniqueId)'/>
<requirement value='(other.GlueHostArchitecturePlatformType == "x86_64")' />
<input name="data1" option="no1">
<access type="LFN"> </access>
</input>
...

Topic revision: r31 - 2011-04-28 - AngelaLuijf
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback