|
Medical Bioinformatics and e-Bioscience
|
|
|
|
From executable to workflow
Introduction
There are several ways to transcribe a unix command line executable (bash, perl, java,.. ) into a workflow that can be executed on the
Dutch Grid. Not all steps described on this page are necessary but they indicate all the procedures that you can follow.
If you have no knowledge of the e-bioscience environment you can first read:
HowTo
All files can be viewed or downloaded by clicking on the name.
Execution Environment
To test the executable, login on a user interface system (e.g ui.grid.sara.nl ) to match the correct environment that will be found in the worker node when the job is running.
If you have a running X window server, close it, as this server is not running on a worker node.
Create executable and define in- and output
The executable
RandomR.sh starts the R script
Random.R
The R script:
- generates x number of random values
- plot these values in a png file
- write these values to a text file.
The parameters for the executable are:
- R version
- nr of values
- name of textfile
- name of plotfile
The files graphics.tar.gz and r-graphics.tar.gz are needed for printing graphics with R on the Grid.
See:
GraphicsONGridWithR
Command line: RandomR.sh -p1 r/2.9.2 -p2 100 resultR.txt resultR.png
Note: -p1 and -p2 are parameter options used in the Gasw xml file. These are necessary to run saGasw !
Automatic generation of Gasw XML, scufl and gwendia file
To generate the Gasw xml, scufl and gwendia file use this (as is) perl script, one should be familiar with the format (see
GasWTemplate.xml ) of Gasw xml ..
Download and extract
create_GASW_SCUFL_GWENDIA20110409.zip which contains 4 files:
1. create_GASW_SCUFL_GWENDIA.pl , the perl script
2. GasWTemplateFULL.xml , the gasw template file read by the script
3. ScuflTemplate.scufl, the scufl template file.
4. GwendiaTemplate.gwendia, the gwendia template file.
Then:
-chmod a+x create_GASW_SCUFL_GWENDIA.pl
-start the script: ./create_GASW_SCUFL_GWENDIA.pl
All
files ("name".xml, WF_"name".scufl and "name".gwendia) will be created in the directory xmlfiles.
Scufl: Iteration strategy in scufl file should be added manually ! The default iteration strategy is cross-product.
Gwendia: The default depth for all inputs and outputs in the gwendia file is set to 0. The default Sourcetype for parameter input is set to "string" and the iteration strategy is "cross".
-contact
AngelaLuijf for questions
Example of running create_Gasw_AND_scufl.pl
Copy files to LFC*
Copy the files to the directories corresponding the Gasw xml file: RandomR.sh, Random.R,
RandomR.xml ,
WF_RandomR.scufl and
WF_RandomR.gwendia , graphics.tar.gz and r-graphics.tar.gz
- Using the VBrowser
- Using the command line on cluster, e.g.
- lcg-cr -v --vo vlemed -l lfn:/grid/vlemed/yourdir/RandomR/RandomR.sh "file://$PWD/RandomR.sh"
Create Gwendia workflow using Gasw as processor type*
Start Moteur2 ide, for installation see:
Moteur2 UserGuide
- Create new workflow, e.g. RandomR
- right click, add new processor , change name into RandomR
- right click on processor, select configure
- IMPORTANT, names for input and output ports must be the same as in Gasw xml
- define Port types, string for text, URI for file
- select tab Processor, select Processor type Gasw and enter the location of the Gasw xml file (http or lfn )
- add 2 inputs, define ports as string, right click on input and select data link from, connect to processor:
- add 2 outputs, define ports as URI, right click on processor and connect data links to corresponding outputs
- Save the workflow, e.g. RandomR.gwendia
Run workflow in Gwendia format*
Start the Vbrowser and right click on the gwendia file, select open with Moteur2, enter the parameter values and click run:
SCUFL
Create workflows using Taverna ->
Manual
Test script running as job on Grid using saGasw*
Skip (sagasw.sh -i
input.dat -n
output.dat -o GridRandomR -d RandomR.xml
See:
saGasw )
Run workflow in scufl format
Start the Vbrowser and click on the scufl file, enter the parameter values and click run:
Workflow done: