Medical Bioinformatics and e-Bioscience

From executable to workflow

Introduction

There are several ways to transcribe a unix command line executable (bash, perl, java,.. ) into a workflow that can be executed on the Dutch Grid. Not all steps described on this page are necessary but they indicate all the procedures that you can follow.

If you have no knowledge of the e-bioscience environment you can first read: HowTo

All files can be viewed or downloaded by clicking on the name.

Execution Environment

To test the executable, login on a user interface system (e.g ui.grid.sara.nl ) to match the correct environment that will be found in the worker node when the job is running.

If you have a running X window server, close it, as this server is not running on a worker node.

Create executable and define in- and output

The executable RandomR.sh starts the R script Random.R

The R script:

  • generates x number of random values
  • plot these values in a png file
  • write these values to a text file.
The parameters for the executable are:
  • R version
  • nr of values
  • name of textfile
  • name of plotfile
The files graphics.tar.gz and r-graphics.tar.gz are needed for printing graphics with R on the Grid.

See: GraphicsONGridWithR

Command line: RandomR.sh -p1 r/2.9.2 -p2 100 resultR.txt resultR.png

Note: -p1 and -p2 are parameter options used in the Gasw xml file. These are necessary to run saGasw !

Automatic generation of Gasw XML, scufl and gwendia file

To generate the Gasw xml, scufl and gwendia file use this (as is) perl script, one should be familiar with the format (seeGasWTemplate.xml ) of Gasw xml ..

Download and extract create_GASW_SCUFL_GWENDIA20110409.zip which contains 4 files:
1. create_GASW_SCUFL_GWENDIA.pl , the perl script
2. GasWTemplateFULL.xml , the gasw template file read by the script
3. ScuflTemplate.scufl, the scufl template file.
4. GwendiaTemplate.gwendia, the gwendia template file.

Then:
-chmod a+x create_GASW_SCUFL_GWENDIA.pl
-start the script: ./create_GASW_SCUFL_GWENDIA.pl

All files ("name".xml, WF_"name".scufl and "name".gwendia) will be created in the directory xmlfiles.
Scufl: Iteration strategy in scufl file should be added manually ! The default iteration strategy is cross-product.
Gwendia: The default depth for all inputs and outputs in the gwendia file is set to 0. The default Sourcetype for parameter input is set to "string" and the iteration strategy is "cross".

-contact AngelaLuijf for questions

Example of running create_Gasw_AND_scufl.pl

Copy files to LFC*

Copy the files to the directories corresponding the Gasw xml file: RandomR.sh, Random.R, RandomR.xml , WF_RandomR.scufl and WF_RandomR.gwendia , graphics.tar.gz and r-graphics.tar.gz

  • Using the VBrowser
  • Using the command line on cluster, e.g.
    • lcg-cr -v --vo vlemed -l lfn:/grid/vlemed/yourdir/RandomR/RandomR.sh "file://$PWD/RandomR.sh"

Create Gwendia workflow using Gasw as processor type*

Start Moteur2 ide, for installation see: Moteur2 UserGuide

  • Create new workflow, e.g. RandomR
  • right click, add new processor , change name into RandomR
  • right click on processor, select configure
  • IMPORTANT, names for input and output ports must be the same as in Gasw xml

  • define Port types, string for text, URI for file
  • select tab Processor, select Processor type Gasw and enter the location of the Gasw xml file (http or lfn )
  • add 2 inputs, define ports as string, right click on input and select data link from, connect to processor:

  • add 2 outputs, define ports as URI, right click on processor and connect data links to corresponding outputs

  • Save the workflow, e.g. RandomR.gwendia

Run workflow in Gwendia format*

Start the Vbrowser and right click on the gwendia file, select open with Moteur2, enter the parameter values and click run:

SCUFL

Create workflows using Taverna -> Manual

Test script running as job on Grid using saGasw*

Skip (sagasw.sh -i input.dat -n output.dat -o GridRandomR -d RandomR.xml

See: saGasw )

Run workflow in scufl format

Start the Vbrowser and click on the scufl file, enter the parameter values and click run:

Workflow done:

Topic revision: r17 - 2011-04-28 - AngelaLuijf
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback