Using Cromwell with MMCloud
Summary
Workflows are computational pipelines, such as those found in bioinformatics, where there are a series of — sometimes interconnected — steps, each of which may involve different software and dependencies. Cromwell is an execution engine that allows users to run workflows written in the Workflow Description Language (WDL, pronounced widdle). WDL is a domain-specific language (DSL), that is, WDL is a language that has features customized for particular applications, in this case, for genomic analyses.
Cromwell is distributed as a Java ARchive (JAR) file, so running a workflow defined in a .wdl file requires a Java runtime engine to execute the Cromwell Java package. In a manner similar to Nextflow, the wdl file describes the steps in the workflow and how each step must be executed. The execution environment for each step in the analysis is described in Cromwell terminology as a "backend." The analogous concept in Nextflow is "executor."
By including a MemVerge-provided configuration file with the Java runtime, Cromwell can use Memory Machine Cloud (MMCloud) as a "backend." From a MMCloud point of view, the execution step that is assigned to it is an independent job that it runs just like any other batch job, which means that all the MMCloud features, such as SpotSurfer and WaveRider, are available. The benefits to the Cromwell user include cost savings and cloud resource rightsizing.
This document describes how to use Cromwell with MMCloud so that Cromwell can schedule one or more (or all) of the tasks in a workflow to run on MMCloud. Examples are used to demonstrate the principles; you can adapt and modify as needed to fit your workflow.
Configuration
The Cromwell Host is the host where you install Java and load the Cromwell JAR file. To use MMCloud as a backend, you must include a MemVerge-provided configuration file when you run the Cromwell JAR file. The configuration file contains the logic that translates the Cromwell task commands into a job file that is submitted (using the float
CLI) to the MMCloud OpCenter. The OpCenter instantiates a Worker Node (a container running in its own virtual machine) for each task in the process pipeline that uses MMCloud as a backend.
The Cromwell configuration file describes the environment for each backend. To use MMCloud as a backend, the configuration file must contain definitions for:
- IP address of the OpCenter
- Login credentials for the OpCenter
- Default number of vCPUs for the Worker Node (value can be left blank)
- Default memory capacity for the Worker Node (value can be left blank)
- Default container image (value can be left blank)
If a value is left blank, it must be provided in the runtime section of the wdl file.
Operation
The Cromwell job file (a file with extension .wdl) describes the workflow and specifies the backend for each task (the default backend is the local host). When the user submits a job using the java run
command, any process with the backend defined as "float" is scheduled for the OpCenter. Combining information from the configuration file and the job file results in a float submit
command that is sent to the OpCenter. This procedure is repeated for every task in the workflow that has "float" as the backend.
Requirements
To use Cromwell with MMCloud, you need the following:
- MMCloud Carmel 2.0 release or later
- OpCenter instance with valid license
- Cromwell Host (can be your local computer or a Linux virtual machine running in same VPC as the OpCenter)
- Cromwell Host with the following:
- Java 11
- Cromwell jar file
- MemVerge's Cromwell configuration file
- Job file in wdl format
- Input file(s) in json format
- Options file in json format (optional)
- MMCloud CLI binary. You can download it from the OpCenter.
Prepare the Cromwell Host
The Cromwell Host can be any computer that has access to the MMCloud. For complicated workflows, it is likely that the wdl file references objects in S3 buckets. For this reason and to comply with the same security policies that apply to the OpCenter, the instructions described here assume that the Cromwell Host is a Linux virtual machine running in the same VPC as the OpCenter. You can view instructions on how to create an AWS EC2 instance here. If the Cromwell Host is in a different VPC subnet, check that the Cromwell Host can reach the OpCenter. Ensure that any firewall rules allow access to ports 22 (the port used by ssh), 80, 443, and 8000.
-
Check the version of java installed on the Cromwell Host by entering:
-
If needed, install Java 11. Commercial users of Oracle Java need a subscription. Alternatively, you can install OpenJDK under an open-source license by entering (on a Red Hat-based Linux system):
sudo dnf install java-11-openjdk
-
Create a directory called cromwell and
cd
to it - Download the Cromwell jar file (85 is a recent version) from here and place it in the cromwell directory
-
Check the Cromwell jar file by entering:
$ java -jar cromwell-85.jar --help cromwell 85 Usage: java -jar /path/to/cromwell.jar [server|run|submit] [options] args>... --help Cromwell - Workflow Execution Engine --version Command: server Starts a web server on port 8000. See the web server documentation for more details about the API endpoints. Command: run [options] workflow-source Run the workflow and print out the outputs in JSON format. workflow-source Workflow source file or workflow url. --workflow-root <value> Workflow root. -i, --inputs <value> Workflow inputs file. -o, --options <value> Workflow options file. -t, --type <value> Workflow type. -v, --type-version <value> Workflow type version. -l, --labels <value> Workflow labels file. -p, --imports <value> A zip file to search for workflow imports. -m, --metadata-output <value> An optional JSON file path to output metadata. Command: submit [options] workflow-source Submit the workflow to a Cromwell server. workflow-source Workflow source file or workflow url. --workflow-root <value> Workflow root. -i, --inputs <value> Workflow inputs file. -o, --options <value> Workflow options file. -t, --type <value> Workflow type. -v, --type-version <value> Workflow type version. -l, --labels <value> Workflow labels file. -p, --imports <value> A zip file to search for workflow imports. -h, --host <value> Cromwell server URL.
-
Download the OpCenter CLI binary for Linux hosts from the following URL:
https://<op_center_ip_address>/float
Replace
<op_center_ip_address>
with the public (if you are outside the VPC) or private (if you are inside the VPC) IP address for the OpCenter. If you download the CLI binary (called float) to your local machine, move the file to the Cromwell Host. -
Make the CLI binary file executable and add the path to the CLI binary file to your PATH variable.
-
Open a file called
cromwell-float.conf
and insert the following contents.include required(classpath("application")) # This is an example of how you can use Cromwell to interact with float. backend { default = float providers { float { actor-factory = "cromwell.backend.impl.sfs.config.ConfigBackendLifecycleActorFactory" config { runtime-attributes=""" String f_cpu = "MINVCPUS" String f_memory = "MINMEMORY" String f_docker = "" String f_extra = "" """ # If an 'exit-code-timeout-seconds' value is specified: # - check-alive will be run at this interval for every job # - if a job is found to be not alive, and no RC file appears after this interval # - Then it will be marked as Failed. # Warning: If set, Cromwell will run 'check-alive' for every job at this interval exit-code-timeout-seconds = 30 submit = """ mkdir -p ${cwd}/execution echo "set -e" > ${cwd}/execution/float-script.sh echo "cd ${cwd}/execution" >> ${cwd}/execution/float-script.sh tail -n +22 ${script} > ${cwd}/execution/no-header.sh head -n $(($(wc -l < ${cwd}/execution/no-header.sh) - 14)) ${cwd}/execution/no-header.sh >> ${cwd}/execution/float-script.sh float submit -i ${f_docker} -j ${cwd}/execution/float-script.sh --cpu ${f_cpu} --mem ${f_memory} ${f_extra} > ${cwd}/execution/sbatch.out 2>&1 cat ${cwd}/execution/sbatch.out | sed -n 's/id: \(.*\)/\1/p' > ${cwd}/execution/job_id.txt echo "receive float job id: " cat ${cwd}/execution/job_id.txt JOB_SCRIPT_DIR=float-jobs/$(cat ${cwd}/execution/job_id.txt) mkdir -p $JOB_SCRIPT_DIR cd $JOB_SCRIPT_DIR # create the check alive script cat <<EOF > float-check-alive.sh SCRIPT_DIR=$(pwd) cd ${cwd}/execution float show -j \$1 --runningOnly > job-status.yaml if [[ -s job-status.yaml ]]; then cat job-status.yaml else float show -j \$1 | grep rc: | tr -cd '[:digit:]' > rc if [ ! -s rc ]; then # If the rc file is empty, write the default value (e.g., 0) echo "127" > rc fi float log cat -j \$1 stdout.autosave > stdout float log cat -j \$1 stderr.autosave > stderr fi cd $SCRIPT_DIR EOF # create the kill script cat <<EOF > float-kill.sh SCRIPT_DIR=$(pwd) cd ${cwd}/execution float scancel -f -j \$1 cd $SCRIPT_DIR EOF cat ${cwd}/execution/sbatch.out """ kill = """ source float-jobs/${job_id}/float-kill.sh ${job_id} """ check-alive = """ source float-jobs/${job_id}/float-check-alive.sh ${job_id} """ job-id-regex = "id: (\\w+)\\n" } } } }
-
Replace the following (keep the quotation marks).
MINVCPUS
: minimum number of vCPUs to use as the default for a Worker NodeMINMEMORY
: minimum memory capacity (in GB) to use as the default for a Worker Node.
-
The string following
f_extra
is combined with the string followingf_extra
in the wdl file and appended to thefloat submit
command sent to the OpCenter. Thef_extra
string in the wdl file overrides thef_extra
string in the configuration file if they are in conflict. - The string following
f_docker
is the name of a default docker image used in case the task in the wdl file does not specify a docker image. You can leave this blank. - Use
float login
to log in to your OpCenter
Run a Simple Cromwell Workflow
The default backend for Cromwell is the local host. Run a simple "hello world" workflow on the local host by completing the following steps.
-
Create a file called "helloworld.wdl" and insert:
-
Run the "hello world" job by entering:
The output from Cromwell is verbose. To determine that the job runs successfully and to view the output, look for the following section:
[2023-05-08 21:47:01,69] [info] BackgroundConfigAsyncJobExecutionActor [cb7c6fd7myWorkflow.myTask:NA:1]: Status change from - to Done [2023-05-08 21:47:03,27] [info] WorkflowExecutionActor-cb7c6fd7-9020-41cf-841f-c95f43ce86da [cb7c6fd7]: Workflow myWorkflow complete. Final Outputs: { "myWorkflow.myTask.out": "hello world" } [2023-05-08 21:47:06,67] [info] WorkflowManagerActor: Workflow actor for cb7c6fd7-9020-41cf-841f-c95f43ce86da completed with status 'Succeeded'. The workflow will be removed from the workflow store. [2023-05-08 21:47:09,04] [info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'. { "outputs": { "myWorkflow.myTask.out": "hello world" }, "id": "cb7c6fd7-9020-41cf-841f-c95f43ce86da" }
Run a Workflow using OpCenter as a Backend
A Cromwell workflow file can call many tasks. If the tasks are independent, Cromwell can scatter the tasks into parallel "shards." Once the shards are complete, the results can be gathered into a single output. The following example demonstrates different aspects of Cromwell.
- Task executed using Local as backend
- Task executed by OpCenter as backend
- Parallel sharding
- JSON files to supply input data and parameters
To run this example, complete the following steps:
-
Create a file called 3step.wdl and insert the following content.
workflow scatterGather { Array[String] names call intro scatter (name in names) { call addList { input: name=name } } call compileList { input: items=addList.out } } task intro { command { echo "Starting to compile the grocery list" } output { String out = read_string(stdout()) } runtime { backend: "Local" } } task addList { String name command { printf "[cromwell-addList] Add ${name} to list\n" sleep 30 } output { String out = read_string(stdout()) } runtime { f_docker: "cactus" f_cpu: "2" f_memory: "4" f_extra: "--name addList" } } task compileList { Array[String] items command { printf "[cromwell-compileList] These items are on the grocery list:\n" > my_file.txt sleep 1 echo ${sep=". " items} >> my_file.txt cat my_file.txt sleep 30 } output { String out = read_string(stdout()) } runtime { f_docker: "cactus" f_extra: "--name compileList" } }
-
Create an input file called 3step.json and insert the following content.
-
Create an options file called options.json and insert the following content.
-
Run the worklow by entering the following command.
The task called "intro" runs locally. The task called addList creates four parallel shards that run on OpCenter. Finally, the output from the four shards are gathered by the task called compileList that runs on OpCenter.
The Cromwell output is voluminous. Check for key events during the run.
-
Tasks assigned to backends.
-
Parallel shards created.
[2023-05-09 20:45:22,46] [info] WorkflowExecutionActor-0b810f55-0d90-4c00-bf41-066e486e45b6 [0b810f55]: Starting scatterGather.addList (4 shards) [2023-05-09 20:45:26,26] [info] Assigned new job execution tokens to the following groups: 0b810f55: 5 [2023-05-09 20:45:26,52] [info] BackgroundConfigAsyncJobExecutionActor [0b810f55scatterGather.intro:NA:1]: echo "Starting to compile the grocery list" [2023-05-09 20:45:26,53] [info] DispatchedConfigAsyncJobExecutionActor [0b810f55scatterGather.addList:2:1]: printf "[cromwell-addList] Add Milk to list\n" sleep 30 [2023-05-09 20:45:26,53] [info] DispatchedConfigAsyncJobExecutionActor [0b810f55scatterGather.addList:3:1]: printf "[cromwell-addList] Add Bread to list\n" sleep 30 [2023-05-09 20:45:26,54] [info] DispatchedConfigAsyncJobExecutionActor [0b810f55scatterGather.addList:0:1]: printf "[cromwell-addList] Add Apples to list\n" sleep 30 [2023-05-09 20:45:26,54] [info] DispatchedConfigAsyncJobExecutionActor [0b810f55scatterGather.addList:1:1]: printf "[cromwell-addList] Add Bananas to list\n"
-
Results gathered.
-
Job concluded successfully.
[2023-05-09 20:54:38,68] [info] SingleWorkflowRunnerActor workflow finished with status 'Succeeded'. { "outputs": { "scatterGather.compileList.out": "[cromwell-compileList] These items are on the grocery list:\n[cromwell-addList] Add Apples to list. [cromwell-addList] Add Bananas to list. [cromwell-addList] Add Milk to list. [cromwell-addList] Add Bread to list", "scatterGather.intro.out": "Starting to compile the grocery list", "scatterGather.addList.out": ["[cromwell-addList] Add Apples to list", "[cromwell-addList] Add Bananas to list", "[cromwell-addList] Add Milk to list", "[cromwell-addList] Add Bread to list"] }, "id": "0b810f55-0d90-4c00-bf41-066e486e45b6" }
OpCenter shows the five tasks (four "scatter" tasks and one "gather" task).
$ float squeue -A -f name=List -d
+-----------------------+-------------+-------+-----------+----------------------+----------+------------+
| ID | NAME | USER | STATUS | SUBMIT TIME | DURATION | COST |
+-----------------------+-------------+-------+-----------+----------------------+----------+------------+
| fihe9H73Mw7stcdCN1JPD | compileList | admin | Completed | 2023-05-09T20:51:17Z | 2m37s | 0.0089 USD |
| JO2HXLD0WJLYVPW2npISI | addList | admin | Completed | 2023-05-09T20:45:29Z | 4m3s | 0.0025 USD |
| N8JDsSywHuKa3c7s8FZrl | addList | admin | Completed | 2023-05-09T20:45:29Z | 3m59s | 0.0024 USD |
| 4t9qV52z8B7423YtfLbve | addList | admin | Completed | 2023-05-09T20:45:29Z | 3m59s | 0.0024 USD |
| HsBFzGvPvhfekjFPSJ97A | addList | admin | Completed | 2023-05-09T20:45:29Z | 4m5s | 0.0025 USD |
(edited)
Run Cromwell in Server Mode
When used in run mode, Cromwell launches a single workflow from the command line. Run mode is typically used for development. In server mode mode, Cromwell starts a web server that supports a feature-rich REST API. By default, the web server is started on the local host (0.0.0.0) and port 8000 (these can be changed in the configuration file).
To start a Cromwell server on a local Cromwell Host, perform the following steps.
-
On the Cromwell Host, enter the following command.
-
Check that the Cromwell Host's inbound filtering rules allow access to port 8000
- Open a browser and go to
http://<cromwell_host_ip>:8000
where<cromwell_host_ip>
is the public (private) IP address of the Cromwell Host if you are outside (inside) the VPC.
To submit a job using the Cromwell web interface, complete the following steps.
-
Expand the Workflows section and then click Submit a workflow for execution
-
Click Try it out (on the right-hand side)
-
Browse your local computer for a
workflowSource
wdl file - Browse your local computer for a
workflowInputs
json file - Browse your local computer for a
workflowOptions
json file - Click
Execute
(at the bottom of the section). If the job is accepted, the server returns a code 201. - Copy the workflow ID
- Click
Get the outputs of a workflow
- Click
Try it out
and then paste the workflow ID into theid
box - Click
Execute
Troubleshooting
As the Cromwell workflow runs, detailed log messages are written to the terminal.