tencent cloud

Feedback

WDL Application Management (Graph Edit)

Last updated: 2024-10-22 16:05:05

    Features of the Graphical Editing Mode

    The platform supports editing WDL workflows in the Graphical Editing Mode. You can easily create and edit workflows and define dependencies between tasks through an intuitive graphical interface and drag-and-drop operations. The Graphical Editing Mode has the following features:
    Low-Code: You can directly define and edit the input, output, parameters, and other attributes of the task in the component content editor without writing complex workflow code.
    Fine-Grained: The granularity is as fine as the task level, supporting multiple components such as scatter and if/else, free and flexible.
    Previewable: Use the DAG overview diagram to preview the structure of the workflow and the dependencies between tasks in real time, making it easier to adjust and optimize.
    High Reusability: The input and output defined in the task support reuse. This is convenient and flexible.
    Convenient and Efficient: It makes the use of platform public data and public images easier.
    Migratable: It supports one-click export of workflow code files (WDL), and the workflow can be migrated.
    The graph editor consists of Component, Drag-and-Drop Canvas and Component Content Editor.
    The layout is as follows:
    
    
    

    Creating an Application (Graphical Editing Mode)

    Enter the work project and click Add in the upper right corner. A new application pop-up will appear, and you can select the application type as WDL.
    
    
    
    Select Create a blank application mode, enter the necessary information (application name, description, etc.), and select Graphical Editor Mode, and add the application name and a brief description. After confirmation, you can enter the application in Graphical Editing Mode.
    
    
    

    Editing Application (Graphical Editing Mode)

    In the Graphical Editing Mode, the steps of application editing are as follows:
    1. New Components
    In the Graphical Editing Mode, five new components can be added, including task, scatter, if/else, workflow-input, and workflow-output. These components can cover common WDL element types.
    2. Editing Component
    After selecting a component, you can define the component's command line, input, output, etc., in the component content editor.
    3. Creating Associations Between Components
    Create associations between components by defining associated inputs and outputs between lines or tasks.
    4. Run the application.
    In the Graphical Editing Mode, only a single workflow application can be created. When an application is created, the platform will automatically generate an empty workflow by default to organize the various components. The application must meet the following conditions to be executable: it must have at least one task component, and the task component's operational settings must specify an available Docker path.

    New Components

    An application needs at least one task component to be executable. The component categories and corresponding editable contents are shown in the following table:
    Module
    Task (required)
    Scatter
    If/else
    Workflow-input
    Workflow-output
    WDL Corresponding Module Name
    Task
    Scatter
    If/else
    Workflow-input
    Workflow-output
    Editable Content
    Name (Required)
    Name
    Name
    Name
    Name
    Input
    Traversal Conditions (Required)
    Conditions (Required)
    Input
    Output
    Command Line (Required)
    Running Settings
    Output

    Editing Component

    In the component content editor, you can define the component's name, input and output information, etc. Select any component to display the component's corresponding content editor. The content editor supports defining the component's input and output variables, command line, run settings (runtime module in WDL), etc.
    
    
    

    Task

    The task component (task) contains four modules: input, run settings, command line, and output, among which the command line is a required module.
    Input Module
    Currently, the input module supports the following variable types:
    String
    Integer (Int)
    Float
    File
    Boolean
    Array (Array[])
    Two-dimensional array (Array[Array[]])
    Array of key-value pairs (Pair[])
    In addition to supporting the definition of input parameter types and names, the editor also supports setting input parameter attributes. When the Required checkbox is ticked, a parameter will be set as a required parameter. After entering the run process, you need to complete the corresponding parameter fillings on the run parameter setting page. When the output parameters of other components are reused as input parameters of task components, the association between the two components will be automatically established and displayed as a connecting line on the canvas.
    
    
    
    Output Module
    The variable types supported by the output module are the same as those of the input module. The output parameters of the task component can be reused by other components as their input parameters. When the parameters are reused as input parameters by other components, the association relationship between the two components is automatically established and displayed as a connecting line on the canvas.
    
    
    
    Command Line
    The command module is an important part of the task component and a required module for a task. It defines the commands that need to be executed in the task. In this module, you can use various programming languages to write commands to be executed, such as bash, Python, and R. For example, run a bioinformatics tool or execute a bash script. Taking the fastp tool as an example, if data quality control is required, you can enter the following command in the command module:
    command {
    fastp -i input.fastq -o output.fastq
    }
    In this example, the fastp tool is used for data quality control. The -i parameter specifies the input file, and the -o parameter specifies the output file. These parameters can be modified according to actual needs.
    In Graphical Editing Mode, you can complete command line-related configurations in the command line module of the task component content editor.
    
    
    

    Scatter

    Scatter is a key concept in WDL. It is used to implement parallel processing to improve the execution efficiency of workflows. In the Graphical Editing Mode, you can implement a module similar to the scatter by editing the task group. The scatter needs to set the traversal conditions and drag the tasks to be triggered into the scatter module. Users can use scatters to batch-process multiple input data. Set the traversal conditions of the scatter to the data set, define the operation to be performed as a task, and drag it into the scatter. In this way, when the scatter is executed, it will automatically execute the task for each data in the data set. This method makes it easier to process multiple data in parallel. Using scatters can improve processing efficiency, reduce repeated operations, and make data processing more efficient.
    
    
    

    If/else

    When writing a workflow, you may want certain steps to run only if certain conditions are met. This may indicate switching between two execution paths (e.g., running a tool in mode A or mode B) or skipping a step entirely (e.g., running a tool or not running it). In such cases, WDL supports the use of conditional statements if. The following is an example of a conditional statement in WDL:
    if((shouldICallStepB == 0) && (m_value == "123")){
    call stepB {input: in=stepA.out}}
    In the Graphical Editing Mode, you can set conditional statements by adding if/else in the canvas. The if/else consists of conditions and trigger tasks. It has the following features:
    1. It supports setting compound conditions.
    2. It supports nesting tasks in if/else. Drag the tasks to be triggered into the if/else to complete the nesting.
    3. It supports adding functions/binary logic relationships within conditions.
    The above WDL conditional statement is set in the Graphical Editing Mode as shown below:
    
    
    
    Note:
    If/else variables can only be selected from existing variables, including: 1. variables defined in workflow-input; 2. output variables defined in any task.

    Workflow-input (Input)

    In the WDL language, input is an important module in the workflow. It is used to define the workflow-input parameters of the workflow. In the Graphical Editing Mode, the workflow-input component (Input) allows users to define some workflow-input parameters at the beginning of the workflow. These parameters can be used anywhere in the entire workflow, including tasks and scatters. The main advantage of workflow-input parameters is that they can make workflows more flexible and reusable because users can change the values of these parameters when running the workflow. In the Graphical Editing Mode, editing the workflow-input component requires adding workflow-input parameters. These parameters can be of various types, such as files, integers, and strings. The following is an example of workflow-input in WDL:
    input{File bam_input Int mem_gb}
    Note:
    In this example, two workflow-input parameters are defined: bam_input and mem_gb. bam_input is a file type parameter used to specify the input BAM file, while mem_gb is an integer type parameter used to specify the memory size to be used.
    As shown in the figure below, in the Graphical Editing Mode, you can add a workflow-input component to the canvas and define the input parameter type and name in the component content editor.
    
    
    

    Workflow-output (Output)

    Workflow-output (Output) is an output parameter defined in a workflow that can be accessed and used by all tasks in the workflow. This way of setting output parameters allows users to set some global parameters at the end of the workflow, such as file path and result file. These parameters may be used multiple times throughout the workflow.
    Here is an example of workflow-output in WDL:
    output {
    File clean_fastq = clean_fastq_output
    File inbd = inbd_output
    }
    As shown in the figure below, in the Graphical Editing Mode, you can add a global output component to the canvas to define the types and names of parameters in the editor of the component content.
    
    
    
    Note:
    Only existing variables can be selected for workflow-input variable assignment, including 1. variables defined in workflow-input; 2. output variables defined in any task.

    Creating Associations Between Components

    In the Graphical Editing Mode, you can create associations between components in the following two ways:
    1. In the canvas, use lines to connect parameters between components and establish logical dependencies between them. For example, connect the output parameter of a task to the input parameter of other tasks to assign values.
    2. In the task component content editor, reuse the output parameters of other components and assign values to the input parameters.
    
    
    
    Note:
    1. Only unassigned parameters can be assigned values through lines.
    2. The components that support association include tasks, scatter, if/else, and workflow-output.

    Save and Run the Application

    After the canvas content is updated, please save changes in time to avoid accidental loss of content. Click Save in the upper right corner of the edit page; you can save the updated content. Each time you save an update, a historical record will be generated on the timeline.
    
    
    
    In the Graphical Editing Mode, only single workflow applications are supported. When an application is created, the platform will automatically generate an empty workflow by default to organize the various components. The application needs to meet the following conditions to run: It must have at least one task component, and the task component's runtime settings must specify an available Docker address.
    Click Run in the upper right corner of the edit page to trigger the verification and running process.
    
    
    
    After the verification is passed, enter the operation setting page to complete the running task setting and parameter setting.
    
    
    
    1. Set up the task
    Task names cannot be the same within a work project.
    
    
    
    Running Options
    Use Call-Caching Feature: When the Call-Caching feature is used, the same task does not need to be executed repeatedly, and the previous results will be automatically used.
    Use Relative Directory Output: If relative directory output is used, all job output files will be archived in the same directory; that is, only the directory level within the task running directory is retained, and it is necessary to ensure that the output files of each job have different names. If it is not checked, all results of multiple task runs will be retained in the specified directory.
    Example:
    When the output directory of a workflow task is set to cos://bucket-12345678/pipeline/output, the output file in the job is sample.vcf:
    If you select use relative directory output, the final output file is : cos://bucket-12345678/pipeline/output/sample. If there are output files with the same name in different jobs, a file conflict will occur in the task output directory, and the conflicting files will overwrite;
    If the option use relative directory as output is not selected, the final output file will be: cos://bucket-12345678/pipeline/output/wgs/ade68a6d876e8d-8a98d7e9-ad989a8d/call-gatk/execution/sample.vcf. Different jobs output to different directories, and files with the same name will not cause conflicts.
    The platform supports two run failure modes (Cromwell features), NoNewCalls and ContinueWhilePossible. It can specify the appropriate operations to take in case of job failure during workflow execution.
    NoNewCalls (default): Once a job fails, Cromwell will not immediately start any new programs. Cromwell will still monitor the remaining jobs until they are completed (successfully or not).
    ContinueWhilePossible: Try to run as many jobs as possible until no more can be started. When all running jobs are completed, the workflow will fail.
    For more information about the running failure mode, see Cromwell official documentation.
    Task Output Directory (optional): When you specify a task output directory, the job output files will be archived in that directory.
    Cache retention period: The cache retention period is divided into 4 levels: 24 hours, 3 days, 7 days, and permanent.
    Note:
    The cache will be deleted when it is beyond the set retention period. This may affect the call caching feature.
    2. Set running parameters
    The running parameters can be filled in according to the parameters parsed by WDL.
    Running parameters must provide valid values according to the type declared in WDL.
    Some parameters that have default values can retain their default values.
    Variable names with an asterisk in the upper left corner are required inputs.
    You can save the running parameters as a template and use it directly later, or you can download the input parameters as a JSON file.
    The platform provides reference templates for public applications. When you import a work project, match the parameter template of the region where the work project is located. You can select the corresponding template on the running parameter setting page.
    
    
    
    3. Submit a single task Instead of uploading a table for a single task, you just need to manually set the running parameters to submit.
    4. Batch Execution If the input files are different but the overall process is the same, users can also submit run groups by uploading a table. For details, see Visual Batch Task Submission and Management.

    Version Management

    The version management feature covers two major features:publishing and historical timeline. It can help you better understand the progress and quality of the application, improving the efficiency, quality, and performance of the application and optimizing the development and management of workflows.
    Publish feature: Through the publish feature, you can easily publish the official version. You can control the viewing and usage permissions of different members to better manage the version of the application.
    Historical Timeline: Through the timeline, you can view the historical and version records and trace the application development process. This can help you promptly discover, solve, and optimize problems, improve application quality and performance, and optimize your development process and management standards.

    Publishing Application Version

    After the workflow phase development is completed, you can publish the official version. Before releasing, confirm that the application has no syntax errors and the main file has been set. Click Release in the upper right corner after confirming that everything is correct, and you can enter the publishing process.
    
    
    
    Clicking Release will trigger the save and verification process. After passing the verification, the publishing process will begin. A pop-up of the published version will appear. Fill in the published version name and description, select the parameter template, and complete the publishing after confirmation.
    
    
    
    After publishing the version, you can switch to this version in the historical version timeline. Viewing and running this historical version is supported.
    
    
    
    Note:
    Historical versions can only be viewed and run but cannot be edited, deleted, or published within the application.

    Viewing Historical and Version Records

    View Timeline
    The timeline is a visual display of the complete historical records of the application, including version records and historical records. The timeline view is arranged in chronological order and shows all updates to the application workflow files. You can use the timeline to trace the key time nodes of the development process and compare different versions and historical records, making it easier to manage and control the development process.
    View Historical Records
    In the timeline view, each time an application file is updated and saved, a historical record is generated. Click any historical record to switch the content of the resource manager interface to the workflow canvas corresponding to the historical record.
    
    
    
    Note:
    The historical records are for viewing only and cannot be edited, run, published, or deleted.
    View Historical Versions
    In the timeline view, the officially published versions are also arranged in chronological order according to the historical versions. Click any historical version, and the resource manager interface content will switch to the workflow canvas corresponding to the historical version.
    
    
    

    Other Features

    Viewing DAG Overview

    You can view the application overview graph created in the Graphical Editing Mode. As shown in the figure below, click the DAG (directed acyclic graph) button in the upper right corner of the application editing page to display the application DAG overview graph.
    
    
    

    Exporting WDL File

    The Tencent Healthcare Omics Platform supports one-click export of workflow code files (WDL), and the process can be migrated. Applications developed in the Graphical Editing Mode can be exported as WDL files. Click the expansion button in the upper right corner of the edit page to display Export WDL. Click it to automatically export the corresponding WDL file.
    
    
    
    
    Contact Us

    Contact our sales team or business advisors to help your business.

    Technical Support

    Open a ticket if you're looking for further assistance. Our Ticket is 7x24 avaliable.

    7x24 Phone Support