通过AGS可以快速处理全基因组测序WGS(Whole Genome Sequencing)的全流程任务,包括基因拼装、排序、去重和变异检测。本文介绍如何通过AGS命令行管理WGS工作流。

启动WGS流程

Usage:

ags remote run wgs /
--region cn-shenzhen # region of oss, e.g. cn-shenzhen, cn-beijing and etc/
--fastq1 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_1.fq.gz # filename of fastq pair 2, fastq-path/filename /
--fastq2 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_2.fq.gz  # filename of fastq pair 1/
--bucket my-test-shenzhen # Bucket name/
--output-bam bam/MGISEQ_NA12878_hs37d5.bam, # Output BAM to bucket,  By default empty, non output of BAM /
--output-vcf vcf/MGISEQ_NA12878_hs37d5_5.vcf # Output filename /
--service "g" #SLA: [n:normal|s:silver|g:gold|p:platinum]/
--reference [hs37d5|hg19|<reference path on OSS>]

e.g.
ags remote run wgs /
--region cn-shenzhen /
--fastq1 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_1.fq.gz  /
--fastq2 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_2.fq.gz  /
--bucket my-test-shenzhen /
--output-vcf vcf/MGISEQ_NA12878_hs37d5_5.vcf /
--output-bam bam/MGISEQ_NA12878_hs37d5_5.bam /
--service "s" /
--reference hs37d5

启动Mapping流程

通过–fastq1–fastq2指定fastq,通过–output指定bam的输出路径。

Usage:

ags remote run mapping /
--region cn-shenzhen # region of oss, e.g. cn-shenzhen, cn-beijing and etc/
--fastq1 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_1.fq.gz # filename of fastq pair 2, fastq-path/filename /
--fastq2 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_2.fq.gz  # filename of fastq pair 1/
--bucket my-test-shenzhen # Bucket name/
--output-bam bam/MGISEQ_NA12878_hs37d5.bam # Output filename of BAM /
--service "g" #SLA: [n:normal|s:silver|g:gold|p:platinum]/
--markdup [true|false|default true] #Mark Duplicated, by default true
--reference [hs37d5|hg19|<reference path on OSS>]

e.g.

ags remote run mapping /
--region cn-shenzhen /
--fastq1 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_1.fq.gz  /
--fastq2 MGISEQ/MGISEQ2000_PCR-free_NA12878_1_V100003043_L01_2.fq.gz  /
--bucket my-test-shenzhen /
--output-bam bam/MGISEQ_NA12878_hs37d5.bam # Output filename of BAM /
--service "g" /
--markdup "true" /
--reference hs37d5
			

列出远程流程

Usage:
ags remote list

e.g.
ags remtoe list
+---------------+-------------------------------+
|   JOB NAME    |          CREATE TIME          |
+---------------+-------------------------------+
| wgs-gpu-ckw96 | 2020-01-07 19:08:32 +0000 UTC |
| wgs-gpu-djzws | 2020-01-07 18:31:22 +0000 UTC |
| wgs-gpu-pd659 | 2020-01-03 20:34:09 +0000 UTC |
+---------------+-------------------------------+

获取流程的详细信息

Usage:
ags remote get <workflow id> --show
--show show detail of input parameters of workflow

e.g.
ags remote get wgs-gpu-zls6r 
+---------------+------------------+-----------+-------------------------------+----------+-------------------------------+
|   JOB NAME    |  JOB NAMESPACE   |  STATUS   |          CREATE TIME          | DURATION |          FINISH TIME          |
+---------------+------------------+-----------+-------------------------------+----------+-------------------------------+
| wgs-gpu-zls6r | XXXXXXXXXXXXXXXX | Succeeded | 2020-01-06 21:25:48 +0800 CST | 29m26s   | 2020-01-06 21:55:14 +0800 CST |
+---------------+------------------+-----------+-------------------------------+----------+-------------------------------+



ags remote get wgs-gpu-zls6r --show

取消运行中的工作流

Usage:

ags remote cancel  <workflow id>

e.g.

ags remote cancel wgs-gpu-zls6r
INFO[0000] Successed to cancel wgs-gpu-zls6r

删除结束的工作流

可以删除成功和失败的工作流,但不能删除运行中的工作流。

Usage: 

ags remote remove <workflow id>

e.g.

ags remote remove wgs-gpu-zls6r
INFO[0000] Successed to remove wgs-gpu-zls6r