MINIMAP2
map long-reads to ref and sort using samtools.
URL: https://github.com/lh3/minimap2
Example
This wrapper can be used in the following way:
# test minimap2.
rule map_ont:
input:
ref_fa = "ref.fa",
fqs = ["test.fq.gz"]
output:
out = "test.b37.bam"
params:
extras = "-x map-ont --MD",
rg = 'ID:test\tSM:test',
wrapper:
"No_Tags/bio/minimap2"
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Input/Output
Input:
ref_fa
: ref fasta file or mmi.fqs_fa
: input fastqs or fa file, list or string.
Output:
out
: sam/bam/paf. using ext .sam, .bam or .paf
Params
extras
: extra arguments to minimap2. (optional)rg
: read group in format ‘@RGtID:test’. (optional)sort_type
: enable sorting if not paf output. can be none, queryname, coordinate(default).(optional)sort_extras
: extra arguments to samtools sort. (optioanl)
Code
# minimap2 wrapper, modified from snakemake wrapper
__author__ = "yangqun"
import sys
import os
from snakemake.shell import shell
assert 'out' in snakemake.output.keys()
out = snakemake.output.get('out')
out_name, out_ext = os.path.splitext(out)
out_ext = out_ext[1:].upper()
extras = snakemake.params.get('extras', '')
rg = snakemake.params.get('rg', '')
rg = "" if not rg else f'-R {rg}'
sort_type = snakemake.params.get('sort_type', 'coordinate')
sort_extras = snakemake.params.get('sort_extras', '')
pipe_cmd = ""
index_cmd = ""
if out_ext not in ['SAM', 'BAM', 'PAF']:
sys.exit("out file extension must be .sam, .bam, .paf")
if sort_type not in ['none', 'coordinate', 'queryname']:
sys.exit('samtools sort must be none, coordinate, queryname. got {}'.format(sort_type))
if out_ext != "PAF":
# sam/bam/cram
extras += " -a"
if sort_type == "none":
# 不排序
out_cmd = "| samtools view -h --output-fmt {} -o {}".format(out_ext, out)
elif sort_type == "queryname":
out_cmd = "| samtools sort -n --output-fmt {} -o {}".format(out_ext, out)
else:
out_cmd = "| samtools sort --output-fmt {} -o {}".format(out_ext, out)
else:
out_cmd = "-o {}".format(out)
if out_ext in ['BAM'] and sort_type == 'coordinate':
index_cmd = "&& samtools index {}".format(out)
shell(
"minimap2 -t {snakemake.threads} {extras} {rg} {snakemake.input.ref_fa} {snakemake.input.fqs_fa} "
"{out_cmd} {index_cmd}"
)