FACTCAT
This eponymous tool concatenates .fastq(.gz) files whilst creating a summary of the sequences. Can also demultiplex reads according to Guppy/MinKNOW .fastq record headers.
URL: https://github.com/epi2me-labs/fastcat
Example
This wrapper can be used in the following way:
test/Snakefile not found.
Note that input, output and log file paths can be chosen freely.
When running with
snakemake --use-conda
the software dependencies will be automatically deployed into an isolated environment before execution.
Input/Output
Input:
fq_or_path
: fastq files or path. If path set, automative give -x.
Output:
out_fq
: output fastq file. if ends with gz, auto compressed. (optional)file_summary
: file summary output. (optional)read_summary
: read summary outout. (optional)
Params
extras
: extra arguments to program.fastcat
: fastcat path. default is fastcat.
Code
# fastcat wrapper.
__author__ = "yangqun"
import sys
import os
from snakemake.shell import shell
data = snakemake.input.get('fq_or_path')
out_fq = snakemake.output.get('out_fq')
file_summary = snakemake.output.get('file_summary')
read_summary = snakemake.output.get('read_summary')
extras = snakemake.params.get('extras', '')
fastcat = snakemake.params.get('fastcat', 'fastcat')
# input data
if not data:
sys.exit("Please give fastq files or path.")
if os.path.isdir(data):
extras += ' -x'
# output data
out_cmd = ''
if out_fq:
if out_fq.endswith('gz'):
# compressed fastq.
out_cmd = " | gzip >{}".format(out_fq)
else:
# fastq.
out_cmd = " >{}".format(out_fq)
else:
# do not output fastq.
out_cmd = " >/dev/null"
cmds = ""
# summary
if file_summary:
cmds += ' -f ' + file_summary
if read_summary:
cmds += ' -r ' + read_summary
# run
shell(
"{fastcat} {cmds} {extras} {data} {out_cmd}"
)