-
Notifications
You must be signed in to change notification settings - Fork 489
New tool addition: KneadData #7498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
tools/kneaddata/.shed.yml
Outdated
| categories: | ||
| - Sequence Analysis | ||
| - Metagenomics | ||
| - Quality Control |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Quality Control |
tools/kneaddata/kneaddata.xml
Outdated
| @@ -0,0 +1,272 @@ | |||
| <tool id="kneaddata" name="KneadData" version="0.12.1+galaxy0" python_template_version="3.5" profile="21.05"> | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <tool id="kneaddata" name="KneadData" version="0.12.1+galaxy0" python_template_version="3.5" profile="21.05"> | |
| <tool id="kneaddata" name="KneadData" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> |
Please introduce the above tokens in the macros.xml file
tools/kneaddata/kneaddata.xml
Outdated
| <requirements> | ||
| <requirement type="package" version="0.12.3">kneaddata</requirement> | ||
| <requirement type="package" version="0.40">trimmomatic</requirement> | ||
| <requirement type="package" version="2.5.4">bowtie2</requirement> | ||
| <requirement type="package" version="4.09.1">trf</requirement> | ||
| <requirement type="package" version="0.12.1">fastqc</requirement> | ||
| </requirements> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can go in macros.xml. Wondering if these dependencies are already part of Kneaddata or does one explicitly need them?
tools/kneaddata/kneaddata.xml
Outdated
| <option value="s">Single read</option> | ||
| <option value="p">Paired reads</option> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <option value="s">Single read</option> | |
| <option value="p">Paired reads</option> | |
| <option value="single">Single read</option> | |
| <option value="paired">Paired reads</option> |
tools/kneaddata/kneaddata.xml
Outdated
| </param> | ||
|
|
||
| <when value="s"> | ||
| <param name="single_read" type="data" format="fastq" label="Single Read"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <param name="single_read" type="data" format="fastq" label="Single Read"/> | |
| <param name="single_read" type="data" format="fastqsanger" label="Single Read"/> |
Does it also support fastq.gz?
tools/kneaddata/kneaddata.xml
Outdated
| <when value="s"> | ||
| <param name="single_read" type="data" format="fastq" label="Single Read"/> | ||
| </when> | ||
| <when value="p"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <when value="p"> | |
| <when value="paired"> |
tools/kneaddata/kneaddata.xml
Outdated
| <param name="number_threads" type="integer" value="1" label="Number of threads"/> | ||
|
|
||
| <param name="number_processes" type="integer" value="1" label="Number of processes"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <param name="number_threads" type="integer" value="1" label="Number of threads"/> | |
| <param name="number_processes" type="integer" value="1" label="Number of processes"/> |
This is not required
tools/kneaddata/kneaddata.xml
Outdated
|
|
||
| <section name="trimmomatic" title="Trimmomatic arguments" > | ||
|
|
||
| <param name="max_memory" type="text" value="500m" label="Maximum memory for Trimmomatic"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <param name="max_memory" type="text" value="500m" label="Maximum memory for Trimmomatic"/> |
tools/kneaddata/kneaddata.xml
Outdated
|
|
||
| <conditional name="trimmomatic_options"> | ||
| <param name="select_option" type="select" label="Trimmomatic settings"> | ||
| <option value="d">Default settings</option> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better names for option values?
tools/kneaddata/.shed.yml
Outdated
| quality filtering, and removal of host contamination using | ||
| Bowtie2/TRIMMOMATIC/TRF. | ||
| homepage_url: https://github.com/biobakery/kneaddata | ||
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/kneaddata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/kneaddata | |
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/kneaddata |
tools/kneaddata/.shed.yml
Outdated
| owner: iuc | ||
| type: unrestricted | ||
| description: Quality control and contaminant removal for metagenomic data | ||
| long_description: > |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| long_description: > | |
| long_description: | |
tools/kneaddata/.shed.yml
Outdated
| remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/master/tools/kneaddata | ||
| categories: | ||
| - Metagenomics | ||
| - Statistics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - Statistics | |
| - Sequence Analysis |
| <requirement type="package" version="0.12.1">fastqc</requirement> | ||
| </requirements> | ||
| <command detect_errors="exit_code"><![CDATA[ | ||
| kneaddata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| kneaddata | |
| mkdir -p results/ | |
| kneaddata |
tools/kneaddata/kneaddata.xml
Outdated
| -i1 "$read_type.forward_read" | ||
| -i2 "$read_type.backward_read" | ||
| #end if | ||
| -o "output_dir" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| -o "output_dir" | |
| -o results/ |
tools/kneaddata/.shed.yml
Outdated
| metagenomic and metatranscriptomic sequencing data, especially | ||
| data from microbiome experiments. It performs adapter trimming, | ||
| quality filtering, and removal of host contamination using | ||
| Bowtie2/TRIMMOMATIC/TRF. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Bowtie2/TRIMMOMATIC/TRF. | |
| Bowtie2, TRIMMOMATIC and TRF. |
tools/kneaddata/kneaddata.xml
Outdated
| #if "$output_prefix" | ||
| --output-prefix "$output_prefix" | ||
| #end if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be in favor of removing this parameter
tools/kneaddata/kneaddata.xml
Outdated
| --threads "$number_threads" | ||
| --processes "$number_processes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| --threads "$number_threads" | |
| --processes "$number_processes" |
tools/kneaddata/kneaddata.xml
Outdated
| <param name="number_threads" value="1"/> | ||
| <param name="number_processes" value="1"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <param name="number_threads" value="1"/> | |
| <param name="number_processes" value="1"/> |
tools/kneaddata/kneaddata.xml
Outdated
| </section> | ||
| <section name="read_type"> | ||
| <param name="select_read_type" value="s"/> | ||
| <param name="single_read" value="28C.single.fastq"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <param name="single_read" value="28C.single.fastq"/> | |
| <param name="single_read" value="test_single.fastq"/> |
tools/kneaddata/kneaddata.xml
Outdated
| <param name="forward_read" value="28C.R1.fastq"/> | ||
| <param name="backward_read" value="28C.R2.fastq"/> | ||
| </section> | ||
| <output name="paired_forward" file="paired_forward.fastq"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <output name="paired_forward" file="paired_forward.fastq"/> | |
| <output name="paired_forward" file="test_paired_1.fastq"/> |
tools/kneaddata/kneaddata.xml
Outdated
| <param name="backward_read" value="28C.R2.fastq"/> | ||
| </section> | ||
| <output name="paired_forward" file="paired_forward.fastq"/> | ||
| <output name="paired_backward" file="paired_backward.fastq"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <output name="paired_backward" file="paired_backward.fastq"/> | |
| <output name="paired_backward" file="test_paired_2.fastq"/> |
tools/kneaddata/kneaddata.xml
Outdated
| usage: kneaddata [-h] [--version] [-v] [-i1 INPUT1] [-i2 INPUT2] | ||
| [-un UNPAIRED] -o OUTPUT_DIR | ||
| [-db REFERENCE_DB] [--bypass-trim] [--run-trim-repetitive] | ||
| [--output-prefix OUTPUT_PREFIX] [-t <1>] [-p <1>] | ||
| [-q {phred33,phred64}] [--run-bmtagger] | ||
| [--run-fastqc-start] [--run-fastqc-end] [--store-temp-output] | ||
| [--cat-final-output] | ||
| [--log-level {DEBUG,INFO,WARNING,ERROR,CRITICAL}] [--log LOG] | ||
| [--trimmomatic TRIMMOMATIC_PATH] [--max-memory MAX_MEMORY] | ||
| [--trimmomatic-options TRIMMOMATIC_OPTIONS] | ||
| [--bowtie2 BOWTIE2_PATH] [--bowtie2-options BOWTIE2_OPTIONS] | ||
| [--bmtagger BMTAGGER_PATH] [--trf TRF_PATH] [--match MATCH] | ||
| [--mismatch MISMATCH] [--delta DELTA] [--pm PM] [--pi PI] | ||
| [--minscore MINSCORE] [--maxperiod MAXPERIOD] | ||
| [--fastqc FASTQC_PATH] | ||
|
|
||
| KneadData | ||
|
|
||
| options: | ||
|
|
||
|
|
||
| -h, --help show this help message and exit | ||
| -v, --verbose additional output is printed | ||
| --version show program's version number and exit | ||
| -i INPUT, --input INPUT input FASTQ file (add a second argument instance to run with paired input files) | ||
| -o OUTPUT_DIR, --output OUTPUT_DIR directory to write output files | ||
| --db REFERENCE_DB, --reference-db REFERENCE_DB location of reference database | ||
| --run-trim-repetitive Option to trim repetitive/overrepresented sequences generated by FASTQC reports | ||
| --bypass-trim bypass the trim step | ||
| --output-prefix OUTPUT_PREFIX prefix for all output files [ DEFAULT : $SAMPLE_kneaddata ] | ||
| -t <1>, --threads <1> number of threads [ Default : 1 ] | ||
| -p <1>, --processes <1> number of processes [ Default : 1 ] | ||
| -q <quality>, --quality-scores <quality> quality scores [phred33|phred64] [DEFAULT: phred33] | ||
| --run-bmtagger run BMTagger instead of Bowtie2 to identify contaminant reads | ||
| --bypass-trf option to bypass the removal of tandem repeats | ||
| --run-fastqc-start run fastqc at the beginning of the workflow | ||
| --run-fastqc-end run fastqc at the end of the workflow | ||
| --store-temp-output store temp output files [ DEFAULT : temp output files are removed ] | ||
| --cat-final-output concatenate all final output files [ DEFAULT : final output is not concatenated ] | ||
| --log-level <DEBUG|INFO|WARNING|ERROR|CRITICAL> level of log messages [DEFAULT: DEBUG] | ||
| --log LOG log file [ DEFAULT : $OUTPUT_DIR/$SAMPLE_kneaddata.log ] | ||
| --trimmomatic TRIMMOMATIC_PATH path to trimmomatic [ DEFAULT : $PATH ] | ||
| --max-memory MAX_MEMORY max amount of memory [ DEFAULT : 500m ] | ||
| --trimmomatic-options TRIMMOMATIC_OPTIONS options for trimmomatic [ DEFAULT : SLIDINGWINDOW:4:20 MINLEN:50 ] | ||
| MINLEN is set to 50 percent of total input read length. The user can alternatively specify a length (in bases) for MINLEN. | ||
| --sequencer-source options for sequencer-source [ DEFAULT: NexteraPE] Available sequencers: ["NexteraPE","TruSeq2","TruSeq3"] | ||
| --bowtie2 BOWTIE2_PATH path to bowtie2 [ DEFAULT : $PATH ] | ||
| --bowtie2-options BOWTIE2_OPTIONS options for bowtie2 [ DEFAULT : --very-sensitive ] | ||
| --bmtagger BMTAGGER_PATH path to BMTagger [ DEFAULT : $PATH ] | ||
| --bypass-trf bypass the TRF step | ||
| --trf TRF_PATH path to TRF [ DEFAULT : $PATH ] | ||
| --mismatch MISMATCH mismatching penalty [ DEFAULT : 7 ] | ||
| --delta DELTA indel penalty [ DEFAULT : 7 ] | ||
| --pm PM match probability [ DEFAULT : 80 ] | ||
| --pi PI indel probability [ DEFAULT : 10 ] | ||
| --minscore MINSCORE minimum alignment score to report [ DEFAULT : 50 ] | ||
| --maxperiod MAXPERIOD maximum period size to report [ DEFAULT : 500 ] | ||
| --fastqc FASTQC_PATH path to fastqc [ DEFAULT : $PATH ] | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A better help could be what is KneadData tool, what does it do, what inputs it requires, what outputs it gives out etc. Perhaps explanation of a few important parameters
fixed help section added bmtagger added all missing arguments added additional outputs
Updated long description formatting and corrected repository URL.
- Updated tool XML with all arguments - Updated macros.xml
| </macros> | ||
| <expand macro="requirements" /> | ||
| <command detect_errors="exit_code"><![CDATA[ | ||
| mkdir -p results/ ; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| mkdir -p results/ ; | |
| mkdir -p results/ |
| <expand macro="requirements" /> | ||
| <command detect_errors="exit_code"><![CDATA[ | ||
| mkdir -p results/ ; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| $cat_final_output | ||
|
|
||
| ]]></command> | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <inputs> | ||
|
|
||
| <conditional name="read_type"> | ||
| <param name="select_read_type" type="select" label="Read type" help="Specify whether your sequencing data is single-end (one file) or paired-end (two files)."> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| <param name="select_read_type" type="select" label="Read type" help="Specify whether your sequencing data is single-end (one file) or paired-end (two files)."> | |
| <param name="select_read_type" type="select" label="Read type" help="Specify your sequencing data type."> |
| </param> | ||
|
|
||
| <when value="single"> | ||
| <param name="single_read" type="data" format="fastqsanger" label="Single Read"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the tool also accomodate fastqsanger.gz
| <when value="single"> | ||
| <param name="single_read" type="data" format="fastqsanger" label="Single Read"/> | ||
| </when> | ||
| <when value="paired"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can implement paired_collection input instead of paired input
| <conditional name="reference_genome"> | ||
| <param name="source" type="select" label="Select reference database source" help="Select a built-in genome index or upload your own FASTA file"> | ||
| <option value="indexed">Use built-in genome index</option> | ||
| <option value="history">Upload custom FASTA file</option> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not implemented in the CMD section. If you like, you can also remove it
| <conditional name="reference_genome"> | ||
| <param name="source" type="select" label="Select reference database source" help="Select a built-in genome index or upload your own FASTA file"> | ||
| <option value="indexed">Use built-in genome index</option> | ||
| <option value="history">Upload custom FASTA file</option> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
| <param name="bmtagger_db" type="select" label="Select reference genome database" | ||
| help="Select the organism genome to filter out. This genome's reads will be removed as host contamination."> | ||
| <options from_data_table="bmtagger_indexes"> | ||
| <filter type="sort_by" column="3" /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check with the bmtagger data tables from the bmtagger tool
| <param name="store_temp" type="boolean" truevalue="--store-temp-output" falsevalue="" label="Keep temporary files" | ||
| help="Save temporary/intermediate files generated during processing. This includes alignment files and trimmed intermediate reads. This will | ||
| increase the output file size."/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be skipped
FOR CONTRIBUTOR: