April 2017 Tool Shed contributions

Tools contributed to the Galaxy Project Tool Shed in April 2017.

All monthly summaries

New Tools

unrestricted

From yating-l:
- regtools_junctions_extract: Wrapper for the regtools junctions extract program. Report splice junctions in RNA-Seq BAM file.
- ucsc_blat: Standalone blat sequence search command line tool.
- jbrowsearchivecreator: A tool to create a JBrowse track hub.
- gbtofasta: Convert GenBank records to fasta and Create table with coding regions information for each mRNA record.
- ucsc_pslpostarget: flip psl strands so target is positive and implicit.
- bamtobigwig: Convert Bam file to BigWig.
- ucsc_pslcheck: validate PSL files.
- snap: SNAP is a general purpose gene finding program suitable for both eukaryotic and prokaryotic genomes. SNAP is an acroynm for Semi-HMM-based Nucleic Acid Parser. SNAP is a general purpose gene finding program suitable for both eukaryotic and prokaryotic genomes. SNAP is an acroynm for Semi-HMM-based Nucleic Acid Parser.
- ucsc_pslcdnafilter: Filter cDNA alignments in psl format.
From jjv_bioinformaticians:
- phylogeny_maximum_parsimony_informative_sites: Phylogeny Maximum Parsimony (with informative sites). This tool performs a ClustalW alignment, after that, it will preprocess the data (if the user specified an informative site). Finally, it will execute maximum parsimony with the previous alignment.
From rnateam:
- rcas: RNA Centric Annotation System that provides intuitive reports and publication ready graphics. RCAS takes input peak intervals in BED foramt from clip-seq data and automatically generates distributions of annotation features, detected motifs, GO-term enrichment, pathway enrichment and genomic coverage.
From iuc:
- stacks_stats: Stacks: statistics (from the Stacks tool suite). Stacks is a software pipeline for building loci from short-read sequences, such as those generated on the Illumina platform. Stacks was developed to work with restriction enzyme-based data, such as RAD-seq, for the purpose of building genetic maps and conducting population genomics and phylogeography. http://catchenlab.life.illinois.edu/stacks/.
- column_remove_by_header: Remove columns by header. Removes or keeps columns based upon user provided column headings.
- bcftools_plugin_color_chrs: Wrapper for bcftools application bcftools color-chrs. BCFtools are meant as a faster replacement for most of the perl VCFtools commands.
- bcftools_mpileup: Wrapper for bcftools application bcftools mpileup.
- bcftools_plugin_frameshifts: Wrapper for bcftools application bcftools frameshifts.
- bcftools_csq: Wrapper for bcftools application bcftools csq.
- obi_uniq: obiuniq (from the Obitools suite). The OBITools package is a set of programs specifically designed for analyzing NGS data in a DNA metabarcoding context, taking into account taxonomic information. It is distributed as an open source software. http://metabarcoding.org/obitools.
- obi_grep: obigrep (from the Obitools suite).
- obi_ngsfilter: NGSfilter (from the Obitools suite).
- obi_tab: obitab (from the Obitools suite).
- obi_annotate: obiannotate (from the Obitools suite).
- obi_convert: obiconvert (from the Obitools suite).
- obi_sort: obisort (from the Obitools suite).
- obi_clean: obiclean (from the Obitools suite).
- obi_illumina_pairend: Illuminapairedend - Assembling pair-end reads (from the Obitools suite).
- obi_stat: obistat (from the Obitools suite).
- gffcompare: Galaxy wrappers for Geo Pertea’s GffCompare package. GffCompare is a modified version of the cufflinks suite’s CuffCompare. https://github.com/gpertea/gffcompare/blob/v0.9.8/README.md http://cole-trapnell-lab.github.io/cufflinks/cuffcompare/.
- gff3_rebase: Rebase a GFF against a parent GFF (e.g. an original genome). Often the genomic data processing/analysis process requires a workflow like the following: - select some features from a genome - export the sequences associated with those regions - analyse those exports with some tool like Blast For display, especially in software like JBrowse, it is convenient to know where in the original genome the analysis results would fall. E.g. if a transmembrane domain is detected at bases 10-20 of an analysed protein, where should this be displayed relative to the parent genome? This tool helps fill that gap, by rebasing some analysis results against the parent features which were originally analysed.
- column_order_header_sort: Sort Column Order by heading. Reorders a file’s columns by sorted value of header fields.
- metagenomeseq_normalization: metagenomeSeq Normalization. metagenomeSeq is designed to determine features (be it Operational Taxanomic Unit (OTU), species, etc.) that are differentially abundant between two or more groups of multiple samples. metagenomeSeq implements both our novel normalization and statistical model accounting for under-sampling of microbial communities and may be applicable to other datatypes. The package includes useful visualization tools. metagenomeSeq has been available through Bioconductor since release 2.12.
From nml:
- kaptive: Kaptive reports information about capsular (K) loci found in genome assemblies.
From lecorguille:
- xcms_merge: [Metabolomics][W4M][LC-MS] XCMS R Package - Preprocessing - Merge individual xcmsSet outputs. Part of the W4M project: http://workflow4metabolomics.org XCMS: http://www.bioconductor.org/packages/release/bioc/html/xcms.html Filtration and Peak Identification using xcmsSet function from xcms R package to preprocess LC/MS data for relative quantification and statistical analysis.
From fgiacomoni:
- lipidmaps_textsearch: Init repository with last lipidmaps_textsearch master version. [W4M][LC-MS] LIPID MAPS Structure Database (LMSD) - Annotation - Returns annotation results from LIPID MAPS Structure Database and its Text/Ontology-based search engine. Part of the W4M project: http://workflow4metabolomics.org / LMSD: http://www.lipidmaps.org. The wrapper script use the LipidMaps Text/Ontology-based search resource to annotate a list of m/z. The process returns output files (tabular and HTML formats) with links through lipid records.
- massbank_ws_searchspectrum: Init repository with last massbank_ws_searchspectrum master version. [W4M][LC-MS] MassBank spectrum searches - Annotation - Search by pseudo-spectra on a High Quality Mass Spectral Database. Part of the W4M project: http://workflow4metabolomics.org / MassBank: http://www.massbank.jp The wrapper script use the MassBank ‘Web service API’ resource to annotate a list of pseudo-spectra. The process returns outputs files (CSV and HTML formats) with links through MassBank records.
From theo.collard:
- ballgown_wrapper: Ballgown is a R package designed to facilitate flexible differential expression analysis of RNA-seq data. Ballgown is a software package designed to facilitate flexible differential expression analysis of RNA-seq data. The Ballgown package provides functions to organize, visualize, and analyze the expression measurements for your transcriptome assembly.
From chaimae_eljaouhari:
- basicplot: Graphics. Take on tabular file of numerical data as input and produces pairwise plots of numerical data, in log-log scale.
From jasper:
- pathoscope_map: Species identification and strain attribution with unassembled sequencing data. PathoScope takes next-generation sequencing reads from a mixture sample and predicts which genomes are present. We use a Bayesian framework combined with an initial reference-based alignment to assign reads to the correct genome of origin.
- cluster_picker: Cluster identification strategies differ between studies and as a consequence cluster definitions vary. The Cluster Picker identifies clusters in newick-formatted trees containing thousands of sequences within a few minutes. Cut-offs for within cluster genetic distance and bootstrap support are selected by the user. Because many groups then look at the epidemiology of these clusters, the Cluster Matcher automatically links Cluster Picker output to spreadsheets of epidemiological data.
- pathoscope_id: Species identification and strain attribution with unassembled sequencing data. PathoScope takes next-generation sequencing reads from a mixture sample and predicts which genomes are present. We use a Bayesian framework combined with an initial reference-based alignment to assign reads to the correct genome of origin.

Select Updates

From devteam:
- ncbi_blast_plus: v0.2.00, for NCBI BLAST+ 2.5.0 via bioconda or tool_dependencies.xml.
From peterjc:
- blast_rbh: v0.1.11 using BLAST+ 2.5.0 and Biopython 1.67.