March 2019 Tool Shed contributions

Tools contributed to the Galaxy Project ToolShed in March 2019.

All monthly summaries

New Tools

From galaxyp:
- dia_umpire: DIA-Umpire analysis for data independent acquisition (DIA) mass spectrometry-based proteomics. DIA-Umpire is an open source Java program for computational analysis of data independent acquisition (DIA) mass spectrometry-based proteomics data. It enables untargeted peptide and protein identification and quantitation using DIA data, and also incorporates targeted extraction to reduce the number of cases of missing quantitation. http://diaumpire.sourceforge.net/.
From imgteam:
- imagecoordinates_flipaxis: Flip coordinate axes. Makes x the horizontal axis (left to right) and y the vertical axis (bottom to top), like in a coordinate system.
From iuc:
- scanpy_plot: Wrapper for the scanpy tool suite: Plot with scanpy. Scanpy is a scalable toolkit for analyzing single-cell gene expression data. It includes preprocessing, visualization, clustering, pseudotime and trajectory inference and differential expression testing.
- scanpy_normalize: Wrapper for the scanpy tool suite: Normalize with scanpy.
- scanpy_inspect: Wrapper for the scanpy tool suite: Inspect with scanpy.
- scanpy_cluster_reduce_dimension: Wrapper for the scanpy tool suite: Cluster and reduce dimension with scanpy.
- kraken2: Kraken2 for taxonomic designation. Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies. Previous attempts by other bioinformatics software to accomplish this task have often used sequence alignment or machine learning techniques that were quite slow, leading to the development of less sensitive but much faster abundance estimation programs. Kraken aims to achieve high sensitivity and high speed by utilizing exact alignments of k-mers and a novel classification algorithm.
- scanpy_filter: Wrapper for the scanpy tool suite: Filter with scanpy.
- scanpy_remove_confounders: Wrapper for the scanpy tool suite: Remove confounders with scanpy.
- berokka: Berokka is used to trim, circularise, orient & filter long read bacterial genome assemblies. There is already a good piece of software to trim/circularise and orient genome assemblies called Circlator. Please try that first! You should only try Berokka if… 1) You only have the contig files and do not have the corrected reads anymore. 2) Your contigs are simple cases with clear overhang and could be done manually with BLAST. 3) Circlator fails on your data even after troubleshooting. Berocca is a brand of effervescent drink and vitamin tablets containing vitamin B and C. It is a popular cure for a hangover. A key role of the berokka tool is to remove the “overhang” that occurs at the ends of long-read assemblies of circular genomes.
From chemteam:
- gmx_merge_topology_files: Wrapper for the gromacs package: Merge GROMACS topologies. GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers. GROMACS supports all the usual algorithms you expect from a modern molecular dynamics implementation (check the online reference or manual for details).
- bio3d_dccm: Wrapper for the Bio3D package: DCCM analysis. Bio3D is an R package containing utilities for the analysis of protein structure, sequence and trajectory data.
From proteore:
- proteore_data_manager: data manager to download and set necessary files for “Number of MS/MS observations in a tissue (from Peptide Atlas)” and “Get expression data by tissue finds tissue” from the proteore package.
- proteore_build_protein_interaction_maps: Build protein interaction maps.
From c-magno:
- bacterial_genome_assembly: Tools for bacterial genome assembly. Tools for bacterial genome assembly.
From ecology:
- vigiechiro_idvalid: Tadarida identifications integration tool from the vigiechiro suite. The Vigie Chiro is a french scientific citizen program that studies bats. This tools suite allows to analyse data from the vigiechiro.herokuapp.com web portal and to create advanced restitutions. Two tools are useful to clean and synthetize data according to the sampling session. The other two produce advanced restitution based on the protocol used.
- vigiechiro_bilanenrichirp: ‘Routier’ or ‘Pedestre’ protocols advanced restitution tool from the vigiechiro suite.
- vigiechiro_idcorrect_2ndlayer: Tadarida data clean tool from the vigiechiro suite.
- vigiechiro_bilanenrichipf: ‘Point fixe’ protocol advanced restitution tool from the vigiechiro suite.
From bixuanjiang:
- wgs_gatk_workflow: WGS analysis workflow based on gatk. This is a workflow for WGS_analysis mainly based on gatk.
From smarthey:
- paqmir_mirdeep2_quantifier: Fast quantitation of reads mapping to known miRBase precursors. This tool is part of the workflow PAQmiR for the *Prediction Annotation and Quantification of miRNA with miRDeeThe module maps the deep sequencing reads to predefined miRNA precursors and determines by that the expression of the corresponding miRNAs. This wrapper was forked from the rbc_mirdeep2_quantifier wrapper of the RNA-Bioinformatics network ==> https://www.denbi.de/network/rna-bioinformatics-center-rbc ==> https://github.com/bgruening First, the predefined mature miRNA sequences are mapped to the predefined precursors. Optionally, predefined star sequences can be mapped to the precursors too. By that the mature and star sequence in the precursors are determined. Second, the deep sequencing reads are mapped to the precursors. The number of reads falling into an interval 2nt upstream and 5nt downstream of the mature/star sequence is determined. Modifications are the product of Valentin Marcon & Sylvain Marthey (Thanks to INRA Migale, IFB ressources & INRA GABI).
- paqmir_postprocess_quantifier: Filters results to quantify, annotate, and eliminate redundancy in miRNAs. This tool is part of the workflow PAQmiR for the *Prediction Annotation and Quantification of miRNA with miRDeep2 This module assigns mature miRNAs to a set of precursors, and report the quantification of the best two mature (3p & 5p predicted from their position on the precursor) observed for each precursor. The module uses output files provided by the quantifier.pl module from miRdeep2 and assigns the matures to the precursors by using the following order of priority: - Mature known in the species studied (generaly all matures from miRBase known for the species are used) - Mature known in another species (all mature from miRBase, or only those corresponding to a subset of closely related species are used). In case where several mature are detected, mature with the highest count is chosen. - Mature unknown (generaly matures predicted by miRDeep2.pl module are used) Wrappers are the product of Valentin Marcon, Kevin Normand & Sylvain Marthey (Thanks to INRA Migale, IFB ressources & INRA GABI).
- paqmir_cut_fasta_identifiers: Modifies the sequences headers in fasta file by removing all annotations located after the first space. This tool is part of the workflow PAQmiR for the Prediction Annotation and Quantification of miRNA with miRDeep2. This tool modifies the sequence headers by removing all data after the first space. ex 1 : original id : >cel-mir-1 MI0000003 Caenorhabditis elegans modified id : >cel-mir-1 ex 2 : original id : >Chr4 length= 120829699 modified id : >Chr4.
- paqmir_postprocess_mirdeep2: Filter and use the results of miRDeep2 to create new reference datasets for quantification and annotation steps. This tool is part of the workflow PAQmiR for the Prediction Annotation and Quantification of miRNA with miRDeep2 This module will use miRDeep2’s prediction results to create new reference datasets (hairpin and mature) which can be used for the quantification and annotation of miRNAs. The two reference datasets will be created by combining the prediction results and the reference miRNA/hairpin provided. This module allow to implement the three strategies described in this publication : Annotation of the goat genome using next generation sequencing of microRNA expressed by the lactating mammary gland: comparison of three approaches. Mobuchon et al. BMC Genomics. 2015 Wrappers are the product of Valentin Marcon, Kevin Normand & Sylvain Marthey (Thanks to INRA Migale, IFB ressources & INRA GABI).
- paqmir_create_mirna_references: Prepares files containing hairpin and mature miRNAs that are necessary for the use of miRDeep2 software. This tool is part of the workflow PAQmiR for the Prediction Annotation and Quantification of miRNA with miRDeep2 This tool will use the “species code” provided as parameter to identify the sequences belonging to the reference species in the mature and hairpin files passed as parameters.
- paqmir_mirdeep2_mapper: Process and map reads to a reference genome. This tool is part of the workflow PAQmiR for the Prediction Annotation and Quantification of miRNA with miRDeep2. The MiRDeep2 Mapper module is designed as a tool to process deep sequencing reads and/or map them to the reference genome. The module works in sequence space, and can process or map data that is in sequence FASTA format. A number of the functions of the mapper module are implemented specifically with Solexa/Illumina data in mind. This wrapper was forked from the rbc_mirdeep2_mapper wrapper of the RNA-Bioinformatics network ==> https://www.denbi.de/network/rna-bioinformatics-center-rbc ==> https://github.com/bgruening Modifications are the product of Valentin Marcon & Sylvain Marthey (Thanks to INRA Migale, IFB ressources & INRA GABI).
- paqmir_mirdeep2: Identification of novel and known miRNAs. This tool is part of the workflow PAQmiR for the *Prediction Annotation and Quantification of miRNA with miRDeep2 MiRDeep2 is a software package for identification of novel and known miRNAs in deep sequencing data. Furthermore, it can be used for miRNA expression profiling across samples. This wrapper was forked from the rbc_mirdeep2 wrapper of the RNA-Bioinformatics network ==> https://www.denbi.de/network/rna-bioinformatics-center-rbc ==> https://github.com/bgruening Modifications are the product of Valentin Marcon & Sylvain Marthey (Thanks to INRA Migale, IFB ressources & INRA GABI).
From dfornika:
- kraken2: Kraken2 for taxonomic designation. Kraken2 is a system for assigning taxonomic labels to short DNA sequences. Kraken 2 uses exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer. The k-mer assignments inform the classification algorithm.
From nml:
- biohansel_bionumeric_converter: Convert BioHansel output data to a Bionumerics friendly form.
From rsajulga:
- group_humann2_uniref_abundances_to_go: Group abundances of UniRef50 gene families obtained with HUMAnN2 to Gene Ontology (GO) slim terms with relative abundances.
From jjohnson:
- mixcr: MiXCR processes immunome sequences to quantitated clonotypes. MiXCR processes big immunome data from raw sequences to quantitated clonotypes MiXCR efficiently handles paired-end and single-end reads, considers sequence quality, corrects PCR errors and identifies germline hypermutations. https://milaboratory.com/software/mixcr/.