November 2018 Tool Shed contributions

Galaxy ToolShed

Tools contributed to the Galaxy Project ToolShed in November 2018.

New Tools

  • From jetbrains:
    • span: Initial version of SPAN for ToolShed. SPAN - Semisupervised Peak Analyzer. SPAN is a tool for analyzing ChIP-seq data.
  • From matnguyen:
    • pathogist: Initial Upload. Calibrated multi-criterion genomic analysis for public health microbiology. TBA.
  • From jaredgk:
    • ppp_seqgen: Wrapper for seq-gen simulation software.
  • From gga:
  • From md-anderson-bioinformatics:
    • matrix_manipulation: Matrix manipulation tools. This tool is a suite of matrix manipulation tools, allowing you to modified a matrix in many ways. The tool features: 1) Matrix multiplication or correlation of one or two matrices. 2) Matrix Transformations – Log, Ln, mean center, median, plus z score normalizations, offsets, and scaling. 3) Data Filters – upper and lower limits, NAN limits and percent, variance count and percent. 4) Matrix Validation – missing or invalid data It requires both column headers (row one) and row headers (column one). You can put any data in the headers if you are not preparing for later clustering. A very useful tool to modify a matrix used for input into the NG-CHM heat map tools for clustering and visualization of heat maps that are also in the Galaxy Tool Shed.
  • From jowong:
  • From peterjc:
    • make_nr: v0.0.1. Make a FASTA file non-redundant. Python script intended to be run prior to calling the NCBI BLAST+ command line tool makeblastdb or in other settings where you want to collapse duplicated sequences in a FASTA file to a single representative.
  • From davidvanzessen:
    • demultiplex_emc: Demultiplexing sequencing data based on a barcode. Simple tool that splits a fasta/fastq file based on a mapping file into seperate fastq/fasta files.
    • fetch_vep_cache_data: fetch VEP data with INSTALL.pl.
  • From greg:
    • validate_temperature_data: Validates either a 30 year normals temeprature dataset or a daily actuals temperature dataset which are used as input to the insect phenology model tool. Validates either a 30 year normals temeprature dataset or a daily actuals temperature dataset which are used as input to the insect phenology model tool.
  • From bgruening:
    • plotly_regression_performance_plots: performance plots for regression problems. The tool creates three plots to measure the performance of a machine learning regression models on three metrics. The metrics include: true vs predicted curves, scatter plot of true vs predicted values and a residual plot.
  • From trinity_ctat:
    • ctat_mutations: Mutation detection using GATK4 best practices and latest RNA editing filters resources. Works with both Hg38 and Hg19. Mutation detection in RNA-Seq highlights the GATK Best Practices in RNA-Seq variant calling, several sources of variant annotation, and filtering based on CRAVAT.
  • From iuc:
    • raceid_filtnormconf: Wrapper for the RaceID pipeline tool: Filtering, Normalisation, and Confounder Removal using RaceID. RaceID is a method for cell type identification from single-cell RNA-seq data by unsupervised learning. An initial clustering is followed by an outlier identification based on a backgorund model of combined technical and biological variability in single-cell RNA-seq data obtained by quantification with unique molecular identifiers. StemID permits subsequent inference of a lineage tree based on clusters, i.e. cell types, identified by RaceID.
    • raceid_trajectory: Wrapper for the RaceID pipeline tool: Lineage computation using StemID. RaceID is a method for cell type identification from single-cell RNA-seq data by unsupervised learning. An initial clustering is followed by an outlier identification based on a backgorund model of combined technical and biological variability in single-cell RNA-seq data obtained by quantification with unique molecular identifiers. StemID permits subsequent inference of a lineage tree based on clusters, i.e. cell types, identified by RaceID.
    • megahit_contig2fastg: A subprogram within the Megahit toolkit for converting contigs to assembly graphs (fastg). Contig2fastg is a subprogram within the MEGAHIT toolkit. It converts MEGAHIT's contigs (.fa) to assembly graphs (.fastg) that can be utilized for protein/peptide identification via graph2pro and can also be visualized via Bandage. MEGAHIT is a single node assembler for large and complex metagenomics NGS reads, such as soil. It makes use of succinct de Bruijn graph (SdBG) to achieve low memory assembly.
    • raceid_clustering: Wrapper for the RaceID pipeline tool: Clustering using RaceID. RaceID is a method for cell type identification from single-cell RNA-seq data by unsupervised learning. An initial clustering is followed by an outlier identification based on a backgorund model of combined technical and biological variability in single-cell RNA-seq data obtained by quantification with unique molecular identifiers. StemID permits subsequent inference of a lineage tree based on clusters, i.e. cell types, identified by RaceID.
    • deg_annotate: Annotate DESeq2/DEXSeq output tables. This tool appends the output table of DESeq2 or DEXSeq with gene symbols, biotypes, positions etc. The information you want to add is configurable. This information should present in the input GTF/GFF file as attributes of feature you choose.
    • raceid_inspecttrajectory: Wrapper for the RaceID pipeline tool: Lineage Branch Analysis using StemID. RaceID is a method for cell type identification from single-cell RNA-seq data by unsupervised learning. An initial clustering is followed by an outlier identification based on a backgorund model of combined technical and biological variability in single-cell RNA-seq data obtained by quantification with unique molecular identifiers. StemID permits subsequent inference of a lineage tree based on clusters, i.e. cell types, identified by RaceID.
    • fasta_stats: Display summary statistics for a fasta file.
    • raceid_inspectclusters: Wrapper for the RaceID pipeline tool: Cluster Inspection using RaceID. RaceID is a method for cell type identification from single-cell RNA-seq data by unsupervised learning. An initial clustering is followed by an outlier identification based on a backgorund model of combined technical and biological variability in single-cell RNA-seq data obtained by quantification with unique molecular identifiers. StemID permits subsequent inference of a lineage tree based on clusters, i.e. cell types, identified by RaceID.
  • From bimib:
    • marea: porting of MaREA Pipeline in Galaxy. Metabolic Reaction Enrichment Analysis (MaREA) pipeline allow to extract metabolic information from gene expression profiles as reported in 'Integration of transcriptomic data and metabolic networks in cancer samples reveals highly significant prognostic power' Graudenzi et. al. (https://doi.org/10.1016/j.jbi.2018.09.010).
  • From estrain: