October-November 2015 Tool Shed Contributions

Featured Updates

  • From peterjc:

  • From saskia-hiltemann:

    • annovar: Added databases 1000g2015aug, SPIDEX, avsnp138, avsnp142, exac03
    • annovar: added GoNL download link
    • annovar: Uploaded
    • ireport: Fixed auto resizing plus various other minor bugs
    • ireport: Fixed auto resizing plus various other minor bugs
  • From iracooke:

  • From crs4:

    • prokka: Support Prokka 1.11. Upgrade dependencies to package_barrnap_0_7, package_blast_plus_2_2_31, package_hmmer_3_1b2, package_tbl2asn_24_3.

Tools

  • From tiagoantao:

    • raxml: RAxML - A Maximum Likelihood based phylogenetic inference
  • From brigidar:

    • snp_to_bed: snp_to_bed From a SNP table with locus_tag and position to a BED formatted file. BED requires an interval rather than a single location.
    • select_product: select_product Finds all CDS that match the key words provided in the comma separated argument -k in the product description adds a given number of bases on each side of the transcript for the exclusion set. key example: transposases
    • drop_column: drop a column Provide the number of the column to the left (number1) and right (number2) of the column to remove.
    • vcf_to_snp: extract snp from vcf Transforms a vcf into a snp tab file. Can be used to blast against clade database
    • sed_fasta: removes leading characters in fasta downloaded from ncbi to match genbank locus tag Removes gi to gb characters and final pipe to match genbank locus tag.
  • From saskia-hiltemann:

    • testrepo: for testing tools only until cross-toolshed dependencies work properly
    • xy_plot_multiformat: devteam's xy_plot tool with support for multiple output formats In addition to PDF output, can also generate PNG, JPEG, BMP, TIFF output
    • file_manipulation: Collection of File Manipulation Tools
    • krona_text: Krona visualisation (generic)
  • From iuc:

    • ngsutils_bam_filter: Wrapper for ngsutils tool: BAM filter
    • khmer_normalize_by_median: Wrapper for khmer tool: Normalize By Median khmer is a library and suite of command line tools for working with DNA sequence. It is primarily aimed at short-read sequencing data such as that produced by the Illumina platform. khmer takes a k-mer-centric approach to sequence analysis, hence the name. The official repository is at https://github.com/ged-lab/khmer and you can read the docs online here: http://khmer.readthedocs.org/
    • sickle: Sickle is a tool that uses sliding windows along with quality and length thresholds to determine when quality is sufficiently low to trim the 3'-end of reads and also determines when the quality is sufficiently high enough to trim the 5'-end of reads.
      https://github.com/najoshi/sickle
    • datamash_transpose: Transpose tool from the datamash package GNU Datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files.
    • transtermhp: Finds rho-independent transcription terminators in bacterial genomes TransTermHP finds rho-independent transcription terminators in bacterial genomes. Each terminator found by the program is assigned a confidence value that estimates its probability of being a true terminator. http://transterm.cbcb.umd.edu/
    • suite_ngsutils: A suite of Galaxy tools for the python ngsutils NGSUtils is a suite of software tools for working with next-generation sequencing datasets.
    • bp_genbank2gff3: Converts GenBank format files to GFF3 Converts GenBank to GFF3
    • khmer_abundance_distribution: Wrapper for khmer tool: Abundance Distribution khmer is a library and suite of command line tools for working with DNA sequence. It is primarily aimed at short-read sequencing data such as that produced by the Illumina platform. khmer takes a k-mer-centric approach to sequence analysis, hence the name.
    • khmer_filter_below_abundance_cutoff: Wrapper for khmer tool: Filter k-mers khmer is a library and suite of command line tools for working with DNA sequence.
    • dexseq: Inference of differential exon usage in RNA-Seq Estimate variance-mean dependence in count data from high-throughput sequencing assays and test for differential expression based on a model using the negative binomial distribution. http://www.bioconductor.org/packages/release/bioc/html/DEXSeq.html
    • khmer_extract_partitions: Wrapper for khmer tool: Extract partitions khmer is a library and suite of command line tools for working with DNA sequence.
    • trinity: Trinity assembles transcript sequences from Illumina RNA-Seq data. Trinity represents a method for the efficient and robust de novo reconstruction of transcriptomes from RNA-seq data. https://github.com/trinityrnaseq/trinityrnaseq
    • data_manager_hisat2_index_builder: HISAT is a fast and sensitive spliced alignment program. As part of HISAT, we have developed a new indexing scheme based on the Burrows-Wheeler transform (BWT) and the FM index, called hierarchical indexing, that employs two types of indexes: (1) one global FM index representing the whole genome, and (2) many separate local FM indexes for small regions collectively covering the genome. http://ccb.jhu.edu/software/hisat/index.shtml
    • transdecoder: TransDecoder finds coding regions within transcripts. TransDecoder identifies candidate coding regions within transcript sequences, such as those generated by de novo RNA-Seq transcript assembly using Trinity, or constructed based on RNA-Seq alignments to the genome using Tophat and Cufflinks. https://transdecoder.github.io/
    • khmer_filter_abundance: Wrapper for khmer tool: Filter k-mer khmer is a library and suite of command line tools for working with DNA sequence.
    • khmer_count_median: Count Median khmer is a library and suite of command line tools for working with DNA sequence.
    • suite_datamash: datamash performs basic numeric, textual and statistical operations on input textual data files GNU Datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files. Home page: http://www.gnu.org/software/datamash. These tool wrappers were originally writen by Assaf Gordon.
    • khmer_partition: Wrapper for khmer tool: Sequence partition all-in-one khmer is a library and suite of command line tools for working with DNA sequence.
    • hisat2: HISAT2 is a fast and sensitive alignment program for mapping next-generation sequencing reads (both DNA and RNA) against the general human population (as well as against a single reference genome).
    • datamash_reverse: Reverse tool from the datamash package. GNU Datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files.
    • datamash_ops: Datamash tool from the datamash package. GNU Datamash is a command-line program which performs basic numeric,textual and statistical operations on input textual data files.
    • khmer_abundance_distribution_single: Wrapper for khmer tool: Abundance Distribution (all-in-one) khmer is a library and suite of command line tools for working with DNA sequence.
  • From bgruening:

  • From nick:

    • duplex: A pipeline for processing duplex sequencing data.
  • From bornea:

  • From mvdbeek:

  • From yhoogstrate:

    • show_metadata: Debug utility, shows all metadata of a history item
    • samtools_parallel_mpileup: samtools - optimized for parallel mpileup generation
    • varscan_mpileup2indel_from_bam: VarScan2 mpileup2indel - optimized for direct BAM/SAM input and parallel mpileup generation. VarScan is a variant caller for high-throughput sequencing data.
    • varscan_mpileup2snp_from_bam: VarScan2 mpileup2snp - optimized for direct BAM/SAM input and parallel mpileup generation. VarScan is a variant caller for high-throughput sequencing data.
  • From devteam:

    • data_manager_fetch_ncbi_taxonomy: Contains a data manager that defines and populates the ncbi_taxonomy tool data table. Download the NCBI taxonomy files and add their paths to a data table
    • data_manager_rsync_g2: rsync data manager This tool connects to the Galaxy Project's rsync reference data repository to download data and populate tool data tables.
    • velvet: de novo genomic assembler specially designed for short read sequencing technologies
  • From drosofff:

    • msp_sr_readmap_and_size_histograms: generates readmaps and size histograms from small RNA bowtie alignments generates readmaps and size histograms from small RNA bowtie alignments. This tool belongs to the mississippi tool suite.
    • msp_sr_signature: Computes the tendency of small RNAs to overlap with each other. Compute the tendency of small RNAs to overlap with each others for detailed information, see C. Antoniewski, \u201cComputing siRNA and piRNA Overlap Signatures.,\u201d Methods Mol. Biol., vol. 1173, no. 12, pp. 135\u2013146, 2014.
    • msp_sr_size_histograms: Generates size histograms from small RNA bowtie alignments. Generates size histograms from small RNA bowtie alignments.
  • From urgi-team:

    • vcfgandalftools: VCF Gandalf tools. Tools developped for the Gandalf project.
    • freebayes4workflow: This tool is a fork of Freebayes revision 22 (99684adf84de) allows to rename the output file in workflows
    • mdust: fast and symmetric DUST implementation to mask low-complexity DNA sequences
    • gandalfworkflow: workflows developped for the Gandalf project
    • mapqfilter: mapQfilter filters reads on quality and remove both members of the pair. uses samtools and picard tools
  • From stheil:

    • readpercontig_blat: Get read number, coverage or rpkm from blat alignment of reads on contigs. Two simple tools : readPerContig allows to compute either the number of reads, the coverage, or the rpkm by mapping reads on contig/scaffold using blat. getSingleton extracts reads that do not match on any contig/scaffold.
    • taxonomy_sqlite: Download NCBI taxonomy from FTP and loads it into a sqlite database. Download 4 files : gi_taxid_prot.dmp, gi_taxid_nucl.dmp, nodes.dmp and names.dmp from NCBI Taxonomy ftp site. A perl script called loadTaxonomy.pl creates the DB structure and loads the downloaded data into it. This DB aims to be used by the taxonomy_from_blast tools repository.
  • From nml:

  • From gbcs-embl-heidelberg:

    • je_markdupes: Initial upload Wrapper for Je tool: Je-MarkDuplicates
    • je_demultiplex_illu: Initial upload Wrapper for Je tool: Je-Demultiplex-Illu
    • je_clip: Initial upload Wrapper for Je tool: Je-Clip
    • je_demultiplex: Initial upload Wrapper for Je tool: Je-Demultiplex

Dependency Definitions