March 2018 Tool Shed contributions
Tools contributed to the Galaxy Project ToolShed in March 2018.
- metagene_annotator: MetaGeneAnnotator gene-finding program for prokaryote and phage. MetaGeneAnnotator (MGA) precisely predicts all kinds of prokaryotic genes from a single or a set of anonymous genomic sequences having a variety of lengths. The MGA integrates statistical models of prophage genes, in addition to those of bacterial and archaeal genes, and also uses a self-training model from input sequences for predictions. As a result, the MGA sensitively detects not only typical genes but also atypical genes, such as horizontally transferred and prophage genes in a prokaryotic genome. MetaGeneAnnotator results are used by sixgill to generate metapeptides. http://metagene.cb.k.u-tokyo.ac.jp/ doi: 10.1093/dnares/dsn027.
- hicexplorer_hiccomparematrices: Wrapper for HiCExplorer: hicCompareMatrices. Sequencing techniques that probe the 3D organization of the genome generate large amounts of data whose processing, analysis and visualization is challenging. Here, we present Hi-C Explorer, a set of tools for the analysis and visualization of chromosome conformation data. Hi-C explorer facilitates the creation of contact matrices, correction of contacts, TAD detection, merging, reordering or chromosomes, conversion from different formats and detection of long-range contacts. Moreover, it allows the visualization of multiple contact matrices along with other types of data like genes, compartments, ChIP-seq coverage tracks (and in general any type of genomic scores) and long range contacts. doi: 10.5281/zenodo.159780 Repository-Maintainer: Björn Grüning https://github.com/maxplanck-ie/HiCExplorer.
- hicexplorer_hicaggregatecontacts: Wrapper for HiCExplorer: hicAggregateContacts.
- combine_json: JSON collection tool that takes multiple JSON data arrays and combines them into a single JSON array.
- trinity: De novo reconstructs transcriptomes from RNA-seq data. Trinity provides a method for efficient and robust de novo reconstruction of transcriptomes from RNA-seq data https://github.com/trinityrnaseq/trinityrnaseq.
- concatenate_multiple_datasets: Concatenate multiple datasets tail-to-head, including collection datasets. Concatenate multiple datasets tail-to-head. Can reduce dataset collections to a single output file, with optionally names of the datasets as section headers.
- ngsap_vc: VARIANT CALLING TOOLS AND WORKFLOW. Contains variant calling tools and workflows.
- irprofiler:IRProfiler Toolbox. Toolbox for immunogenetic repertoire profiling of high-throughput sequencing data.
- motif_tools: Version 1.0.1. Simple motif finding utilities. Set of simple motif finding utilities by Ian Donaldson: - Get all matches to a given IUPAC in GFF format - Counts the matches to a given IUPAC - Get non-redundant count of sequences - Identify clusters of two and three TFBS.
- proteore_tissue_specific_expression_data: ProteoRE Retrieve tissue-specific expression data. A tool for retrieving tissue-specific expression data from HPA (Human Protein Atlas) - no input required.
- proteore_clusterprofiler: ProteoRE clusterProfiler. A tool performing GO terms classification and enrichment analysis using R package clusterProfiler.
- kodoja: kodoja_search.py v0.0.3 wrapper.https://github.com/abaizan/kodoja_galaxy/commit/55004d41a9c0750b2543f394594ee58cc4426609. Kodoja takes the raw data, and uses Kraken and Kaiju to detect viral sequences in RNA-seq or sRNA-seq data. Kodoja takes the raw data (either fasta or fastq) and uses Kraken, a k-mer-based tool, and Kaiju, which used the Burrows–Wheeler transform, to detect viral sequences in RNA-seq or sRNA-seq data.
- epicseg: EpiCSeg is a tool for conducting chromatin segmentation. EpiCSeg is a tool for performing chromatin segmentation based on a hidden Markov model approach. A detailed description of the method is available under http://www.genomebiology.com/2015/16/1/151 .
- ectyper_v1: ectyper_v1.
- charts: Enables advanced visualization options in Galaxy Charts. Galaxy Charts is a visualization tool available as plugin for Galaxy. Certain chart types like e.g. Histograms, pre-process data before rendering the visualization. The pre-processing is done by this tool. Noteworthy, many chart types like e.g. Bar diagrams and Scatterplots, do not require data pre-processing.
- vcfdistance: Calculate distance to the nearest variant. Adds a value to each VCF record indicating the distance to the nearest variant in the file. The dataset used as input to this tool must be coordinate sorted. This can be achieved by either using the VCFsort utility or Galaxy''s general purpose sort tool (in this case sort on the first and the second column in ascending order).
- raceid_main: Wrapper for the RaceID pipeline tool: RaceID. RaceID is an algorithm for the identification of rare and abundant cell types from single cell transcriptome data. The method is based on transcript counts obtained with unique molecular identifies.
- raceid_diffgene: Wrapper for the RaceID pipeline tool: RaceID differential gene expression analysis. RaceID is an algorithm for the identification of rare and abundant cell types from single cell transcriptome data. The method is based on transcript counts obtained with unique molecular identifies.
- fastp: Fast all-in-one preprocessing for FASTQ files. A tool designed to provide fast all-in-one preprocessing for FASTQ files. This tool is developed in C++ with multithreading supported to afford high performance.
- feelnc: Galaxy wrapper for FEELnc. FEELnc (FlExible Extraction of LncRNAs) pipeline is used in order to annotate long non-coding RNAs (lncRNAs) based on reconstructed transcripts from RNA-seq data (either with or without a reference genome). It is an alignment-free program that accurately annotates lncRNAs based on a Random Forest model trained with general features such as multi k-mer frequencies and relaxed open reading frames.
- mothur_merge_count: Wrapper for mothur application: Merge.count. The mothur project seeks to offers a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community. We have incorporated the functionality of dotur, sons, treeclimber, s-libshuff, unifrac, and much more. In addition to improving the flexibility of these algorithms, we have added a number of other features including calculators and visualization tools.
- mothur_biom_info: Wrapper for mothur application: Biom.info. The mothur project seeks to offers a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
- mothur_taxonomy_to_krona: Wrapper for mothur application: Taxonomy-to-Krona. The mothur project seeks to offers a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
- mothur_rename_seqs: Wrapper for mothur application: Rename.seqs. The mothur project seeks to offers a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
- mothur_chimera_vsearch: Wrapper for mothur application: Chimera.vsearch. The mothur project seeks to offers a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community.
- samtools_fastx: Extract reads. Extract reads from a SAM or BAM file, optionally filtering by flags in the input file.