October 2014 Galaxy Update
Welcome to the October 2014 Galaxy Update, a summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.
The Galaxy Project is preparing for our next grant cycle and we are seeking your feedback and comments on on all things Galaxy. There are two questionnaires, each with a different focus, based on how you interact with Galaxy:
Please take a few minutes and fill out whichever surveys apply to you. The questionnaires are structured so you can skip topics that don't apply to you, and every question is optional.
And, to thank you for your time and effort, the Galaxy Project will increase your storage quota on usegalaxy.org by 50GB, a 20% increase.
Let your voice be heard!
A proposed change for the #galaxyproject IRC channel was proposed, and then discussed, and approved on Galaxy Biostar. Starting sometime in October, posts to this channel will be made available in a searchable archive on the web.
Thanks to everyone who participated in the decision, and those at GCC2014 who suggested this.
There are upcoming events in Switzerland, Germany, Australia, Norway, France, Italy, and the United States. See the Galaxy Events Google Calendar for details on other events of interest to the community.
Executing SADI services in Galaxy, by Aranguren, et al. Journal of Biomedical Semantics, Vol. 5, No. 1. (2014), 42, doi:10.1186/2041-1480-5-42
A Survey of Cloud-Based Service Computing Solutions for Mammalian Genomics, by Church & Goscinski, IEEE Transactions on Services Computing, DOI: 10.1109/TSC.2014.2353645
An automated infrastructure to support high-throughput bioinformatics, by Cuccuru, et al. High Performance Computing & Simulation (HPCS), 2014 International Conference on (July 2014), pp. 600-607, doi:10.1109/hpcsim.2014.6903742
Experiences building Globus Genomics: a next-generation sequencing analysis service using Galaxy, Globus, and Amazon Web Services, by Madduri, et al. Concurrency and Computation: Practice and Experience, Special issue on XSEDE13, Volume 26, Issue 13, pages 2266–2279, 10 September 2014
MIRPIPE – quantification of microRNAs in niche model organisms, by Kuenne, et al. Bioinformatics (2014) doi: 10.1093/bioinformatics/btu573
ballaxy: web services for structural bioinformatics, by Hildebrandt, et al. Bioinformatics (2014) doi: 10.1093/bioinformatics/btu574
The new papers were tagged in many different areas:
The Galaxy is expanding! Please help it grow.
- CDD Ingénieur NGS - Institut Curie, Paris, France
- Emploi CDD Ingénieur Bioinformatique - ChIP-seq, Marseille, France
- Research Specialist, Michigan State University, United States
- Bioinformatics and Computational Biology, US Army Engineer Research and Development Center’s Environmental Laboratory, Vicksburg, MS, United States
- Computational Science Developer I, Cold Spring Harbor Laboratory (CSHL), New York, United States
- Statistical Genomics Postdoc opening in the Makova lab at Penn State
- The Galaxy Project is hiring software engineers and post-docs
Two new public Galaxy server was added to the published list in September:
GalaxEast aims at providing a large range of bioinformatics tools for the analysis of various types of Omics data. It supports reproducible computational research by providing an environment for performing and recording bioinformatics analyses.
The GalaxEast project has the following main objectives:
- Provide the academic scientific community with an open and powerful Galaxy instance with a guaranteed availability. The platform offers access to cutting-edge and up-to-date tools for Omics data analysis with help and support.
- Propose innovative developments and new helpful tools packaged for Galaxy (available in the GalaxEast toolshed)
- Promote the packaging of new developments for Galaxy (through wrappers and/or toolshed packages).
See GalaxEast: an open and powerful Galaxy instance for integrative Omics data analysis, poster presented at ECCB'14 by Stephanie Le Gras, et al. for more.
MIRPIPE focuses on quantification of microRNA based on smallRNA sequencing reads. From the home page: In opposition to present algorithms that generally rely on genomic data to identify miRNAs, MIRPIPE focuses on niche model organisms that lack such information. Among the MIRPIPE features are automatic trimming and adapter removal of raw RNA-Seq reads originating from various sequencing instruments, clustering of isomiRs, and quantification of detected miRNAs by homology search versus public or user uploaded reference databases.
See "MIRPIPE – quantification of microRNAs in niche model organisms," C. Kuenne, et al. for more. Email support and a MIRPIPE Manual are provided. MIRPIPE is sponsored by the Max Planck Institute for Heart and Lung Research.
The deployment details for the GalaxEast public server were posted in September. Tracey Timms-Wilson's (of the NERC Environmental 'Omics Synthesis Centre) Overview of Galaxy on Bio-Linux 8 page was also added to the Community Log Board.
Look for a new Galaxy distribution in October.
Here are new contributions for the past two months.
In no particular order:
- sift_web: PROVEAN and SIFT predictions for a list of human genome variants.
- jemultiplexer: debarcoding/demultiplexing tool for FASTQ files accommodating all complex multiplexing protocols (iCLIP, molecule barcoding, ...).
- tcoffee: T-Coffee multiple alignment suite.
- kggseq_variant_selection: Variant selection with KGGSeq
- structurefold: StructureFold predicts RNA secondary structures from high throughput RNA structure profiling data
- sirna_plant: plant siRNA analysis toolkits. siRNA prediction, siRNA annotation, siRNA quantify
- dc_genotyper: genotyper aimed at finding SNPs in high-ploidy (or pooled) samples sequenced at very high depth in a targeted region.
- fasta_merge_files_and_filter_unique_sequences: Merge FASTA files, keeping only unique sequences
- filter_by_fasta_ids: Extract sequences from a FASTA file based on a list of IDs
- myrimatch: protein identification via database search using Bumbershoot MyriMatch
- ltq_iquant_cli: iQuant performs tag based isobaric quantification
- idpqonvert: Bumbershoot idpQonvert, a part of Bumbershoot IDPicker.
- directag_and_tagrecon: protein identification via Directag and TagRecon.
- suite_vcflib_tools_3_0: 23 tools for manipulation of VCF datasets
- opal2_4_1: Opal Package - GVL
- package_vcflib_8a5602bf07: Compiled vcflib binaries for x86_64
- package_igvtools_2_3_32: igvtools binaries, to be used as dependency in other tools.
- package_rseqc_2_4: downloads and compiles version 2.4 of RSeQC.
- toolfactory: Citations added (thanks John!) and a few more output formats for Alistair Chilcott
- Why the three biggest positive contributions to reproducible research are the iPython Notebook, knitr, and Galaxy on the Simply Statistics blog
- Updated wiki page about dynamically discovering output datasets at runtime.
- The Ansible playbook used to update usegalaxy.org is available in GitHub.
- New GVL Galaxy Release: Metagenomics Tutorial tools, MACS2, BLAST, MEME, hg38, rn6, and Trinity.
- Galaxy Community UK launches a Twitter channel: @GalaxyUKFriends
- BOSC 2015 will be in Dublin with ISMB/ECCB 2015. We should have voted more often!
- Supporting Enhanced Reproducibility for Platforms like Galaxy, discussion on GitHub.