April 2014 Galaxy Update
- New Papers
- Who's Hiring
- Galaxy Distributions
- Galaxy Community Hubs
- Other News
Welcome to the April 2014 Galaxy Update, a monthly summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.
63 papers (a new monthly record) were added to the Galaxy CiteULike Group in March. Some papers that may be particularly interesting to the Galaxy community:
- "Wrangling Galaxy's Reference Data," by Daniel Blankenberg, James E. Johnson, James Taylor, Anton Nekrutenko; Bioinformatics (28 February 2014), doi:10.1093/bioinformatics/btu119
- "Controlling for contamination in re-sequencing studies with a reproducible web-based phylogenetic approach," Benjamin Dickins, Boris Rebolledo-Jaramillo, Marcia Shu-Wei S. Su, Ian M. Paul, Daniel Blankenberg, Nicholas Stoler, Kateryna D. Makova, Anton Nekrutenko, BioTechniques, Vol. 56, No. 3. (2014)
- "Orione, a web-based framework for NGS analysis in microbiology," Gianmauro Cuccuru, Massimiliano Orsini, Andrea Pinna, Andrea Sbardellati, Nicola Soranzo, Antonella Travaglione, Paolo Uva, Gianluigi Zanetti, Giorgio Fotia, Bioinformatics (Oxford, England) (10 March 2014), doi:10.1093/bioinformatics/btu135
- "Galaxy as a Platform for Identifying Candidate Pathogen Effectors," Peter J. Cock, Leighton Pritchard, In Plant-Pathogen Interactions, Vol. 1127 (2014), pp. 3-15, doi:10.1007/978-1-62703-986-4_1
- "GigaDB: promoting data dissemination and reproducibility," Tam P. Sneddon, Xiao S. Zhe, Scott C. Edmunds, Peter Li, Laurie Goodman, Christopher I. Hunter, Database, Vol. 2014 (01 January 2014), bau018, doi:10.1093/database/bau018
- "Prediction of Gene Activity in Early B Cell Development Based on an Integrative Multi-Omics Analysis," Mohammad Heydarian, Teresa Romeo Luperchio, Jevon Cutler, Christopher J. Mitchell1, Min-Sik Kim, Akhilesh Pandey, Barbara Sollner-Webb, Karen Reddy, Journal of Proteomics & Bioinformatics, Vol. 07, No. 02. (2014), doi:10.4172/jpb.1000302
The new papers covered:
GCC2014: June 30 - July 2, Baltimore
The 2014 Galaxy Community Conference (GCC2014) will be held June 30 through July 2, at the Homewood Campus of Johns Hopkins University, in Baltimore, Maryland, United States.
Oral Presentation Abstract Submission Closes April 4
Abstract submission for oral presentations closes April 4, which is this Friday. Poster submission closes April 25. Poster authors will be notified of acceptance status within two weeks of submission, while presentation authors will be notified no later than May 2. Please consider presenting your work. If you are dealing with big biological data, then this meeting wants to hear about it.
Accepted talks and selected posters from GCC2014 are also eligible for consideration to appear in the GigaScience "Galaxy: Data Intensive and Reproducible Research" series.
Registration is Open
Early registration is now open. Early registration saves more than 70% on registration costs, and Training Day registration is an additional 55% off if you register for both at the same time. This is by far the most affordable option, with early registration fees starting at less than $50 per day. When you register you can also reserve lodging at Charles Commons, a very affordable housing option in the same building as the conference.
Training Day is an opportunity to learn about all things Galaxy including using Galaxy, deploying and managing Galaxy, extending Galaxy, and Galaxy internals. There are 5 parallel tracks, each with 3 sessions, with each of those sessions two and half hours long. That's 15 sessions and over 37 hours of workshop material.
There are still Silver and Bronze sponsorships available. Please contact the Organizers if your organization would like to help sponsor this event.
In 2014 we are also adding non-sponsor exhibit spaces in addition to the sponsor exhibits. This will significantly increase the size of the exhibit floor. Please contact the Organizers if your organization would like to have an exhibit space at GCC2014.
Globus World 2014
GlobusWorld is this year’s biggest gathering of all things Globus. GlobusWorld 2014 features a features a Using Globus Genomics to Accelerate Analysis Tutorial, and a full half day on Globus Genomics in the main meeting, including a keynote by Nancy Cox and these accepted talks:
- Globus Genomics: Enabling high-throughput cloud-based analysis and management of NGS data for Translational Genomics research at Georgetown, by Yuriy Gusev,
- Improving next-generation sequencing variants identification in cancer genes using Globus Genomics, by Toshio Yoshimatsu
- Globus Genomics: A Medical Center's Bioinformatics Core Perspective, by Anoop Mayampurath
- Building a Low-budget Public Resource for Large-scale Proteomic Analyses, by Rama Raghavan
Globus Genomics is a Globus and Galaxy based platform for genomic analysis. GlobusWorld is being held April 15-17, in Chicago. And, GCC2014 is a Silver Sponsor of GlobusWorld.
UC Davis 2014 Bioinformatics Workshop
Registration is now open for the Using Galaxy for Analysis of High Throughput Sequence Data Workshop being held at UC Davis, June 16-20, 2014 from 9-5 each day. The workshop will cover modern high throughput sequencing technologies, applications, and ancillary topics, including:
- Illumina HiSeq / MiSeq, and PacBio RS technologies
- Read Quality Assessment & Improvement
- Genome assembly
- SNP and indel discovery
- RNA-Seq differential expression analysis
- Experimental design
- Hardware and software considerations
- Cloud Computing
The workshop will include a rich collection of lectures and hands-on sessions, covering both theory and tools. We will cover the basics of several high throughput sequencing technologies, but will focus on Illumina and PacBio data for hands-on exercises. Participants will explore software and protocols, create and modify workflows, and diagnose/treat problematic data. Workshop exercises will be performed using the popular Galaxy platform (http://usegalaxy.org) on the Amazon Cloud which allows for powerful web-based data analyses. There are no prerequisites other than basic familiarity with genomic concepts.
A similar workshop, using command line interfaces to perform the analysis, is being offered September 15-19, 2014.
The Galaxy is expanding! Please help it grow.
- Statistical Genomics Postdoc opening in the Makova lab at Penn State
- The Galaxy Project is hiring software engineers and post-docs
Got a Galaxy-related opening? Send it to email@example.com and we'll put it in the Galaxy News feed and include it in next month's update.
New Public Servers
Three public Galaxy servers were added to the published list in March:
- Link: Biomina Galaxy
- Domain/Purpose: A general purpose Galaxy instance that includes most "standard" tools for DNA/RNA sequencing, plus extra tools for panel resequencing, variant annotation and some tools for Illumina SNParray analysis.
- Includes a number of workflows, including workflow from "A SWI/SNF-related autism syndrome caused by de novo mutations in ADNP," by Helsmoortel, et al., Nature Genetics (2014) doi:10.1038/ng.2899
- User Support: [Email support](mailto:geert DOT vandeweyer AT uantwerpen DOT be)
- Registered users : 50Gb. Can be increased up to 3Tb in collaborative projects.
- There is NO backup of data inside this galaxy server.
- Collaboration partner jobs have higher priority on the system.
Image Analysis and Processing Toolkit
- Domain/Purpose: Pylogenetics
- Comments: "This server aims to demonstrate Osiris, a set of phylogenetics tools for the Galaxy Bioinformatics platform. Because it is only a demo, some computationally intensive tools are disabled. Other tools will be slow because this is a public, shared resource."
- Sponsor(s): Oakley Lab at UC Santa Barbara
The most recent release of Galaxy was February 10, 2014.
The most recent version of CloudMan was released in January 2014.
Galaxy Community Hubs
| Share your experience now
The Community Log Board and Deployment Catalog Galaxy community hubs were launched in December. If you have a deployment, or experience you want to share then please publish them.
There was one new Community Log Board entry in March:
- Basic Galaxy Puppet Module (work by Olivier Inizan, Mikael Loaec of INRA-URGI)
New Repositories in the Galaxy Project ToolShed
- regex_find_replace: Use python regular expressions to find and replace text
- samtools_phase: Call and phase heterozygous SNPs
- sample_seqs: Sub-sample sequences files (e.g. to reduce coverage)
- transpose: Transposes tabular-delimited data
- proteomics_rnaseq_reduced_db_workflow: Filter Proteomics Search DB by RNA-seq transcript expression analysis
- proteomics_rnaseq_sap_db_workflow: Create Proteomics Search DB from RNA-seq Single amino acid Polymorphism detection
- proteomics_novel_peptide_filter_workflow: filter a Proteomics Search DB for novel peptides
- proteomics_rnaseq_splice_db_workflow: create Proteomics Search DB from RNA-seq novel splice detection
- rsem_datatypes: Custom galaxy datatypes definitions for use with RSEM
- varscan_wrapper: Fork of fcaramia package correcting errors and additional options
- align_back_trans: Thread nucleotides onto a protein alignment (back-translation)
- dna_visualizer: convert DNA sequence into a PNG image by representing each base with one colored pixel
- bwa_mem: a software package for mapping low-divergent sequences against a large reference genome
- samtool_filter2: Filter BAM/SAM on FLAG,MAPQ,RG,LB or by region & produce a BAM/SAM on demand
- Galaxy reached a milestone of 100 contributors to our codebase! Thank you all!
- Poster: "ChemicalToolBoX and its application on the study of the drug like and purchasable space," by Lucas. et al., Journal of Chemoinformatics
- New tools available in Galaxy @ URGI: SnpEff, Mapsembler2, BLAST+, Blast2GO, Peak predictor, ...