July 2013 Galaxy Update
Welcome to the July 2013 Galaxy Update, a monthly summary of what is going on in the Galaxy community. Galaxy Updates complement the Galaxy Development News Briefs which accompany new Galaxy releases and focus on Galaxy code updates.
Four new servers joined the list of over 30 publicly accessible Galaxy servers in June.
P-Galaxy is part of the DDBJ Read Annotation Pipeline and is described in "DDBJ Read Annotation Pipeline: A Cloud Computing-Based Pipeline for High-Throughput Analysis of Next-Generation Sequencing Data" by Nagasaki, et al. in DNA Research. Support is available at P-Galaxy Team, and P-Galaxy is developed by the Genome Informatics Laboratory and the DNA Data Bank of Japan (DDBJ) of the National Institute of Genetics (NIG).Encore: Genetic Association Interaction Network Centrality Pipeline and Application to SLE Exome Data," by Davis et al., In Silico Galaxy implements Encore
LiSIs is a platform for virtual screening. The !LiSIs platform has been developed in the context of the cross-disciplinary GRANATUM project aiming to bridge the gap between biomedical researchers by ensuring their seamless access to the globally available information needed to perform complex experiments and to conduct studies on large-scale datasets. LiSIs is developed and maintained by the e-Health Laboratory at the Department of Computer Science, University of Cyprus. LiSIs is an EU-funded project under FP7-(ICT-2009.5.3) and has its own LiSIS Support Google Group for questions.
Orione is a Galaxy based web server for microbiology. Orione includes all post mapping or assembling steps from scaffolding to complete annotation pipelines. From the GCC2013 poster abstract "Engaging Galaxy in Microbiology" We started on selecting the relevant software in the microbiology area, developing then all the necessary tools to integrate them into the Galaxy ecosystem. In addition to that, we made available several specialized workflows covering major applications such as bacterial resequencing, de novo assembly, scaffolding, bacterial RNA-seq, gene annotation and metagenomics. Orione provides additional capabilities to perform integrative, reproducible and transparent bioinformatic data analysis in microbiology thus expanding the constellation of specialized Galaxy based web servers as Nebula, Cistrome and several others.
This installation of Galaxy has been configured such that anonymous users can operate in a limited way. If you need to store data on this website and/or use advanced Galaxy features such as sharing and workflows, please send us an email with a short request.
|* New tag this month|
A record 53 new papers were added to the Galaxy CiteULike Group in June. These papers may be particularly interesting to the Galaxy community
- "Web-based visual analysis for high-throughput genomics" by Jeremy Goecks, Carl Eberhard, Tomithy Too, Anton Nekrutenko, James Taylor, BMC Genomics, Vol. 14, No. 1. (2013), 397
- "Detection of non-coding RNA in Bacteria and Archaea using the DETR’PROK Galaxy pipeline" by Claire Toffano-Nioche, Yufei Luo, Claire Kuchly, Claire Wallon, Delphine Steinbach, Matthias Zytnicki, Annick Jacq, Daniel Gautheret, Methods (June 2013)
- "Encore: Genetic Association Interaction Network Centrality Pipeline and Application to SLE Exome Data" by Nicholas A. Davis, Caleb A. Lareau, Bill C. White, et al. Genetic Epidemiology (June 2013)
- "A review of Bioinformatics training applied to research in Molecular Medicine, Agriculture and Biodiversity in Costa Rica and Central America" by Allan Orozco, Jessica Morera, Sergio Jiménez, Ricardo Boza, Briefings in Bioinformatics (30 May 2013)
- "Streaming Support for Data Intensive Cloud-Based Sequence Analysis" by Shadi A. Issa, Romeo Kienzler, Mohamed El-Kalioby, et al., BioMed Research International, Vol. 2013 (2013), pp. 1-16
- "Phase-defined complete sequencing of the HLA genes by next-generation sequencing" by Kazuyoshi Hosomichi, Timothy Jinam, Shigeki Mitsunaga, Hirofumi Nakaoka, Ituro Inoue, BMC Genomics, Vol. 14, No. 1. (2013), 355
- "Biology: The big challenges of big data" by Vivien Marx, Nature, Vol. 498, No. 7453. (12 June 2013), pp. 255-260
- "Software and supporting material for “SOAPdenovo2: An empirically improved memory-efficient short read de novo assembly" by Luo, et al. in GigaDB
|These tags were added in 2013 and very few papers before 2013 have been back-curated with them. All the tags are explained on the /src/publication-library/index.md page.|
The Galaxy is expanding! Please help it grow.
- The Galaxy Project is hiring software engineers and post-docs at both Emory and Penn State.
- Sr Bioinformatics Specialist, Tufts University, Boston MA.
- Senior Developer, Stem Cell Bioinformatics Core, Sage Bionetworks, Seattle, WA, United States
- Bioinformatics Support Group Leader @ LSU
ISMB / ECCB 2013 (along with http://www.open-bio.org/wiki/BOSC_2013|BOSC]], MS SIG and many other events) will be held in Berlin July 19-23. As of this writing there are 10 talks, 9 posters, and 2 workshops related to Galaxy at the events.
GCC2013 is an opportunity to participate in two full days of presentations, discussions, poster sessions, lightning talks and Birds of a Feather sessions, all about high-throughput biology and the tools that support it. The conference also includes a Training Day for the second year in a row, this year with more in-depth topic coverage, more concurrent sessions, and more topics. The conference ends with a closing dinner (sponsored by Ion Torrent) at the historic venue Sporten high above Oslo. (And for the two evenings before that, we have reserved a pub...)
Past Galaxy Community Conferences have been the event for networking in the Galaxy: There is no better place to meet and learn from others doing high-throughput biology. GCC2013 extends this tradition by including Birds of a Feather (BoF) meetups at the event. Birds of a Feather meetups are informal gatherings where participants group together based on common interests. If you have something in the list at right you want to meet about, or you have a whole new topic, then please join or start a GCC2013 BoF.
Directly following GCC2013, there is a unique possibility to attend a two-day Software Carpentry Boot Camp at the University of Oslo (in a building close to where the GCC is held). Software Carpentry Boot Camps aim to to help scientists and engineers become more productive by teaching them basic computing skills like program design, version control, testing, and task automation. In this two-day boot camp, short tutorials will alternate with hands-on practical exercises.
The course is aimed at postgraduate students and other scientists who are familiar with basic programming concepts (like loops, conditionals, arrays, and functions) but need help to translate this knowledge into practical tools to help them work more productively.
Content: The syllabus for this boot camp will include:
- using the shell to do more in less time
- using version control to manage and share information
- basic Python programming
- how (and how much) to test programs
Visit the Boot Camp Page for more information, and registration.
Galaxy Events Google Calendar for details on these and other events.
# Galaxy Distributions
- Visualization tool updates to Scatterplot and Trackster.
- New features and fixes added to the Tool Shed and related components.
- Python 2.5 Support officially ended.
- Plus newly merged Pull Requests and links to tickets covering key Bug Fixes.
new: $ hg clone https://bitbucket.org/galaxy/galaxy-dist#stable upgrade: $ hg pull $ hg update release_2013.06.03
A security vulnerability was recently discovered by Björn Grüning with Galaxy's "user impersonation" feature that can expose an administrator's active history to users whom they impersonate. Only Galaxy instances with
allow_user_impersonation = True
set in their configurations are affected, and only if an administrator makes use of the impersonation feature. By default, user impersonation is disabled.
A fix (id: 9d42f1e32efb) has been provided in the stable branch of Galaxy. To apply the fix, ensure you are on the stable branch and upgrade to the latest changeset:
% hg branch stable % hg pull -u
For Galaxy installations on relatively old versions that administrators are not yet ready to upgrade, there are three workarounds. First, the patch can be downloaded and applied manually:
% wget -o security.patch https://bitbucket.org/galaxy/galaxy-central/commits/9d42f1e32efb654fda1e011dc66a4aa5888717f1/raw/
% hg patch security.patch
% patch -p1 < security.patch
Second, the impersonation feature can be disabled by setting the following option in Galaxy's configuration file:
allow_user_impersonation = False
In all of the above cases, the Galaxy server process(es) must be restarted for the change to take effect.
Third, the feature can be left enabled and unpatched, and the vulnerability can be worked around by educating administrators who use the feature. As long as a new history is created by the administrator prior to switching to the impersonated user, no data will be exposed to the impersonated user.
- MMuFLR: Missense Mutation and Frameshift Location Reporter Workflows.
- ensembl_vep: Annotation tool using Ensembl Variant Effect Predictor for GVL
- eqtl_tools: eQTL Analysis using PANAMA
- fasta_extract: Extract a single sequence or set of sequences from a multifasta file
- fasta_stats: Produce basic statistics on a fasta/multifasta file
- repeat_masker: RepeatMasker
- interproscan: Interproscan functional predictions of ORFs
- fastq_validator: Validate your FastQ Files.
- peptide_to_gff: Outputs GFF3 with the peptide position in a reference genome
- cmpfastq: Compare two fastq files
- samifier: integrate genomic & transcriptomic NGS data with proteomic MS data.
- somatic_sniper: identify single nucleotide positions differences between tumor and normal
- Tin-Lap Lee's slides from his Bio-IT 13 talk: Next-Gen Sequencing Analysis by GigaGalaxy
- Now in MyExperiment: Detrprok_wf: Detect 3 types of ncRNA. Uses s_mart & detrprok_scripts from toolshed
- Building a CloudMan instance from scratch
- Tutorial: uploading NGS data to Amazon S3 for use in Galaxy
- Video: ChIP-seq analysis using Galaxy by Ian Donaldson of U Manchester
- At the University of Florida? UF Health is offering 2 Galaxy Workshops, Basics on June 26 and NGS on July 2
- pycrac: User-friendly Python tools for the analyses of CLIP/CRAC datasets
- In Sweden? The Swedish National Infrastructure for Computing (SNIC) is now supporting Galaxy (see last page).
- Follow Saket Choudhary's Google Summer of Code Galaxy project work on Blogspot