- Visualising Proteomics Data in Galaxy
- Building a scalable Galaxy cluster for biomedical research in The Netherlands
- Mississippi: a galaxy server centered on small RNA analysis
- A Galaxy-Based framework for online streaming data analytics in Heart Rate Variability Analysis
- Ebiogenouest régional initiative : a use case for the structuration of the biologists community
- Intergalactic travel: Sending usegalaxy.org through the wormhole
- Plan for Galaxy based Chip-exo Analysis platform
- BeeGFS: Accelerating the access to BLAST and Galaxy Indices
- Less talking, more doing: Crowd-sourcing the integration of Galaxy with a high-performance computing cluster
- Running and maintaining a reliable production Galaxy server
- Private BLAST: Using Galaxy
- Galaxy: Farm to Federation
- Galaxy Docker Containers: Docker, Docker, Docker
- Galaxy Search API
If you wish to give a lightning talk, please send a title and short abstract to [GCC2014 Scientific Committee](mailto:gcc2014-sci AT groups DOT galaxyproject DOT org), either
- any time before the start of Session 2 (to be considered for Tuesday or Wednesday slots), or
- before Session 6 (to be considered for Wednesday only).
The slides for all lightning talks will be made available on the this page, and the talks may be videotaped and posted here as well.
Accepted Talks, Session 4, Tuesday, July 1
These talks have been accepted for the first lightning talks storm on Tuesday.
Visualising Proteomics Data in Galaxy
1 Latrobe University
Building a scalable Galaxy cluster for biomedical research in The Netherlands
David van Enckevort1, Anthony Potappel2, Niek Bosch3, Jeroen Beliën4, Rita Azevedo5, Rob Hooft5, Sander Ruiter2, Sanne Abeln6, Irene Nooren3, Jan-Willem Boiten7
1 University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
2 Vancis, Amsterdam
3 SURFsara, Amsterdam, The Netherlands
4 VU university medical center, Amsterdam, The Netherlands
5 Netherlands eScience Center, Amsterdam, The Netherlands
6 VU university, Amsterdam, The Netherlands
7 Center for Translational Molecular Medicine, Eindhoven, The Netherlands
For the national translational IT project CTMM/TraIT Galaxy has been selected as one of the tools in the experimental domain. The TraIT partners (among others NBIC and SURFsara) have developed a vision how to make Galaxy available to the research community in The Netherlands. The scalable Galaxy cluster on the SURFsara HPC Cloud will be transferred to Vancis to provide a sustainable production-level Galaxy cluster. In the design of this environment Vancis has made use of the knowledge and experience of NBIC and SURFsara hosting the public NBIC instance on the SURFsara HPC Cloud.
To assess the minimal requirements for the infrastructure we used metrics collected while running the NBIC Galaxy on the HPC Cloud. Next we drafted a set of use cases the infrastructure should be able to fulfil, such as the ability to run Omics-pipelines and the ability to scale to handle peak demand. We identified I/O performance as a major bottleneck, since many Galaxy tools are I/O intensive, while Galaxy has a shared data design. Memory was also recognized as a critical factor, since typical datasets are in the order of the tens of gigabytes. We also built upon the experiences from SURFsara in operating the HPC Cloud and other HPC. To accommodate for a full set of development, testing, acceptance & production environments, as well as private installations, the infrastructure should support multiple Galaxy clusters. The chosen architecture will use a Linux High Availability environment with OpenStack, which will run on two large-size blades. Storage is split into multiple tiers with different characteristics to support both high I/O workloads and a reliable large storage. The chosen setup is horizontally scalable in a cost-efficient manner.
From May to September 2014 we will pilot the new architecture within the TraIT project. For this pilot we have selected a few TraIT NGS tools and pipelines to stress test the system under different workload scenarios. Furthermore we have established a process to ensure the quality of the tools required for a stable production environment.
Mississippi: a galaxy server centered on small RNA analysis
Marius van den Beek1, Christophe Antoniewski1
1Drosophile.org, CNRS and University Pierre-et-Marie-Curie, Paris
Non-coding small RNAs (miRNA, siRNA, piRNA, …) are involved in the regulation of genes and transposable elements as well as in the defense against viral infections. Their discovery and their functional characterization rely heavily on high throughput RNA sequencing. The ~20:30nt length of small RNAs raises specific challenges for meaningful read mapping and analysis, so that standard RNAseq analysis methods have to be adapted. We provide an integrated set of galaxy tools that should streamline the most frequent small RNA analysis needs. This includes a modified bowtie-wrapper and workflows that allow users to quickly and reproducibly interrogate various aspects of small RNA biology. We provide tools for the discovery and differential expression analysis of miRNAs and a way for genome-wide visualization of miRNA precursors that complements Trackster. Furthermore we provide tools to detect the “ping-pong” biogenesis signature of piRNAs, to detect piRNA-producing loci in the genome and to study and visualize the impact of piRNAs and siRNAs on transposable elements.
A Galaxy-Based framework for online streaming data analytics in Heart Rate Variability Analysis
C Zarbo1, A Bizzego1,2,3, M Mina1, G Esposito2,4, C Furlanello1
1 Predictive Models for Biomedicine & Environment - Fondazione Bruno Kessler, Trento, Italy
2 University of Trento, Italy
3 SKIL Telecom Italia, Trento, Italy
4 RIKEN BSI, Wako-Shi, Japan
The emerging applications in physiological data processing, encouraged by the availability of wearable sensors for continuous self-monitoring and quantified self, require new platforms for time series analysis supporting real-time processing and fast prototyping capabilities. We recently proposed Physiolyze, a Galaxy-based web framework to support complex workflows for Heart Rate Variability (HRV) analysis. Here we extend Physiolyze by introducing scalable online processing capabilities.
The enhanced version still relies on Galaxy as core platform to design and manage the pipelines. In order to incrementally analyze the streams, a set of Python routines based on the Bioblend library works as middleware to trigger the pipelines as new data become available. A web interface based on the Django Python framework allows the user to control the execution of the pipelines, running them on new data streams.
We tested our system on the task of predicting infant behavioral state from HRV patterns. We simulated a real-time scenario of 100 asynchronous data streams from data for 24 infants previously collected with a Light WP Holter ECG recorder (GE Healthcare). The system incrementally extracts 37 HRV indicators from each data stream and predicts the infant state (e.g. wake, sleep, cry) with a Random Forest regression model. The pipeline is modular and fully managed as a Galaxy workflow.
Our system can easily be adapted to other online streaming analytics applications, such as for the parallelized analysis of multiple data streams acquired from physiological sensors and wearable devices.
Ebiogenouest régional initiative : a use case for the structuration of the biologists community
Yvan Le Bras1 and Olivier Collin1
1 CNRS UMR 6074 IRISA-INRIA, Rennes, France
Two years after the beginning of a western France e-Science project, we propose here to highlight some results and show prospects.
Intergalactic travel: Sending usegalaxy.org through the wormhole
Nate Coraor1, Dannon Baker 2 and John Chilton1
1 Galaxy Team, Penn State University, University Park, Pennsylvania
2 Galaxy Team, Johns Hopkins University, Baltimore, Maryland
Due to resource constraints, the main public Galaxy server run by the Galaxy Team, usegalaxy.org, moved from Penn State to the Texas Advanced Computing Center, with backups at the Pittsburgh Supercomputing Center. In addition to these resources, Galaxy has been awarded an XSEDE Grant of over 400,000 SUs, which we will be utilizing to further extend usegalaxy.org's computing Capacity.
This talk provides an overview of the work that was done to move the site, what challenges we faced, and some of the work that is going on right now and in the near future.
Accepted Talks, Session 8, Wednesday, July 2
These talks have been accepted for the second lightning talks storm on Wednesday.
Plan for Galaxy based Chip-exo Analysis platform
1 Center for Eukaryotic Gene Regulation, The Pennsylvania State University
BeeGFS: Accelerating the access to BLAST and Galaxy Indices
Franz-Josef Pfreundt1, Björn Grüning2
1 Fraunhofer ITWM
2 Bioinformatics Uni Freiburg
1 Michigan State University
Running and maintaining a reliable production Galaxy server
1 New Zealand Genomics Ltd
Private BLAST: Using Galaxy
Presented by Gilda Le Corguillé1
Galaxy: Farm to Federation
Kyle Ellrott1, Dannon Baker2
1 UC Santa Cruz
2 John Hopkins University
Galaxy Docker Containers: Docker, Docker, Docker
1 Bioinformatics Uni Freiburg
This talk was entirely a live demo.
Galaxy Search API
1 UC Santa Cruz
Lightning talks are your opportunity to give an impassioned and enthralling talk about something that you care about - but you only have 300 seconds. Make every one count, because your audience may include people suffering from limited attention spans this late in the proceedings.
- Lightning talks are 5 minutes followed by 2 minutes for questions.
- At 5 minutes in, thunder will be played
- At 6 minutes in we will take over the presentation laptop and start switching to the next set of slides.
- At 7 minutes the next talk will start, no matter what.
- Your slides (as PDF or PowerPoint) should be on the presentation computer before the session starts (talk to Dave Clements) to minimize the risk of BYOD.
- You can BYOD (your own computer or whatever) but you are advised not to.
- If you do BYOD, we will start swapping out your device at 2 minutes left, rather than 1.
- Connection and fiddling time beyond the first minute comes out of your 5 minutes and is painful, for everyone.
From Ross Lazarus, the (possibly former) Benevolent Lightning Session Moderator for Life
- Good lightning talks are well rehearsed and very, very focussed.
- Plan on talking to 5 or 6 slides
- Don't try to cram a 30 minute talk into 5 minutes. It won't fit.
- 5 minutes is not long enough to explain anything in detail. Just give the big picture.
- Practice your talk at least 3 times to make sure it works and fits in 5 minutes.
- If you have more than 5 or 6 slides, you are probably screwed before you start and stand a high risk of being cut off in mid-flight unless you have rehearsed a few times with a timer to be sure you can fit everything in.
- You are advised not to read your acknowledgements out loud. It's a lightning talk for heaven's sake.
Submit a Proposal
|Submission is closed|
Proposals will be solicited during the meeting. If you wish to give a lightning talk, please send it to email@example.com before the start of Session 2 (to present on Tuesday) or the start of Session 6 (to present on Wednesday). The slides for all lightning talks will be made available on the conference web site, and the talks may be videotaped and also posted on the conference web site.
A proposal consists, of a title, and a short description of the topic.