Galaxy End-Users

Thanks to all who came out and contributed to the "end user" BoF discussion this evening. It was a fruitful discussion that we begrudgingly ended early to allow the Charles Commons staff to clean/set-up the room for tomorrow. Let's continue this discussion tomorrow during lunch in room 304 (7/1/2014 @ 12:15 PM), we can call it the "end user BoF 2.0"!
GCC2014 BoFs!

This page describes the Galaxy End-UsersBirds of a Feather meetup being held at GCC2014.

When:: Monday, June 30, 6:15pm
and
Tuesday, July 1, 12:15pm

Where:: Multipurpose Room
and
East Room 304

Contact:: [Mo Heydarian](mailto:mheydar1 AT jhmi DOT edu)

Description

This Birds-of-a-Feather session will serve as a forum for end-users of the Galaxy environment to share experiences and lessons learned, as well as address and discuss issues that hinder progress from the end-user perspective.

Audience

End-users of Galaxy who would like to share experiences (or listen to those of others) and developers interested in the perspective of the end-user should attend this BoF.

When and Where

When Where
Monday, June 30, 6:15pm Multipurpose Room
Tuesday, July 1, 12:15pmEast Room 304

Summary

User experiences

  • The end user wants to interface with a bioinformatician, who can explain tools/workflows in biologist friendly terms.
  • When jobs don't run instantly, the user panics and restarts the job (and possible iterations of that job). This can cause a heavy burden on the server being used. If the user were told the job would eventually run, this may mitigate the panic.
  • Some end users do their heavy computational work on Galaxy and visualizations on alternate programs due to: inability to customize charts/plots (like with R or even excel), inability to capture high resolution visualizations (can only take screenshots with Trackster), ETC (Add something here)

Getting feedback from users

  • Users are more encouraged to provide feedback where there is a service contract in place.
  • Users are hesitant to report issues because many times we are not confident if the issue is a user error or problem in Galaxy. Users “bang our heads against a wall” trying to figure out the issue, because we don't want to look incompetent and don't want to blame a tool (or developer) for a user error.
  • The core Galaxy team is working towards a means of collecting user information (tools used, job time, compute power used) that would allow users to “opt-in”. This information would go to the system administrators.
  • Ravi Madduri has 10 – 15 stories from end users that are nice examples of users running into issues and how these issues were collaboratively resolved with system administrator/developer guidance. These types of stories would be nice to host on the Galaxy wiki.
  • J. Kissinger's group has conducted a systematic analysis of how end users interact with data (contact jkissing@uga.edu for more info).
  • Suggestions to promote feedback from users:
    • Standardized feedback forms that take minimal time to complete. Similar to a bug-report that would go to the system administrator.
    • Showing examples of correct input/output for tools could help users determine if the issue is a user issue of Galaxy issue.

Data management

  • "naming files is the worst problem in since labeling eppendorf tubes" … using batch submission mitigates this by naming files based on the input file name.
  • The core Galaxy team is working on a solution for naming/managing data files.
  • University of Georgia has a mandatory one credit course on “data management” for all undergraduate students.

Providing workflows/pipelines to the end user

  • To allow the end user to focus on the science (their primary goal), Globus Genomics works to make “turnkey” solutions for biologists. This was in part inspired by Karen Reddy’s talk at GCC2012 (“why all the keys?!”). To this end, there are ~ 20 standard workflows automatically loaded on a Globus instance.
  • With the cost of computation and storage rapidly decreasing it is possible to keep old versions of tools (images) for years. Sometimes it is cheaper and less time consuming to re-analyze data with current tool versions, than to debug + optimize old tools/workflows.
  • Running NGS tools is trivial at some point, pipelines are needed for the back end analysis (on which the end user spends most of their time).
  • UGA doesn't spend time on any tool unless there is demand for the user community.

Using Galaxy with Cloudman on Amazon Web Services

  • Scalable, semi-temporary use
  • Can implement tools from Tool Shed (more below)

Tool Shed

  • Some users have issues in implementing tools from the Tool Shed
  • There was a consensus that tools written from Core Galaxy team members can be trusted to work “out of the box”
  • There is a commission to vett tools (Intergalactic Utilities Commission, IUC), but it is very time consuming to vett the thousands of tools in the Tool Shed. Rating tools could speed up the vetting process, by the IUC or the community.
  • Indicating which other tools can use the output of a particular tool could be informative to end users. Keywords would be informative too.

Participants

  • Aaron Gardner
  • Amy Hsu
  • Anushka Brownley
  • Ben Busby
  • Brian Whalen
  • Carl Eberhard
  • David Hoover
  • Dawei Lin
  • Edwin Smith
  • Efe Sezgin
  • Frederick Tan
  • Jason Gray
  • Jason Park
  • Jessie Kissinger
  • Jim Johnson
  • Jonas Paulsen
  • Karen Reddy
  • Malcolm Cook
  • Maria Doyle
  • Marie Jacques Seignon
  • Mark Rose
  • Martin Cech
  • Michael R. Crusoe
  • Mo Heydarian
  • Morten Johansen
  • Oleksandr Moskalenko
  • Peter Cock
  • Peter van Heusden
  • Philipp Ross
  • Pratik Jagtap
  • Ravi Alla
  • Ravi Madduri
  • Ravi Sanka
  • Richard Everson
  • Sukhinder Sandhu

Questions?

Send them to [Mo Heydarian](mailto:mheydar1 AT jhmi DOT edu).