Bowtie and Lastz Migration to Tool Shed
Migration scripts for both Bowtie and Lastz will run upon Galaxy's first launch (after updating to this release) that will automatically handle installing replacement tool wrappers from the Tool Shed. Primary executables for Bowtie and Lastz plus target reference genomes should still be installed as described in the Galaxy wiki - start in the Tool Dependencies section.
Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp reads per hour. Bowtie indexes the genome with a Burrows-Wheeler index to keep its memory footprint small: typically about 2.2 GB for the human genome (2.9 GB for paired-end).
Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25.
LASTZ is a program for aligning DNA sequences, a pairwise aligner. Originally designed to handle sequences the size of human chromosomes and from different species, it is also useful for sequences produced by NGS sequencing technologies such as Roche 454.
Harris, R.S. (2007) Improved pairwise alignment of genomic DNA. Ph.D. Thesis, The Pennsylvania State University.
## New Galaxy CloudMan Release
CloudMan offers an easy way to get a personal and completely functional instance of Galaxy in the cloud in just a few minutes, without any manual configuration.
This update brings a large number of updates and new features, the most prominent ones being:
- Support for Eucalyptus cloud middleware; thanks to Alex Richter. Also, CloudMan can now run on the HPcloud in basic mode (note that there is no public image available on the HPcloud at the moment and one would thus need to be built by you).
- Added a new file system management interface on the CloudMan Admin page, allowing control and providing insight into each available file system
- Added quite a few new user data options. See the UserData page for details; thanks to John Chilton.
- Galaxy can now be run in multi-process mode; thanks to John Chilton.
- Added Galaxy Reports app as a CloudMan service; thanks to John Chilton.
- Introduced a new format for cluster configuration persistence, allowing more flexibility in how services are maintained
- Added a new file system service for instance's transient storage, allowing it to be used across the cluster over NFS. The file system is available at
/mnt/transient_nfsjust know that any data stored there will not be preserved after a cluster is terminated.
- Support for Ubuntu 12.10
- Worker instances are now also SGE submit hosts
This update comes as a result of 175 code changesets; for a complete list of changes, see the commit messages.
Any new cluster will automatically start using this version of CloudMan. Existing clusters will be given an option to do an automatic update once the main interface page is refreshed.
# Tool Shed
Improvements in the display of repository dependencies and contents in the tool shed
The various types of contents of a tool shed repository ( valid tools, invalid tools, datatypes, workflows ) as well as the dependencies that are defined for the repository are now displayed in clickable containers that can be opened or closed. For example here is the view of the emboss_5 repository that I'm hosting on my local Galaxy tool shed.
Notice the "Repository dependencies" container? This is currently in development, and will be available in the tool shed shortly. This container displays the list of all repositories int he tool shed upon which this repository depends.
Opening each of the above containers (by clicking on the links) displays the contents of each.
Functional test framework for the tool shed
Miscellaneous tool shed enhancements and fixes
- You can now configure the directory location for the tool shed's
hgweb.configfile using the following setting in your
community_wsgi.inifile. Configuring this location is highly recommended, but if you choose not to, a new
hgweb.configfile will automatically be created in the default location (the Galaxy root directory).
Backups will be made of the
hgweb.config file (in the same directory in which it is located) any time a new repository is added to your tool shed, so configuring it to be located in it's own directory has benefits. You can also choose to change the configured location over time, and simply move the
hgweb.config file to that new location before starting your tool shed server, and everything should work as expected.
- #2 Implement a new
HgWebConfigManagerto manage the tool shed's hgweb.config file. This will greatly diminish file i/o for the tool shed.
- #3 When defining dependencies for tools contained in a repository, allow for environment variables that contain neither
INSTALL_DIR; thanks to James Johnson. Allowing these values to be set in a single location rather than hard-coded into each config file is the best approach. Here's an example:
- #5 Don't allow reviewing empty repositories in the tool shed.
- #6 Provide a warning message when uploading files to a toolshed repository and a
tool_dependencies.xmlhas been provided, but
tool_dependenciesmetadata has not been generated.
User Interface (UI)
- Introduction of the dataset "Paused" state and basic "Resume-Paused" functionality for a history.
- Adjustments and fixes to history panel layout.
- Added back in "display" and "edit" attribute buttons to datasets in the error state.
- Scatterplot visualization tool: updated layout of features.
- Updated History Pull-down menu. Options affect all datasets in the current history:
- Resume Paused Jobs - a single-click resume of all paused datasets
- Collapse Expanded Datasets - a single-click to collapse all expanded datasets
- Show/Hide Deleted Datasets - a single-click toggle to show or hide all deleted datasets
- Show/Hide Hidden Datasets - a single-click toggle to show or hide all hidden datasets
- Unhide Hidden Datasets - a single-click to change state of hidden datasets to that of regular datasets
# Job Runner
- The query for determining which jobs are ready to run has been significantly optimized. Heavily loaded multiprocess Galaxy installations should see increased performance in job dispatch and finish times.
- Jobs and their outputs are no longer set to an error state when their inputs fail to complete successfully. Instead, they are moved to a "paused" state. In the distribution release following this, it will be possible to rerun the failed jobs and continue paused jobs from the point of failure.
SGErunner has been deprecated for a long time, and has finally been completely removed. The
DRMAArunner should be used to connect to
check_galaxyNagios script has been updated to be compatible with the new client-side histories.
Miscellaneous Galaxy fixes and enhancements
- Add the ability to view the current data tables registry. This new feature is available from the Galaxy Administration menu within the "Server" section, and is labeled "View data tables registry".
- Since tool migration scripts can be executed any number of times, make sure that no repositories are installed if no tools associated with the migration are defined in the
tool_conf.xml(or equivalent) file. This fix is associated only with the recently introduced Galaxy administration UI feature displaying the list of migrations stages currently available in the local Galaxy instance. This is the way that the migration process at Galaxy server startup always worked, so no changes were needed in that scenario.
- Maintain entries for Galaxy's
ToolDataTableManagerthat are acquired from installed tool shed repositories in a new config file named
shed_tool_data_table_conf.xml. This will ensure that manual edits to the original
tool_data_table_conf.xmlfile (which has existed for some time) will not be altered or lost when Galaxy's tool shed repository installation process automatically adds entries into the file.
- Fix for
ToolDataTablenew entries that should have been persisted to the
shed_tool_data_table_conf.xmlfile were not being handled correctly.
- Attempt to make sure
.samplefiles included in an installed tool shed repository are copied to the
~/tool-datadirectory only if they are sample data index files.
- Add error messages for a
DataToolParameterwhen the provided value is no longer valid due to be deleted or being in an error state.
- Rework "Re-run" functionality to validate and display errors between the original job and currently set states (e.g. the previously used dataset has been deleted).
- To help with reproducibility, when extracting a workflow from a history, provide a warning message if the tool version for a job does not match the tool version of the currently loaded tool.
# Security Fixes
All Galaxy instance maintainers are strongly encouraged to run the latest release.
- Grid filters are now sanitized correctly.
# Bug Fixes
- Ensure that slugs cannot be duplicated for active, importable items.
- Fix paging in embedded grids.
- When getting job parameters for extracting a workflow from a history, set
ignore_errors to True. Prevents traceback when e.g. a tool was updated and had a text value changed to an integer.
- Fix for rendering workflow tooltips when tool help is nonexistent in the wrapper.
- Training Day Topic Nominations for GCC2013 will open in December. Start thinking of ideas now!
- Slides and Screencast from November GalaxyAdmins Meetup are online. The next GalaxyAdmins Meetup will be on January 16 and feature John Chilton discussing "Deploying Galaxy on OpenStack with CloudBioLinux & CloudMan"
- A short "Getting started with JGalaxy" document (with screenshots), by John Chilton
- Batch Workflow starting using the Galaxy API : Practical Example by Geert Vandeweyer
# About Galaxy