Hello, Galaxy Community!
Greetings, Galaxy enthusiasts! We're thrilled to bring you the latest edition of the Galaxy Newsletter, packed with exciting updates and insights from the Galaxy community. In this issue, we'll delve into the highlights of the upcoming GCC2024, explore the Galaxy Mentoring Network, and unveil the newest features of the Galaxy 23.2 Release.
Our Special Interest Group Spotlight shines on Single Cell, showcasing the collaborative efforts and innovative tools driving advancements in this field. We'll also celebrate Galaxy's impact in three scientific papers in our Galaxy Success Stories segment.
In other news, we bid adieu to Twitter as Galaxy departs the platform, and we'll recap some major events and engagements that have kept Galaxy at the forefront of bioinformatics. Plus, we'll give you a sneak peek at upcoming events that you won't want to miss. And don't forget to check out our new Hub landing design.
Join us as we journey through the latest news and developments in the Galaxy universe!
We're thrilled to announce that plans for the 2024 Galaxy Community Conference (GCC2024) are well underway, and we can't wait to welcome you in Brno, Czech Republic, from June 24th to 29th, 2024.
GCC2024 promises to be an exceptional gathering of researchers, developers, practitioners, and enthusiasts from around the globe, all united by our passion for Galaxy and the advancement of open science. Whether you're a seasoned Galaxy user or just getting started, GCC2024 offers something for everyone.
General registration is now open for GCC2024. Join us for this exciting event to connect with fellow community members and contribute to the vibrant Galaxy community. Register now to reserve your place and join us for what promises to be an unforgettable event.
A friendly reminder that the deadline for submitting abstracts for talks is April 15, 2024. Share your research, insights, and experiences with the community by presenting a talk at GCC2024; submit your abstract, here!
Stay in the loop with all the latest news and announcements about GCC2024 by joining our dedicated mailing list. Be the first to know about keynote speakers, workshop and training schedules, social events, and more. Don't miss out on important updates – sign up today and ensure you're well-prepared for an enriching experience at GCC2024!
We look forward to welcoming you to Brno for GCC2024 and celebrating the vibrant Galaxy Community together!
Are you passionate about open science and eager to connect with like-minded individuals? The Galaxy Mentoring Network (GMN) is calling for applications from both mentors and mentees interested in joining our vibrant community.
The GMN facilitates connections between experienced members of the global Galaxy community and those seeking guidance in STEM fields. Our mission is to support the integration of new members, whether they're users, developers, students, or professionals, by fostering meaningful dialogue, sharing experiences, and setting achievable goals over a two-month mentorship period.
Who Should Apply:
Benefits of Participation:
Visit our website to apply for the GMN. Applications are open to all backgrounds and experience levels. Please note that we review applications every three months, and we appreciate your patience as we match mentors and mentees effectively.
Contact Information: For inquiries or more information, please email galaxy.mentorship@gmail.com.
Join us in shaping the future of Galaxy mentorship and fostering collaboration within the scientific community. Together, we can make a difference in the lives of aspiring open science enthusiasts!
We're excited to announce the release of Galaxy 23.2, packed with new features and enhancements designed to streamline your Galaxy experience. Here's a glimpse into what Galaxy 23.2 has to offer:
Upgrade to Galaxy 23.2 today and experience these exciting new features firsthand. For more information and detailed instructions, check out our user release notes.
The Galaxy Single Cell Community has had a stellar year, advancing tools and workflows while fostering global collaboration. Expanding beyond scRNA-seq, the community now offers tools for single-cell ATAC-seq and CITE-Seq analysis. Updates include a revamped UI for RNA STARSolo, new tools for scATAC-seq data preprocessing, and enhanced functionalities for Seurat. Community building efforts have led to the unification of single-cell subdomains and the launch of the Single Cell Community of Practice hub. Collaboration has also extended internationally, with projects spanning three continents. Training initiatives have been revamped, including the introduction of the Case Study Reloaded tutorial for parallel trajectory analysis. The community remains committed to efficiency, consolidating instances and preventing duplication to streamline workflows.
Join the community for another exciting year with the Galaxy Single Cell Community as they continue to push boundaries and explore the cosmos of single-cell analysis! For more information on the Single Cell Special Interest Group, please contact Wendi Bacon.
Scalable, accessible and reproducible reference genome assembly and evaluation in Galaxy (Larivière et al., Nature Biotechnology, 2024)
The Earth BioGenome Project aims to sequence genomes of all ~1.8 million eukaryotic species over the next decade, requiring a significant increase in genome production pace. To achieve this, researchers combined the expertise of the Vertebrate Genomes Project (VGP) and the European Reference Genome Atlas (ERGA) to develop an automated assembly pipeline within the Galaxy ecosystem. This pipeline uses PacBio HiFi reads and long-distance information from Hi-C maps and/or optical maps to generate nearly complete assemblies, with extensive quality control functions. Validation using vertebrate datasets has shown improvements in haplotype resolution, particularly for large, repeat-rich genomes. Future work will focus on improving efficiency, incorporating ultra-long-read data, and automating the curation process. The workflows and instructions are available on the Galaxy website, and new genome assemblies are being submitted to the NCBI under the Vertebrate Genome Project BioProject.
Anthropogenic contamination sources drive differences in antimicrobial-resistant Escherichia coli in three urban lakes (Wight et al., Applied and Environmental Microbiology, 2024)
Antimicrobial resistance (AMR) is a growing concern, even in environmental reservoirs, highlighting the need for thorough monitoring. A study isolated antimicrobial-resistant Escherichia coli from urban waterbodies over 15 months, examining their susceptibilities, population structures, and genetic resistance determinants. Despite close proximity, E. coli populations in each site showed distinct antimicrobial resistance patterns, with many belonging to globally concerning sequence types. Widespread resistance was observed to key antimicrobials like amoxicillin, cefotaxime, and ciprofloxacin, but susceptibility to last-line drugs was maintained. Acquired resistance genes and chromosomal mutations were identified as key mechanisms. Whole-genome analysis revealed a diverse population with various resistance and virulence genes, including those on the chromosome (e.g., blaCTX-M). Environmental persistence, likely facilitated by wild birds, and transfer of genetic elements were identified as significant contributors to resistance patterns. The researchers used the European Galaxy server (usegalaxy.eu) for bioinformatic analyses, including extracting and aligning single nucleotide polymorphisms from core genes.
Genomic variation in pepper vein yellows viruses in Australia, including a new putative variant, PeVYV-10 (Filardo et al., Archives of Virology, 2024)
Following its initial identification in Australia in 2016, pepper vein yellows virus (PeVYV) has been sporadically found in various hosts and locations. Genomic comparisons of 14 PeVYV-like isolates suggested the presence of potential new variants. High-throughput sequencing of six PeVYV-positive plants revealed eight PeVYV-like sequences, including isolates closely related to known PeVYV variants and others showing significant genetic divergence, tentatively named PeVYV-10. This study also marks the first report of a PeVYV-like virus infecting garlic. The researchers utilized Galaxy Australia for quality trimming, adaptor and primer sequence removal, and de novo assembly of paired reads, followed by further analysis of contigs and BLAST results in Geneious 10.
We would like to direct your attention to our running list of publications that cite Galaxy on Zotero and Google Scholar! Galaxy tries to keep up with all publications from our users, but if you have a paper you would like to see highlighted either in a Galaxy Newsletter or on social media, please use this form to let us know! We would love to see all the fantastic work you have been doing with Galaxy, as this not only helps us know what our users are accomplishing but also helps guide us in developing new features!
Galaxy will be leaving Twitter/X effective March 31st, 2024. We believe this decision is necessary as the platform no longer supports our commitment to fostering constructive scientific discourse. Recent changes have made it challenging to maintain the level of discussion we strive for.
We value the support and engagement from our community on Twitter/X and assure you this decision was made after careful consideration. Although we bid farewell to Twitter/X, you can still stay connected with us through our other communication channels, such as BlueSky, Mastodon, and LinkedIn. We will continue to cross-post all social media contributions there. Additionally, you can stay updated by visiting the Galaxy Hub, and we encourage you to stay connected with the Galaxy Training Network, which is active on BlueSky and Mastodon.
Thank you for your ongoing support. We look forward to continuing our journey with you and maintaining the strong sense of community that defines Galaxy.
Exciting news! Our latest meeting report unpacks the highlights from the recent International Plant & Animal Genome Conference (PAG31). Get ready to delve into engaging workshops, thought-provoking discussions on justice, equity, diversity, and inclusion (JEDI+), and the impactful contributions of our community members.
Explore the full report here, and join us in celebrating Galaxy's ongoing journey in shaping the future of genomics.
In February, Galaxy had a major presence at AGBT in Orlando Florida. This included being featured during a keynote presentation by Michael Schatz on BioDIGS: BioDiversity and Informatics for Genomics Scholars. This is an exciting new initiative to perform a distributed soil metagenomics project throughout the United States working hand-in-hand with dozens of institutions especially community colleges, historically Black colleges and Universities, tribal colleges and universities, and Hispanic-serving institutions. Galaxy was featured as a supporting platform for the project, especially to empower students and researchers who had limited prior experience or who lacked large scale computing infrastructure for the research. Stay tuned for more updates on this project later this year!
| DATE | EVENT | VENUE or LOCATION | | ------------- | ------------- | ------------- | | 21 March 2024| Small Scale Galaxy Admins Meeting | Online, Global | | 7-11 May 2024 | CSHL Biology of Genomes| Cold Spring Harbor, NY | | 17–21 June 2024 and 22–26 July 2024 | Workshop on High-Throughput Data Analysis with Galaxy: June, July | University of Freiburg, Germany | | 24–29 June 2024| 2024 Galaxy Community Conference | Brno, Czech Republic | | 13–16 November 2024 | CSHL Biological Data Science | Cold Spring Harbor, NY |
Thank you for being a part of Galaxy!
Get more timely info by following us on Mastodon, Bluesky, and LinkedIn!
]]>Participants worked on various aspects of imaging analysis, including data management, visualizations, migrating existing workflows into Galaxy and creating novel Galaxy tools, among others. The hackathon took place in a hybrid format, allowing for remote participation.
Napari and CellProfiler have now been integrated into Galaxy as interactive tools. This means that you can now use both applications in the same way as you’d do in your local workstation but with the computational capabilities of high-performance or cloud computing resources in the backend.
How do they work?
To use these new interactive tools, you need first to create an account in Galaxy and log in with your credentials. Then, you can find in the Tools panel of the left side, either “Run Napari“ interactive tool or “Run CellProfiler” interactive tool. Once you click on the one of your choice, you’ll find a Run Tool button to launch the CellProfiler or Napari instances in the central panel of Galaxy. When the graphical user interface is ready, an 'Open' link will be displayed at the top of the Galaxy central panel.
Cellpose is now available as a Galaxy tool. Cellpose is a deep learning-based segmentation tool widely used in the image analysis field. The integration of Cellpose into Galaxy harnesses the HPC resources, enabling users to perform segmentation tasks on a wide range of images with enhanced computing power.
How does it work?
You can go to the tool panel and type “cellpose” in the search box. You can upload input images to Galaxy history using the ‘Upload Data’ button. To run the Cellpose tool, select input images, pre-trained model type and channel to segment via its user interface. The output will be available in your Galaxy history upon completion of the tool’s execution.
This project aimed to curate a list of advanced AI models from the BioImage Model Zoo, make them available via Galaxy as remote files and develop a general-purpose tool in Galaxy to make inferences on test images.
How do they work?
A curated list of AI models has been made available in Galaxy, accessible in the Galaxy's file uploader at "ML models/bioimaging-models".
Models can be accessed for advanced Galaxy users in a Jupyter Notebook using the "get" method. Once the model is accessible in a notebook, it can make inferences on a test image. One example of such a notebook is bioimage-boundary-model-galaxy-IT, which takes one of the curated models and analyses a test image.
A Galaxy tool is also being developed to take one of the curated models and test images directly for analysis. The project currently supports Pytorch-based AI models. Moving forward, support for TensorFlow models will be added. A pull request of the initial work toward building the general-purpose tool has been added to showcase the project's progress and collect feedback from the imaging community to improve it.
It is now possible to visualize TIFF and OME-TIFF files in Galaxy. Previously, manual conversion to other formats such as PNG or JPEG was required before visualization for certain web browsers. The conversion step can now be omitted within all web browsers.
How does it work?
You can now find a new visualization “Tiff Viewer” that will display basic regular TIFF images (PR).
Once you upload a TIFF file to your history, you can click the visualization icon in your expanded dataset and select this new visualization.
Alternatively, you can display the TIFF file using the Avivator external application which provides more advanced functionality like zooming, etc. (PR).
Work is ongoing towards the visualization of zarr-based images. For that, first Galaxy needs to have a proper zarr data type definition, including required metadata. Once this is possible and the Vizarr npm package is updated, Vizarr will be integrated as a Galaxy visualization plugin.
For the first time, we have implemented capabilities for the automatic testing of Galaxy tools and workflows specifically tailored for the testing of image-oriented tools. The new capabilities permit automatic testing using (i) statements about the image metadata, (ii) the image content, as well as (iii) direct comparison to ground truth images, and will contribute to the overall quality of tools and workflows in our community. Existing tools were improved, including major bug fixes.
The segmentation of cell nuclei in microscopy images is a central task in many biological studies. When using fluorescent markers, cell nuclei often appear as bright image regions compared to the image background, which permits effective segmentation of the nuclei using automatic intensity thresholding methods (e.g., Otsu 1979).
How does it work?
A workflow based on Otsu thresholding for segmentation and counting of nuclei using fluorescence microscopy images has been created, submitted to IWC, and is now available on WorkflowHub.
The main goal of the project was to investigate the translation of KNIME workflows for image analysis into Galaxy workflows. The main tested workflow was a high content screening analysis workflow obtained from the KNIME Hub. Furthermore, we work on the detection of subcellular structures workflow described in a recent publication. Translation focused on the first main steps of pre-processing, segmentation and feature extraction.
How does it work?
As the main outcome, it was confirmed that the main nodes for image processing are present on the Galaxy EU server and several steps of the workflow can be translated from KNIME to Galaxy with similar outcomes. We discovered some tools that are present in KNIME and can still be integrated into Galaxy. In particular, processes on binary images present also in imageJ (i.e. outline, fill holes, erode, dilate, voronoi-otsu-labeling, subtract background) can be added or implemented. An overview on the results plus the workflow converted to Galaxy was published in Zenodo.
A comprehensive list of tools was prepared and given for implementation (see issue). Furthermore, a small workflow for testing different threshold values and checking outcomes would be soon developed to speed up the workflow development. As a future task, a translation table from KNIME to Galaxy can be established to easily guide the user through the workflow conversion. Since most of the KNIME tools are ImageJ-based, a tool for running imageJ macros in Galaxy might also be developed to efficiently operate a complete translation between the two platforms.
In this project, we aim to translate a small Nextflow image analysis workflow into a Galaxy workflow. The workflow quantifies the closing of a wound area over time in brightfield microscopy images of a cell monolayer. The main step of the workflow measures the fraction of the lesion that is free of cells for each time point. The resulting output is a CSV table of cell-free percentages as a function of time.
How does it work?
We have translated the main step of this workflow into a Galaxy tool, which can be accessed via the tool panel on the left, or through this link. The workflow is available in the WorkflowHub as well.
The input of the main step in the workflow is a collection of TIFF images. As a next step, we aim to build a data download tool to automatically acquire input images from cloud storage such as S3.
After this successful week with plenty of outcomes, we are looking forward to similar gatherings in which we can continue the collaboration!
Further updates will be discussed at the FAIR Image Data Workflows Expert Group, we’re meeting every 3rd Wednesday of the month at 4 pm CE(S)T. Join us!
]]>Do you submit genome assembly data to GenBank? If so, try out NCBI’s Foreign Contamination Screen (FCS) tool, a quality assurance process that you can run yourself. We will screen all prokaryotic and eukaryotic genome submissions to GenBank with this tool, but we encourage you to screen your data before submitting to save time. FCS offers sensitive contaminant detection to increase the quality of your genome submissions to GenBank. As part of our ongoing effort to improve your experience, we recently made several enhancements.
See our (NCBI) release notes for additional details.
FCS is available on GitHub. For more information about FCS and for step-by-step instructions on how to use it, check out our help documentation.
Read all about FCS in our recent publication in Genome Biology.
FCS is part of the NIH Comparative Genomics Resource (CGR). CGR facilitates reliable comparative genomics analyses for all eukaryotic organisms through an NCBI Toolkit and community collaboration.
Follow us on social @NCBI and join our mailing list to keep up to date with FCS and other CGR news.
We want to hear from you! Try it out and let us know what you think. We are making ongoing improvements based on your feedback. If you have questions or would like to provide feedback, please reach out to us at info@ncbi.nlm.nih.gov.
]]>CellProfiler is an open-source software for measuring and analyzing cell images.
Two CellProfiler tutorials for fully-automated workflows are available already to perform segmentation and tracking. You can (1) assemble your workflows using individual modules of CellProfiler available as Galaxy tools or (2) craft your workflows using CellProfiler and upload your cppipe file to be run in Galaxy. Up to you!
The novelty is that, besides the two options mentioned above, you can use CellProfiler in your browser interactively, i.e. you can plug and play the modules in the CellProfiler interface in the browser. Once you’re ready, you run it like in your local workstation but with the power of an HPC or cloud environment on the backend.
To use this new interactive tool, first create an account in Galaxy and log in with your credentials. Then, you can find find in the Tools panel “Run CellProfiler interactive tool” or go to this link. In the central panel of Galaxy, you’ll find a Run Tool button to launch the CellProfiler instance. When the graphical user interface of Napari is ready, an 'Open' link will be displayed at the top of the Galaxy central panel (see screenshot below).
And with that, you’re ready to go. Enjoy performing your favourite image analysis in Galaxy!
–
This work is supported by the NFDI4BIOIMAGE project.
]]>EuroScienceGateway (ESG) has developed a project-specific onboarding guide for WorkflowHub [Soiland-Reyes 2024]. The guide gives an overview of the structure used in WorkflowHub and pointers to general WorkflowHub onboarding.
The guide is managed as a living document in Google Doc, and registered as a Standard Operating Procedure in WorkflowHub with versioned snapshots.
Guide content include:
As part of the EuroScienceGateway project is about maturing the WorkflowHub EOSC service, similar project-specific guides have now been developed for the Biodiversity Genomics Europe (BGE) project, Beyond COVID-19 (BY-COVID), BioDiversity Digital Twin (BioDT). As these projects are larger and combine existing collaboration networks and e-Infrastructures, their organisation in WorkflowHub can be more complex than ESG, assisted by their respective onboarding guides.
Within the WorkflowHub, a EuroScienceGateway team groups the contributors, organisations and workflows developed by the ESG project. Each contributor can register for a separate account, and their workflows can be given shared attribution.
Figure 1: The WorkflowHub team https://workflowhub.eu/projects/166 shows registered people, organisations, standard operating procedures, workflows and collections.
Workflows are registered as Workflow RO-Crates, capturing the workflow definition and its metadata as a FAIR Digital Object. RO-Crate is a general-purpose FAIR packaging mechanism for data, metadata and software.
Figure 2: Workflow https://workflowhub.eu/workflows/749?version=1 with options for Download RO-Crate and Run on usegalaxy.eu
Galaxy workflows registered this way can be launched from WorkflowHub (Figure 2) directly onto the usegalaxy.eu instance. This feature is also used by the GTN as part of a snippet (Figure 3) that enables such launching of workflows from a particular tutorial.
Figure 3: Galaxy Training Network guide for embedding WorkflowHub execution snippets in tutorials.
In addition, an ESG Workflow Collection has been created - the purpose of this is to also aggregate pre-existing and third-party workflows which have been helped or further developed by ESG, such as in the Galaxy Training Network (GTN) and Intergalactic Workflow Commission (IWC), both which are community-led initiatives with participants outside ESG.
As of 2024-02-28, the WorkflowHub ESG collection contains the workflows:
These workflows span the ESG use cases, including from astronomy, biodiversity, earth science and genomics. Further workflows will be registered during the second phase of the project.
Additional Galaxy Training Network (GTN) workflows are being considered for WorkflowHub registration, however as some of these workflows are building blocks meant to be completed according to a particular tutorial, these will be better suited for a separate collection, as they may not be directly suitable for scientific use.
In contrast, the Intergalactic Workflow Commission (IWC) has developed mature, production-grade workflows for Galaxy. The separate IWC team in WorkflowHub has registered 47 workflows as of 2024-02-28. These are automatically registered by the WorkflowHub Bot, which scans the IWC GitHub repositories and registers the workflows according to their RO-Crate metadata. As many of these also have defined tests, WorkflowHub is able to show their test status via the LifeMonitor service, picked up from the test definition in their RO-Crate (Figure 4).
Figure 4: Workflow https://workflowhub.eu/workflows/615?version=2 indicates Tests Passing and links to the LifeMonitor test results.
Stian Soiland-Reyes, Björn Grüning, Paul De Geest (2024):
EuroScienceGateway MS3: Initial EuroScienceGateway workflows registered.
Zenodo (Milestone)
https://doi.org/10.5281/zenodo.1072892
To be able to bring expertise from both the scientific and the technical side of Galaxy, Polina Polunina, a Galaxy Europe researcher, and Mira Kuntz, Galaxy Europe Admin, teamed up and submitted an abstract.
Fortunately, our abstract “Science without secrets – how Galaxy democratizes data analysis” was accepted and we traveled to Brussels 🎉
After tasting the best french fries in the whole city and listening to many interesting talks on Saturday, e.g. the latest updates on containerd, we got ready to give our talk in the Open Research Devroom.
The focus of the talk was to illustrate how the Galaxy platform addresses challenges in scientific research, emphasizing collaboration, reproducibility, scalability, and compliance, while also providing technical insights to ensure understanding among audiences with a technical background. We were happy to tell so many people about Galaxy and show them ways to optimize their research and be part of a huge community.
The talk was recorded and you can watch it below or here and check out our slides.
Acknowledgements: We would like to thank the FOSDEM organizers for their commitment to make this amazing conference possible every year and give us the opportunity to present Galaxy, but also the whole Galaxy team for supporting us! ❤️
]]>Our Horizont Europe funded EOSC project EuroScienceGateway participated in the EOSC Winter School 2024 in Thessaloniki, Greece, from 29th Jan - 01st Feb. The desire for more collaborations between all the existing EOSC projects was the initial idea to have a joint meeting with EOSC project members to discuss different topics the state of the art, goals, and future achievements towards the EOSC SRIAs and macro roadmap. Intense discussions over 3 days gave an in-depth technical understanding of specific opportunity areas and integrating the deliverables of the EOSC-A Task Forces into these projects.
After a warm welcome from the European Commission and EOSC and their political statements on the importance of EOSC, 1.5 days were used for intense networking and discussions on common goals, strategies, and collaborations. Almost all Opportunity Areas (OAs) were covered by ESG members. More information about the EOSC Opportunity Areas can be found here:
The EuroScienceGateway project with the underlying Galaxy infrastructure and Galaxy Training platform (GTN) was represented in almost all OAs and has successfully shown that the Galaxy project and the European Galaxy Server are already widely established as technical core infrastructures in many EOSC projects. Besides the Galaxy infrastructure for data analysis and as VRE, the Galaxy Training Network (GTN) was presented as a very mature training platform with more than 380 different tutorials.
One of the key conclusions of the nearly 120 participants from the EOSC Association (EOSC-A) and the 21 EOSC-related EU projects is that the Winter School resulted in such essential collaborations and outcomes, inspiring an appetite for “more hands-on, collaborative work”, that it should be repeated on an annual basis - and yes, we absolutely agree.
To put into action this “hands-on, collaborative work” directly following the EOSC winter school, a small informal hackathon was put in place. During this time multiple EOSC projects (AquaInfra, EuroScienceGateway, Fair-Ease, Raise, ...) worked together on Galaxy.
Napari is an open-source image viewer for analyzing large multi-dimensional images. It is built on top of Python, Qt, VisPy and the scientific Python stack.
Napari provides a dynamic platform for visualizing and interacting with 2D, 3D, and multi-dimensional arrays on a canvas. It allows users to overlay derived data such as points, polygons, segmentations, and more. It also facilitates the annotation and editing of derived datasets using standard data structures like NumPy or Zarr arrays. Napari seamlessly integrates exploration, computation, and annotation in the field of imaging data analysis.
To use Napari in Galaxy, first you need to create an account in Galaxy and log in with your credentials. You can access Napari from here and specify images from your history that you want to visualize using Napari, then press the Run Tool button to launch a Napari instance. When the graphical user interface of Napari is ready, an 'Open' link will be displayed at the top of the Galaxy central panel (see screenshot below).
We are still working on integrating more plugins into the Napari interactive tool in Galaxy.
This work is supported by the NFDI4BIOIMAGE project.
]]>The Conference Molecular Biology of Plants (MBP) of the Section Plant Physiology and Molecular Biology of the DBG will take place at Hennef, North Rhine-Westphalia, Germany, from 4th to 7th March 2024. It will be organized by Prof. Dr. Christopher Grefen (Bochum), Prof. Dr. Ute Höcker (Cologne), and Prof. Dr. Andreas Meyer (Bonn). If interested, please visit full program page.
Deepti Varshney and Saskia Hiltemann will jointly present a Poster entitled "MAdLand Resources & Tools".
MAdLand is a DFG-funded research consortium exploring the molecular mechanism behind the transition from water to land, from alga to land plant. All work presented in that poster is part of a collaboration between MAdLand, NFDI4Plants (DATAplant), Galaxy & de.NBI. The poster presents MAdLand's cutting-edge resources useful for plant research: MAdLand DB (GenomeZoo), TAPscan v4, and DataPLANT ARCs. Below is an abstract for the Poster Presentation.
MAdLandDB represents a comprehensive protein database accessible through the Galaxy web-based platform. With a particular focus on non-seed plants and streptophyte algae, it delivers non-redundant, reliable genome sequences utilizing BLAST and Diamond search functionalities for comparative and evolutionary questions in plant biology. The database includes reference/outgroup genomes of various lineages, including, e.g., fungi, animals, phylo-diverse algae, bacteria, and archaea. It is actively developed and maintained in the MAdLand context. The intuitive Galaxy interface ensures effortless data access and retrieval and remains consistently updated with the latest genomic insights.
TAPscan v4 is an advanced tool for genome-wide annotation of plant transcription-associated proteins (TAPs). TAPs (TFs and TRs) are key to understanding the development and evolution of plant form and function. Access to reliable, up-to-date classifications of TAPs enables comparative analyses that expand our knowledge of plant transcriptional regulation. Moreover, TAPscan is currently being integrated into the Galaxy framework, so users may employ it for their own datasets.
Finally, MAdLand is committed to contributing to the DataPLANT Research Data Management (RDM) platform, which supports the creation of ARCs (Annotated Research Contexts) to encapsulate all research data (raw and metadata) in a FAIR, open and standardized manner. ARCs are an RO-crate implementation and provide an easy-to-use way to re-analyze data generated by other labs, e.g. in Galaxy. Conversely, analyses performed in Galaxy can also be exported directly as RO-crates.
Acknowledgements: The Presenters would like to thank the organizers of MBP2024 for providing the opportunity to present a poster at the conference. The presenters would also like to thank the MAdLand/Rensing lab, Galaxy, NFDI4PLANT and de.NBI.
]]>Galaxy is a widely used open-source platform that empowers scientists globally and enhances research accessibility, reproducibility, and transparency. Galaxy Training Network (GTN) employs a collaborative, open, and FAIR approach to scientific training materials. With over 300 tutorials authored and reviewed by a global community, the GTN serves researchers, educators, and scientific tool developers.
MAdLandDB offers a comprehensive protein database accessible via Galaxy, focusing on non-seed plants and streptophyte algae. It provides reliable genome sequences for comparative plant biology research.
TAPscan v4 is an advanced tool for genome-wide annotation of plant transcription-associated proteins (TAPs). TAPs (TFs and TRs) are key to understanding the development and evolution of plant form and function. Access to reliable, up-to-date classifications of TAPs enables comparative analyses that expand our knowledge of plant transcriptional regulation.
DataPLANT contributes to the creation of ARCs, store large amounts of annotated research data, as a means to make data and metadata available to and reusable by the community.
The talk was well-received and sparked a brief discussion about the utilization of MAdLandDB, Galaxy, and ARC.
Acknowledgements: The speakers would like to thank the organizers of DOMPS for providing the opportunity to present a talk at the Seminar. The speaker would also like to thank the MAdLand/Rensing lab, Galaxy, NFDI4PLANT and de.NBI.
]]>January 12th–17th, 2024; San Diego, CA, USA
The 31st annual Plant & Animal Genome Conference (PAG31) served as a valuable forum for discussions on recent developments and future plans in plant and animal genome projects. With 7 Plenary Talks, over 200 Workshops, and 100 Exhibitors, the conference offered a comprehensive platform for technical presentations, poster sessions, and workshops. It provided an excellent opportunity for the exchange of ideas and applications in this globally significant project. Notably, Galaxy participated with enthusiasm in PAG31, hosting its own workshop and being featured in several others. This active involvement underscored Galaxy's commitment to contributing meaningfully to the international discourse on plant and animal genomics (and beyond!).
Galaxy showcased its commitment to advancing life science research with a workshop titled "Galaxy for NGS Data Analysis: A Hands-on Workshop" at PAG31. Galaxy is a widely used open-source platform that empowers scientists globally and enhances research accessibility, reproducibility, and transparency. The workshop began with an introduction by Michael Schatz to the Galaxy platform, featuring insights into recent work supporting microbiome analysis for the Genomic Data Science Community Network. Attendees engaged in a tutorial on quality control, classification, and analysis of metagenomes from short and long-read sequencing data.
A spotlight on the Galaxy Training Network (GTN) by Saskia Hiltemann emphasized its collaborative, open, and FAIR approach to scientific training materials. With over 300 tutorials authored and reviewed by a global community, the GTN serves researchers, educators, and scientific tool developers. The Introduction to Galaxy portion of the workshop by Natalie Whitaker covered key topics, including using the Galaxy interface, organizing analyses, running tools, and managing data. Participants prepared for in-depth analyses, delving into metagenomics data to identify yeast species and visualize microbiome communities.
The workshop further explored microbiome analyses, leveraging Galaxy's tools, workflows, and existing trainings led by Alex Ostrovsky. Participants had hands-on experience running their own microbiome analyses on reference data, gaining a deeper understanding of Galaxy's capabilities. The concluding segment by Tyler Collins centered on the assembly and annotation of microbial genomes, advocating a hybrid sequencing strategy for improved accuracy. The workshop emphasized the integration of long and short-read sequencing methods and highlighted the significance of predicting protein structures using ColabFold, bridging genomic sequencing data with functional proteomics analysis. Galaxy's dedication to open-source, FAIR data access, and comprehensive analyses shone through, offering participants a valuable learning experience at PAG31.
Galaxy was honored to feature the research endeavors of two promising undergraduate students from Spelman College, Nia Davis and Katherine Ulbricht, during PAG31's "Galaxy for NGS Data Analysis: A Hands-on Workshop." These students, affiliated with the BioDIGS (BioDiversity and Informatics for Genomics Scholars) Project, presented their insightful investigations into plant microbiome interactions.
Katherine Ulbricht shared her expertise in a talk titled "Using Galaxy to Curate Antibiotic BGCs from Bacterial Isolates and Assess Their Abundance in Soil and Roots," while Nia Davis delved into "Analyzing the Pathogenic and Antibiotic Bacteria in Cucumber and Wheat Plants." Their engaging presentations provided a glimpse into the promising research conducted by emerging talents in the field.
Galaxy extends its appreciation to Katherine and Nia for joining PAG31 and showcasing their work with BioDIGS and Galaxy. This collaboration underscores Galaxy's commitment to providing a platform for diverse perspectives and fostering inclusivity in STEM, recognizing the potential of young researchers like Katherine and Nia to contribute to the future of genomics.
The Vertebrate Genomes Project (VGP) took the spotlight at PAG31, offering a comprehensive series of talks that delved into the project's ambitious goal of generating phased, error-free, chromosome-level, near-complete, and annotated reference genome assemblies for all ~70,000 extant vertebrate species. Aligned with the Earth BioGenome Project (EBP), which aims to produce reference-quality genomes for all 1.8 million named species on Earth, the VGP emphasizes the crucial role these genomes will play as a valuable resource for the broader scientific community.
Galaxy played a central and instrumental role in the success of the VGP. The platform proved to be a linchpin in the VGP's mission, facilitating seamless data integration, analysis, and collaboration among researchers. The discussions during these talks underscored the importance of Galaxy in overcoming challenges related to genome-wide alignments, phylogenetic tree inference, universal gene nomenclature, and comparative genomics of specialized traits within the VGP's Phase 1 scientific studies in which a representative species from every (or nearly every) vertebrate order will be assembled and released.
The workshop not only addressed the scientific intricacies of the VGP but also focused on ongoing advancements in sequencing technology, assembly techniques, and annotation methodologies. This collaborative effort aimed to organize and discuss the work involved in each project, with the anticipated outcome benefiting all high-quality reference genome projects affiliated with the VGP and EBP.
In a significant development following PAG31, the VGP recently published a groundbreaking paper in Nature Biotechnology. The paper delves into scalable, accessible, and reproducible reference genome assembly and evaluation, with Galaxy emerging as a crucial component in the methodology. This publication marks a milestone for the VGP, solidifying its commitment to advancing genomics research and underscores Galaxy's pivotal role in enabling cutting-edge genomic studies on a global scale.
Galaxy played a prominent role in the workshop session titled "Teaching Genetics, Genomics, Biotechnology, and Bioinformatics," contributing two insightful talks by Anton Nekrutenko and Saskia Hiltemann.
Saskia Hiltemann's presentation focused on the multifaceted capabilities of Galaxy as a mature, browser-accessible workbench for scientific computing. Galaxy empowers scientists to effortlessly share, analyze, and visualize their data, requiring minimal technical expertise. With over 8,000 integrated analysis software packages and support from various national infrastructure providers, Galaxy has fostered a thriving global community that actively contributes to the project. Saskia highlighted the Galaxy Training Network (GTN), an open and collaborative initiative delivering comprehensive training materials across various scientific topics. The GTN, boasting over 375 tutorials created by more than 300 contributors, facilitates self-study and offers valuable resources for educators in both classroom and virtual settings. The talk provided insights into leveraging Galaxy, the GTN, and other training resources for effective data science teaching.
Anton Nekrutenko's presentation delved into advancements in genome sequencing and assembly, addressing the challenges of reproducibility and scalability. He introduced the latest Vertebrate Genomes Project assembly pipeline, showcasing its versatility and ability to deliver high-quality reference genomes across a diverse range of vertebrate species. The pipeline, accessible through Galaxy, incorporates PacBio HiFi long-reads and Hi-C-based haplotype phasing in a novel graph-based paradigm. Anton emphasized the importance of standardized quality control to troubleshoot assembly issues and assess biological complexities automatically. By making the pipeline freely accessible through Galaxy, researchers without local computational resources can engage in the training and assembly process, democratizing access and enhancing reproducibility. The presentation demonstrated the flexibility and reliability of the pipeline through the successful assembly of reference genomes for 51 vertebrate species spanning major taxonomic groups, including fish, amphibians, reptiles, birds, and mammals.
Galaxy's active involvement in this workshop session underscores its pivotal role in advancing education, research, and the democratization of scientific tools and knowledge.
To facilitate the effort towards fostering justice, equity, diversity, and inclusion in genomics research, PAG31 dedicated a session to Justice, Equity, Diversity, Inclusion + (JEDI+). This session marked a significant expansion of the discourse, featuring talks by experts in both genomics and JEDI+, followed by a panel discussion and interactive activities. The workshop aimed to systematically address global inequalities in the genomics community, considering JEDI+ as the guiding principle throughout the genomic data lifecycle.
During the session, Carolyn Hogg delivered a noteworthy talk titled "Implementation: Genomic Resources," where she thoughtfully highlighted Galaxy. This shout-out emphasized the importance of integrating tools like Galaxy in the pursuit of justice, equity, diversity, and inclusion within genomics. The session served as a crucial step towards acknowledging and addressing systemic injustices in genomics, encouraging and establishing an environment where the field can realize its full scientific potential while embracing social responsibility and inclusivity.
VGP researcher Giulio Formenti also gave a presentation on “Genome Assembly and Curation”, in which he highlighted the ongoing challenges of large-scale genome analysis and how the VGP adopted Galaxy to support their production assemblies. He highlighted Galaxy’s core principles of accessibility, reproducibility, and transparency as keys to the success of the project. In a separate exchange, VGP researcher Adam Phillippy commented on Galaxy: “Thanks for building the engine that will take VGP across the finish line!”
Katherine Ulbricht, Spelman College
“I had an amazing time at PAG31! The Galaxy Team was extremely accommodating and welcoming; they made the conference feel like a natural place to be. The conference itself was also stellar, and it was honestly stunning to be casually walking amongst and conversing with pillars of the genomics community. The idea of presenting at the workshop was daunting at first, but with ample support that seemed to be freely given, I feel confident in stating that even though I stood in front of the most impressive audience I had ever faced, I felt calm, collected, and prepared. This was an absolutely brilliant experience, and I am unbelievably lucky to have been part of it!”
Natalie Whitaker-Allen, Johns Hopkins University
“I had the privilege of attending the Plant & Animal Genome Conference for the very first time, and my experience was truly transformative. The conference provided an invaluable platform for knowledge exchange, collaborative engagement, and following the latest developments in genomics. Galaxy's workshop, which I had the privilege of participating in and hosting, provided a unique platform for participants to delve into the intricacies of NGS data analysis. Witnessing the active engagement and enthusiasm of attendees as they navigated the Galaxy platform was truly gratifying. Galaxy's commitment to fostering inclusivity and empowering underrepresented voices in STEM resonated strongly, and being a part of this transformative environment was both inspiring and fulfilling. I am eager to contribute to future conferences and continue collaborating with Galaxy to advance genomics knowledge and accessibility.”
Galaxy eagerly anticipates its participation in next year’s Plant & Animal Genome Conference (PAG32), building upon the momentum and enriching experiences gained at PAG31. As a steadfast contributor to the genomics community, Galaxy looks forward to showcasing its latest advancements, facilitating engaging workshops, and continuing its commitment to fostering inclusivity and collaboration in the field. The enthusiasm to share insights, exchange ideas, and contribute to the ever-evolving landscape of genomics research accentuates Galaxy's dedication to staying at the forefront of innovation. The forthcoming conference provides an exciting opportunity for Galaxy to strengthen existing partnerships, forge new connections, and continue to play a pivotal role in shaping the future of genomics!
]]>The Galaxy Committers team is pleased to announce the Galaxy 23.2 release!
A few release highlights are:
Workflow comments are a brand new feature of the Galaxy Workflow Editor. They add a suite of tools to help you visually explain and structure your Workflows. Comments are saved on your workflow, so they can be shared with other workflow contributors, help guide workflow users, or just help you keep track of your work and sort your thoughts, while developing a workflow. They can also help you with teaching, live demos, and providing feedback on a workflow, all directly inside the workflow editor!
Workflow comments tools are made available as part of the Editor Toolbar - a new UI element in the workflow editor. They include:
The Published Workflow Sharing page UI has been overhauled to make the page more informative, visually appealing, and easier to navigate. Some of the key improvements include adding a new read-only mode to the workflow editor, embedding an interactive view of the workflow editor, adding more information about the workflow, adding a run button, etc. The page is now responsive, which makes it usable on smaller screens.
Workflow Publishing now has a configurable iframe embed:
In response to user feedback and suggestions, a lot of work went into improving the page and workflow report editor options. Workflow image and license information have been added to Galaxy markdown capabilities. Workflow version is now included in the report; time is displayed in UTC time format. A variety of minor issues including invocation report and the job metrics markdown component have been fixed. To address requests to add citation information for the Galaxy instance on which the workflow has been executed, several new directives have been added to Galaxy Markdown: Galaxy URL, organization URL, citation URL, support URL, help URL, terms URL, and resources URL - alongside new configuration options to support them.
InvenioRDM has been integrated into Galaxy. Users can now import files directly from InvenioRDM repositories into Galaxy, and publish records containing artifacts (Histories, datasets, etc.) from Galaxy to InvenioRDM.
To import from InvenioRDM, you simply select the InvenioRDM repository as a remote file source in the Upload tool and then select the records or individual files you want to import into your history, just like you would do with any other remote file source in Galaxy:
To export your data to InvenioRDM, you need to set up your InvenioRDM API personal token. You can create a new personal token in your InvenioRDM account settings if you don't have one. In Galaxy, you need to add that personal token to your account (go to User > Preferences > Manage Information).
Once you have set your personal token, you can publish your histories by selecting the InvenioRDM repository as a remote file source in the Export History to File menu option. You will be able to select an existing draft record or create a new one, and the files will be uploaded to it. Once exported, you will be able to import the history snapshot from the published record at any time.
When you are completely happy with your history, you can archive it, taking it out of your active histories and freeing up some disk quota. You can do so by selecting the Archive History menu option. The process will be similar to exporting a history; however, your history will be purged from Galaxy once the process is completed. You will be able to import the history as a copy from the published record at any time.
You can also publish individual datasets to InvenioRDM by selecting the InvenioRDM repository as a remote file source using the Export Datasets tool. Since the tool interface is a bit limited, you will need to create your draft record manually in advance, and then you can select it when exporting the datasets.
The 23.2 release includes many other improvements to user experience. Following are a few examples:
Please see the full release notes for all the details!
Thanks for using Galaxy!
]]>The Albert-Ludwigs-University Freiburg and the Galaxy Team are happy to announce the participation in the "Open Science Clusters' Action for Research & Society" (OSCARS). With EuroScienceGateway running in its second year now, OSCARS is an additional project from the EOSC space to be involved in. Similar to EuroScienceGateway, this project offers the opportunity to foster connection to a variety communities beyond life sciences, underlining Galaxy being principally domain-agnostic. OSCARS is financed by the European Commission though the European Research Executive Agency‘s HORIZON-INFRA-2023-EOSC-01-01 funding line, under grant agreement number 101129751.
The project's central mission is to bring together world-class European Research Infrastructures (RIs) in the ESFRI roadmap and beyond to foster the uptake of Open Science in Europe.
Over the next four years, 15 such RIs are partners in OSCARS, representing the five Science Clusters (Humanities and Social Sciences, Life Sciences, Environmental Sciences, Photon and Neutron Science, Astronomy, Nuclear and Particle Physics). Their common aim is to make open data easily accessible to users and the public, by providing FAIR data management policies and practices for enabling Open Science. OSCARS will
While having officially started on 1st January 2024, we are looking forward to the kick-off meeting from 13th to 15th March 2024; not least we are happy to join forces with well-known as well as new partners from all over Europe.
As for EuroScienceGateway, we will keep you informed about developments via blog posts and social media (Mastodon, Linkedin), but also recommend to follow OSCARS itself on X and Linkedin.
]]>Recently, 2 more Jupyterlab tools and their set of notebooks were added in Galaxy and more specifically in earth-system.usegalaxy.eu. What do you mean you never heard of this subdomain?? Then, I suggest you go check this blog post "A brand new subdomain of Galaxy Europe: earth-system.usegalaxy.eu" Anyway, about the tools there is one on Copernicus Data Space Ecosystem and another and the Holoviz ecosystem.
Access a wide range of Earth observation data from the Copernicus Sentinel missions and more. The Copernicus Data Space Ecosystem provides notebooks for easy discovery, visualization, and download.
The Copernicus Data Space Ecosystem provides access to Earth observation data from the Copernicus Sentinel satellites.
- Sentinel-1
- Sentinel-2
- Sentinel-3
- Sentinel-5P
More data are available including high-resolution satellite imagery from various providers and data offerings from different Copernicus services, ensuring a comprehensive and diverse dataset for your needs.
This Jupyterlab tool allows you to dive into data exploration, visualization, and analysis without the hassle of installing dependencies or downloading large data sets. To use this tool, you need to use the {% tool dedicated form %}.
Whether you are a seasoned data scientist, a researcher, or a student just starting your journey into earth observation, the JupyterLab service offers a user-friendly and efficient way to harness the power of data analysis. With its seamless integration with Copernicus Data Space Ecosystem services, you can unlock valuable insights and unleash the potential of earth observation data in your work.
With Python, initial exploration is typically in a Jupyter notebook, using tools like Matplotlib and Bokeh to develop static or interactive plots. These tools support a simple syntax for making certain kinds of plots, but showing more complex relationships in data can quickly turn into a major software development exercise, making it difficult to achieve understanding during exploration. Various toolkits like Bokeh, Dash, and ipywidgets allow building apps to control and explore these visualizations interactively rather than recoding each time, but again outside of a small range of simple functions building the app itself becomes a major software development exercise. Plotting libraries also have limitations on how much data they can handle, especially if they require that all of the data be present in the web browser’s limited memory space. It is thus difficult to find tools that support anything close to the entire range of cases where data needs to be visualized.
To address all the above issues, a set of open-source Python packages was developed to streamline the entire process of working with small and large datasets. The HoloViz ecosystem includes a set of special-purpose tools designed to fill in the gaps and solve the whole problem of visualization:
- Panel: Assembling objects from many different libraries into a layout or app, whether in a Jupyter notebook or in a standalone servable dashboard
- hvPlot: Quickly return interactive HoloViews, GeoViews, or Panel objects from Pandas, Xarray, or other data structures
- HoloViews: Declarative objects for instantly visualizable data, building Bokeh plots from convenient high-level specifications
- GeoViews: Visualizable geographic data that can be mixed and matched with HoloViews objects
Everything is accessible through the {% tool dedicated form %}
And you know what you can even chain these two brand-new tools! How do you ask? Keep reading for the answer.
This time we are still using the Copernicus Jupyterlab to retrieve using OpenEO NDVI time-series data. Once you have your time-series you can easily visualise and analyse them through some graphs with Holoviz. To follow an easy way to see the evolution of vegetation throughout time check out this tutorial: From NDVI data with OpenEO to time series visualisation with Holoviews.
Following the domain chosen by the Fair-Ease project (if you still don't know what that is then I recommend you to check this blog up), the tutorial was written to study some atmosphere variable over the volcano La Soufrière. In April 2021, the volcano "La Soufrière" in Guadeloupe erupted. The tutorial Sentinel 5P data visualisation will walk you through the discovery with OpenEO of Sentinel 5P Copernicus data to the visualisation of sulfur dioxide and aerosol spread out around the Antilles area from the 1st April to 30th May 2021. For the visualisation part of those atmosphere variables, you can use the great [Panoply tool]((https://earth-system.usegalaxy.eu/root?tool_id=interactive_tool_panoply).
If you are interested in these tools and/or tutorials don't hesitate to test them and give some feedbacks!
We are still working on Earth System tools, workflows, and tutorials. This specific study is in cooperation with the Fair-Ease and EuroScienceGateway projects (to know more about the projects see the blog . ![EOSC EuroSciencefair_ease_colour.png)
]]>After careful consideration, Galaxy has decided to leave Twitter/X, effective March 31st, 2024. This decision comes as we believe the platform no longer aligns with Galaxy's strong commitment to successfully and respectfully facilitating scientific discourse.
At Galaxy, we value open and constructive discussions that contribute to the advancement of scientific knowledge. Unfortunately, recent changes on Twitter/X have made it challenging for us to maintain the level of discourse we strive for.
We appreciate the support and engagement we have received from our community on Twitter/X, and we want to assure you that this decision was not made lightly. We remain dedicated to providing a platform that fosters meaningful conversations and supports the scientific community.
While we bid farewell to Twitter/X, we invite you to stay connected with Galaxy through our other communication channels, where we will be cross-posting all our social media contributions. You can find us on BlueSky, Mastodon, and LinkedIn, and of course, you can stay up to date by visiting the Galaxy Hub. Additionally, we encourage you to stay connected with the Galaxy Training Network, who are active on BlueSky and Mastodon.
We appreciate your ongoing support and look forward to continuing our journey with you and maintaining the strong sense of community that defines Galaxy.
Thank you for using Galaxy!
]]>MAdLandDB is a protein database comprising a comprehensive collection of fully sequenced plant and algal genomes, with a particular emphasis on non-seed plants and streptophyte algae. This database contains over 21 million sequences, representing a diverse group of more than 600 species. Its scope extends to various organisms, including fungi, animals, the SAR group, bacteria, and archaea, fostering a platform for comprehensive comparative analysis. The database's species abbreviation system, using a 5-letter code, simplifies species identification. For instance, CHABR – Chara braunii, is an example of how this nomenclature works. It contains non-redundant, reliable protein sequences utilizing BLASTp and Diamond search functionalities on the Galaxy interface for comparative and evolutionary questions in plant biology.
Non-redundancy: One of the database's primary strengths lies in its non-redundant nature. Sequence data is carefully curated, ensuring that you access only the most reliable genomic information.
Genome Project Origins: MAdLandDB takes pride in sourcing its sequences predominantly from genome projects. This approach enhances the reliability of the data and provides a solid foundation for various comparative analyses.
Comprehensive Information: The 5-letter code in the sequence header along with the gene ID allows for essential, short information and also features the source of gene encoding (plastome or transcriptome-based, etc.). The source/metadata for all genomes is available via the published papers.
There have been several releases since MAdLandDB was hosted on Galaxy.
| DATE | #RELEASE |
| ------------- | ------------- |
| 22 December 2022 | 1st release |
| 03 Febuary 2023 | 2nd release |
| 23 August 2023| 3rd release |
| 18 January 2024 | 4th release |
Let this tutorial be your starting point, providing you with a straightforward path to follow.
Want to know more about MAdLand, please visit MAdLand
The database is being maintained in the framework of MAdLand (http://madland.science, DFG priority program 2237). The Rensing lab is grateful for funding from the DFG (RE 1697/15–1, 20–1).
]]>The beginning of a new year always marks a welcome occasion to look back.
In the year 2023, especially the second half, there had been a lot of momentum in the Image community in Galaxy. Without diving too much into details, the contemplative mood of the end of the year made me want to capture visually what had been going on.
For this occasion, I had dedicated some spare time of the holidays to needle a web-based tool which automatically creates graphical retrospectives of contributions to Galaxy and its various communuties. The emphasis of the tool was clearly put on the contributions as an ongoing process, meaning that it would have been insufficient to just analyze the latest state of the repositories, the code, or the tools harbored there. It seemed necessary to build a tool which crawls through the whole Git history of each possibly relevant GitHub repository.
In fact, this is the most striking difference of the newly built tool to the Galaxy Tool Metadata Extractor, which emerged at the BioHackathon Europe 2023. The contributions are extracted from the Git histories and cached more compactly. In addition, metadata concerning different Galaxy tools and communuties, is extracted for each contribution and stored along with the cached contributions. The metadata is extracted in a way widely adopted from the Galaxy Tool Extractor. Thanks to the incrementally updated cache, the graphical retrospectives can be created and updated very quickly, and automatic updates are scheduled weekly.
In order to attribute contributions to specific Galaxy communuties, a mapping from the set of Galaxy tools to the set of the Galaxy communuties is required. To not reinvent the wheel, the newly built tool directly exploits the mappings already specified in the Galaxy Tool Metadata Extractor. Both mappings and communities can be configured in the communities.yml
file. Feel free to add yours!
The tool generates visualizations of the following quantities and relations:
Community contribution graph. This is a visualization of the contributions made within a community during the last 356 days, where contributions are attributed to the community according to the specifications in the communities.yml
file.
Personal contribution timeline. This is essentially a bipartite graph, where every week of the last year is connected with the repositories, to which contributions had been made in that week. Personal contribution timelines are created for everyone who contributed to any known community during the last year.
Distribution of repositories. This pie chart illustrates how the tools of a community are scattered across the different repositories.
Repositories to be scanned can be configured via the repositories.yml
file.
In addition to the visualizations described above, the tool also generates lists of (i) the all-time most frequent contributors, (ii) the last-year most frequent contributors, and (iii) newcomers (those who contributed for the first time during the last year to a community).
The tool is available on: https://kostrykin.github.io/galaxy-community-activities/report/
🚀Embarking on a cosmic journey, the Galaxy Single-cell Community has clustered together to unveil a constellation of tools, making strides in RNA-stellar discoveries and creating out-of-this-world workflows. With a commitment to battling work duplication across the multiverse, this community is boldly charting a course for global domination, proving that when it comes to bioinformatics, the Galaxy is the limit!✨
The past year has been a spectacular journey in advancing tools and workflows within the Galaxy Single-cell Community. Breaking free from the confines of scRNA-seq, we've expanded our horizons to include tools for analyzing data from other modalities such as single-cell ATAC-seq and CITE-Seq. Dive into our updated scRNA-seq analysis tools, now featuring more downstream analysis options for trajectory analysis. We've even thrown in advanced analysis tools for estimating cell type proportions from bulk RNA-seq data!
RNA STARSolo just got a makeover to version 2.7.10b, boasting a revamped UI, Velocyto-like UMI counting, and nifty alternatives for quantifying single nuclei data. Analyzing single-cell ATAC-seq data on Galaxy is now a breeze with the introduction of the EpiScanpy tool suite, built on the renowned Scanpy toolkit. Plus, we've added some powerful tools from the Sinto toolkit for preprocessing scATAC-seq data on Galaxy. And that's not all! The MuSiC tool suite in Galaxy now unveils deconvolution tools for discovering cell type composition in bulk RNA-seq using cell types within scRNA-seq. Multi-faceted comparisons are now at your fingertips with the MuSiC Compare tool.
Seurat fans, rejoice! Updates have rolled in, offering additional functionalities and optional CITE-seq capabilities at runtime.
Our efforts in community building have seen the unification of single-cell subdomains into https://singlecell.usegalaxy.eu/, the creation of the Single Cell Community of Practice hub page (https://galaxyproject.org/projects/singlecell/), and a user-focused Matrix channel. Testing brilliant widgets, automated updates, and the latest addition of automated help request updates onto Matrix channels have made community collaboration a breeze.
Down under, the Australian BioCommons launched their Single Cell and Spatial Omics Analysis Infrastructure Roadmap for Australia, complementing our endeavors to unite across timezones. Collaboration, including international tool adventures across three continents, proves that team work does, in fact, make the dream work.
Get ready for a thrilling journey through the Single Cell Training's latest updates! The Case Study Reloaded tutorial, fueled by three remote-working undergrads, introduces parallel trajectory analysis in R, Python, and the Galaxy GUI.
Sustainability shines as we've revamped numerous tutorials with hands-on and video re-recordings. New tools bring new tutorials, such as Pre-processing of 10X Single-Cell ATAC-seq Datasets and Comparing inferred cell compositions using MuSiC deconvolution.
Tackling user hurdles, our data ingest and conversion tutorials in the Changing data formats & preparing objects section await exploration.
Dive into the unique Tips, tricks & other hints section, offering analysis options like parameter iteration for swift optimization or Removing the effects of the cell cycle for precise clustering analysis.
And don't miss our inaugural Learning Pathway, guiding Galaxy novices to becoming single-cell analysis experts!
Our mission: No more work duplication! We've consolidated single-cell instances and are diligently preventing tool and infrastructure duplication, creating smoother pathways across servers. We value efficiency here at single-cell, so we'd like to thank ChatGPT (free version) for adding in Galaxy puns to this post, as per our request.
What's next on our list? World domination or global collaboration? You decide. Stay tuned for another exciting year with the Galaxy Single-cell Community!
]]>First of all we updated the components of the Open Infrastructure to the latest available software revision. The Open Infrastructure is a set of tools allowing to provide ready-to-go Pulsar nodes and Galaxy servers deployable over partners Cloud infrastructures.
Currently, each Pulsar endpoint is created using a virtual machine image, named Virtual Galaxy Compute Nodes, which provides everything needed. Containerized tools and Reference Data are shared through read only repositories based on the CERN-VM FileSystem. Finally, the deployment and the configuration of the Pulsar endpoint is managed using Terraform and Ansible. HTCondor is used as the default Resource Manager for job submission.
The Open infrastructure update includes:
Most Pulsar endpoints have been already updated to the new configuration:
Moreover, the updated Open Infrastructure allows to deploy full fledged usegalaxy.eu replica servers, thus allowing to instantiate new usegalaxy services easily, but also providing a robust framework for maintaining and updating running instances. We also started documenting the whole procedure, which has been successfully used to deploy a prototype of the Italian UseGalaxy server (temporary repository).
Currently, 7 UseGalaxy endpoints are available: usegalaxy.eu (ALU-FR), https://usegalaxy.be (VIB), usegalaxy.fr (CNRS), usegalaxy.es (BSC-CNS), usegalaxy.no (ELIXIR-NO), usegalaxy.cz (CESNET) and usegalaxy.it (CNR). In this case the activities focused on the management and maintenance of the instances:
Finally, in order to enable other workflow engines to access and leverage the a Pulsar Network, we are going to implement support for the GA4GH Task Execution Service, as a proof of concept, allowing other services to submit jobs via TES to Pulsar. The Task Execution Service (TES) API is an effort to define a standardized schema and API for describing batch execution tasks. A task defines a set of input files, a set of containers and commands to run, a set of output files, and some logging and metadata.
For this purpose the TESP (TES for Pulsar) microservice, decoupled from Pulsar, has been developed: it implements the TES standard, distributes TES tasks using the Pulsar REST API and supporting Docker containers for tools provisioning. The current version provides:
TESP will be tested with WfExS, a high-level workflow execution service backend, developed within EOSC-Life as part of Demonstrator 7, which can manage workflows across different domains during 2024.
The WP3 work will, of course, continue during 2024, with the release of the documentation for both Pulsar and (Use)Galaxy endpoint deployment. We also plan to complete the Pulsar endpoints deployment and connect more UseGalaxy.*
to the Pulsar network. Finally, we also aim to test the Open Infrastructure with commercial Cloud providers, in order to further extend the possibility for users to use Galaxy as frontend for their very own resources.
Hello, Galaxy Community!
Firstly, the Galaxy team would like to extend our warmest wishes to you and all your loved ones this holiday season! With the New Year rapidly approaching, we wanted to thank you all for being a part of the Galaxy Community in 2023; we couldn’t do it without you, and we are excited for what is to come!
As a part of the final newsletter for 2023, we are including an end-of-year wrap-up to highlight some of Galaxy's successes and updates over the past year. We also are including a preview of what is to come in 2024, and we hope you are as excited as we are!
To celebrate Galaxy’s 18th year, we have put together an infographic highlighting the top Galaxy events, features, and scientific drivers of 2023!
To view the infographic, please view this PDF.
We are excited to share just a few of the Galaxy's scientific success stories in 2023. This year, we are thrilled to have been a part of over 1,400 papers, which brings our total to over 16,000! Check out our “Citations per year” graph from Google Scholar. The success of our users helps drive Galaxy to adapt and improve its features, and we cannot wait to see the continued success of our users over the next year!
See below for a brief highlight on each story, and view the full paper for each article to learn more! While there are many more success stories from 2023, each of these articles was chosen by our users and developers who wished to highlight specific features or advances in Galaxy.
As mass spectrometry (MS) technology progresses, there is a growing demand for bioinformatics tools that can effectively analyze complex MS-based proteomic data. The Galaxy bioinformatics ecosystem is designed to meet these analysis needs by providing a diverse range of open-source tools tailored for MS-based proteomics applications. This ecosystem operates on a flexible, scalable, and easily accessible computing platform. Galaxy's distinctive feature, provenance tracking, enables users to save and share comprehensive analysis histories and workflows. This capability extends to various MS-based proteomics techniques, including shotgun proteomics, data-independent acquisition MS-based quantitation, multi-omics, MS imaging, as well as results visualization and interpretation. Researchers have successfully employed existing workflows within the Galaxy ecosystem in diverse fields such as COVID-19 pandemic research, proteogenomics, metaproteomics, MS imaging, and clinical studies involving MS-based proteomics with patient-derived samples. Additionally, Galaxy provides access to training resources, fostering awareness and facilitating the widespread adoption of these tools within the research community.
In this study, the researchers aim to understand better the mechanism governing the Hox gene timer, in which the activation of Hox genes occurs in a temporal sequence based on their positions within gene clusters. To address this, stembryos derived from mouse embryonic stem cells were obtained, and various next-generation sequencing (NGS) analyses and figure-generation techniques were applied. These techniques included RNA-seq, single-cell RNA-seq, ChIP and ChIP-M, CHi-C, and HiChIP. Galaxy was utilized for all NGS analyses, excluding single-cell RNA-seq. The findings indicated that the precision and pace of the Hox gene timer are regulated by the presence of evolutionarily conserved and regularly spaced intergenic CTCF sites.
Bioinformatics has assumed a pivotal role in the realm of research studies across the natural sciences due to the continuous generation of scientific datasets resulting from numerous technological advancements. Despite the significant influx of knowledge and scientific progress facilitated by the proliferation of datasets, there exists a substantial gap in basic computational skills and data analysis. To bridge this skill gap and empower researchers, the Galaxy Training Network introduced the Galaxy Training Platform. This platform, a community-driven framework accessible to all, compiles FAIR (Findable, Accessible, Interoperable, and Reusable) training materials tailored for data analyses within the Galaxy environment. Since its launch, the Galaxy Training Platform has surpassed expectations, emerging as a reliable repository featuring hundreds of tutorials from numerous contributors worldwide. Initially rooted in the natural sciences, the platform has expanded its scope to encompass diverse subjects such as climatology, cheminformatics, and machine learning. Furthermore, the Galaxy Training Platform has transcended its original purpose of supporting researchers, now serving as a valuable resource for educators. This paper highlights the recent advancements in the Galaxy Training Platform and explores how the Galaxy Training Network has evolved to facilitate the integration of its materials into classroom settings.
MetaboLights serves as a global database for metabolomics studies, encompassing both the raw experimental data and associated metadata. It stands as the recommended repository for metabolomics by several prominent journals and the European Infrastructure for Life Science Information (ELIXIR). This paper delves into the notable progress made by MetaboLights in recent years, emphasizing the introduction of MetaboLights Labs, a new instance within Galaxy. The objectives of this Galaxy instance are threefold: 1) to streamline the reuse of MetaboLights data through high-quality analysis tools, 2) to empower users to analyze their own data, and 3) to foster collaboration with researchers for the contribution of community tools and workflows. Galaxy is enthusiastic about being a part of MetaboLight's recent advancements in FAIR data science and is looking forward to seeing what they do next!
Instructors often face significant hurdles when planning and organizing training courses, such as a lack of needed technical resources and expertise, complications with queue contention and managing resources, and adapting to the rise of virtual and hybrid teaching. To help make training more accessible and adaptable, a collaboration between Galaxy Europe, the Gallantries, and the Galaxy community has developed Training Infrastructure-as-a-Service (TIaaS) with the goal of delivering user-friendly training infrastructure to the global training community. TIaaS is designed to offer dedicated training resources specifically for Galaxy-based courses and events. TIaaS represents a substantial enhancement for instructors, learners, and infrastructure administrators alike. The instructor dashboard facilitates the feasibility and simplicity of remote events, and students benefit from a seamless learning experience, as all training occurs on Galaxy, a platform they can continue to utilize post-event. The success of TIaaS has been substantial, as over the past 60 months, 504 training events, engaging more than 24,000 learners, have utilized this infrastructure for Galaxy training.
Finally, we would like to direct your attention to our running list of publications that cite Galaxy on Zotero and Google Scholar! Galaxy tries to keep up with all publications from our users, but if you have a paper you would like to see highlighted either in a Galaxy Newsletter or on social media, please use this form to let us know! We would love to see all the amazing work you have been doing with Galaxy, as this not only helps us know what our users are accomplishing but also helps guide us in developing new features!
There is much to look forward to from Galaxy in 2024! Most notably, the 2024 Galaxy Community Conference (GCC2024) is scheduled for June 24–29th in Brno, Czech Republic. While we are rapidly planning for this year's fantastic keynote speakers, workshops, and networking opportunities, we encourage you to sign up to receive GCC2024-specific announcements here!
Additionally, Galaxy is ecstatic to be once again participating at the International Plant & Animal Genome (PAG 31) conference set to be held from January 12–17th in San Diego, CA, USA! Galaxy will be hosting a workshop titled “Galaxy for NGS Data Analysis: A Hands-on Workshop”, where we will be sharing a brief introduction to Galaxy and working with participants to walk through an advanced microbiome tutorial. Additionally, this year we are thrilled to also include two users, Katherine Ulbricht and Nia Davis from Spelman College, in our workshop, who will share their personal Galaxy user success stories!
Finally, Galaxy has a lineup of some incredible new features to be introduced to the interface in 2024. From a new Galaxy Help Tool to user-interface simplifications, users can expect Galaxy to continue to grow and advance throughout 2024! See the Galaxy Roadmap for a detailed listing of features planned for the next 6 to 12 months.
| DATE | EVENT | VENUE or LOCATION | | ------------- | ------------- | ------------- | | 12–17 January 2024 | International Plant and Animal Genome Conference | San Diego, CA, USA | | 4–8 March 2024 | Workshop on High-Throughput Data Analysis with Galaxy | University of Freiburg, Germany | | 7-11 May 2024 | CSHL Biology of Genomes| Cold Spring Harbor, NY | | 24–29 June 2024| 2024 Galaxy Community Conference | Brno, Czech Republic |
“The most important job of senior faculty is to mentor junior faculty and students.” - JXTX
In memory of James P. Taylor, one of the original founders of the Galaxy Project, the JXTX Foundation was created to enable and support ongoing mentoring of young and diverse faculty and students to the best Open Science computational biology.
Congratulations to this year’s JXTX + CSHL 2023 Genome Informatics scholarship awardees! We cannot wait to see what they bring to this year's conference.
Check out our awardees below:
The JXTX Foundation is now proudly a 501(c)(3)! Please consider donating online through the foundation’s website! Your contribution will support the foundation's efforts by providing graduate student scholarships, academic mentorship, and sponsoring student outreach.
To conclude, Galaxy is going to be increasing communications on the ‘galaxy-dev’ mailing list! Anyone running a local installation, wrapping tools, and otherwise expanding the Galaxy ecosystem is encouraged to join the galaxy-dev mailing list. Typical discussions regard features, bugs, and new ideas. Announcements from the Galaxy Team are also sent to this list. Subscription is open to all!
Thank you for being a part of the Galaxy Community, and happy holidays!
Get more timely info by following us on Mastodon, Twitter, Bluesky, and LinkedIn!
]]>