- Quick help: Manupulating NGS data with Galaxy: Getting Data In
- Full tutorial: Uploading data into Galaxy
- Dataset Collections, including creation during Upload: Processing many samples at once with collections
- FTP/FTPS tutorial: FTP Upload
Data is loaded using the tools in the Get Data tool group. Some access specific data provider sites that will load data back into your Galaxy history. To directly load your own local data or data from another source, use the tool Get Data → Upload File (also accessible from the top of the left tool panel, as seen in the graphics below). Want to practice import/export functions with small sample data? Import the Upload sample data history here.
- Each file loaded creates one dataset in the history.
- The maximum size limit is 50G (uncompressed).
- Most individual file compression formats are supported, but multi-file archives are not (
- Load by "browsing" for a local file. Only good for very small datasets. ( < 2G, but often works best for smaller). If you are having problems with this method, try FTP.
- Load using an HTTP URL or FTP URL.
- Load a few lines of plain text.
- Load using FTP. Either line command or with a desktop client.
- Search for your data directly in the tool and use the Galaxy links
- There are a few links, so which data do I load?
- Be sure to check your sequence data for correct quality score formats and the metadata "datatype" assignment.
- Load the data using line command FTP or a client. More help...
- Note that the FTP server name is specific to the Galaxy you are working on. This is by default the URL of the server.
- For the public Main Galaxy instance at http://usegalaxy.org the FTP server name to use is usegalaxy.org.
- For a default local (with FTP enabled, see next) the FTP server name to use is localhost:8080. If the server URL was modified, use that custom URL.
- If on another server, the FTP server name will appear in the Upload tool pop-up window (see graphics below). When using a local Galaxy server, be certain to configure your instance for FTP first.
- Use your email and password for the same instance as your credentials to log in and save the data to your account.
- Once the data is loaded (confirm through FTP client), use the Upload tool to load the data into a History.
FTPSwas enabled for all transfers to http://usegalaxy.org on July 19, 2017. If you are having trouble connecting the first time after this date, verifying the server certificate is required when using an FTP client.
If you DO NOT see any files, load data using FTP first, then come back to the Upload tool.
- Data quota is at limit, so no new data can be loaded. Disk usage and quotas are reported at User → Preferences when logged in.
- Password protected data will require a special URL format. Ask the data source. Double check that it is publicly accessible.
- Use FTP or FTPS, not SFTP. Check with local admin if not sure.
- No HTML content. The loading error generated may state this. Remove HTML fields from your dataset before loading into Galaxy or omit HTML fields from the query if importing from a data source (such as Biomart).
- Compression types .gz/.gzip, .bz/.bzip, .bz2/.bzip2, and single-file .zip are supported.
- Only the first file in any compressed archive will load as a dataset.
- Data must be < 50G (uncompressed) to be successfully uploaded and added as a dataset to a history, from any source.
- Is the problem the dataset format or the assigned datatype? Can this be corrected by editing the datatype or converting formats? See Learn/Managing Datasets for help or watch the screencast above for a how-to example.
- Problems in the first step working with your loaded data? It may not have uploaded completely. If you used an FTP client, the transfer message will indicate if a load was successful or not and can often restart interrupted loads. This makes FTP a great choice for slower connections, even when loading small files.
Create your GenomeSpace OpenID token at the Galaxy server you are working at. Not all public Galaxy servers will have this option enabled, but it is available at Galaxy Main https://usegalaxy.org.
Note: GS OpenID tokens can become stale over time. If your account is not connecting properly when using these tools in Galaxy, resetting the token is the first thing to try when troubleshooting.
- Open a browser window at http://www.genomespace.org/, log into your account, and leave that window open
- Log into your Galaxy Account and go to User > Preferences > Manage OpenIDs
- Delete the existing GS OpenID and create a new one, or just create a new one if you don't have one already