- Fastq Manipulation and Quality Control
- How to format fastq data for tools that require .fastqsanger format?
- Understanding compressed fastq data (fastq.gz)
- Common datatypes explained
- Input datatype misassignment and errors
- A Tabluar datatype is any that is human readable and has tabs seperating data columns.
- Note: tabular data is different from comma seperated data (.csv)
- Common tabular datatypes are .bed, .gtf, .interval, or .txt.
- The datatype metadata attribute can often be directly reassigned to tabular format data.
- Click the icon to reach the Edit Attributes form. In the center panel, using tabs to navigate, change the datatype (3rd tab) and save, then label columns (1st tab) and save. Metadata will assign, then the dataset can be used.
- If the required input is a BED or Interval datatype, adjusting (.tab → .bed, .tab → .interval) may be possible using a combination of Text Manipulation tools, to create a dataset that matches the BED or Interval datatype specifications.
- Some tools require that BED format is followed, even if the datatype Interval (with less strict column ordering) is accepted on the tool form.
- These tools will fail if run with malformed BED datasets or non-specific column assignments.
- Solution: Reorganize the data to be in BED format and rerun. More error troubleshooting help here.