Assignment 6 - Transcriptome analysis [35 marks] Due Friday 01 November 2024

Due Friday 01 November

Your answers to all questions should be submitted to myUni as a .zip file containing four bash scripts, and a 5_final_assembly folder including files required in step 5, and a text file inlcuding your answers to theoretical questions. [1 mark]

The .zip filename must start with your student number and your bash script must be able to run without errors. Meaningful comments are strongly advised [1 mark]

For all scripts, please use the directory ~/Assignment6 as the parent directory for all downloads and analysis. You will be expected to hard code this into your scripts. Use an organised folder structure to store files generated in this assignment. [1 mark]

Assignment6/ ├── data ├── DB ├── results │   ├── 1_QC │   ├── 2_clean_data │   ├── 3_denovo_assembly │   ├── 4_genome_guided_assembly │   └── 5_final_assembly └── scripts

Practical questions [28 marks in total]

Step 1. Write the first script to:

Step 2. Write the second script to copy the sequencing data to your data directory and carry out QC. The data are in ~/data/Transcriptomics_data/Assignment/. [1 mark] Include the following steps:

Step 3. Write the third script to:

Step 4. Write the fourth script to:

Step 5. Organise the final results:

Step 6. Identify one assembled transcript including alternative splicing events with read evidence (Take a screenshot from IGV and put it in the 5_final_assembly subdirectory) [3 marks].

Theoretical questions [4 marks in total]

  1. Can we get all genes assembled from one transcriptome sequenced from a sample collected from one particular tissue at one particular developmental stage? [1 mark]
  2. Give the reason(s) supporting your answer in quesion 1 [3 marks]

Resources (also in assignment 4)