Protocol: Basic Bulk RNA-Seq Work Flow

This is the basic work flow we employ to obtain our bulk RNA-seq gene expression data in this project:

  1. Sterilized plant seeds placed on MS medium (including 1% sucrose) solidified with 0.6% agarose and incubated under continuous light at 22C. [Note: Seeds from different species may require specialized sterilization and germination treatments, see the Seed Sterilization and Germination Protocol for details.]
  2. Seed germination and root growth for 3-5 days, root sections are collected from the three development zones (meristematic, elongation, and differentiation) and stored at -80C.
  3. Extract RNA using QIAGEN RNeasy Plant Mini Kit.
  4. Construct cDNA library and obtain sequence reads (conducted by University of Michigan Sequencing Core – Illumina HiSeq System).
  5. Quality of reads assessed using FastQC.
  6. Trim the first 15bp of each read (due to low-quality reads at the beginning).
  7. Map reads to reference genome using Tophat2 with default settings (–segment-length 17, use half reads length as advised by the program).
  8. Quantify relative expression (in FPKM) using Cufflinks2 with correction for multi-read (-u -G).
  9. Sort the .bam files and count the number of reads mapped to a single genomic location by HTseq (htseq-count2.7 -m intersection-strict -s no -f bam).
  10. Use edgeR to do the differential expression analysis based on HTseq counts (TMM or upper quartile normalization plus tagwise dispersion).