Linux commands for running TopHat and Cufflinks

Below is the list of commands we used in the RNA-Seq class on 5/24.  To see other available options for each program, you can simply type the program name at the command line.

1. Use bowtie to index the reference genome

bowtie2-build TAIR10_chr4.700k.fa TAIR10_chr4.700k.fa

2. Run TopHat to map the reads.  This was performed twice to map both of the datasets.  Be sure to change the output directory (-o) before running a second time.

tophat -o SRR815_th TAIR10_chr4.700k.fa SRR314815_accepted_hits.ch4_part.fq

3. Convert .bam output to .sam to observe sam format

samtools view accepted_hits.bam -ho accepted_hits.sam

4. Index the reference for loading into Tablet

samtools faidx TAIR10_chr4.700k.fa

5. Index the .bam output for loading into Table

samtools index accepted_hits.bam

6. Run cufflinks to assemble mapped reads into transcripts

cufflinks -o SRR314818_cl SRR314818_th/accepted_hits.bam 

7. Create an input file of the transcripts.gtf files for cuffmerge

find -name transcripts.gtf > gtf_files.txt

8. Run cuffmerge to merge the transcripts.gtf files

cuffmerge -o cuffmerge_out -s TAIR10_chr4.700k.fa gtf_files.txt

9. Run cuffdiff to detect differential expression

cuffdiff -o cuffdiff_out -b TAIR10_chr4.700k.fa -L SRR815,SRR818 cuffmerge_out/merged.gtf                   SRR815_th/accepted_hits.bam SRR818/accepted_hits.bam

  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: