Variant Calling

Germline Variant Calling

Somatic Variant Calling

RNA-seq variant analysis

RNA fusion detection. Another important application of RNA-seq is to detect fusion genes, which are abnormal genes produced by the concatenation of two separate genes arising from chromosomal translocations, or tran-splicing events. Fusion genes play a critical role in investigating causes and development of various cancer types. Based on recent publications (pubmed: 28680106), FusionCatcher yielded most sensitive and precise predictions.

This is an example of finding fusion genes in the BT474 cell line using the public available RNA-seq data (from SRA archive).

mkdir Fusioncatcher_Test
cd Fusioncatcher_Test

Submit the following batch job to HTC cluster.

#SBATCH --cpus-per-task=4 # Request that ncpus be allocated per process.
#SBATCH -J Fusioncatcher_human_sample
#SBATCH -o Fusioncatcher.out
#SBATCH -t 24:00:00
module load fusioncatcher/0.99.7b
fusioncatcher -p 4 -i ./Fusioncatcher_samples/ -o ./results/
# fusioncatcher -d /ihome/sam/apps/fusioncatcher/fusioncatcher/data/ensembl_v86/ -i ./Fusioncatcher_Test/ -o ./results/

Options specified as follows:

  • '-p PROCESSES', or ' --threads=PROCESSES' Number or processes/threads to be used for running SORT, Bowtie, BLAT, STAR, BOWTIE2 and other tools/programs. If this parameter is not specified, 1 core is used.
  • ' --config=CONFIGURATION_FILENAME', The default configuration file is at /ihome/sam/apps/fusioncatcher/fusioncatcher/etc/configuration.cfg
  • '-i INPUT_FILENAME' The input file(s) or directory.
  • '-o OUTPUT_DIRECTORY', The output directory where all the output files containing information about the found candidate fusiongenes are written.
  • '-d DATA_DIRECTORY', The data directory where all the annotations files from Ensembl database are placed. This directory should be built using 'fusioncatcher-build'. The default directory is /ihome/sam/apps/fusioncatcher/fusioncatcher/data/ensembl_v86/