# CG Cancer Genomics Data Analysis Exercise
First index the Human reference genome using BWA
bwa index GCA_000001405.28_GRCh38.p13_genomic.fna.gzAlign and create sam file
bwa mem GCA_000001405.28_GRCh38.p13_genomic.fna.gz tu.r1.fq.gz tu.r2.fq.gz > tumor.sam
bwa mem GCA_000001405.28_GRCh38.p13_genomic.fna.gz tu.r1.fq.gz tu.r2.fq.gz > wt.samConvert sam to bam
samtools view -O BAM -o tumor.bam tumor.sam
samtools view -O BAM -o wt.bam wt.samSort and index the BAM file
samtools sort -T temp -O bam -o tumor.sorted.bam tumor.bam
samtools sort -T temp -O bam -o wt.sorted.bam wt.bamIndex sorted bam files
samtools index tumor.sorted.bam
samtools index wt.sorted.bamRemove duplicates from PCR
samtools rmdup -r -S tumor.sorted.subset.bam tumor.deduplicate.bam
samtools rmdup -r -S wt.sorted.subset.bam wt.deduplicate.bamIdentify the depth at each locus from a bam file
samtools depth tumor.deduplicated.bam > tumor.deduplicated.coverage
samtools depth wt.deduplicated.bam > wt.deduplicated.coverageExtract just chromosomeX = CM000685
grep "CM000685" tumor.deduplicated.coverage > tumor.chrx.coverage
grep "CM000685" wt.deduplicated.coverage > wt.chrx.coverageSubset to the region of interest
sed -n '/20000000/,/40000000/p' tumor.chrx.coverage > tumor.extract.new
sed -n '/20000000/,/40000000/p' wt.chrx.coverage > wt.extract.newKeep last two columns
sed 's/CM000685.2//' wt.extract.new > wt.extract
sed 's/CM000685.2//' tumor.extract.new > tumor.extractRun
python3 rd_plot.py