- Preface
- Setup
- Main pipeline
- Helpers
- Download data
- QC of sequencing reads (1)
- Clean sequencing reads
- QC of sequencing reads (2)
- Download and index reference genome
- Align and quantify microRNA reads
- Align and quantify mRNA reads
- Data preparation
- Identify cycling genes
- Pairwise association between microRNA and mRNAs
- Networks of mRNA and microRNAs
- Pathway enrichment analysis
- Done!
Data analysis for “Site- and cell-type-specific miRNA and mRNA genes and networks across the cortex, striatum, and hypothalamus”.
Important:
-
Consider reading the
README.htmlfile which has a floating table of contents. -
This project assumes you are using resources from the The Centre for Advanced Computing.
- The CAC uses SLURM to allocate jobs.
- It is highly recommended that you use a cloud computing system. You may need to edit scripts to load dependencies in a manner compatible with your system.
-
Ensure all scripts and data are stored in an R project folder.
-
Script names are numbered so the order of execution is more obvious.
-
Set the R current working directory to the project working directory. Most scripts assume that the project directory is the current working directory.
-
Caution! Some scripts use absolute paths (especially bash scripts)
- Run the following commands in the terminal to replace the
absolutePathspaceholder found in scripts with your absolute path to the project directory.
find . -type f -name "*.sh" -exec sed -i'' -e 's#absolutePath#/my/custom/path#g' {} + find . -type f -name "*.R" -exec sed -i'' -e 's#absolutePath#/my/custom/path#g' {} + - Run the following commands in the terminal to replace the
Primary session info:
- R version 4.4.0 (2024-04-24)
- Platform: x86_64-redhat-linux-gnu (64-bit)
- Running under: CentOS Linux 7 (Core)
- Matrix products: default
- BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
Packages:
| Package | Version |
|---|---|
| arrayQualityMetrics | 3.60.0 |
| Biobase | 2.64.0 |
| biomaRt | 2.60.0 |
| cividis | 0.2.0 |
| colorspace | 2.1-0 |
| ComplexHeatmap | 2.20.0 |
| ComplexUpset | 1.3.3 |
| cowplot | 1.1.3 |
| DESeq2 | 1.44.0 |
| devtools | 2.4.5 |
| dplyr | 1.1.4 |
| DT | 0.33 |
| dynOmics | 1.0 |
| edgeR | 4.2.0 |
| GEOquery | 2.72.0 |
| ggplot2 | 3.5.1 |
| ggpubr | 0.6.0 |
| gprofiler2 | 0.2.3 |
| Hmisc | 5.1-2 |
| htmltools | 0.5.8.1 |
| IsoformSwitchAnalyzeR | 1.18.0 |
| knitr | 1.46 |
| limma | 3.60.0 |
| lmms | 1.3.3 |
| MetaCycle | 1.2.0 |
| miRBaseConverter | 1.11.1 |
| multiMiR | 1.26.0 |
| optparse | 1.7.5 |
| patchwork | 1.2.0 |
| pheatmap | 1.0.12 |
| purrr | 1.0.2 |
| rain | 1.38.0 |
| RColorBrewer | 1.1-3 |
| readxl | 1.4.3 |
| renv | 1.0.7 |
| rmarkdown | 2.26 |
| rsconnect | 1.3.1 |
| rtracklayer | 1.64.0 |
| scales | 1.3.0 |
| shiny | 1.8.1.1 |
| shinythemes | 1.2.0 |
| stringr | 1.5.1 |
| tibble | 3.2.1 |
| tidyr | 1.3.1 |
| tidyverse | 2.0.0 |
| UpSetR | 1.4.0 |
| VennDiagram | 1.7.3 |
| WGCNA | 1.72-5 |
Notice the 0_helpers folder. This directory contains many R functions
that minimize repetition of code and are generally helpful.
- Navigate to the
0_datafolder.- R current working directory remains the project working directory
- Terminal working directory becomes
./0_databy runningcd ./0_datain the command line
- Make the following folders:
seqreadsandseries. - Manually download SRA Accession Lists and SRA metadata / run
tables from…
- mRNA: BioProject PRJNA636378
- microRNA: BioProject PRJNA636377
- Run bash scripts beginning with
download.- Note that these files use absolute paths
- Dependencies: StdEnv/2020 gcc/9.3.0 sra-toolkit/2.10.8
- For every accession ID in a dataset,
prefetch,fastq-dump, andgzipthe relevant data - Use
.outlogs to monitor download progress - Scripts aren’t written to parallelize downloads of files, but I recommend threads for future users!
- Prepare metadata/coldata. Note:
coldatawill be used interchangeably with metadata.-
Download GEO metadata.
- Run
./1_readSeriesMatrix.Rwhich downloads GEO series matrixes, in txt format and converts to csv.
- If you prefer to not use the
GEOQuerypackage, it’s likely possible to directly use the txt files.
- Run
-
Run
2_makeColdata.Rto make coldata files by merging theSRA run tablesandseries matrixes. This file also makes newctTimeandztTimecolumns. Finally, the script removes columns that aren’t directly needed for the project.
-
- Run
./3_timeDesign.Rto inspect the number of samples per timepoint, tissue, and sequence type.
-
Navigate to the
1_qcSeqReads/1_qcB4Trimfolder.- R current working directory remains the project working directory
- Terminal working directory becomes
./1_qcSeqReads/1_qcB4Trimby runningcd ./1_qcSeqReads/1_qcB4Trimin the command line
-
Run
1_writeFastqcScripts.Rto generate individual FastQC scripts.- Rather than running quality control on every sample in a loop, run multiple scripts at once.
-
Execute fastqc scripts. Do not execute all scripts at once! I recommend running 10 at a time. Use
2_checkSuccess.RandjobsToRun.shto ensure all jobs have been run!# 1 cpu, max 10 gigabytes of memory module load StdEnv/2020 module load nixpkgs/16.09 module load fastqc/0.11.9 fastqc -f fastq -o $OUTDIR $INDATAPATH -
Run
3_writeMultiqcScripts.Rto generate a multiQC script for each tissue. -
Execute multiqc scripts.
# 1 cpu, max 1 GB of memory module load StdEnv/2020 python/3.9.6 #pip install --user multiqc #pip install --user --upgrade multiqc # Begin MultiQC multiqc \ --outdir $OUTDIR \ --filename $FILENAME \ --force \ --interactive \ --cl_config "fastqc_config: { fastqc_theoretical_gc: mm10_txome }" \ $FQPATHS
- Clean mRNA reads with Trimmomatic only.
-
Navigate to
./2_trimMRNA. -
Run
1_writeIndivScripts.R. -
Run individual scripts. Use
2_checkSuccess.RandjobsToRun.shto ensure all jobs have been run.# 5 cpu, max 5 GB memory # Dependencies module load nixpkgs/16.09 trimmomatic/0.36 # trimmomatic # Begin Trimmomatic java -jar /cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/trimmomatic/0.36/trimmomatic-0.36.jar PE \ -threads 10 \ $FWDPATH $REVPATH \ ${OUTPATH}/${ID}_1.pair.trim.fastq.gz ${OUTPATH}/${ID}_1.unpair.trim.fastq.gz \ ${OUTPATH}/${ID}_2.pair.trim.fastq.gz ${OUTPATH}/${ID}_2.unpair.trim.fastq.gz \ SLIDINGWINDOW:4:20 MINLEN:36SLIDINGWINDOW:4:20= Over a sliding window of 4 bps, remove bps with an average phred quality score below 20MINLEN:36= Drop a read if it’s below 36 bps long
-
- Clean microRNA reads with CutAdapt and Trimmomatic.
-
Navigate to
./2_trimMiRNA. -
Run
1_writeIndivScripts.R. -
Run individual scripts. Use
2_checkSuccess.RandjobsToRun.shto ensure all jobs have been run.# 5 cpu, max 5 GB memory # Begin CutAdapt cutadapt --cores 10 \ --adapter TGGAATTCTCGGGTGCCAAGG \ --error-rate 0.25 \ --no-indels \ --minimum-length 15 \ --overlap 6 \ --times 1 \ --match-read-wildcards \ --untrimmed-output ${OUTPATH}/cutAdapt/${ID}.NO3AD.fastq.gz \ --too-short-output ${OUTPATH}/cutAdapt/${ID}.short.fastq.gz \ --output ${OUTPATH}/cutAdapt/${ID}.cutClean.fastq.gz \ $FWDPATHs # # Begin Trimmomatic java -jar /cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/trimmomatic/0.36/trimmomatic-0.36.jar SE \ -threads 10 \ ${OUTPATH}/cutAdapt/${ID}.cutClean.fastq.gz \ ${OUTPATH}/${ID}.trim.fastq.gz \ SLIDINGWINDOW:4:20 MINLEN:15- CutAdapt parameters from Encode project’s pipeline
- -a = 3 prime adapter sequence, from Illumina website
- -e = maximum allowed error rate when finding adapters
- –no-indels = no indels when matching adapters
- -m = minimum processed read length
- -O = minimum overlap between adapter and read sequence, ignored for anchored adapters
-
- Repeat steps from QC of sequencing reads (1). For mRNA, navigate
to
./1_qcSeqReads/2_qcAftTrim. For miRNA, navigate to./1_qcSeqReads/2_qcAftCutTrim.
- Navigate to the
./0_resources/gencodefolder. - Make the following folders:
indexHisat2andindexStar. - Run the
1_downloadGencode.shscript to get the primary fasta and gtf files for GRCm39. - Run the
2_subsetMiRNA.shscript to extract features whosetranscript_typeis “miRNA” from the gtf file. - Run the
3_indexHisat2script to prepare the reference genome for Hisat2 alignment. This script useshisat2_extract_splice_sites.pyandhisat2_extract_exons.pyto improve Hisat2’s handling of splice sites. - Run the
3_indexStar.shscript to prepare the reference genome for STAR alignment.--sjdbOverhang 50= an input argument for indexing because maximum read length is 51.
-
Navigate to the
3_alignStarfolder. -
Generate an individual script for each sample with
1_writeStarScripts.R. -
Run scripts. Use
2_checkSuccess.RandjobsToRun.shto ensure all jobs have been run.# 5 cpus, 40 GB max memory. Each script takes ~12 minutes. module load StdEnv/2020 gcc/9.3.0 star/2.7.9a samtools/1.13 # align and quantify PARAMS='--runThreadN 10 --alignEndsType EndToEnd \ --outFilterMismatchNmax 1 --outFilterMultimapScoreRange 0 \ --quantMode TranscriptomeSAM GeneCounts --outReadsUnmapped Fastx \ --outSAMtype BAM SortedByCoordinate --outFilterMultimapNmax 10 \ --outSAMunmapped Within --outFilterScoreMinOverLread 0\ --outFilterMatchNminOverLread 0 --outFilterMatchNmin 16 \ --alignSJDBoverhangMin 1000 --alignIntronMax 1 \ --outWigType wiggle --outWigStrand Stranded --outWigNorm RPM ' STAR --genomeDir $INDEX \ --sjdbGTFfile $MIRNA_GTF \ --readFilesCommand gunzip -c \ --readFilesIn $SINGLE_END_1 \ --outFileNamePrefix ${OUT_PATH}. \ $PARAMS samtools index -@ 20 ${OUT_PATH}.Aligned.sortedByCoord.out.bam- STAR parameters from Encode project’s pipeline
- –alignEndsType EndToEnd = force end-to-end read alignment (no soft-clipping)
- –outFilterMismatchNmax 1 = maximum number of mismatches in alignment
- –outFilterMultimapScoreRange 0 = if a read maps to multiple regions, only alignments with a score matching the best alignment will be output
- –quantMode TranscriptomeSAM GeneCounts = output sam/bam file with transcript alignments & output matrix with number of reads aligning to each “gene”.
- –outReadsUnmapped Fastx = output unmapped or partially mapped reads
- –outSAMtype BAM SortedByCoordinate = output a bam file that is sorted by coordinate
- –outFilterMultimapNmax 10 = the default, max # loci a read can align to
- –outSAMunmapped Within = output unmapped reads within the main SAM file (i.e. Aligned.out.sam)
- –outFilterScoreMinOverLread 0 = alignment will be output only if the number of matched bases is higher than or equal to this value, normalized to read length
- –outFilterMatchNminOverLread 0 = alignment will be output only if the number of matched bases is higher than or equal to this value., normalized to read length
- –outFilterMatchNmin 16 = same as outFilterMatchNminOverLread, but not normalized. In other words, the minimum mapped read length is 16 bps long.
- –alignSJDBoverhangMin 1000 = minimum overhang (i.e. block size) for annotated spliced alignments
- –alignIntronMax 1 = maximum intron length
- … remaining parameters are for generating wiggle files, which can be used to visualize results with the UCSC genome browser or Integrative Genomics Viewer.
-
Run the
3_getRates.Rscript to get an overview of STAR’s alignment rates.
Alignment
-
Navigate to
./3_alignHisat2 -
Generate an alignment script for each sample with
1_writeHisat2Scripts.R. -
Run scripts. Use
2_checkSuccess.RandjobsToRun.shto ensure all jobs have been run.# 5 cpu, 15 GB max memory, each script takes ~1-2 hours module load StdEnv/2020 samtools/1.10 hisat2/2.2.1 echo ALignment Started at $(date +'%T') hisat2 -p 7 -x $INDEX -1 $PAIRED_END_1 -2 $PAIRED_END_2 \ --dta --sensitive --no-discordant --no-mixed \ --summary-file $SUMMARY_PATH --time --verbose \ -S ${ALIGN_PATH}.sam echo Samtools processing started at $(date +'%T') samtools view -b -@ 7 ${ALIGN_PATH}.sam > ${ALIGN_PATH}.bam rm ${ALIGN_PATH}.sam echo collate started at $(date +'%T') samtools collate -@ 7 -o ${ALIGN_PATH}.col.bam ${ALIGN_PATH}.bam ${ALIGN_PATH}_tmpcol rm ${ALIGN_PATH}.bam echo fixmate started at $(date +'%T') samtools fixmate -m -@ 7 ${ALIGN_PATH}.col.bam ${ALIGN_PATH}.fix.bam rm ${ALIGN_PATH}.col.bam echo sort started at $(date +'%T') samtools sort -@ 7 -T ${ALIGN_PATH}_sort -o ${ALIGN_PATH}.sort.bam ${ALIGN_PATH}.fix.bam rm ${ALIGN_PATH}.fix.bam echo markdup started at $(date +'%T') samtools markdup -@ 7 -T ${ALIGN_PATH}_tmpmrk -s ${ALIGN_PATH}.sort.bam ${ALIGN_PATH}.sort.mrkdup.bam rm ${ALIGN_PATH}.sort.bam echo index started at $(date +'%T') samtools index -b -@ 7 ${ALIGN_PATH}.sort.mrkdup.bam ${ALIGN_PATH}.sort.mrkdup.bam.bai- Hisat2 parameters adapted from the Beijing Genomic’s Institute’s
arguments example dataset
1,
example dataset
2
- –dta = reported alignments tailored for tools like StringTie. Require longer anchor lengths for novel splice sites
- –sensitive = same as
--bowtie2-dp 1 -k 30 --score-min L,0,-0.5- –bowtie2-dp 1 = use Bowtie2’s conditional dynamic programming.
- -k 30 = search for at most 30 distinct primary alignments for each read. Default = 5 (linear index) or 10 (graph index).
- –score-min = minimum score function for an alignment to
be valid
f(x) = 0 + -0.5 * xwhere x = read length. Default = L,0,-0.2.
- –no-discordant = don’t allow unique alignment of mates
- –no-mixed = don’t try to find alignments for individual mates after hisat fails to identify concordant/discordant alignments
- Using samtools to 1) convert SAM to BAM, 2) mark duplicates and sort the BAM file, & 3) dndex the bam file. For marking duplicates and sorting by coordinates, use the example workflow from the samtools-markdup manual; author = Andrew Whitwham from the Sanger Institute
- Hisat2 parameters adapted from the Beijing Genomic’s Institute’s
arguments example dataset
1,
example dataset
2
-
Run the
3_getRates.Rscript to get an overview of Hisat2’s alignment rates.
Quantification
-
Navigate to
./4_stringtie -
Run
1_writePass1IndivScripts.Rto write individual scripts for pass 1. Execute scripts in thepass1IndivScriptsdirectory. Use the2_checkSuccess.RandjobsToRun.shscripts to monitor progress.# 1 cpu, 5 GB memory # REF_GTF is the full GTF file from Gencode module load StdEnv/2020 stringtie/2.1.5 stringtie $INPUT -p 5 -G $REF_GTF -o $OUT_GTF -
Run
3_writeGtfLists.Rand4.0_writeMergeScripts.Rto prepare the merging of individual GTFS from pass 1. Tissues are kept separate! -
Run
*.sh*files in the4_mergefolder to execute the merging of GTF files.# 5 cpu, 3 GB memory module load StdEnv/2020 stringtie/2.1.5 stringtie --merge -p 20 -o $OUTPUT -G $REF_GTF $GTFS_LIST -
Evaluate StringTie performance with
5.1_writeGffCompareScripts.Rand*.shscripts in the5_gffComparefolder. -
Run
1_writePass2IndivScripts.Rto write individual scripts for pass 2. Execute scripts in thepass2IndivScriptsdirectory. Use the2_checkSuccess.RandjobsToRun.shscripts to monitor progress.# 1 cpu, 5 GB memory # REF_GTF is the merged gtf that corresponds to this sample's tissue module load StdEnv/2020 stringtie/2.1.5 stringtie $INPUT -b $BALL -e -p 5 -G $REF_GTF -o $OUT_GTF -
To generate gene count matrixes, switch your R version to 4.2.1 and run
7_isoformAnalyzeR.R- This script uses an absolute path! Edit the script to use your project directory.
- Navigate to
5_dataPrep - Clean miRNA count matrixes
- Navigate to the
miRNAfolder. - Run
0_id2name.Rto get a dataframe with ensembl ID to gene name/symbol conversion information. - Run
0_makeCountMats.Rto merge gene count matrixes from STAR into single dataframes, one per tissue. - Run
1_outlierRemoval.Rto … 1. Perform outlier detection with arrayQualityMetrics. A sample is considered an outlier if- it is marked as an outlier before and after normalization by the same outlier detection metrics, and/or,
- it is marked as an outlier by multiple outlier detection metrics after normalization 2. Normalize counts with the weighted trimmed mean of M-values method
- Run
2_corrReps.Rto get the spearman correlation between samples, before and after outlier removal. - Run
2_filtering.Rto perform non-specific filtering to remove lowly expressed features (mean CPM < 1). - Repeat steps from Clean miRNA count matrixes, except replace “miRNA” with “mRNA” in the folder name.
- Caveats:
- No need to merge count matrixes for each tissue.
- Samples SRR11902345 and SRR11902411 were manually removed (hard coding) from the hypothalamus dataframes upon inspection of PCA.
- Inspect the number of samples per tissue and timepoint after sample
removal with
timeDesign.R
- Navigate to
./6_rhythmicity - Identify cycling mRNAs
-
Navigate to
1_mRNA24h -
Run the
1_metacyclescript to identify 24-hour period cycling features. Cycling genes have a combined, Benjamini-Hochberg corrected p-value that is less than 0.05.meta2d( infile = inPath, outdir = outPath, filestyle = "csv", minper = 24, maxper = 24, timepoints = "line1", outputFile = TRUE, combinePvalue = "fisher", cycMethod = c("JTK", "LS"), nCores = 1 ) -
Summarize results with
2_summarizeRes.R -
To inspect/visualize results, run
3_make*.Rscripts. -
Repeat 1-4. with RAIN in the
1_mRNAHarmdirectory. Cycling genes have a period 4 to 30 hours and adjusted p-value < 0.05.rain( x = df, deltat = 3, # sampling interval period = 17, # period to search for period.delta = 13, # period +/- delta to consider peak.border = c(0.3, 0.7), # default measure.sequence = sequenceLists[[set]], method = "independent", # MC options are bonferroni, benjamini-hochberg (BH), or adaptive BH (ABH). adjp.method = "ABH", verbose = TRUE ) -
Repeat 1-4 with ARSER from MetaCycle in the
1_mRNAArserdirectory. Cycling genes have a period 4 to 30 hours and adjusted p-value < 0.05.meta2d( infile = inPath, outdir = outPath, filestyle = "csv", minper = 4, maxper = 30, timepoints = "line1", outputFile = TRUE, combinePvalue = "fisher", cycMethod = "ARS", nCores = 1 )- Run
1.5_getDomCycle.Rto select the cycle with the largest amplitude when one gene has multiple cycles. Run this script before running2*.Rand3*.Rscripts.
- Run
-
- Identify cycling miRNAs
- Repeat steps from Identify cycling mRNAs, except replace
“mRNA” with “miRNA” in folder and filenames.
- JTK cycle will not run for the Corpus Striatum because there are no samples from time 15.
- Repeat steps from Identify cycling mRNAs, except replace
“mRNA” with “miRNA” in folder and filenames.
- Compare MetaCycle vs RAIN vs ARSER
- Navigate to the
2_24hVsHarmdirectory to compare MetaCycle results to RAIN results.- Run
1_compare24VsHarm.Rto quantify the similarities and differences across results. - Run
2*.Rscripts to visualize the similarities and differences across results.
- Run
- Navigate to the
2_ArserVsHarmdirectory to compare ARSER results to RAIN results.- Run
1_compareAvsR.Rto quantify the similarities and differences across results. - Run
2*.Rto visualize the similarities and differences across results.
- Run
- Navigate to the
- Compare rhythmic feature across tissues
- Navigate to the
2_compareTissues24hfolder - Run
1_differenceTissuesto find genes that are unique to each tissue. iii.Run2_sharedTissuesto find genes that are shared across multiple tissues. - Run
3_sharedDiffParams.Rto see how shared genes have different or similar cycling patterns across tissues. - Run
4_plotCompareTissues.Rto visualize shared and different genes across tissues (venn diagrams and upset plots). - Repeat for results from RAIN by navigating to the
2_compareTissuesHarmfolder, and running scripts with the aforementioned names.
- Navigate to the
- Compare cycling genes to genes previously associated with
chronotype.
- Navigate to the
2_compareGwasChrono24hfolder. - Run the
1_chronotypeGWAS.Rscript.
- Navigate to the
- Prepare an ensembl ID to mature mirbase ID conversion table with
7_0_mirPrep/miRNAConvertTable24h.R - Identify experimentally validated and predicted targets with MultiMiR.
- Navigate to
7.1_multiMiR. - Run
1_querymultiMiR24h.Rto query the package. - Run
2_processmultiMiR24h.to explore and format the results.
- Navigate to
./7.2_dynOmics - Run the
1_runDynOmics.shscript to identify cycling mRNA-microRNA pairs in each tissue.- Note that the Corpus Striatum is skipped because there were no cycling microRNAS.
- Run
2_examineAssocs.Rand2_subsetMultiCycl.Rto inspect/summarize the results. - Run
3_compareTissueAssocs.Rto compare results across tissues. - Run
3_plotAssocs.Rand3_compareTissueAssocsUpset.Rto plots the results.
- Navigate to
./8_wgcna - Use
1_runWgcna.shto run1_makeNetworkMRNA.Rand1_makeNetworkMiRNA.Rremotely.- Input is the log2 TMM counts of features that remain after non-specific filtering
- Determine the optimal soft threshold for a signed network is calculated. We use a 0.85 threshold scale-free topology fit for plotting.
- Calculate the similarity between features
(
similarity = (correlation + 1) / 2). What is the relationship between gene A and gene B? - Calculate the adjacency (`similarity^soft threshold). Adds weighting to connectivity between genes.
- Calculate the signed TOM matrix. What is the relationship between the neighbors of gene A and gene B?
- Cluster genes using the dissimilarity TOM matrix (
1 - TOM) - Using dynamic tree cutting to identify modules. Relevant
parameters:
deepSplit = 4, pamRespectsDendro = FALSE, minClusterSize = 30 - Calculate eigengenes. iix. Merge modules with high correlations
between eigengenes. Maximum dissimilarity that qualifies merging
is 0.3 (
cutHeight = 0.3) - Calculate module membership. Essentially, the pearson correlation between module eigengenes and the feature’s expression.
- Get hub genes. In this case, the hub gene is the gene with the highest connectivity in a module.
- Annotate results with what we already know about the genes. I.e.
make the
geneInfodataframes.
- Run the
2.a_cyclEnrich*.Rscripts to run a hypergeometric test between cycling genes in a tissue and modules in a tissue. These scripts also plot the results as bubble plots. - Run the
2.b_cyclCompositionPlotting.Rscript to plot the composition of period categories and cycling genes in cycling modules. - Run the
3.a_cellTypeMRNA.Rscript to approximate cell-types for mRNA cycling modules. - Run the
3.b_mRNAmiRNAassoc.Rscript to find the correlation between mRNA-miRNA eigengenes. - Run the
4.a_cytoscape.Rto prepare files necessary for plotting modules as a graph-network w/ cytoscape.
- Navigate to the
./9_enrichmentdirectory. - To perform pathway analysis of Metacycle and RAIN results,
navigate to the
rhythmicitydirectory. Run1_runGprof.shto query G: profiler and plot results. Run2_compareTissues.Rto compare tissues and plot results. - To analyze DynOmics and WGCNA results, repeat the above steps,
except with the
dynOmicsandwgcnafolders. - Navigate to the
compareGwasChronodirectory and run1_gprofGwas.Rto perform pathway analysis on genes that are both cycling and previously linked to human chronotype susceptibility.