Codestin Search App

Preface

Data analysis for “Site- and cell-type-specific miRNA and mRNA genes and networks across the cortex, striatum, and hypothalamus”.

Setup

Important:

Consider reading the README.html file which has a floating table of contents.
This project assumes you are using resources from the The Centre for Advanced Computing.
- The CAC uses SLURM to allocate jobs.
- It is highly recommended that you use a cloud computing system. You may need to edit scripts to load dependencies in a manner compatible with your system.
Ensure all scripts and data are stored in an R project folder.
Script names are numbered so the order of execution is more obvious.
Set the R current working directory to the project working directory. Most scripts assume that the project directory is the current working directory.
Caution! Some scripts use absolute paths (especially bash scripts)
- Run the following commands in the terminal to replace the absolutePath spaceholder found in scripts with your absolute path to the project directory.
```
find . -type f -name "*.sh" -exec sed -i'' -e 's#absolutePath#/my/custom/path#g' {} +
find . -type f -name "*.R" -exec sed -i'' -e 's#absolutePath#/my/custom/path#g' {} +
```

Primary session info:

R version 4.4.0 (2024-04-24)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

Packages:

Package	Version
arrayQualityMetrics	3.60.0
Biobase	2.64.0
biomaRt	2.60.0
cividis	0.2.0
colorspace	2.1-0
ComplexHeatmap	2.20.0
ComplexUpset	1.3.3
cowplot	1.1.3
DESeq2	1.44.0
devtools	2.4.5
dplyr	1.1.4
DT	0.33
dynOmics	1.0
edgeR	4.2.0
GEOquery	2.72.0
ggplot2	3.5.1
ggpubr	0.6.0
gprofiler2	0.2.3
Hmisc	5.1-2
htmltools	0.5.8.1
IsoformSwitchAnalyzeR	1.18.0
knitr	1.46
limma	3.60.0
lmms	1.3.3
MetaCycle	1.2.0
miRBaseConverter	1.11.1
multiMiR	1.26.0
optparse	1.7.5
patchwork	1.2.0
pheatmap	1.0.12
purrr	1.0.2
rain	1.38.0
RColorBrewer	1.1-3
readxl	1.4.3
renv	1.0.7
rmarkdown	2.26
rsconnect	1.3.1
rtracklayer	1.64.0
scales	1.3.0
shiny	1.8.1.1
shinythemes	1.2.0
stringr	1.5.1
tibble	3.2.1
tidyr	1.3.1
tidyverse	2.0.0
UpSetR	1.4.0
VennDiagram	1.7.3
WGCNA	1.72-5

Main pipeline

Helpers

Notice the 0_helpers folder. This directory contains many R functions that minimize repetition of code and are generally helpful.

Download data

Navigate to the 0_data folder.
- R current working directory remains the project working directory
- Terminal working directory becomes ./0_data by running cd ./0_data in the command line
Make the following folders: seqreads and series.
Manually download SRA Accession Lists and SRA metadata / run tables from…
1. mRNA: BioProject PRJNA636378
2. microRNA: BioProject PRJNA636377
Run bash scripts beginning with download.
1. Note that these files use absolute paths
2. Dependencies: StdEnv/2020 gcc/9.3.0 sra-toolkit/2.10.8
3. For every accession ID in a dataset, prefetch, fastq-dump, and gzip the relevant data
4. Use .out logs to monitor download progress
5. Scripts aren’t written to parallelize downloads of files, but I recommend threads for future users!
Prepare metadata/coldata. Note: coldata will be used interchangeably with metadata.
1. Download GEO metadata.
  1. Run ./1_readSeriesMatrix.R which downloads GEO series matrixes, in txt format and converts to csv.
  - If you prefer to not use the GEOQuery package, it’s likely possible to directly use the txt files.
2. Run 2_makeColdata.R to make coldata files by merging the SRA run tables and series matrixes. This file also makes new ctTime and ztTime columns. Finally, the script removes columns that aren’t directly needed for the project.
Run ./3_timeDesign.R to inspect the number of samples per timepoint, tissue, and sequence type.

QC of sequencing reads (1)

Navigate to the 1_qcSeqReads/1_qcB4Trim folder.
- R current working directory remains the project working directory
- Terminal working directory becomes ./1_qcSeqReads/1_qcB4Trim by running cd ./1_qcSeqReads/1_qcB4Trim in the command line
Run 1_writeFastqcScripts.R to generate individual FastQC scripts.
- Rather than running quality control on every sample in a loop, run multiple scripts at once.
Execute fastqc scripts. Do not execute all scripts at once! I recommend running 10 at a time. Use 2_checkSuccess.R and jobsToRun.sh to ensure all jobs have been run!
```
# 1 cpu, max 10 gigabytes of memory
module load StdEnv/2020
module load nixpkgs/16.09
module load fastqc/0.11.9

fastqc -f fastq -o $OUTDIR $INDATAPATH
```
Run 3_writeMultiqcScripts.R to generate a multiQC script for each tissue.

Execute multiqc scripts.

# 1 cpu, max 1 GB of memory
module load StdEnv/2020 python/3.9.6
#pip install --user multiqc
#pip install --user --upgrade multiqc

# Begin MultiQC
multiqc \
  --outdir $OUTDIR \
  --filename $FILENAME \
  --force \
  --interactive \
  --cl_config "fastqc_config: { fastqc_theoretical_gc: mm10_txome }" \
  $FQPATHS

Clean sequencing reads

Clean mRNA reads with Trimmomatic only.

Navigate to ./2_trimMRNA .
Run 1_writeIndivScripts.R.

Run individual scripts. Use 2_checkSuccess.R and jobsToRun.sh to ensure all jobs have been run.

# 5 cpu, max 5 GB memory
# Dependencies
module load nixpkgs/16.09 trimmomatic/0.36 # trimmomatic

# Begin Trimmomatic
java -jar /cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/trimmomatic/0.36/trimmomatic-0.36.jar PE \
-threads 10 \
$FWDPATH $REVPATH \
${OUTPATH}/${ID}_1.pair.trim.fastq.gz ${OUTPATH}/${ID}_1.unpair.trim.fastq.gz \
${OUTPATH}/${ID}_2.pair.trim.fastq.gz ${OUTPATH}/${ID}_2.unpair.trim.fastq.gz \
SLIDINGWINDOW:4:20 MINLEN:36

SLIDINGWINDOW:4:20 = Over a sliding window of 4 bps, remove bps with an average phred quality score below 20
MINLEN:36 = Drop a read if it’s below 36 bps long

Clean microRNA reads with CutAdapt and Trimmomatic.

Navigate to ./2_trimMiRNA.
Run 1_writeIndivScripts.R.

Run individual scripts. Use 2_checkSuccess.R and jobsToRun.sh to ensure all jobs have been run.

# 5 cpu, max 5 GB memory
# Begin CutAdapt
cutadapt --cores 10 \
  --adapter TGGAATTCTCGGGTGCCAAGG \
  --error-rate 0.25 \
  --no-indels \
  --minimum-length 15 \
  --overlap 6 \
  --times 1 \
  --match-read-wildcards \
  --untrimmed-output ${OUTPATH}/cutAdapt/${ID}.NO3AD.fastq.gz \
  --too-short-output ${OUTPATH}/cutAdapt/${ID}.short.fastq.gz \
  --output ${OUTPATH}/cutAdapt/${ID}.cutClean.fastq.gz \
  $FWDPATHs
#
# Begin Trimmomatic
java -jar /cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/trimmomatic/0.36/trimmomatic-0.36.jar SE \
  -threads 10 \
  ${OUTPATH}/cutAdapt/${ID}.cutClean.fastq.gz \
  ${OUTPATH}/${ID}.trim.fastq.gz \
  SLIDINGWINDOW:4:20 MINLEN:15

CutAdapt parameters from Encode project’s pipeline
-a = 3 prime adapter sequence, from Illumina website
-e = maximum allowed error rate when finding adapters
–no-indels = no indels when matching adapters
-m = minimum processed read length
-O = minimum overlap between adapter and read sequence, ignored for anchored adapters

QC of sequencing reads (2)

Repeat steps from QC of sequencing reads (1). For mRNA, navigate to ./1_qcSeqReads/2_qcAftTrim. For miRNA, navigate to ./1_qcSeqReads/2_qcAftCutTrim.

Download and index reference genome

Navigate to the ./0_resources/gencode folder.
Make the following folders: indexHisat2 and indexStar.
Run the 1_downloadGencode.sh script to get the primary fasta and gtf files for GRCm39.
Run the 2_subsetMiRNA.sh script to extract features whose transcript_type is “miRNA” from the gtf file.
Run the 3_indexHisat2 script to prepare the reference genome for Hisat2 alignment. This script uses hisat2_extract_splice_sites.py and hisat2_extract_exons.py to improve Hisat2’s handling of splice sites.
Run the 3_indexStar.sh script to prepare the reference genome for STAR alignment. --sjdbOverhang 50 = an input argument for indexing because maximum read length is 51.

Align and quantify microRNA reads

Navigate to the 3_alignStar folder.
Generate an individual script for each sample with 1_writeStarScripts.R.
Run scripts. Use 2_checkSuccess.R and jobsToRun.sh to ensure all jobs have been run.
```
# 5 cpus, 40 GB max memory. Each script takes ~12 minutes.
module load StdEnv/2020 gcc/9.3.0 star/2.7.9a samtools/1.13
# align and quantify
PARAMS='--runThreadN 10 --alignEndsType EndToEnd \
--outFilterMismatchNmax 1 --outFilterMultimapScoreRange 0 \
--quantMode TranscriptomeSAM GeneCounts --outReadsUnmapped Fastx \
--outSAMtype BAM SortedByCoordinate --outFilterMultimapNmax 10 \
  --outSAMunmapped Within --outFilterScoreMinOverLread 0\
  --outFilterMatchNminOverLread 0 --outFilterMatchNmin 16 \
  --alignSJDBoverhangMin 1000 --alignIntronMax 1  \
  --outWigType wiggle --outWigStrand Stranded --outWigNorm RPM '
STAR --genomeDir $INDEX \
  --sjdbGTFfile $MIRNA_GTF \
  --readFilesCommand gunzip -c \
  --readFilesIn $SINGLE_END_1 \
  --outFileNamePrefix ${OUT_PATH}. \
  $PARAMS
samtools index -@ 20 ${OUT_PATH}.Aligned.sortedByCoord.out.bam
```
- STAR parameters from Encode project’s pipeline
- –alignEndsType EndToEnd = force end-to-end read alignment (no soft-clipping)
- –outFilterMismatchNmax 1 = maximum number of mismatches in alignment
- –outFilterMultimapScoreRange 0 = if a read maps to multiple regions, only alignments with a score matching the best alignment will be output
- –quantMode TranscriptomeSAM GeneCounts = output sam/bam file with transcript alignments & output matrix with number of reads aligning to each “gene”.
- –outReadsUnmapped Fastx = output unmapped or partially mapped reads
- –outSAMtype BAM SortedByCoordinate = output a bam file that is sorted by coordinate
- –outFilterMultimapNmax 10 = the default, max # loci a read can align to
- –outSAMunmapped Within = output unmapped reads within the main SAM file (i.e. Aligned.out.sam)
- –outFilterScoreMinOverLread 0 = alignment will be output only if the number of matched bases is higher than or equal to this value, normalized to read length
- –outFilterMatchNminOverLread 0 = alignment will be output only if the number of matched bases is higher than or equal to this value., normalized to read length
- –outFilterMatchNmin 16 = same as outFilterMatchNminOverLread, but not normalized. In other words, the minimum mapped read length is 16 bps long.
- –alignSJDBoverhangMin 1000 = minimum overhang (i.e. block size) for annotated spliced alignments
- –alignIntronMax 1 = maximum intron length
- … remaining parameters are for generating wiggle files, which can be used to visualize results with the UCSC genome browser or Integrative Genomics Viewer.
Run the 3_getRates.R script to get an overview of STAR’s alignment rates.

Align and quantify mRNA reads

Alignment

Navigate to ./3_alignHisat2
Generate an alignment script for each sample with 1_writeHisat2Scripts.R.

Run scripts. Use 2_checkSuccess.R and jobsToRun.sh to ensure all jobs have been run.

# 5 cpu, 15 GB max memory, each script takes ~1-2 hours
module load StdEnv/2020 samtools/1.10 hisat2/2.2.1 
echo ALignment Started at $(date +'%T')
hisat2 -p 7 -x $INDEX -1 $PAIRED_END_1 -2 $PAIRED_END_2 \
  --dta --sensitive --no-discordant --no-mixed \
  --summary-file $SUMMARY_PATH --time --verbose \
  -S ${ALIGN_PATH}.sam

echo Samtools processing started at $(date +'%T')
samtools view -b -@ 7 ${ALIGN_PATH}.sam > ${ALIGN_PATH}.bam
rm ${ALIGN_PATH}.sam
echo collate started at $(date +'%T')
samtools collate -@ 7 -o ${ALIGN_PATH}.col.bam  ${ALIGN_PATH}.bam ${ALIGN_PATH}_tmpcol
rm ${ALIGN_PATH}.bam
echo fixmate started at $(date +'%T')
samtools fixmate -m -@ 7 ${ALIGN_PATH}.col.bam  ${ALIGN_PATH}.fix.bam 
rm ${ALIGN_PATH}.col.bam
echo sort started at $(date +'%T')
samtools sort -@ 7 -T ${ALIGN_PATH}_sort -o ${ALIGN_PATH}.sort.bam ${ALIGN_PATH}.fix.bam 
rm ${ALIGN_PATH}.fix.bam
echo markdup started at $(date +'%T')
samtools markdup -@ 7 -T ${ALIGN_PATH}_tmpmrk -s ${ALIGN_PATH}.sort.bam ${ALIGN_PATH}.sort.mrkdup.bam 
rm ${ALIGN_PATH}.sort.bam
echo index started at $(date +'%T')
samtools index -b -@ 7 ${ALIGN_PATH}.sort.mrkdup.bam  ${ALIGN_PATH}.sort.mrkdup.bam.bai

Hisat2 parameters adapted from the Beijing Genomic’s Institute’s arguments example dataset 1, example dataset 2
- –dta = reported alignments tailored for tools like StringTie. Require longer anchor lengths for novel splice sites
- –sensitive = same as --bowtie2-dp 1 -k 30 --score-min L,0,-0.5
  - –bowtie2-dp 1 = use Bowtie2’s conditional dynamic programming.
  - -k 30 = search for at most 30 distinct primary alignments for each read. Default = 5 (linear index) or 10 (graph index).
  - –score-min = minimum score function for an alignment to be valid f(x) = 0 + -0.5 * x where x = read length. Default = L,0,-0.2.
- –no-discordant = don’t allow unique alignment of mates
- –no-mixed = don’t try to find alignments for individual mates after hisat fails to identify concordant/discordant alignments
Using samtools to 1) convert SAM to BAM, 2) mark duplicates and sort the BAM file, & 3) dndex the bam file. For marking duplicates and sorting by coordinates, use the example workflow from the samtools-markdup manual; author = Andrew Whitwham from the Sanger Institute

Run the 3_getRates.R script to get an overview of Hisat2’s alignment rates.

Quantification

Navigate to ./4_stringtie
Run 1_writePass1IndivScripts.R to write individual scripts for pass 1. Execute scripts in the pass1IndivScripts directory. Use the 2_checkSuccess.R and jobsToRun.sh scripts to monitor progress.
```
# 1 cpu, 5 GB memory
# REF_GTF is the full GTF file from Gencode
module load StdEnv/2020 stringtie/2.1.5
stringtie $INPUT -p 5 -G $REF_GTF -o $OUT_GTF
```
Run 3_writeGtfLists.R and 4.0_writeMergeScripts.R to prepare the merging of individual GTFS from pass 1. Tissues are kept separate!

Run *.sh* files in the 4_merge folder to execute the merging of GTF files.

# 5 cpu, 3 GB memory
module load StdEnv/2020 stringtie/2.1.5
stringtie --merge -p 20 -o $OUTPUT -G $REF_GTF $GTFS_LIST

Evaluate StringTie performance with 5.1_writeGffCompareScripts.R and *.sh scripts in the 5_gffCompare folder.

Run 1_writePass2IndivScripts.R to write individual scripts for pass 2. Execute scripts in the pass2IndivScripts directory. Use the 2_checkSuccess.R and jobsToRun.sh scripts to monitor progress.

# 1 cpu, 5 GB memory
# REF_GTF is the merged gtf that corresponds to this sample's tissue
module load StdEnv/2020 stringtie/2.1.5
stringtie $INPUT -b $BALL -e -p 5 -G $REF_GTF -o $OUT_GTF

To generate gene count matrixes, switch your R version to 4.2.1 and run 7_isoformAnalyzeR.R
- This script uses an absolute path! Edit the script to use your project directory.

Data preparation

Navigate to 5_dataPrep
Clean miRNA count matrixes
Navigate to the miRNA folder.
Run 0_id2name.R to get a dataframe with ensembl ID to gene name/symbol conversion information.
Run 0_makeCountMats.R to merge gene count matrixes from STAR into single dataframes, one per tissue.
Run 1_outlierRemoval.R to … 1. Perform outlier detection with arrayQualityMetrics. A sample is considered an outlier if
- it is marked as an outlier before and after normalization by the same outlier detection metrics, and/or,
- it is marked as an outlier by multiple outlier detection metrics after normalization 2. Normalize counts with the weighted trimmed mean of M-values method
Run 2_corrReps.R to get the spearman correlation between samples, before and after outlier removal.
Run 2_filtering.R to perform non-specific filtering to remove lowly expressed features (mean CPM < 1).
Repeat steps from Clean miRNA count matrixes, except replace “miRNA” with “mRNA” in the folder name.

Caveats:
1. No need to merge count matrixes for each tissue.
2. Samples SRR11902345 and SRR11902411 were manually removed (hard coding) from the hypothalamus dataframes upon inspection of PCA.

Inspect the number of samples per tissue and timepoint after sample removal with timeDesign.R

Identify cycling genes

Navigate to ./6_rhythmicity

Identify cycling mRNAs

Navigate to 1_mRNA24h

Run the 1_metacycle script to identify 24-hour period cycling features. Cycling genes have a combined, Benjamini-Hochberg corrected p-value that is less than 0.05.

meta2d(
    infile = inPath,
    outdir = outPath,
    filestyle = "csv",
    minper = 24,
    maxper = 24,
    timepoints = "line1",
    outputFile = TRUE,
    combinePvalue = "fisher",
    cycMethod = c("JTK", "LS"),
    nCores = 1
)

Summarize results with 2_summarizeRes.R
To inspect/visualize results, run 3_make*.R scripts.

Repeat 1-4. with RAIN in the 1_mRNAHarm directory. Cycling genes have a period 4 to 30 hours and adjusted p-value < 0.05.

rain(
    x = df,
    deltat = 3, # sampling interval
    period = 17, # period to search for
    period.delta = 13, # period +/- delta to consider
    peak.border = c(0.3, 0.7), # default
    measure.sequence = sequenceLists[[set]],
    method = "independent",
    # MC options are bonferroni, benjamini-hochberg (BH), or adaptive BH (ABH).
    adjp.method = "ABH",
    verbose = TRUE
)

Repeat 1-4 with ARSER from MetaCycle in the 1_mRNAArser directory. Cycling genes have a period 4 to 30 hours and adjusted p-value < 0.05.
```
meta2d(
    infile = inPath,
    outdir = outPath,
    filestyle = "csv",
    minper = 4,
    maxper = 30,
    timepoints = "line1",
    outputFile = TRUE,
    combinePvalue = "fisher",
    cycMethod = "ARS",
    nCores = 1
)
```
- Run 1.5_getDomCycle.R to select the cycle with the largest amplitude when one gene has multiple cycles. Run this script before running 2*.R and 3*.R scripts.

Identify cycling miRNAs
1. Repeat steps from Identify cycling mRNAs, except replace “mRNA” with “miRNA” in folder and filenames.
  - JTK cycle will not run for the Corpus Striatum because there are no samples from time 15.
Compare MetaCycle vs RAIN vs ARSER
1. Navigate to the 2_24hVsHarm directory to compare MetaCycle results to RAIN results.
  1. Run 1_compare24VsHarm.R to quantify the similarities and differences across results.
  2. Run 2*.R scripts to visualize the similarities and differences across results.
2. Navigate to the 2_ArserVsHarm directory to compare ARSER results to RAIN results.
  1. Run 1_compareAvsR.R to quantify the similarities and differences across results.
  2. Run 2*.R to visualize the similarities and differences across results.
Compare rhythmic feature across tissues
1. Navigate to the 2_compareTissues24h folder
2. Run 1_differenceTissues to find genes that are unique to each tissue. iii.Run 2_sharedTissues to find genes that are shared across multiple tissues.
3. Run 3_sharedDiffParams.R to see how shared genes have different or similar cycling patterns across tissues.
4. Run 4_plotCompareTissues.R to visualize shared and different genes across tissues (venn diagrams and upset plots).
5. Repeat for results from RAIN by navigating to the 2_compareTissuesHarm folder, and running scripts with the aforementioned names.
Compare cycling genes to genes previously associated with chronotype.
1. Navigate to the 2_compareGwasChrono24h folder.
2. Run the 1_chronotypeGWAS.R script.

Pairwise association between microRNA and mRNAs

Find previously observed targets of cycling microRNAs

Prepare an ensembl ID to mature mirbase ID conversion table with 7_0_mirPrep/miRNAConvertTable24h.R
Identify experimentally validated and predicted targets with MultiMiR.

Navigate to 7.1_multiMiR.
Run 1_querymultiMiR24h.R to query the package.
Run 2_processmultiMiR24h. to explore and format the results.

DynOmics

Navigate to ./7.2_dynOmics
Run the 1_runDynOmics.sh script to identify cycling mRNA-microRNA pairs in each tissue.
- Note that the Corpus Striatum is skipped because there were no cycling microRNAS.
Run 2_examineAssocs.R and 2_subsetMultiCycl.R to inspect/summarize the results.
Run 3_compareTissueAssocs.R to compare results across tissues.
Run 3_plotAssocs.R and 3_compareTissueAssocsUpset.R to plots the results.

Networks of mRNA and microRNAs

Create co-expression networks

Navigate to ./8_wgcna
Use 1_runWgcna.sh to run 1_makeNetworkMRNA.R and 1_makeNetworkMiRNA.R remotely.
1. Input is the log2 TMM counts of features that remain after non-specific filtering
2. Determine the optimal soft threshold for a signed network is calculated. We use a 0.85 threshold scale-free topology fit for plotting.
3. Calculate the similarity between features (similarity = (correlation + 1) / 2). What is the relationship between gene A and gene B?
4. Calculate the adjacency (`similarity^soft threshold). Adds weighting to connectivity between genes.
5. Calculate the signed TOM matrix. What is the relationship between the neighbors of gene A and gene B?
6. Cluster genes using the dissimilarity TOM matrix (1 - TOM)
7. Using dynamic tree cutting to identify modules. Relevant parameters: deepSplit = 4, pamRespectsDendro = FALSE, minClusterSize = 30
8. Calculate eigengenes. iix. Merge modules with high correlations between eigengenes. Maximum dissimilarity that qualifies merging is 0.3 (cutHeight = 0.3)
9. Calculate module membership. Essentially, the pearson correlation between module eigengenes and the feature’s expression.
10. Get hub genes. In this case, the hub gene is the gene with the highest connectivity in a module.
11. Annotate results with what we already know about the genes. I.e. make the geneInfo dataframes.

Investigate modules

Run the 2.a_cyclEnrich*.R scripts to run a hypergeometric test between cycling genes in a tissue and modules in a tissue. These scripts also plot the results as bubble plots.
Run the 2.b_cyclCompositionPlotting.R script to plot the composition of period categories and cycling genes in cycling modules.
Run the 3.a_cellTypeMRNA.R script to approximate cell-types for mRNA cycling modules.
Run the 3.b_mRNAmiRNAassoc.R script to find the correlation between mRNA-miRNA eigengenes.
Run the 4.a_cytoscape.R to prepare files necessary for plotting modules as a graph-network w/ cytoscape.

Pathway enrichment analysis

Navigate to the ./9_enrichment directory.
To perform pathway analysis of Metacycle and RAIN results, navigate to the rhythmicity directory. Run 1_runGprof.sh to query G: profiler and plot results. Run 2_compareTissues.R to compare tissues and plot results.
To analyze DynOmics and WGCNA results, repeat the above steps, except with the dynOmics and wgcna folders.
Navigate to the compareGwasChrono directory and run 1_gprofGwas.R to perform pathway analysis on genes that are both cycling and previously linked to human chronotype susceptibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Preface

Setup

Main pipeline

Helpers

Download data

QC of sequencing reads (1)

Clean sequencing reads

QC of sequencing reads (2)

Download and index reference genome

Align and quantify microRNA reads

Align and quantify mRNA reads

Data preparation

Identify cycling genes

Pairwise association between microRNA and mRNAs

Find previously observed targets of cycling microRNAs

DynOmics

Networks of mRNA and microRNAs

Create co-expression networks

Investigate modules

Pathway enrichment analysis

Done!

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
0_data		0_data
0_helpers		0_helpers
0_resources/gencode		0_resources/gencode
1_qcSeqReads		1_qcSeqReads
2_trimMRNA		2_trimMRNA
2_trimMiRNA		2_trimMiRNA
3_alignHisat2		3_alignHisat2
3_alignStar		3_alignStar
4_stringtie		4_stringtie
5_dataPrep		5_dataPrep
6_rhythmicity		6_rhythmicity
7.0_mirPrep		7.0_mirPrep
7.1_multiMiR		7.1_multiMiR
7.2_dynOmics		7.2_dynOmics
8_wgcna		8_wgcna
9_enrichment		9_enrichment
shiny		shiny
README.Rmd		README.Rmd
README.html		README.html
README.md		README.md

amzacharias/chronoCNS

Folders and files

Latest commit

History

Repository files navigation

Preface

Setup

Main pipeline

Helpers

Download data

QC of sequencing reads (1)

Clean sequencing reads

QC of sequencing reads (2)

Download and index reference genome

Align and quantify microRNA reads

Align and quantify mRNA reads

Data preparation

Identify cycling genes

Pairwise association between microRNA and mRNAs

Find previously observed targets of cycling microRNAs

DynOmics

Networks of mRNA and microRNAs

Create co-expression networks

Investigate modules

Pathway enrichment analysis

Done!

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages