A Shiny app for dual and bulk RNA‑sequencing analysis
inDAGO supports both dual and bulk RNA-seq workflows within a single, user-friendly Shiny interface.
For dual RNA-seq, users can choose between two alignment strategies:
- Sequential mapping — reads are mapped separately to each reference genome
- Combined mapping — reads are aligned once to a merged reference genome
📊 See dual RNA-seq workflow ▸
Figure: Overview of the inDAGO dual RNA-seq workflow.
The workflow supports both sequential and combined mapping approaches and consists of seven steps. Steps 1, 2, 5, 6, and 7 are common to both approaches, whereas Steps 3 and 4 differ.
Step 1: Quality control of raw mixed reads (organism A + organism B, FASTQ format) using the Biostrings and ShortRead packages; visualizations are produced with ggplot2 and custom R scripts.
Step 2: Filtering of raw mixed reads using Biostrings and ShortRead.
Step 3: Genome indexing of reference sequences (FASTA) performed with Rsubread. In the sequential approach, each organism is indexed separately; in the combined approach, a concatenated genome is indexed once.
Step 4: Alignment of filtered reads, manipulation of SAM/BAM files, and in-silico discrimination of mixed transcripts using Rsubread, Rsamtools, and base R functions. The sequential approach performs two mappings (one per organism), while the combined approach performs a single mapping followed by computational read separation.
Step 5: Assignment and summarization of mapped reads for each organism using Rsubread.
Step 6: Exploration of summarized counts through statistical and graphical analysis using ggplot2, pheatmap, Hmisc, and RNAseQC.
Step 7: Identification of differentially expressed genes (DEGs) with edgeR and HTSFilter.
📊 See bulk RNA-seq workflow ▸
Figure: Overview of the inDAGO bulk RNA-seq workflow.
The bulk RNA-seq workflow follows seven key steps, mirroring the dual workflow but focused on a single organism.
Step 1: Quality control of raw reads.
Step 2: Filtering of low-quality sequences.
Step 3: Genome indexing of the reference genome (FASTA).
Step 4: Alignment of reads to the reference.
Step 5: Summarization of mapped reads by biological unit (e.g., gene).
Step 6: Statistical exploration and visualization of read counts.
Step 7: Identification of differentially expressed genes (DEGs).
The bulk RNA-seq workflow uses the same core set of R packages as the dual pipeline, ensuring consistency and reproducibility across analyses.
The interface walks you step‑by‑step through the entire analysis, from raw reads to publication‑ready plots, and lets you:
- Download intermediate results at each step
- Export high‑quality figures directly for your manuscript
Thanks to optimized, parallelized code, inDAGO runs efficiently on a standard laptop (16 GB RAM), so you don’t need access to a high‑performance cluster.
- Quality Control
Generates quality control metrics and graphical plots.
📊 See plots ▸
Figure: Quality Control Module Outputs. This figure presents key quality control plots generated by inDAGO: (A) average base quality line plot; (B) sequence length distribution; (C) GC content distribution across reads; (D) base quality boxplot showing average and variation per base position; (E) base composition line plot; and (F) base composition area chart across the dataset. Together, these visualizations provide a comprehensive assessment of the sequencing quality and the overall characteristics of the raw read data.
- Sequence Pre‑processing
Read trimming, low‑quality filtering, and adapter removal - Genome indexing
Index genome or genomes according to the selected approach (bulk or dual RNA‑seq) - Reference‑based Alignment
Align reads according to the selected approach (bulk or dual RNA‑seq) - Read Count Summarization
Generate gene or transcript level count matrices - Exploratory Data Analysis
PCA, MDS, heatmaps, and more.
📊 See plots ▸
Figure: Exploratory Data Analysis Module Outputs. This figure presents key exploratory data analysis plots generated by inDAGO: (A) Principal Component Analysis (PCA) plot; (B) Multi-Dimensional Scaling (MDS) plot; (C) gene expression boxplot; (D) library size bar plot; (E) gene expression heatmap; (F) correlation heatmap; and (G) saturation plot. Together, these visualizations provide a comprehensive overview of the exploratory data analysis results and the underlying characteristics of the count data.
- Differential Expression Genes (DEGs) analysis
Identify differentially expressed genes/transcripts across comparisons
📊 See plots ▸
Figure: Differential Expression Gene (DEG) Module Outputs. This figure presents key DEGs analysis plots generated by inDAGO: (A) volcano plot; and (B) UpSet plot. Together, these visualizations provide a comprehensive overview of the differential expression analysis results and highlight key transcriptional changes between conditions.
💻 INSTALLATION GUIDE: R AND RSTUDIO ▸
Official site: CRAN R Project
| OS | Command or Link |
|---|---|
| Windows | Download R for Windows and run the .exe installer. |
| macOS | Download R for macOS and run the .pkg installer. |
Official site: Posit RStudio Desktop
| OS | Command or Link |
|---|---|
| Windows | Download the .exe installer and run it. |
| macOS | Download the .dmg installer and drag RStudio into Applications. |
R --version
Rscript -e 'cat(R.version.string, "\n")'💻 INSTALLATION GUIDE: INDAGO ▸
# Install Bioconductor dependencies if you don't have them yet
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
bioc_pac <- c(
"XVector",
"ShortRead",
"S4Vectors",
"rtracklayer",
"Rsubread",
"Rsamtools",
"limma",
"HTSFilter",
"edgeR",
"Biostrings",
"BiocGenerics"
)
for (pac in bioc_pac) {
if (!requireNamespace(pac, quietly = TRUE))
BiocManager::install(pac)
}
#Install devtools if you don’t have it yet
if (!requireNamespace("devtools", quietly = TRUE))
install.packages("devtools")
# Install inDAGO
devtools::install_github("inDAGOverse/inDAGO")
Install inDAGO from CRAN (https://cran.r-project.org/web/packages/inDAGO/index.html)
# Install inDAGO
install.packages("inDAGO")
🚀 HOW TO LOAD AND LAUNCH THE APP ▸
# Load and launch the app
library(inDAGO)
inDAGO::inDAGO()
⚙️ TIPS FOR A SEAMLESS EXECUTION ▸
To ensure execution during long time-consuming steps such as reference‑based alignment:
💤 Disable sleep mode to keep your system active.
💡 Reduce screen brightness to save power.
These simple precautions can help avoid incomplete runs and unnecessary power consumption.
👥 AUTHORS & ACKNOWLEDGEMENTS ▸
-
Authors / Creators
- Carmine Fruggiero ([email protected])
- Gaetano Aufiero ([email protected])
-
Designated maintainer for CRAN Repository
- Carmine Fruggiero ([email protected])
-
Project Supervisor
- Nunzio D'Agostino ([email protected])
If you find this code useful in your research, please cite:
Aufiero G, Fruggiero C and D’Agostino N (2025) inDAGO: a user-friendly interface for seamless dual and bulk RNA-Seq analysis. Front. Bioinform. 5:1696823. doi: 10.3389/fbinf.2025.1696823