- Research
- Open access
- Published:
Transposable element dynamics in glioblastoma stem cells: insights from locus-specific quantification
Mobile DNA volume 16, Article number: 33 (2025)
Abstract
Background
Glioblastoma, the most common primary malignant brain tumor, has a median survival of less than two years. This is due in part to a subpopulation of cells called glioblastoma stem cells (GSCs), which drive tumor recurrence. Transposable elements (TEs) are expressed at higher levels in cancer stem cells, enhancing the oncogenic potential and plasticity of cells through changes in gene expression, fusion transcript generation, and genomic rearrangement.
Results
Leveraging a large previously published dataset, we investigated the expression of TEs in bulk RNA sequencing data from 42 GSCs to identify subpopulations defined by their TE expression profile. Using telescope, a locus-specific approach to quantifying TE expression, we identified 858 TE loci that were expressed and defined two groups of GSCs using a consensus clustering approach. These TE-driven clusters displayed significant differences in both transcription factor (TF) and gene expression, with one group significantly enriched for a mesenchymal signature based on Gene Set Enrichment Analysis. Next, we extracted the locations and sequences of the TE regulatory domains and elucidated TF binding motifs within the TE sequences. This showed that the SOX11 consensus motif was enriched in the 5’ untranslated region of differentially expressed long interspersed nuclear elements (LINE). SOX11, a known inducer of LINE expression, was significantly under-expressed in the mesenchymal GSC cluster, which correlated with the concurrent decreased expression of LINE transcripts. These loci also overlapped with the enhancer elements of genes that were significantly downregulated, suggesting a potential link between TF binding to TE regulatory regions and gene expression.
Conclusions
Although further mechanistic studies are required, the identified link between TE location, TE and TF expression, and corresponding gene expression suggests that TEs may play a regulatory role in GSC transcription regulation. The current findings highlight the need for further investigation into the role of TEs in defining the gene regulatory and expression landscapes of GSCs. Future studies in this area could have therapeutic implications, given that glioblastoma recurrence may be driven by these cells.
Background
Glioblastoma is the most common primary malignant tumor of the brain, with an incidence of 3.26 cases per 100,000 [1]. Despite an aggressive treatment regimen including maximal safe resection followed by concurrent radiation and temozolomide, survival remains exceedingly low, with a 5-year survival rate of 6.9% [1, 2]. Although new immunotherapy treatment approaches have shown promise, patient survival has not increased substantially, and in some CAR-T cell therapy trials, responses have been transient and complicated by side effects [3,4,5]. In general, glioblastoma can be grouped into three different clinically relevant subtypes: mesenchymal, proneural, and classical [6,7,8]. The subtypes have unique molecular signatures resulting in varying responses to therapeutic approaches and patient outcome trajectories [7]. Mesenchymal tumors are associated with a particularly poor prognosis [7,8,9].
One underlying mechanism for glioblastoma treatment evasion is a subpopulation of cells referred to as glioblastoma stem cells (GSCs) [10, 11]. These radiation- and chemotherapy- resistant cells are capable of self-renewal and tumor initiation, and their plasticity is responsible for tumor cell heterogeneity [12,13,14]. They are also capable of invasion, often “seeding” in distant sites within the brain, avoiding radiographic detection and resection, and eventually leading to recurrence [15,16,17]. Moreover, GSCs are characterized by a high level of chromosomal instability, leading to drastic changes in gene expression and high genetic and transcriptional heterogeneity within tumors [18]. This allows different regions of the tumor to adapt to the local microenvironment, whether it is the hypoxic core or the vascularized edge of the tumor [19]. Thus, there is an urgent need to develop a deeper understanding of GSCs to find therapies that can prevent recurrences.
The role of transposable elements (TEs) in cancer stem cells and glioblastoma has gained significant recognition. TEs are genomic entities that can self-mobilize and reinsert themselves in new positions [20]. There are several subtypes of TEs which are classified into two main families: DNA transposons and retrotransposons. The latter are classified as long terminal repeats (LTR) and non-LTRs [21,22,23]. Within the LTR family, the largest sub-family is referred to as endogenous retroviruses (ERVs), which comprise 8% of the human genome [23]. The non-LTR family includes long interspersed nuclear elements (LINEs) and short interspersed nuclear elements, which each make up about 17% of the human genome [23]. Both LINEs and ERVs contain regulatory regions that can influence gene expression proximally and distally and are therefore of particular interest [24, 25]. LINEs have a 5’ untranslated region (UTR) that contains binding motifs for YY1, SP1, SP3 and RUNX3, amongst other transcription factors (TFs). ERVs, on the other hand, are flanked by long terminal repeat regions which can bind TFs such as FOXA1, FOXA2, SOX2, OC4 and NANOG [26]. Overall, ERVs and LINEs are increasingly understood as critical regulators of stem cell plasticity and recent studies have identified insertions as critical components of the transcriptional circuitry that regulate pluripotency in stem cells via direct interactions with TFs [27,28,29].
All together, these findings have led to the exploration of the role of LINEs and ERVs in cancer, and in cancer stem cells particularly, with the hope of identifying novel therapeutic targets and diagnostic markers. In the present study, we focus on the relationship between ERVs/LINEs and TFs as part of the complex gene expression regulatory network of GSCs. First, we demonstrated that ERV/LINEs can be used to cluster GSCs. Using differential expression analysis and gene set enrichment analysis (GSEA), we identified that the clusters of cells defined by ERV and LINE expression enrich for different glioblastoma subtypes, suggesting a connection between ERV/LINE expression and the mesenchymal gene signature. Furthermore, we find that ERVs and LINEs, as well as their regulatory regions, overlap with genes and enhancer elements of genes that are drivers of cluster identity. Taken together, our results support a potential mechanism whereby ERVs/LINEs regulate GSC identity by altering the activity of TFs on critical genes and underscores the need to further study these relationships to identify their therapeutic and prognostic value.
Methods
RNAseq Data and Alignment
This paper utilized an already published dataset based on bulk RNA sequencing of 42 patient-derived GSCs isolated from different tumors [30]. For gene expression data, standardized, pre-processed gene counts were downloaded from GEO for all samples. To determine transposable element expression,.fastq files from Bioproject GSE119834 were downloaded using the sratoolkit (version 3.0.0-u4jvgps). Files were aligned using bowtie2 (version bowtie2/2.5.3-qgscc2u) with the following options: –very-sensitive-local –score-min L,0,1.6 -k 100 –fr [31]. The resulting.bam files were then processed using telescope (version 1.0.3 + 49.g2832514) to determine locus-specific transposable element counts (telescope assign bam telescopetrasctips.gtf –stranded_mode “RF”) [32]. Telescope has a set of subfamilies that it groups LINEs and ERVs into. LINE subfamilies are based on L1Base groupings and are defined as L1FLI (LINE1 Full Length Intact; elements that are capable of independent reinsertion), L1FLnI (LINE1 Full Length Not Intact; elements that, due to an accumulation of mutations are not capable of independent reinsertion), and L1ORF1 (LINE1 elements with a disrupted ORF1 but functional ORF2) [32, 33]. ERVs are further identified as either ERVL, ERVK or ERV1 [32].
To ensure that the analysis was performed on transposable elements that are expressed at a significant level, the count table was filtered to retain only loci that are expressed at a level higher than 0.55 count per million (CPM) in at least a third of all samples. This cutoff ensured that TEs had at least 10 reads which aligned to corresponding loci in a third of the samples. To ensure robust clustering, diceR (version 3.0.0) was used [34]. The following command was utilized to screen a wide array of clustering approaches: consensus_cluster(PCA_data, nk = 2:4, p.item = 0.8, reps = 100, algorithms = c("som","sc","diana","km")). The best performing clustering approach, which uses divisive analysis hierarchical clustering, was selected based on proportion of ambiguous clustering metric, the consensus cumulative distribution, and the delta plots.
Differential Expression and Gene Enrichment Analysis
Differential expression was determined using DESeq2 (version 1.44.0) [35]. Lfcshrink apeglm was used to reduce noise in the fold change estimates [36]. Gene and TEs were determined to be differentially expressed if their log2 fold change was greater than 1. All heatmaps and volcano plots were generated using pheatmaps (1.0.12) and EnhancedVolcano (1.22.0), respectively. Enrichment analysis was performed using both enrichr (version 3.4) and GSEA (version 4.3.2) [37,38,39,40]. For GSEA, the full rlog normalized gene count tables were used. Hallmark cancer gene sets were analyzed, along with gene sets that define glioblastoma subtypes [7]. To determine any over- or under-representation of chromosomes and TE families in the identified dataset, we randomly selected 858 TE loci (the number that passed the CPM cut off in our sample) from the full background telescope dataset. This was repeated 100,000 times. For each chromosome/family, we built the null distribution of counts, computed the observed deviation from its mean, and obtained a two‐sided empirical p‐value as the fraction of permutations with at least as large an absolute deviation. Finally, we controlled the false discovery rate at 5% using the Benjamini–Hochberg procedure.
Transcription Factor Activity Analysis
To infer transcription factor activity levels, a univariate linear modeling approach was used [41, 42]. First, a gene level statistic was calculated that includes the direction, magnitude and significance of the change. This was calculated as follows: stat = -log10(padj) * logFC. These values were then utilized to infer TF activity changes between the two clusters by running the following command: decoupleR::run_ulm(mat = JR_deg[,'stat', drop = FALSE], net = TF_Gene_net,.source = 'source',.target = 'target',.mor = 'mor', minsize = 2) [42]. Changes in activity were considered significant if the p-value was below 0.05.
Extraction of TE and Gene Regulatory Regions
Upon identification of differentially expressed loci, the genomic locations of 5’UTR of LINE and the upstream LTR of ERV elements were extracted. For ERVs, LTR element locations was extracted from the telescope gtf file. For LINE elements, the 5’ UTR had to be manually annotated. This was done by downloading all LINE insertions from L1Base [33]. The UTR was defined as ending at the start of the ORF1p gene region. The start was defined as 900 base pairs (bp) upstream (or downsteam if on the negative strand) of this location, minus the query start value, which indicates any 5’ truncation of the specific locus as compared to the reference LINE element. Bedtools getfasta command was used to extract the sequences of these sites, which were analyzed using MEME suites FIMO to identify conserved binding motifs of transcription factors [43, 44]. Motifs were considered conserved if the false discovery rate (FDR) was < 0.1.
Using bedtools intersect (bt.intersect(a1, b1, wo = T), the same areas were interrogated for any overlaps with gene enhancer elements. These elements were obtained from the GeneHancer database [45]. To ensure robust results, we filtered the data throughout the integration process. Firstly, the enhancer elements were required to be annotated as “elite”, or to have a combined genehancer score above 12. This ensured that there was robust evidence supporting its function. Secondly, gene-enhancer connections were only considered is they were also annotated as “elite”.
The full R markdown files are available from the authors upon request.
Results
Clustering GSCs by Transposable Element Expression Identifies Unique Clusters
To evaluate which families of LINEs and ERVs are expressed in GSCs, we evaluated the expression of LINEs and ERVs in 42 patient-derived GSCs by using an established, locus-specific alignment approach on bulk-RNA-sequencing data (Fig. 1a). Due to the alignment approach’s structure, we were limited to identifying the expression of LINE and ERV elements. This analysis showed that GSCs express TEs at 858 loci, 268 (31%) of which are ERVs and 589 (69%) of which are LINEs (Fig. 1b). Sub-families ERV1, ERVL, and L1FLnI were particularly well represented. Although expressed TEs were dispersed throughout the genome, we observed a significantly higher concentration of expressed TEs on chromosome 7 than would be expected by chance (68 loci observed, 43.7 expected) (Supplemental Fig. 1a). Higher TE expression arising from chromosome 7 has been previously noted and may be related to genomic copy number variants that are commonly described in glioblastoma [46]. Chromosome 19 also had significantly elevated levels of TE expression.
(A) A schematic demonstrating the data processing approach used in this study. (B) Distribution of expressed transposable elements (TEs) across families. Blue corresponds to Long Terminal Repeat TEs (grouped by ERV sub-families) while red bars correspond to groupings of Long Interspersed Repeat Elements (separated by LINE sub-groupings). (C) PCA plot of the top 10% most variably expressed TEs across all GSCs. Colors correspond to the clusters defined by Divisive Analysis clustering using Euclidean distance
To further identify whether patient-derived GSCs can be clustered into distinct subpopulations based on LINE and ERV expression, we performed consensus clustering approaches on the top 10% most variably expressed LINEs and ERVs (Supplemental Fig. 1b). These variably expressed LINEs and ERVs represented L1FLnI (67 loci), ERV1 [9], ERVK [6] and ERVL [3] sub-families. After testing various clustering algorithms, the most robust clustering approach used Divisive Analysis hierarchical clustering to identify two distinct and robust groups of GSCs (Fig. 1c and Supplemental Fig. 2a). The two GSC groups identified in terms of TE clustering are less clearly differentiated when evaluating gene expression. Using the same clustering approach to cluster cells according to the top 10% most variable genes (rather than LINEs and ERVs), we observed different albeit slightly overlapping clusters (Supplemental Fig. 2b). The differences in cluster separation between TE-based and gene-based clustering suggests that grouping cells by LINE and ERV expression may identify new cell groupings that would not have been as readily identified by gene expression alone.
Clusters Defined by Transposable Elements Display Unique Phenotypes
To investigate whether these two LINE/ERV-defined GSC clusters demonstrate different transcriptomic profiles, we performed differential expression analysis of both TEs and genes. We first explored the differences in gene expression between the two clusters and performed GSEA on the upregulated genes (Fig. 2a). Genes related to epithelial-to-mesenchymal transition (EMT) and NF-kappa beta signaling pathways, both of which are highly active in glioblastoma, were significantly enriched in cluster 2 (Fig. 2b) [47, 48].
(A) A volcano plot showing the changes in gene expression between the two TE-defined clusters shown in Fig. 1. (B) Upregulated genes in cluster 2 demonstrated a significant enrichment for genes related to epithelial mesenchymal transition, as well as other pathways often upregulated in more aggressive tumors like hypoxia, TNF-alpha signaling via NF-kB, and IL-2 signaling. (C) GSCs in cluster 2 showed significant enrichment for the mesenchymal signature and the corresponding heatmap for the leading-edge genes. while. (D) GSCs in cluster 1 were enriched for the proneural signature and the corresponding heatmap for the leading edge genes. This demonstrated that TE expression can be used to identify unique subpopulations of GSCs
To investigate whether LINE/ERV expression in GSCs is correlated with glioblastoma subtype, we performed GSEA using well-established gene signatures for the three glioblastoma subtypes (mesenchymal, proneural, and classical). This analysis showed that cluster 2 had a significant enrichment of genes related to mesenchymal tumors (Normalized Enrichment Score (NES): 1.59; FWER p-value: 0.038) while cluster 1 was defined by a more proneural lineage (NES: 1.65; FWER p value: 0.022) (Fig. 2c,d). Neither cluster showed significant enrichment for the classical signature. These findings are further supported by differentially expressed genes identified as signatures of various glioblastoma cellular states (Supplemental Fig. 3) [49]. In particular, the known glioblastoma mesenchymal signature (FN1, SERPINE1, and GFPT2) was significantly upregulated in Cluster 2 (log2FC 5.52, 2.53, 1.93 respectively). Combined, these results show that clusters defined by LINE/ERV expression are associated with distinct GSC subtypes and transcriptomic profiles.
To further evaluate which LINEs/ERVs most strongly contribute to the distinct transcriptomic profiles of these two clusters, we investigated differences in LINE/ERV expression. Of the 858 distinct TE loci that were expressed in these samples, 77 (9% of all expressed LINE/ERVs) were differentially expressed (shrunken log2 fold change (Log2FC) > 1 and adjusted p-value < 0.05), and cluster 2 showed significant lower expression in LINEs/ERVs (Fig. 3a). Most of the TEs which were different between the two groups are in the L1FLnI family, with 57 loci differentially expressed. To determine the connection of LINE and ERV expression differences to the transcriptomic signatures of the two clusters, we asked whether any of these differentially expressed loci overlap with transcripts that define the mesenchymal signature of cluster 2 or the proneural signature of cluster 1. TEs have been shown to provide regulatory functions for nearby and overlapping genes, influencing the expression of these genes [29, 50,51,52,53]. Of the 77 significantly differentially expressed loci, 69 overlapped with genes. Of these, we found two L1FLnI elements which were upregulated in cluster 2 (L1FLnI_4q26oa and L1FLnI_2q33.1t) and which overlap with the upregulated mesenchymal genes SEC24D (Fig. 3b) and CASP8. Likewise, three downregulated L1FLnI loci overlap with the proneural signature genes NCAM1, MAP2 and DNM3, all of which are also downregulated (Supplemental Fig. 4a). Other notable genes that overlapped with differentially expressed LINEs/ERVs include PTN (−1.61 logFC, overlaps with HARLEQUIN_7q33b which has −1.35 logFC), SOX5 (−1.06 logFC, overlaps with L1FLnI_12p12.1y which has −0.96 logFC), and SOX6 (−1.11 logFC, overlaps with L1FLnI_11p15.2v which has −2.37 logFC) (Fig. 3b and Supplemental Fig. 4a). As these are intronic transposable elements, there is the possibility that this relationship arises from co-expression or transcriptional noise. However, regulation of genes by intronic TEs has been described for many genes, including PTN [53]. None the less, clustering using the expression of LINEs and ERVs, allowed us to define two novel sub-groups of GSCs that had different gene signatures associated with distinct glioblastoma tumor subtypes. A specific subset of differentially expressed LINEs and ERVs also overlap with the loci of genes that are critical to these mesenchymal or proneural subtype signatures, suggesting a potential role for these LINEs/ERVs on development and/or maintenance of mesenchymal and proneural phenotypes in GSCs. To address the limitations of overlapping transposable elements, we tried to identify whether transposable elements may play a role in the regulatory networks of GSCs.
(A) A volcano plot showing the significantly over-expressed and under-expressed TEs between the two clusters (threshold log2 fold change > 1). (B) Screen captures of Integrative Genomics Viewer demonstrating the overlap between TEs and selected genes. Two representative examples are provided. Many genes related to both the increased mesenchymal (SEC24D) and diminished proneural signature (SOX5) of cluster 2 were found to overlap with differentially expressed TEs at common loci
Potential Role of LINEs and ERVs in Transcription Factor Regulatory Networks
To investigate whether LINE and ERV elements might contribute to the changes in transcription in these cells through potential direct interactions with TFs, we investigated differences in the expression of TFs between our two TE-defined clusters. This analysis showed that mesenchymal phenotype associated TFs (e.g., VDR, STAT6, RUNX2, BNC2, ELF4) were upregulated in cluster 2, while TFs defining the proneural phenotype (e.g., MYT1, ASCL1, OLIG2, SOX11, ZNF711) were downregulated (Fig. 4a). Multiple TFs which define the classical subtype were either up- or downregulated between clusters, consistent with neither group being closely associated with this phenotype in terms of gene expression.
(A) A volcano plot demonstrates transcription factors (TFs) which were found to be differentially expressed between the two clusters defined in Fig. 1, colored according to their known glioblastoma subtype association. As expected, TFs known to be related to a mesenchymal signature tended to be upregulated in cluster 2 while the known proneural TFs tended to be downregulated. (B) 17 TFs that have been shown to interact with TEs and regulate either their expression, or that of nearby genes, had significant changes in their expression, activity score or both. (C) There is a statistically significant positive association between the log2 fold change in enhancer of transposable elements that have a conserved TF binding motif in their regulatory region that overlaps with a gene regulatory element and the change in expression of the target genes of these enhancer elements (Estimated Coefficient: 0.6023, p-value: 0.018). While this arises from a small number of genes, this suggests that TEs may contribute to the regulation of gene expression in GSCs. (D) The enhancer region regulating P4HA2 contains the 5’ UTR region of L1FLnI_5131.1l and its conserved SP3 binding motif. (E) The enhancer region regulating PYGB, a glycogen phosphorylase that has shown therapeutic promise in GBM, contains the 5’ UTR region of L1FLnI_20p11.21u and its conserved VDR and SP3 binding motifs. PYGB is significantly upregulated in cluster 2, which correlates with the increased activity of both VDR and SP3, suggesting the LINE element’s regulatory region contributes to the regulation of this gene
In addition to TF expression differences, we used gene expression data to identify differences in the predicted activity of these TFs, based on known TF-gene interactions [42]. Associations between the expression of individual TFs and anticipated target genes were assessed using a univariate linear model [41, 42]. In short, this approach infers regulatory activity of TFs by generating a linear model for each TF and its known target genes. If the slope of this line is positive, the TF is predicted to be more active. This TF activity prediction demonstrated that some of the upregulated mesenchymal related TFs (RUNX2, VDR, ELF4, and STAT6) also had significantly more positive scores in cluster 2. Overall, 14 TFs were associated with one of the three glioblastoma signatures and had a significant change in their activity. Interestingly, only MEIS1, a TF associated with the classical signature has a decrease in activity between the two clusters. Six of the top 8 increases in activity scores were mesenchymal-related TFs, supporting the mesenchymal signature we identified. There were 5 proneural TFs and 3 classical TFs with significant changes (Supplemental Fig. 5a).
To explore potential regulatory interactions between TEs and TFs in our two TE-defined clusters, we first curated a list of TFs that have been shown to bind to TEs and regulate either the expression of that TE itself, or that of nearby genes [24, 26]. Of these 52 TFs, 17 were significantly differentially expressed or had significantly different levels of activity in cluster 2 (Fig. 4b). One particularly notable TF is SOX11. As a positive regulator of LINE expression during neurogenesis, its significant downregulation (−1.38 log2FC) parallels the decrease in LINE element expression and predominantly mesenchymal lineage observed in this cluster of cells [54].
To further investigate the potential functional role of TEs on TF activity, we investigated TF binding to ERV/LINEs located within enhancer regions and the genes associated with these enhancer regions. We first extracted the regulatory elements of expressed ERVs/LINEs. This included the long terminal repeat (LTR) elements of ERVs and the 5’ UTR of LINEs. Of 716 regulatory regions, 101 overlapped with a gene enhancer element. Interestingly, we observed a positive association between the change in ERV/LINE expression and that of genes targeted by the enhancer element overlapping with ERVs/LINEs, an association which was observed to be independent of distance between the ERV/LINE and the enhancer-associated gene (Estimated Coefficient: 0.3624, p-value: 0.0067) (Supplemental Fig. 5b). This suggests that transposable elements and their regulatory elements contribute to changes in gene regulation. Taking this a step further, we extracted the sequences of these TE regulatory regions and probed for conserved TF binding motifs of the 17 differentially activated TE-binding TFs. Focusing on the overlaps where a conserved TF binding site was identified demonstrated an increase in the correlation (Estimated Coefficient: 0.6023, p-value: 0.018) (Fig. 4c). This stronger correlation suggests that enhancers containing a TE with a conserved TF binding motif are more likely to demonstrate altered regulation of their target genes. Genes regulated by these enhancer regions included P4HA2 (1.86 log2FC) and COL6A1 (2.71 log2FC), both of which are associated with the mesenchymal signature of glioblastoma and EMT (Fig. 4d) [55,56,57]. The regulatory domains of the TEs within these enhancer regions both had conserved domains for SP3, which was significantly more active in cluster 2.
We found that 9 of the differentially expressed TE loci overlap with an enhancer element, which in turn regulate 12 genes. Five (42%) of these genes were themselves significantly differentially expressed, a much higher proportion of differentially expressed genes than that for linked genes in general (9.6%). Using a Fisher’s Exact Test to assess enrichment, we found that this corresponds to an odds ratio of 8.15 (p-value: 0.0026). Genes that were differentially expressed included PYGB (log2FC 1.20), B3GALT1 (log2FC −1.63), and NCEH1 (log2FC 1.49). PYGB was particularly interesting, as it is stimulated by hypoxia and PYGB inhibition inhibits cell growth in glioblastoma [58,59,60,61]. The TE element regulatory regions that are found within the enhancer regions of PYGB and NCEH1 have conserved VDR and SP3 bindings motifs, which both demonstrate increased activity in cluster 2 (Fig. 4e). This underscores the impact that TEs have on gene regulation in GSCs. Their regulatory regions provide a significant number of TF binding sites in these cells that may alter the activity of the enhancer elements they overlap with. The genes impacted by these changes have been implicated in EMT, hypoxia and more generally, the mesenchymal signature of glioblastoma. Taken together, these results demonstrate ERV/LINE expression closely relates with TF-related regulatory networks in GSCs.
Discussion
There is a pressing need to understand the regulatory networks that govern the identity and behavior of GSCs to develop more effective therapeutic approaches for glioblastoma. One understudied component of the gene regulatory environment in GSCs is how LINE/ERVs and their interactions with TFs may contribute to tumor phenotypes. Recent studies have demonstrated that TFs can directly regulate LINE/ERV expression [24], LINE/ERV provide novel TF binding sites [53, 62], and TFs and TEs interactions can drive the activity of EMT master regulators [63]. ERVs within the ERVK subfamily are known to be particularly active during early development in pluripotent cells and possess long range regulatory capabilities in these cells. HERVH noncoding RNA acts as a scaffold for gene expression regulation in embryonic stem cells, and MER41 (also an ERV) interacts with the TF STAT1 to provide interferon-inducible binding sites [28, 64, 65]. In glioblastoma generally, LINEs and ERVs have been shown to play significant roles in pathogenesis, tumor heterogeneity, genomic instability, influencing the tumor microenvironment (TME), and facilitating immune invasion [66,67,68,69]. To our knowledge, there are no studies elucidating LINE/ERV expression patterns specifically in GSCs, nor any showing how they may contribute to gene regulatory networks and GSC identity. Gleaning this information may unveil new therapeutic approaches to this disease given the central role of GSCs.
In the present study, we utilized the most expansive GSC bulk RNA sequencing dataset available to perform locus specific transposable element alignment and quantification. We identified two main clusters of GSCs with unique TE expression profiles, which also had significantly different underlying gene expression characteristics. We found that samples in cluster 2 had a significant enrichment in genes related to a mesenchymal signature. To our knowledge, this is the first demonstration of this potential link between GSC TE expression patterns and either gene signatures or glioblastoma subtype. The differences in both gene signature and TF expression between distinct TE-defined clusters suggest that TEs may contribute to the regulatory landscape of these cells.
Transcription regulatory functions of TEs, including mechanisms which are TF-mediated, have been described in the literature. We found that multiple regulatory domains of TEs directly overlap with enhancer regions in the genome and may provide additional binding sites for TFs to bind and interact with these enhancer sites. This relationship is particularly important in a disease setting like glioblastoma. In healthy cells, TE expression is regulated and diminished by epigenetic mechanisms such as H3K9me3 and chromatin remodeling. However, in glioblastoma there is a general disruption of this epigenetic control of gene transcription that also results in an increase in TE expression. These changes will also increase the accessibility of TFs to the regulatory regions of TEs. In addition to the link between TE expression patters and GSC subtypes, we identified a set of enhancer regions that overlap with the regulatory regions of TEs. The expression of genes that were regulated by these enhancer regions corresponded with changes in the expression of overlapping TEs, suggesting a previously unrecognized regulatory role for TEs in GSCs. Genes impacted by this potential regulatory relationship were involved with pathways that defined the two clusters we identified, such as EMT and hypoxia. This paper represents a first step in understanding how TEs impact the regulatory landscape of these particularly dynamic cells.
One of the major limitations of this study is that we were limited to a computational analysis of these dynamic relationships. This constrained us twofold. First, we were limited by Telescope to identifying only LINE and ERV elements, although other TEs contribute to the transcriptional regulation of genomic elements. Secondly, to more definitively probe the relationship between TE expression and GSC identity beyond the present computational analyses, studies targeting and perturbing specific TE-gene associations will be necessary. By modulating the accessibility and/or activity of specific TE loci, it would be possible to infer whether the TEs are independently acting as regulatory domains or whether changes in their expression are driven by changes in proximal gene expression. The use of approaches such as CRISPR to modulate the activity of specific TE loci have recently been used to study such interactions in embryonic stem cells and provides powerful insights to how TEs regulate gene expression and vice versa [70, 71]. Other approaches like the recently described CARGO-BioID would provide information on how TFs interact at these loci and dictate gene expression differences [72]. Another limitation is that we were unable to probe the relationship between the expression of these transposable elements and patient outcomes and/or treatment responses. Integrating patient outcomes into this data would allow us to understand whether transposable elements provide prognostic value in this disease.
Due to their heterogeneity and plasticity, GSCs display significant diversity in the identity and behavior even within a single tumor [49]. Beyond the bulk RNA sequencing data used for the present study, single cell approaches would allow a more granular analysis of which GSCs are most affected by TE regulatory activity and would more conclusively show whether a mesenchymal signature is correlated with changes in TE expression we found in the present study. However, accurate identification of TE expression within single cells remains challenging, and SoloTE represents the only published approach that provides locus specific results [73, 74].
Conclusions
In summary, we identified a link between TE expression and TFs that regulate GSC phenotypic identity. Although we could not experimentally verify these functional connections and establish a directionality for these relationships, we believe our analyses underscore the need to further characterize the role that TEs play in the development and progression of glioblastoma. Future work will have to be done to fully characterize the impact these genomic elements have on gene regulatory networks and cell identities.
Data availability
No datasets were generated or analysed during the current study.
References
Ostrom QT, Price M, Neff C, Cioffi G, Waite KA, Kruchko C. CBTRUS Statistical Report: Primary Brain and Other Central Nervous System Tumors Diagnosed in the United States in 2015–2019. Neuro Oncol. 2022;24(Supplement_5):v1–95. https://academic.oup.com/neuro-oncology/article/24/Supplement_5/v1/6742201.
Alexander BM, Cloughesy TF. Adult glioblastoma. J Clin Oncol. 2017Jul 20;35(21):2402–9.
Choi BD, Gerstner ER, Frigault MJ, Leick MB, Mount CW, Balaj L. Intraventricular CARv3-TEAM-E T Cells in Recurrent Glioblastoma. N Engl J Med. 2024;390(14):1290–8. https://www.nejm.org/doi/full/10.1056/NEJMoa2314390
Bagley SJ, Logun M, Fraietta JA, Wang X, Desai AS, Bagley LJ, et al. Intrathecal bivalent CAR T cells targeting EGFR and IL13Rα2 in recurrent glioblastoma: phase 1 trial interim results. Nat Med. 2024;30(5):1320–9. https://www.nature.com/articles/s41591-024-02893-z.
Stupp R, Taillibert S, Kanner AA, Kesari S, Steinberg DM, Toms SA. Maintenance Therapy With Tumor-Treating Fields Plus Temozolomide vs Temozolomide Alone for Glioblastoma: A Randomized Clinical Trial. JAMA. 2015;314(23):2535–43. https://jamanetwork.com/journals/jama/fullarticle/2475463.
Tang Y, Qazi MA, Brown KR, Mikolajewicz N, Moffat J, Singh SK. Identification of five important genes to predict glioblastoma subtypes. Neurooncol Adv. 2021;3(1):vdab144. https://pmc.ncbi.nlm.nih.gov/articles/PMC8577514/.
Verhaak RGW, Hoadley KA, Purdom E, Wang V, Qi Y, Wilkerson MD. Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. Cancer Cell. 2010;17(1):98–110. https://pubmed.ncbi.nlm.nih.gov/20129251/.
Wang Q, Hu B, Hu X, Kim H, Squatrito M, Scarpace L. Tumor evolution of glioma intrinsic gene expression subtype associates with immunological changes in the microenvironment. Cancer Cell. 2017;32(1):42. https://pmc.ncbi.nlm.nih.gov/articles/PMC5599156/.
Kim Y, Varn FS, Park SH, Yoon BW, Park HR, Lee C. Perspective of mesenchymal transformation in glioblastoma. Acta Neuropathol Commun. 2021;9(1):1–20. https://actaneurocomms.biomedcentral.com/articles/10.1186/s40478-021-01151-4
Chen J, Li Y, Yu TS, McKay RM, Burns DK, Kernie SG. A restricted cell population propagates glioblastoma growth after chemotherapy. Nature. 2012;488(7412):522–6. https://www.nature.com/articles/nature11287.
Bao S, Wu Q, McLendon RE, Hao Y, Shi Q, Hjelmeland AB. Glioma stem cells promote radioresistance by preferential activation of the DNA damage response. Nature. 2006;444(7120):756–60. https://www.nature.com/articles/nature05236.
Lathia JD, Mack SC, Mulkearns-Hubert EE, Valentim CLL, Rich JN. Cancer stem cells in glioblastoma. Genes Dev. 2015;29(11):1112-1123. PMCID: PMC4495393.
Alves ALV, Gomes INF, Carloni AC, Rosa MN, da Silva LS, Evangelista AF. Role of glioblastoma stem cells in cancer therapeutic resistance: a perspective on antineoplastic agents from natural sources and chemical derivatives. Stem Cell Res Ther. 2021;12(1):1–22. https://stemcellres.biomedcentral.com/articles/10.1186/s13287-021-02231-x.
Liebelt BD, Shingu T, Zhou X, Ren J, Shin SA, Hu J. Glioma Stem Cells: Signaling, Microenvironment, and Therapy. Stem Cells Int. 2016;2016:3971487. PMCID: PMC4736567.
Cheng L, Wu Q, Guryanova OA, Huang Z, Huang Q, Rich JN. Elevated Invasive Potential of Glioblastoma Stem Cells. Biochem Biophys Res Commun. 2011;406(4):643 Available from: /pmc/articles/PMC3065536/.
Kim Y, Kim E, Wu Q, Guryanova O, Hitomi M, Lathia JD. Platelet-derived growth factor receptors differentially inform intertumoral and intratumoral heterogeneity. Genes Dev. 2012;26(11):1247 Available from: /pmc/articles/PMC3371412/.
Tamura R, Miyoshi H, Sampetrean O, Shinozaki M, Morimoto Y, Iwasawa C. Visualization of spatiotemporal dynamics of human glioma stem cell invasion. Mol Brain. 2019;12(1):1–10. https://molecularbrain.biomedcentral.com/articles/10.1186/s13041-019-0462-3.
Zhao Y, Carter R, Natarajan S, Varn FS, Compton DA, Gawad C. Single-cell RNA sequencing reveals the impact of chromosomal instability on glioblastoma cancer stem cells. BMC Med Genomics. 2019;12(1). Available from: /pmc/articles/PMC6545015/
Ravi VM, Will P, Kueckelhaus J, Sun N, Joseph K, Salié H, et al. Spatially resolved multi-omics deciphers bidirectional tumor-host interdependence in glioblastoma. Cancer Cell. 2022Jun 13;40(6):639–655.e13.
Branco MR, Soler A de M, editors. Transposable Elements: Methods and Protocols. 1st ed. Humana New York, NY; 2023. http://www.springer.com/series/7651.
Burns KH. Transposable elements in cancer. Nat Rev Cancer. 2017;17(7):415–24. https://www.nature.com/articles/nrc.2017.35.
Zhang Q, Pan J, Cong Y, Mao J. Transcriptional Regulation of Endogenous Retroviruses and Their Misregulation in Human Diseases. Int J Mol Sci. 2022;23(17). Available from: /pmc/articles/PMC9456331/
Cordaux R, Batzer MA. The impact of retrotransposons on human genome evolution. Nat Rev Genet. 2009;10(10):691–703. https://www.nature.com/articles/nrg2640.
Hermant C, Torres-Padilla ME. TFs for TEs: the transcription factor repertoire of mammalian transposable elements. Genes Dev. 2021;35(1–2):22. https://pmc.ncbi.nlm.nih.gov/articles/PMC7778262/.
Chuong EB, Elde NC, Feschotte C. Regulatory activities of transposable elements: from conflicts to benefits. Nat Rev Genet. 2017;18(2):71 Available from: /pmc/articles/PMC5498291/.
Ito J, Sugimoto R, Nakaoka H, Yamada S, Kimura T, Hayano T. Systematic identification and characterization of regulatory elements derived from human endogenous retroviruses. PLoS Genet. 2017;13(7):e1006883. https://pmc.ncbi.nlm.nih.gov/articles/PMC5529029/.
Wang J, Xie G, Singh M, Ghanbarian AT, Raskó T, Szvetnik A. Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells. Nature. 2014;516(7531):405–9. https://www.nature.com/articles/nature13804.
Lu X, Sachs F, Ramsay LA, Jacques PÉ, Göke J, Bourque G. The retrovirus HERVH is a long noncoding RNA required for human embryonic stem cell identity. Nat Struct Mol Biol. 2014;21(4):423–5 https://www.nature.com/articles/nsmb.2799.
Kunarso G, Chia NY, Jeyakani J, Hwang C, Lu X, Chan YS. Transposable elements have rewired the core regulatory network of human embryonic stem cells. Nat Genet. 2010;42(7):631–4 https://pubmed.ncbi.nlm.nih.gov/20526341/.
Mack SC, Singh I, Wang X, Hirsch R, Wu Q, Villagomez R, et al. Chromatin landscapes reveal developmentally encoded transcriptional states that define human glioblastoma. J Exp Med. 2019;216(5):1071–90. https://doi.org/10.1084/jem.20190196.
Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nature Methods. 2012;9(4):357–9 https://www.nature.com/articles/nmeth.1923.
Bendall ML, De Mulder M, Iñiguez LP, Lecanda-Sánchez A, Pérez-Losada M, Ostrowski MA. Telescope: Characterization of the retrotranscriptome by accurate estimation of transposable element expression. PLoS Comput Biol. 2019;15(9):e1006453 https://pmc.ncbi.nlm.nih.gov/articles/PMC6786656/.
Penzkofer T, Jäger M, Figlerowicz M, Badge R, Mundlos S, Robinson PN. L1Base 2: more retrotransposition-active LINE-1s, more mammalian genomes. Nucleic Acids Res. 2017;45(D1):D68–73 https://doi.org/10.1093/nar/gkw925.
Chiu DS, Talhouk A. DiceR: An R package for class discovery using an ensemble driven approach. BMC Bioinformatics. 2018;19(1):1–4 https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-017-1996-y.
Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15(12):1–21 https://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8.
Zhu A, Ibrahim JG, Love MI. Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences. Bioinformatics. 2019;35(12):2084–92. https://doi.org/10.1093/bioinformatics/bty895.
Xie Z, Bailey A, Kuleshov MV, Clarke DJB, Evangelista JE, Jenkins SL. Gene Set Knowledge Discovery with Enrichr. Curr Protoc. 2021;1(3):e90 https://onlinelibrary.wiley.com/doi/full/10.1002/cpz1.90.
Chen EY, Tan CM, Kou Y, Duan Q, Wang Z, Meirelles GV. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics. 2013;14. https://pubmed.ncbi.nlm.nih.gov/23586463/.
Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA. Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci U S A. 2005;102(43):15545–50 https://www.pnas.org/doi/abs/10.1073/pnas.0506580102.
Mootha VK, Lindgren CM, Eriksson KF, Subramanian A, Sihag S, Lehar J. PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Gen. 2003;34(3):267–73 https://www.nature.com/articles/ng1180.
Teschendorff AE, Wang N. Improved detection of tumor suppressor events in single-cell RNA-Seq data. npj Genom Med. 2020;5(1):1–14 https://www.nature.com/articles/s41525-020-00151-y.
Badia-I-Mompel P, Vélez Santiago J, Braunger J, Geiss C, Dimitrov D, Müller-Dott S, et al. decoupleR: ensemble of computational methods to infer biological activities from omics data. Bioinformatics Adv. 2022. https://doi.org/10.1093/bioadv/vbac016.
Quinlan AR, Hall IM. BEDtools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2. https://doi.org/10.1093/bioinformatics/btq033.
Grant CE, Bailey TL, Noble WS. FIMO: scanning for occurrences of a given motif. Bioinformatics. 2011;27(7):1017–8. https://doi.org/10.1093/bioinformatics/btr064.
Fishilevich S, Nudel R, Rappaport N, Hadar R, Plaschkes I, Stein TI. GeneHancer: genome-wide integration of enhancers and target genes in GeneCards. Database (Oxford). 2017;2017. https://pubmed.ncbi.nlm.nih.gov/28605766/.
Bonté PE, Arribas YA, Merlotti A, Carrascal M, Zhang JV, Zueva E, et al. Single-cell RNA-seq-based proteogenomics identifies glioblastoma-specific transposable elements encoding HLA-I-presented peptides. Cell Rep. 2022;39(10). https://pubmed.ncbi.nlm.nih.gov/35675780/.
Soubannier V, Stifani S. NF-κB Signalling in Glioblastoma. Biomedicines. 2017;5(2):29 https://pmc.ncbi.nlm.nih.gov/articles/PMC5489815/.
Iwadate Y. Epithelial-mesenchymal transition in glioblastoma progression. Oncol Lett. 2016;11(3):1615 https://pmc.ncbi.nlm.nih.gov/articles/PMC4774466/.
Neftel C, Laffy J, Filbin MG, Hara T, Shore ME, Rahme GJ. An integrative model of cellular states, plasticity and genetics for glioblastoma. Cell. 2019;178(4):835 https://pmc.ncbi.nlm.nih.gov/articles/PMC6703186/.
Ito J, Kimura I, Soper A, Coudray A, Koyanagi Y, Nakaoka H. Endogenous retroviruses drive KRAB zinc-finger protein family expression for tumor suppression. Sci Adv. 2020;6(43). https://www.science.org/doi/10.1126/sciadv.abc3020.
Chen M, Huang X, Wang C, Wang S, Jia L, Li L. Endogenous retroviral solo-LTRs in human genome. Front Genet. 2024;15:1358078 http://repeatmasker.org.
Elbarbary RA, Lucas BA, Maquat LE. Retrotransposons as regulators of gene expression. Science. 2016;351(6274):aac7247. https://pmc.ncbi.nlm.nih.gov/articles/PMC4788378/
Cohen CJ, Lock WM, Mager DL. Endogenous retroviral LTRs as promoters for human genes: A critical assessment. Gene. 2009;448(2):105–14.
Orqueda AJ, Gatti CR, Ogara MF, Falzone TL. SOX-11 regulates LINE-1 retrotransposon activity during neuronal differentiation. FEBS Lett. 2018;592(22):3708–19 https://onlinelibrary.wiley.com/doi/full/10.1002/1873-3468.13260.
Cha J, Ding EA, Carvalho EM, Fowler A, Aghi MK, Kumar S. Glioma Cells Secrete Collagen VI to Facilitate Invasion. bioRxiv. 2023;2023.12.12.571198. https://www.biorxiv.org/content/10.1101/2023.12.12.571198v1.
Huang M Sen, Fu LH, Yan HC, Cheng LY, Ru HM, Mo S. Proteomics and liquid biopsy characterization of human EMT-related metastasis in colorectal cancer. Front Oncol. 2022;12:790096.
Lin J, Jiang L, Wang X, Wei W, Song C, Cui Y. P4HA2 Promotes Epithelial-to-Mesenchymal Transition and Glioma Malignancy through the Collagen-Dependent PI3K/AKT Pathway. J Oncol. 2021;2021. https://pubmed.ncbi.nlm.nih.gov/34434233/.
Ferraro G, Mozzicafreddo M, Ettari R, Corsi L, Monti MC. A Proteomic Platform Unveils the Brain Glycogen Phosphorylase as a Potential Therapeutic Target for Glioblastoma Multiforme. Int J Mol Sci. 2022;23(15):8200 https://pmc.ncbi.nlm.nih.gov/articles/PMC9331883/.
Pan Y, Zhou Y, Shen Y, Xu L, Liu H, Zhang N. Hypoxia Stimulates PYGB Enzymatic Activity to Promote Glycogen Metabolism and Cholangiocarcinoma Progression. Cancer Res. 2024;84(22). https://pubmed.ncbi.nlm.nih.gov/39163511/.
Wang G, Ni X, Wang J, Dai M. METTL3-mediated m6A methylation of PYGB facilitates pancreatic ductal adenocarcinoma progression through the activation of NF-κB signaling. Pathol Res Pract. 2023;1(248):154645.
Yang C, Wang H, Shao M, Chu F, He Y, Chen X. Brain-Type Glycogen Phosphorylase (PYGB) in the Pathologies of Diseases: A Systematic Review. Cells. 2024;13(3):289 https://pmc.ncbi.nlm.nih.gov/articles/PMC10854662/.
Chuong EB, Elde NC, Feschotte C. Regulatory evolution of innate immunity through co-option of endogenous retroviruses. Sci. 2016;351(6277):1083 https://pmc.ncbi.nlm.nih.gov/articles/PMC4887275/.
Eskier D, Yetkin S, Arslan N, Karakülah G, Alotaibi H. Exploring Regulatory Roles of Transposable Elements in EMT and MET through Data-Driven Analysis: Insights from regulaTER. J Mol Biol. 2025;437(2):168887.
Fuentes DR, Swigut T, Wysocka J. Systematic perturbation of retroviral LTRs reveals widespread long-range effects on human gene regulation. Elife. 2018;7:e35989 https://pmc.ncbi.nlm.nih.gov/articles/PMC6158008/.
Chuong EB, Elde NC, Feschotte C. Regulatory evolution of innate immunity through co-option of endogenous retroviruses. Science. 2016;351(6277):1083–7 https://www.science.org/doi/10.1126/science.aad5497.
Campbell IM, Gambin T, Dittwald P, Beck CR, Shuvarikov A, Hixson P, et al. Human endogenous retroviral elements promote genome instability via non-allelic homologous recombination. BMC Biol. 2014;12(1):74 https://pmc.ncbi.nlm.nih.gov/articles/PMC4195946/.
Nguyen THM, Carreira PE, Sanchez-Luque FJ, Schauer SN, Fagg AC, Richardson SR, et al. L1 Retrotransposon Heterogeneity in Ovarian Tumor Cell Evolution. Cell Rep. 2018;23(13):3730–40.
Zhou X, Singh M, Santos GS, Guerlavais V, Carvajal LA, Aivado M. Pharmacologic Activation of p53 Triggers Viral Mimicry Response Thereby Abolishing Tumor Immune Evasion and Promoting Antitumor Immunity. Cancer Discov. 2021;11(12):3090–105 https://pubmed.ncbi.nlm.nih.gov/34230007/.
Petri R, Brattås PL, Sharma Y, Jonsson ME, Pircs K, Bengzon J. LINE-2 transposable elements are a source of functional human microRNAs and target sites. PLoS Genet. 2019;15(3). https://pubmed.ncbi.nlm.nih.gov/30865625/.
Pal D, Patel M, Boulet F, Sundarraj J, Grant OA, Branco MR. H4K16ac activates the transcription of transposable elements and contributes to their cis-regulatory function. Nat Struct Mol Biol. 2023;30(7):935–47 https://www.nature.com/articles/s41594-023-01016-5.
Sakashita A, Ariura M, Namekawa SH. CRISPR-Mediated Activation of Transposable Elements in Embryonic Stem Cells. Methods Mol Biol. 2022;2509:171–94 https://link.springer.com/protocol/10.1007/978-1-0716-2380-0_11.
Sun T, Xu Y, Xiang Y, Ou J, Soderblom EJ, Diao Y. Crosstalk between RNA m6A and DNA methylation regulates transposable element chromatin activation and cell fate in human pluripotent stem cells. Nat Gen. 2023;55(8):1324–35 https://www.nature.com/articles/s41588-023-01452-5.
Rodríguez-Quiroz R, Valdebenito-Maturana B. SoloTE for improved analysis of transposable elements in single-cell RNA-Seq data using locus-specific expression. Commun Biol. 2022;5(1). Available from: /pmc/articles/PMC9537157/
He J, Babarinde IA, Sun L, Xu S, Chen R, Shi J. Identifying transposable element expression dynamics and heterogeneity during development at the single-cell level with a processing pipeline scTE. Nat Commun. 2021;12(1):1–14 https://www.nature.com/articles/s41467-021-21808-x.
Clinical Trial Number
Not applicable.
Funding
This study was supported by a Warren Alpert Foundation Grant #17775 to N.T., private philanthropic donations to the Laboratory of Cancer Epigenetics and Plasticity and from internal support of Brown University to N.T.
Author information
Authors and Affiliations
Contributions
M.P., Y.S., O.L.: Experimentation, Data Analysis, Figures, Manuscript Writing and Editing. N.T.: Conceptualization, Supervision, Editing of Manuscript, Funding.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
No new human data was generated in this study and no additional ethical approval was required,
Consent for publication
Not applicable.
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
13100_2025_370_MOESM1_ESM.pdf
Supplementary Material 1. Supplemental Fig. 1: (A) Distribution of expressed transposable elements (TEs) across chromosomes. Red columns show the expected distribution of expression as determined by 10,000 random simulations of expression. Blue corresponds to the observed expression across all samples in the dataset. The table provides the exact values, with significantly over or under represented chromosomes bolded (p-value < 0.01) (B) A heatmap showing the expression of the most variably expressed TEs across all cell lines. These TEs were used to determine the clustering of the glioblastoma stem cell samples using a consensus clustering approach. There is significant variability in the expression of these TEs between the two clusters, as well as within the individual clusters.
Supplemental Fig. 2: (A) Determination of ideal cluster number was done using ConsensusClusterPlus. The Cumulative distribution function showed that K = 2 has the flattest curve, indicating stable clustering. This clustering was also supported by the cluster consensus matrix heatmap showing high intra-cluster consensus. (B) PCA plot of GSC gene expression, with clusters determined by the most variably expressed genes. The same consensus clustering approach was used to determine which algorithm best clusters the cells by their most variably expressed cells. The labels indicate the original TE cluster the sample belongs to. Although partially overlapping, the groupings of cells are different, suggesting that using TE expression to cluster cells can unveil novel subpopulations of cells as they provide different clusters.
Supplemental Fig. 3: Significantly differentially expressed genes that were identified to be drivers of glioblastoma stem cell states by Neftal et al. (49). Confirming our other results, cluster 2 has increased expression levels of genes related to the mesenchymal signature and decreased expression of genes related to astrocyte, oligodendrocyte precursor and neural precursor signatures. The changes in expression are between the two clusters defined in the first section by their expression of variable TEs.
Supplemental Fig. 4: More representative screen captures from the Integrative Genomics Viewer demonstrating TE overlap with (A) genes related to epithelial to mesenchymal transition and (B) enhancer elements. GH01J247110 regulates the expression of ZNF669 and ZNF124.
Supplemental Fig. 5: (A) TFs related to glioblastoma signatures that had significant changes in their activity score between the two clusters (p-value < 0.05). These are the two clusters that were defined by the expression of variable TEs. (B) The log2 Fold Change in the expression of transposable elements with regulatory regions located in the enhancer region of genes is correlated to the change in the expression of target genes. This relationship is independent of distance between TE and the gene (Estimated Coefficient: 0.3624, p-value: 0.0067). The correlation coefficient increases when focusing solely on TEs with conserved transcription factor binding motifs (see Fig. 4 C)
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Pizzagalli, M.D., Suita, Y., Leary, O.P. et al. Transposable element dynamics in glioblastoma stem cells: insights from locus-specific quantification. Mobile DNA 16, 33 (2025). https://doi.org/10.1186/s13100-025-00370-z
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s13100-025-00370-z