Abstract
Lungs are essential respiratory organs in terrestrial vertebrates, present in most bony fishes but absent in cartilaginous fishes, making them an ideal model for studying organ evolution. Here we analysed single-cell RNA sequencing data from adult and developing lungs across vertebrate species, revealing significant similarities in cell composition, developmental trajectories and gene expression patterns. Surprisingly, a large proportion of lung-related genes, coexpression patterns and many lung enhancers are present in cartilaginous fishes despite their lack of lungs, suggesting that a substantial genetic foundation for lung development existed in the last common ancestor of jawed vertebrates. In addition, the 1,040 enhancers that emerged since the last common ancestor of bony fishes probably contain lung-specific elements that led to the development of lungs. We further identified alveolar type 1 cells as a mammal-specific alveolar cell type, along with several mammal-specific genes, including ager and sfta2, that are highly expressed in lungs. Functional validation showed that deletion of sfta2 in mice leads to severe respiratory defects, highlighting its critical role in mammalian lung features. Our study provides comprehensive insights into the evolution of vertebrate lungs, demonstrating how both regulatory network modifications and the emergence of new genes have shaped lung development and specialization across species.
Similar content being viewed by others
Data availability
All sequencing data and genome assemble have been deposited in the NCBI database (PRJNA1026724).
Code availability
The code is available via GitHub at https://github.com/YeLi0909/vertebrate-lung (ref. 116).
References
Clack, J. A. Devonian climate change, breathing, and the origin of the tetrapod stem group. Integr. Comp. Biol. 47, 510–523 (2007).
Cupello, C. et al. Lung evolution in vertebrates and the water-to-land transition. eLife 11, e77156 (2022).
Sallan, L. C. & Coates, M. I. End-Devonian extinction and a bottleneck in the early evolution of modern jawed vertebrates. Proc. Natl Acad. Sci. USA 107, 10131–10135 (2010).
Bond, D. P. G. & Grasby, S. E. On the causes of mass extinctions. Palaeogeogr. Palaeoclimatol. Palaeoecol. 478, 3–29 (2017).
Perry, S. F., Wilson, R. J. A., Straus, C., Harris, M. B. & Remmers, J. E. Which came first, the lung or the breath? Comp. Biochem. Physiol. 129, 37–47 (2001).
Longo, S., Riccio, M. & Mccune, A. Homology of lungs and gas bladders: Insights from arterial vasculature. J Morphol. 274, 687–703 (2013).
Cass, A. N., Servetnick, M. & Mccune, A. Expression of a lung developmental cassette in the adult and developing zebrafish swimbladder. Evol. Dev. 15, 119–132 (2013).
Zheng, W. et al. Comparative transcriptome analyses indicate molecular homology of zebrafish swimbladder and mammalian lung. PLoS ONE 6, e24019 (2011).
Daniels, C. B. et al. The origin and evolution of the surfactant system in fish: insights into the evolution of lungs and swim bladders. Physiol. Biochem. Zool. 77, 732–749 (2004).
Wang, X. et al. Archaeorhynchus preserving significant soft tissue including probable fossilized lungs. Proc. Natl Acad. Sci. USA 115, 11555–11560 (2018).
Trinajstic, K. et al. Exceptional preservation of organs in Devonian placoderms from the Gogo lagerstätte. Science 377, 1311–1314 (2022).
Janvier, P., Desbiens, S. & Willett, J. A. New evidence for the controversial ‘lungs’ of the Late Devonian antiarch Bothriolepis canadensis (Whiteaves, 1880) (Placodermi: Antiarcha). J. Vertebr. Paleontol. 27, 709–710 (2007).
Hara, Y. et al. Shark genomes provide insights into elasmobranch evolution and the origin of vertebrates. Nat. Ecol. Evol. 2, 1761–1771 (2018).
Hsia, C. C. W., Schmitz, A., Lambertz, M., Perry, S. F. & Maina, J. N. Evolution of air breathing: oxygen homeostasis and the transitions from water to land and sky. Compr. Physiol. 3, 849–915 (2013).
Lambertz, M., Grommes, K., Kohlsdorf, T. & Perry, S. Lungs of the first amniotes: why simple if they can be complex?. Biol. Lett. 11, 20140848 (2015).
Rankin, S. A. et al. A molecular atlas of Xenopus respiratory system development. Dev. Dyn. 244, 69–85 (2015).
Zaccone, G., Mauceri, A., Maisano, M. & Fasulo, S. Innervation of lung and heart in the ray-finned fish, bichirs. Acta Histochem. 111, 217–229 (2009).
Maina, J. N. The morphology of the lung of the African lungfish, Protopterus aethiopicus: a scanning electron-microscopic study. Cell Tissue Res. 250, 191–196 (1987).
Wallau, B. R., Schmitz, A. & Perry, S. F. Lung morphology in rodents (Mammalia, Rodentia) and its implications for systematics. J. Morphol. 246, 228–248 (2000).
Raredon, M. S. B. et al. Single-cell connectomic analysis of adult mammalian lungs. Sci. Adv. 5, eaaw3851 (2019).
Dai, M. et al. Dissection of key factors correlating with H5N1 avian influenza virus driven inflammatory lung injury of chicken identified by single-cell analysis. PLoS Pathog. 19, e1011685 (2023).
Tolomeo, M., Cavalli, A. & Cascio, A. STAT1 and its crucial role in the control of viral infections. Int. J. Mol. Sci. 23, 4095 (2022).
Lelièvre, E. The Ets family contains transcriptional activators and repressors involved in angiogenesis. Int. J. Biochem. Cell B 33, 391–407 (2001).
Fahmy, R. G., Dass, C. R., Sun, L.-Q., Chesterman, C. N. & Khachigian, L. M. Transcription factor Egr-1 supports FGF-dependent angiogenesis during neovascularization and tumor growth. Nat. Med. 9, 1026–1032 (2003).
Hale, A. T. et al. Endothelial Krüppel-like factor 4 regulates angiogenesis and the Notch signaling pathway. J. Biol. Chem. 289, 12016–12028 (2014).
Feng, H., Zhang, Y.-B., Gui, J.-F., Lemon, S. M. & Yamane, D. Interferon regulatory factor 1 (IRF1) and anti-pathogen innate immune responses. PLoS Pathog. 17, e1009220 (2021).
Persyn, E. et al. IRF2 is required for development and functional maturation of human NK cells. Front. Immunol. 13, 1038821 (2022).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Tarashansky, A. J. et al. Mapping single-cell atlases throughout Metazoa unravels cell type evolution. eLife 10, e66747 (2021).
Beers, M. F. & Mulugeta, S. The biology of the ABCA3 lipid transporter in lung health and disease. Cell Tissue Res. 367, 481–493 (2017).
Chroneos, Z. C., Sever-Chroneos, Z. & Shepherd, V. L. Pulmonary surfactant: an immunological perspective. Cell. Physiol. Biochem. 25, 13–26 (2010).
Becker, M.-B., ZuÈlch, A., Bosse, A. & Gruss, P. Irx1 and Irx2 expression in early lung development. Mech. Dev. 106, 155–158 (2001).
van Tuyl, M. et al. Iroquois genes influence proximo-distal morphogenesis during rat lung development. Am. J. Physiol. Lung Cell. Mol. Physiol. 290, L777–L789 (2006).
Angenendt, L. et al. The neuropeptide receptor calcitonin receptor-like (CALCRL) is a potential therapeutic target in acute myeloid leukemia. Leukemia 33, 2830–2841 (2019).
Kumar, S., Stecher, G., Suleski, M. & Hedges, S. B. TimeTree: a resource for timelines, timetrees, and divergence times. Mol. Biol. Evol. 34, 1812–1819 (2017).
Negretti, N. M. et al. A single-cell atlas of mouse lung development. Development 148, dev199512 (2021).
Street, K. et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics 19, 477 (2018).
Minoo, P. Transcriptional regulation of lung development: emergence of specificity. Respir. Res. 1, 109–115 (2000).
Attarian, S. J. et al. Mutations in the thyroid transcription factor gene NKX2-1 result in decreased expression of SFTPB and SFTPC. Pediatr. Res. 84, 419–425 (2018).
Ikonomou, L. et al. The in vivo genetic program of murine primordial lung epithelial progenitors. Nat. Commun. 11, 635 (2020).
Belgacemi, R. et al. Hedgehog signaling pathway orchestrates human lung branching morphogenesis. Int. J. Mol. Sci. 23, 5265 (2022).
Zhang, Z. et al. Transcription factor Etv5 is essential for the maintenance of alveolar type II cells. Proc. Natl Acad. Sci. USA 114, 3903–3908 (2017).
Whitsett, J. A., Kalin, T. V., Xu, Y. & Kalinichenko, V. V. Building and regenerating the lung cell by cell. Physiol. Rev. 99, 513–554 (2019).
Domyan, E. T. et al. Signaling through BMP receptors promotes respiratory identity in the foregut via repression of Sox2. Development 138, 971–981 (2011).
Abdelwahab, E. M. M. et al. Wnt signaling regulates trans-differentiation of stem cell like type 2 alveolar epithelial cells to type 1 epithelial cells. Respir. Res. 20, 204 (2019).
Saito, A., Horie, M. & Nagase, T. TGF-β signaling in lung health and disease. Int. J. Mol. Sci. 19, 2460 (2018).
Brown, R. et al. Cathepsin S: investigating an old player in lung disease pathogenesis, comorbidities, and potential therapeutics. Respir. Res. 21, 111 (2020).
Anas, A., van der Poll, T. & de Vos, A. F. Role of CD14 in lung inflammation and infection. Crit. Care 14, 209 (2010).
Elias-Oliveira, J. et al. CD14 signaling mediates lung immunopathology and mice mortality induced by Achromobacter xylosoxidans. Inflamm. Res. 71, 1535–1546 (2022).
Singh, P. P. & Isambert, H. OHNOLOGS v2: a comprehensive resource for the genes retained from whole genome duplication in vertebrates. Nucleic Acids Res. 48, D724–D730 (2020).
Arora, R., Metzger, R. J. & Papaioannou, V. E. Multiple roles and interactions of Tbx4 and Tbx5 in development of the respiratory system. PLoS Genet. 8, e1002866 (2012).
Rodriguez-Esteban, C. et al. The T-box genes Tbx4 and Tbx5 regulate limb outgrowth and identity. Nature 398, 814–818 (1999).
Arendt, D. et al. The origin and evolution of cell types. Nat. Rev. Genet. 17, 744–757 (2016).
He, Y. et al. Spatiotemporal DNA methylome dynamics of the developing mouse fetus. Nature 583, 752–759 (2020).
Steele-Perkins, G. et al. The transcription factor gene Nfib is essential for both lung maturation and brain development. Mol. Cell. Biol. 25, 685–698 (2005).
Volpe, M. V. et al. Expression of Hoxb-5 during human lung development and in congenital lung malformations. Birth Defects Res. A 67, 550–556 (2003).
Gao, K.-Q. & Shubin, N. H. Late Jurassic salamandroid from western Liaoning, China. Proc. Natl Acad. Sci. USA 109, 5767–5772 (2012).
Pyron, R. A. et al. The draft genome sequences of 50 salamander species (Caudata, Amphibia). Biodivers. Genomes 2024, https://doi.org/10.56179/001c.116891 (2024).
Wang, K. et al. African lungfish genome sheds light on the vertebrate water-to-land transition. Cell 184, 1362–1376 (2021).
LIEM, K. F. Form and function of lungs: the evolution of air breathing mechanisms. Am. Zool. 28, 739–759 (1988).
Maniatis, N. A., Chernaya, O., Shinin, V. & Minshall, R. D. Caveolins and lung function. Adv. Exp. Med. Biol. 729, 157–179 (2012).
Nguyen, N. M. et al. Lung development in laminin gamma2 deficiency: abnormal tracheal hemidesmosomes with normal branching morphogenesis and epithelial differentiation. Respir. Res. 7, 28 (2006).
Mittal, R. A. et al. SFTA2—a novel secretory peptide highly expressed in the lung—is modulated by lipopolysaccharide but not hyperoxia. PLoS ONE 7, e40011 (2012).
Wu, B. et al. Single-cell analysis of the amphioxus hepatic caecum and vertebrate liver reveals genetic mechanisms of vertebrate liver evolution. Nat. Ecol. Evol. 8, 1972–1990 (2024).
Darwin, C. The Origin of Species (Norton, 1975).
Oakley, T. H. & Speiser, D. I. How complexity originates: the evolution of animal eyes. Annu. Rev. Ecol. Evol. Syst. 46, 237–260 (2015).
Gregory, T. R. The evolution of complex organs. Evol. Educ. Outreach 1, 358–389 (2008).
Bishopric, N. H. Evolution of the heart from bacteria to man. Ann. N. Y. Acad. Sci. 1047, 13–29 (2005).
Jacob, F. Evolution and tinkering. Science 196, 1161–1166 (1977).
Griffith, O. W. & Wagner, G. P. The placenta as a model for understanding the origin and evolution of vertebrate organs. Nat. Ecol. Evol. 1, 0072 (2017).
Lynch, V. J. et al. Ancient transposable elements transformed the uterine regulatory landscape and transcriptome during the evolution of mammalian pregnancy. Cell Rep. 10, 551–561 (2015).
Rawn, S. M. & Cross, J. C. The evolution, regulation, and function of placenta-specific genes. Annu. Rev. Cell Dev. Biol. 24, 159–181 (2008).
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Vurture, G. W. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics 33, 2202–2204 (2017).
Hu, J., Fan, J., Sun, Z. & Liu, S. NextPolish: a fast and efficient genome polishing tool for long-read assembly. Bioinformatics 36, 2253–2255 (2020).
Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).
Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).
Manni, M., Berkeley, M. R., Seppey, M., Simão, F. A. & Zdobnov, E. M. BUSCO update: novel and streamlined workflows along with broader and deeper phylogenetic coverage for scoring of eukaryotic, prokaryotic, and viral genomes. Mol. Biol. Evol. 38, 4647–4654 (2021).
Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215–225 (2003).
Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019).
Wolock, S. L., Lopez, R. & Klein, A. M. Scrublet: computational identification of cell doublets in single-cell transcriptomic data. Cell Syst 8, 281–291 (2019).
Young, M. D. & Behjati, S. SoupX removes ambient RNA contamination from droplet-based single-cell RNA sequencing data. Gigascience 9, giaa151 (2020).
Haghverdi, L., Lun, A. T. L., Morgan, M. D. & Marioni, J. C. Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36, 421–427 (2018).
Welch, J. D. et al. Single-cell multi-omic integration compares and contrasts features of brain cell identity. Cell 177, 1873–1887 (2019).
Gayoso, A. et al. A Python library for probabilistic analysis of single-cell omics data. Nat. Biotechnol. 40, 163–166 (2022).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Pertea, M. et al. StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
Langmead, B., Wilks, C., Antonescu, V. & Charles, R. Scaling read aligners to hundreds of threads on general-purpose processors. Bioinformatics 35, 421–432 (2019).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. deepTools: a flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, W187–W191 (2014).
Zhang, Y. et al. Model-based analysis of ChIP-seq (MACS). Genome Biol. 9, R137 (2008).
Wu, T. et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innovation 2, 100141 (2021).
Aibar, S. et al. SCENIC: single-cell regulatory network inference and clustering. Nat. Methods 14, 1083–1086 (2017).
Jin, S. et al. Inference and analysis of cell–cell communication using CellChat. Nat. Commun. 12, 1088 (2021).
Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat. Methods 14, 979–982 (2017).
Buchfink, B., Reuter, K. & Drost, H.-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods 18, 366–368 (2021).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Stamatakis, A. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313 (2014).
Löytynoja, A. Phylogeny-aware alignment with PRANK. Methods Mol. Biol. 1079, 155–170 (2014).
Talavera, G. & Castresana, J. Improvement of phylogenies after removing divergent and ambiguously aligned blocks from protein sequence alignments. Syst. Biol. 56, 564–577 (2007).
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Wang, D., Zhang, Y., Zhang, Z., Zhu, J. & Yu, J. KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies. Genomics Proteomics Bioinform. 8, 77–80 (2010).
Szklarczyk, D. et al. The STRING database in 2023: protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 51, D638–D646 (2023).
Kiełbasa, S. M., Wan, R., Sato, K., Horton, P. & Frith, M. C. Adaptive seeds tame genomic sequence comparison. Genome Res. 21, 487–493 (2011).
Blanchette, M. et al. Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, 708–715 (2004).
Hubisz, M. J., Pollard, K. S. & Siepel, A. PHAST and RPHAST: phylogenetic analysis with space/time models. Brief. Bioinform. 12, 41–51 (2011).
Zhang, Z., Schwartz, S., Wagner, L. & Miller, W. A greedy algorithm for aligning DNA sequences. J. Comput. Biol. 7, 203–214 (2000).
Li, H. Protein-to-genome alignment with miniprot. Bioinformatics 39, btad014 (2023).
Wang, Y. et al. The 3D Genome Browser: a web-based browser for visualizing 3D genome organization and long-range chromatin interactions. Genome Biol. 19, 151 (2018).
Frazer, K. A., Pachter, L., Poliakov, A., Rubin, E. M. & Dubchak, I. VISTA: computational tools for comparative genomics. Nucleic Acids Res. 32, W273–W279 (2004).
Demichev, V., Messner, C. B., Vernardis, S. I., Lilley, K. S. & Ralser, M. DIA-NN: neural networks and interference correction enable deep proteome coverage in high throughput. Nat. Methods 17, 41–44 (2020).
Ye, L. YeLi0909/vertebrate-lung: vertebrate-lung v1.1.1. Zenodo https://doi.org/10.5281/zenodo.14546703 (2024).
Acknowledgements
The project was supported by the National Natural Science Foundation of China (32122021, 32370452, 82200040, 32225009 and 32100367), the National Key R&D Program of China (2022YFC3400300), the New Cornerstone Investigator Program to W.W., the 1000 Talent Project of Shaanxi Province to K.W. and Q.Q., and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Contributions
K.W., Q.Q., Y.L. and W.W. designed this project and research aspects. C.F., B.W. and M.H. performed sample collection. T.X., F.Z., Y.L. and J.H. contributed to sequencing library construction for CUT&Tag. M.H. performed the scRNA-seq and bulk RNA-seq data analysis for white-spotted bamboo shark. J.Z. conducted the search for CNE. Y.L. conducted the remaining data analysis. C.Z., W.X., Z.L., L.Z. and P.X. provided valuable suggestions for the study. Y.Z. and Z.Z. contributed to the experimental components of this project. Y.L. and K.W. contributed to figure design. Y.L., K.W., Q.Q. and W.W. wrote the manuscript. K.W. and W.W. amended it.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Ecology & Evolution thanks Florent Murat and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Comparative analysis of cell types and gene expressions across vertebrate species using single-cell RNA sequencing data.
This figure displays a phylogenetic tree of various species (from Senegal bichir to rat), with corresponding cell counts and UMAP plots showing cell type clustering. Each species’ plot is color-coded to represent different cell types, belonging to immune, stromal, endothelial and epithelial cells. Adjacent to the UMAP plots, a dot plot heatmap illustrates gene expression patterns across cell types, with dot size indicating the percentage of cells expressing each gene and color intensity showing expression levels. The UMAP plots for pig, human, mouse, and rat are provided at the bottom. The light color block highlights the data generated for this study.
Extended Data Fig. 2 Cross-species integration of lung cells from nine vertebrate species.
a, UMAP plots showcasing lung cell distributions from each species, derived from merged SAMAP coordinates. Species range from Senegal bichir to rat, illustrating evolutionary diversity. b, Stacked bar chart quantifying the proportional composition of cell types across species. This visualization highlights interspecies variations in lung cellular makeup. c, Comparative UMAP plots demonstrating the integration efficacy of six different computational methods (CCA, RPCA, MNN, scANVI, scVI, and LIGER) on lung cells from all nine species. The upper row is color-coded by species origin, while the lower row is color-coded by cell types as annotated from each species.
Extended Data Fig. 3 Conservation of cellular communication, transcription factor networks, and gene expression in vertebrate lungs.
a, Co-expression network of 23 conserved lung transcription factors and their partial target genes. This network illustrates the complex regulatory relationships governing lung cell identity and function across species. b, Comparative VEGF signaling networks in Senegal bichir, African lungfish, and mouse lungs, derived from CellChat analysis. Circle sizes indicate cell type proportions, while edge widths represent intercellular communication probabilities. This visualization highlights conserved signaling patterns across evolutionarily distant species. c, Dotplot depicting the expression patterns of lung-specific genes across various mouse tissue cell populations. Rows represent individual genes, columns represent cell types, and color intensity indicates scaled average expression levels.
Extended Data Fig. 4 The shared and diverged gene expression pattern between mouse and chicken.
a, UMAP clustering diagram of mouse lung development. The left plot shows the overall cell type distribution, with major populations such as stromal cells, epithelial cells, endothelial cells, and specialized cell types (AT1, AT2, secretory, and ciliated cells) clearly demarcated. The right series of plots demonstrate the temporal progression of lung cell populations from embryonic day 12 (E12) through postnatal day 14 (P14), showcasing the dynamic changes in cellular composition during development. b, Similar and diverged gene expression patterns in epithelial cells between mouse (top row) and chicken (bottom row) during lung development. The left four genes (nkx2-1, sftpc, lpcatl1, fgfr2) exhibit similar expression patterns across both species, indicating conserved developmental processes. In contrast, the right two genes (etv5, shh) show divergent expression patterns, suggesting species-specific adaptations in lung development. c, Part of genes with mammalian-specific expression patterns during lung development. The dot plot compares expression levels across various species, from chicken to rat, emphasizing genes with higher expression or prevalence in mammalian lungs. d, The dynamic proportions of major cell types during lung development in both chicken (top) and mouse (bottom). Notably, the proportion of stromal cells decreases over time in both species, indicating a conserved developmental trend. e, Heatmap of 32 shared signaling pathways in stromal cells between mouse and chicken. f, The expression of key genes in chicken stromal cells that are known to be highly expressed in mouse lung stromal cells during development. g, Heatmaps displaying genes with progressively increasing expression levels in chicken endothelial cells (left) and mouse endothelial cells (right). h, Bubble plot showing enriched GO terms in endothelial (385 genes). i, Venn diagram showing the number of lung development genes and the lung adult gene set (the upper part is the gene set conserved expressed in the adult lungs, and the lower part is the gene set specifically expressed in the lungs).
Extended Data Fig. 5 Evolution of lung-related genes in cartilaginous fish and analysis of lung-associated enhancers.
a, Phylogenetic tree depicting the evolutionary history of the cd14 (left), ctss and ctsk (rigth) across various species. Numbers at each node represent posterior probabilities (as percentages), indicating the level of support for each branching event. b, Bar plot showing the proportion of lung-related genes originating from 2R-WGD, which is significantly higher than the background proportion. c, Phylogenetic tree depicting the evolutionary history of the sftpb gene across various species. d, Heatmap showing the expression levels of lung-specific genes in various tissues of the bamboo shark (bulk RNA-seq). e, Comparison of evolutionary rates for lung-related genes (left) and other genes (right) between cartilaginous and bony fish. These plots show the distribution of Ka (nonsynonymous substitution rate), Ks (synonymous substitution rate), and Ka/Ks ratio, all calculation based on the ancestral sequence. f, Bar plots comparing the expression levels (in FPKM) of two key lung-specific genes, sftpb and abca3, across different tissues in three species: African lungfish, Senegal bichir, and bamboo shark. The plots reveal a degree of co-expression of these genes in lungfish and bichir, particularly in lung tissues, while showing no apparent correlation in the bamboo shark. g, UMAP plot showing the distribution of cells co-expressing sftpb and abca3 in the esophagus and stomach from white spotted bamboo shark.
Extended Data Fig. 6 Evolution of lung-related regulatory elements.
a-b, Heatmaps showing the distribution of CUT&Tag signals for histone modifications H3K27ac and H3K4me1 around gene transcription start sites (TSS) in embryonic chicken lungs at 9, 11, and 19 days, and at postnatal day 23(a), as well as in 4-week-old mouse lungs (b). c, Stacked plot depicting the relative genomic positions of selected genes and their nearby CNEs in humans, with different rows representing the evolutionary origins of the CNEs. d, Detailed view of 10 CNEs near the TBX4 gene. Top: human lung Hi-C interactions, TAD (Topologically Associating Domain) distribution, and CNE positions (using hg38 as reference). Middle: H3K27ac signals in chicken and mouse embryonic lungs. Bottom: Sequence conservation across multiple species.
Extended Data Fig. 7 Expression of lung-specific genes in swim bladder.
a, Left: UMAP plot showing integrated cell clustering for lungs of African lungfish, Senegal bichir, and swim bladder of zebrafish. Right: Bar chart showing proportion of each cell population (stromal, immune, epithelial, endothelial) in these three species. b, Dot plot displaying expression of lung-specific genes in different cell populations of the zebrafish swim bladder.
Extended Data Fig. 8 Mammalian-specific adaptations in lung structure and function.
a, UMAP plots illustrating the further specialization of mammalian lung endothelial cells, particularly highlighting a distinct population of lung capillary cells not identified in the five non-mammalian species studied. The upper panel shows mammalian lung endothelial cells, while the lower panel represents non-mammalian species. b, Dot plots demonstrating that capillary marker genes are not prominently expressed in lung endothelial cells of non-mammalian species. Triangles indicate endothelial cells in each species. This plot compares gene expression across different cell types and species. c, RNA in situ hybridization images of ca4 (capillary endothelial cell marker) and vwf (vascular endothelial cell marker) in lung sections from mouse, bearded dragon, and African bullfrog. The results indicate that ca4 expression is not detectable in the lungs of these two non-mammalian species, supporting the findings from (b). d, Clustering relationships among AT1, AT2, and respiratory epithelial cells. The left panel shows a phylogenetic tree constructed using expression data, while the right panel represents a similarity matrix. e, UMAP plot of respiratory epithelial cells from nine vertebrate species, generated using SAMAP. Colors represent cell types (left) and species (right). f, Dot plots illustrating the expression patterns of typical AT2 marker genes and AT1-specific genes (found to be upregulated in mammalian AT1 cells compared to non-mammalian respiratory epithelial cells.) in vertebrate lung epithelial cells. Mammals are shown on the top, and non-mammals on the bottom. The scattered expression patterns in non-mammals suggest the specialization of AT1 cells in mammals.
Extended Data Fig. 9 Mammalian-specific genes may drive cell type specialization.
a, Phylogenetic tree illustrating the origin of the sfta2 gene from a duplication of the rhcg gene. b, Phylogenetic tree illustrating the origin of the scgb3a2 gene. c, Enrichment analysis of genes upregulated in the lungs of sfta2 knockout homozygous mice. The plot shows enriched GO terms, ReactomeDB pathways, and WikiPathways, with dot size indicating gene count and color representing significance.
Supplementary information
Supplementary Table
The SupplementaryTable.xlsx file contains 11 sheets.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Li, Y., Hu, M., Zhang, Z. et al. Origin and stepwise evolution of vertebrate lungs. Nat Ecol Evol 9, 672–691 (2025). https://doi.org/10.1038/s41559-025-02642-6
Received:
Accepted:
Published:
Issue date:
DOI: https://doi.org/10.1038/s41559-025-02642-6
- Springer Nature Limited