Aging How Science Works
Visit the link below to download the full version of this book:
https://medidownload.com/product/aging-how-science-works/
Click Download Now
vi Preface
the interplay of transcription factors and chromatin, which is the core mechanism
for the proper function of all our tissues and cell types in health, disease and aging.
Next, we shift to cellular mechanisms and discuss the impact of biochemistry and
immunity on the process of aging. In the following, the relation of aging to chronic
disorders, such as the metabolic syndrome, cancer and neurodegenerative diseases,
as well as to nutrition and physical activity is discussed. Do we get these disor-
ders because we are aging or are we aging because we get one or multiple of
these diseases? By studying premature aging syndromes we not only shed light on
biological processes but we ask ourselves what we expect from life and how we use
the time we are given. Eventually, we will have to admit that from an evolutionary
perspective aging makes sense. But what can we do to age as healthy as possible and
can we slow down the aging process by healthy diets and physical activity?
This book is linked to a series of lecture courses in “Molecular Medicine and
Genetics”, “Molecular Immunology”, “Cancer Biology” and “Nutrigenomics” that
are given by one of us (C. Carlberg) in different forms since 2002 at the University of
Eastern Finland in Kuopio. Over the last 20 years the topic of aging got more and more
important in these courses. The core of this book is related to our textbook Molecular
Medicine: How Science Works (ISBN 978-3-031-27132-8), which to a central part
is based on our book Nutrigenomics: How Science Works (ISBN 978-3-030-47663-
2). By combining basic understanding of molecular and cellular mechanism with
clinical examples, the authors hope to make this textbook a personal experience. A
glossary in the appendix will explain the major specialist’s terms.
We hope that readers will enjoy this rather visual book and get as enthusiastic
about understanding the process of aging as the authors are.
Olsztyn, Poland/Kuopio, Finland Carsten Carlberg
Oslo, Norway Stine M. Ulven
Düsseldorf, Germany Eunike Velleuer
March 2024
Contents
1 Human Genome, Development, Evolution and Aging . . . . . . . . . . . . . . 1
1.1 The Human Genome and Its Variation . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Principles of Development and Aging . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Evolutionary Perspective on Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4 Hallmarks of Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Bibiliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2 Principles of Gene Expression and Epigenetics . . . . . . . . . . . . . . . . . . . . 19
2.1 Gene Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Chromatin Structure and Epigenetics . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3 Information Storage by Chromatin Modifications . . . . . . . . . . . . . . . 24
2.4 Epigenetics Enables Gene Expression . . . . . . . . . . . . . . . . . . . . . . . . . 33
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3 Epigenetics, Memory and Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1 Transgenerational Epigenetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.2 Epigenetics of Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.3 Epigenetics and Time: The Circadian Clock . . . . . . . . . . . . . . . . . . . . 51
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4 Biochemistry of Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1 Principles of Metabolism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 Aging and Conserved Nutrient Sensing Pathways . . . . . . . . . . . . . . . 59
4.3 Neuroendocrine Regulation of Energy Metabolism and Aging . . . . 65
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5 Molecular and Cellular Basis of Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1 Mitochondria and Endoplasmatic Reticulum Dysfunctions . . . . . . . 73
5.2 Apoptosis, Autophagy and the Loss of Proteostasis . . . . . . . . . . . . . . 80
5.3 Stem Cell Exhaustion and Cellular Senescence . . . . . . . . . . . . . . . . . 84
5.4 Long- and Short-Lived Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
vii
viii Contents
6 Immunity and Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.1 Innate and Adaptive Immunity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.2 Relation of Epigenetics and Immunity . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.3 Inflammation and Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.4 The Aging Immune System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.5 The Microbiome in Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7 Chronic Diseases and Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.1 The Global Burden of Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.2 The Metabolic Syndrome: Obesity, T2D and CVD . . . . . . . . . . . . . . 118
7.3 Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
7.4 Neurodegenerative Diseases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
8 Premature Aging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.1 DNA Repair Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.2 Premature Aging Syndromes Associated with Genomic
Instability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
8.3 Laminopathies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
9 Healthy Aging and Longevity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.1 Impact of Energy Balance and Dietary Macronutrient
Composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
9.2 Impact of Physical Activity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
9.3 Aging Clocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
9.4 The Socio-Economic Need of Healthy Aging . . . . . . . . . . . . . . . . . . . 163
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Abbreviations
1,25(OH)2 D3 1,25-dihydroxyvitamin D3
25(OH)D3 25-hydroxyvitamin D3
3D 3-dimensional
5caC 5-carboxylcytosine
5fC 5-formylcytosine
5hmC 5-hydroxymethylcytosine
5hmU 5-hydroxyuracil
5mC 5-methylcytosine
A Adenine
AC Adenylate cyclase
ACAC Acetyl-CoA carboxylase
ADP Adenosine diphosphate
AGRP Agouti-related neuropeptide
AKT AKT murine thymoma viral oncogene homolog
AMP Adenosine monophosphate
AMPK Adenosine monophosphate-activated protein kinase
AP1 Activator protein 1 (JUN-FOS heterodimer)
APC APC regulator of WNT signaling pathway
APO Apolipoprotein
AR Androgen receptor
ARC Arcuate nucleus
ARID5B AT-rich interaction domain 5B
ASC Apoptosis-associated speck-like protein containing a CARD
ATF Activating transcription factor
ATM ATM serine/threonine kinase
ATP Adenosine triphosphate
ATR ATR serine/threonine kinase
ATRX ATRX chromatin remodeler
Aβ Amyloid β
BAK BCL2 antagonist/killer 1
BANF1 BAF nuclear assembly factor 1
ix
x Abbreviations
BAX BCL2 associated X, apoptosis regulator
BCL2 BCL2 apoptosis regulator
BCL2L1 BCL2-like
BCR B cell receptor
BDNF Brain-derived neurotrophic factor
BER Base excision repair
BLM BLM RecQ like helicase
BMAL1 Basic helix-loop-helix ARNT like 1
BMF Bone marrow failure
BMI Body mass index
BRCA1 BRCA1 DNA repair associated
BRIP1 BRCA1 interacting protein C-terminal helicase 1
C Cytosine
CAMKK2 Ca2+ /calmodulin-dependent protein kinase kinase 2
CAMP Cathelicidin
CASP Caspase
CCK Cholecystokinin
CCL Chemokine C-C motif ligand
CDK Cyclin-dependent kinase
CDKN1A Cyclin-dependent kinase inhibitor 1A, also called p21
CDKN2A Cyclin-dependent kinase inhibitor 2A, also called p16
CEBP CCAAT enhancer binding protein
CFU Colony forming unit
CHEK Checkpoint kinase
CLOCK Clock circadian regulator
CNS Central nervous system
CNV Copy number variation
COPD Chronic obstructive pulmonary disease
COVID-19 Coronavirus disease 2019
CpG CpG dinucleotide
CREB3L3 cAMP responsive element binding protein 3-like 3
CRP C-reactive protein
CRTC2 CREB-regulated transcription coactivator 2
CRY1 Cryptochrome circadian clock 1
CSNK2A1 Casein kinase 2α 1
CTCF CCCTC binding factor
CVD Cardiovascular disease
CXXC1 CXXC finger protein 1
CYP27B1 Cytochrome P450 family 27 subfamily B member 1
DAF Abnormal dauer formation
DALY Disability-adjusted life-year
DAMP Damage-associated molecular pattern
DAXX Death domain associated protein
DDR DNA damage response
DEFB4 Defensin beta 4A
Abbreviations xi
DNMT DNA methyltransferase
DOHaD Developmental origins of health and disease
EGF Epidermal growth factor
EIF2AK3 Eukaryotic translation initiation factor 2α kinase 3
ENCODE Encyclopedia of DNA elements
EP300 E1A-binding protein p300, also called KAT3B
eQTL Expression quantitative trait locus
ERCC ERCC excision repair, TFIIH core complex helicase subunit
ERN1 Endoplasmic reticulum to nucleus signaling 1
ES Embryonic stem
ESR Estrogen receptor
FAS FAS cell surface death receptor
FASLG FAS ligand
FGF Fibroblast growth factor
FOXO Forkhead box O
FTO Fat mass and obesity associated
G Guanine
GABA Gamma-aminobutyric acid
Gb Giga (1000,000,000) base pairs
GH1 Growth hormone 1
GHR GH receptor
GIS1 GLG1-2 suppressor
GLP1 Glucagon-like peptide 1
GPAT Glycerol phosphate acyl transferase
GWAS Genome-wide association study
HAT Histone acetyltransferase
Hb1Ac Glycated hemoglobin
HBV Hepatitis B virus
HDAC Histone deacetylase
HDL High-density lipoprotein
HIF1 Hypoxia-inducible factor 1
HIV-1 Human immunodeficiency virus
HP1 Heterochromatin protein 1, official name CBX5
HPV Human papilloma virus
HR Homologous recombination
HSC Hematopoietic stem cell
HSF1 Heat shock transcription factor 1
HSP Heat shock protein
IgA Immunoglobulin A
IGF Insulin-like growth factor
IGF1R IGF1 receptor
IKBK Inhibitor of kappa light polypeptide gene enhancer in B cells,
kinase
IL Interleukin
ILC Innate lymphoid cell
xii Abbreviations
ILP Insulin-like peptide
indel Insertion-deletion
INF Interferon
INSR Insulin receptor
iPS Induced pluripotent stem
IRS Insulin receptor substrate
IRX Iroquois homeobox
IVF In vitro fertilization
kb Kilo (1000) base pairs
KDM Lysine demethylase
KLF4 Krüppel-like factor 4
KMT Lysine methyltransferase
LAD Lamin-associated domain
LCR Locus control region
LDL Low-density lipoprotein
LINE Long interspersed element
LKB1 Liver kinase B1
LPS Lipopolysaccharide
LTR Long terminal repeat
LXR Liver X receptor
MAF Minor allele frequency
MAPK Mitogen-activated protein kinase
Mb Mega (1000,000) base pairs
MBD Methyl-DNA-binding domain
MC4R Melanocortin 4 receptor
M-CFU Myeloid stem cells
mCH Non-CpG methylation
MCP1 Monocyte chemoattractant protein 1
MDM2 MDM2 proto-oncogene, E3 ubiquitin protein ligase
MECP2 Methyl-CpG-binding protein 2
MHC Major histocompatibility complex
miRNA Micro RNA
MLH1 MutL homolog 1
MMP Matrix metalloproteinase
MMR Mismatch repair
mRNA Messenger RNA
MSN Multicopy suppressor of SNF1 mutation
MYC MYC proto-oncogene, BHLH transcription factor
NAD Nicotinamide adenine dinucleotide
NAFLD Non-alcoholic fatty liver disease
NAMPT Nicotinamide mononucleotide phosphoribosyltransferase, also
called visfatin
NANOG Nanog homeobox
NCOR1 Nuclear receptor corepressor 1
ncRNA Non-coding RNA
Abbreviations xiii
NER Nucleotide excision repair
NFAT Nuclear factor activated T cells
NFκB Nuclear factor kappa B
NGS Next-generation sequencing
NHEJ Non-homologous end-joining
NK Natural killer
NLR NOD-like receptor
NLRP NLR protein
NPY Neuropeptide Y
NRIP1 Nuclear receptor interacting protein 1
NTS Nucleus tractor solitarius
OCT4 Octamer-binding transcription factor 4, official name POU5F1
OR Odds ratio
PALB2 Partner and localizer of BRCA2
PAMP Pathogen-associated molecular pattern
PARP1 Poly(ADP-ribose) polymerase 1
PBMC Peripheral blood mononuclear cell
PER1 Period circadian clock 1
PGC Primordial germ cell
PI3K Phosphatidylinositol-4,5-bisphosphate 3-kinase
Pol II RNA polymerase II
POMC Proopiomelanocortin
POU1F1 POU class 1 homeobox 1
PPAR Peroxisome proliferator-activated receptor
PPARGC1α Proliferator-activated receptor γ, coactivator 1A
PRC Polycomb repressive complex
PRK Protein kinase
PRKDC Protein kinase, DNA-activated, catalytic subunit
PROP1 PROP paired-like homeobox 1
PRR Pattern-recognition receptor
PTEN Phosphatase and tensin homolog
PTGS2 Prostaglandin-endoperoxide synthase 2, also known as COX2
RAPTOR Regulatory associated protein of TOR
RAS Rat sarcoma
RB1 Retinoblastoma 1
RECQL4 RecQ like helicase 4
REST RE1-silencing transcription factor
REV-ERB Reverse-Erb, official gene symbol NR1D1
RNA-seq RNA sequencing
ROR RAR-related orphan receptor
ROS Reactive oxygen species
rRNA Ribosomal RNA
S6K S6 kinase
SAM S-adenosyl-L-methionine
SARS-CoV 2 Severe acute respiratory syndrome coronavirus 2
xiv Abbreviations
SASP Senescence-associated secretory phenotype
SCFA Short chain fatty acid
SCN Suprachiasmatic nucleus
SINE Short interspersed nuclear element
SIRT Sirtuin
SNP Single nucleotide polymorphism
snRNA Small nuclear RNA
SNV Single nucleotide variant
SORT1 Sortilin 1
SOX2 SRY-box 2
SREBF1 Sterol regulatory element-binding transcription factor 1
SWI/SNF Switching/sucrose non-fermenting
T Thymine
T1D Type 1 diabetes
T2D Type 2 diabetes
T3 Triiodothyronine
TAD Topologically associated domain
TBC1D TBC1 domain family, member 1
TCGA The Cancer Genome Atlas
TCR T cell receptor
TDG Thymine-DNA glycosylase
TERC Telomerase RNA component
TERT Telomerase reverse transcriptase
TET Ten-eleven translocation
TGFβ Transforming growth factor β
TH T helper
TIFIA Transcription initiation factor IA
TLR Toll-like receptor
TNF Tumor necrosis factor
TOR(C) Target of rapamycin (complex)
TP53 Tumor protein p53
TRAF2 TNF receptor-associated factor 2
TREG Regulatory T
tRNA Transfer RNA
TSC2 Tuberous sclerosis 2
TSS Transcription start site
U Uracil
UCP Uncoupling protein
ULK1 UNC-51-like kinase 1
UTR Untranslated region
UV Ultraviolet
VDR Vitamin D receptor
VEGF Vascular epithelial growth factor
VLDL Very low-density lipoprotein
WHO World Health Organization
Abbreviations xv
WRN Werner syndrome RecQ-like helicase
XBP1 X-box-binding protein 1
ZCWPW1 Zinc finger CW-type and PWWP domain containing 1
ZMPSTE24 Zinc metallopeptidase STE24
α-MSH α-melanocyte-stimulating hormone
Chapter 1
Human Genome, Development,
Evolution and Aging
Abstract In this introductory chapter, we will briefly describe the genetic basis of
the variation of human populations and individuals, which contributes to disease
risk and speed of aging. Most recent information on human genome variations are
based on big biology consortia like the 1000 Genomes project or the UK Biobank.
Genome-wide genotyping and whole genome sequencing allow the study and anal-
ysis of complex diseases on the basis of dozens to hundreds of genetic variants, such as
single nucleotide variants (SNVs) and copy number variations (CNVs), in regulatory
and coding regions of genes. Next, we discuss the principles of (embryonic) devel-
opment and stem cells. In this context we introduce aging as the natural progressive
decline in the function of cells, tissues and organs that leads to impaired functions
of the body. Accordingly, older age is the primary risk factor for numerous non-
communicable diseases. Major hallmarks of aging are cellular senescence, genome
instability, epigenetic alterations and telomere attrition, which we discuss in more
detail in following chapters.
Keywords Human genome · Big biology projects · 1000 Genomes project · Cell
lineage commitment · Cellular reprograming · Lifespan · Healthspan ·
Senescence · Hallmarks of aging
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024 1
C. Carlberg et al., Aging, https://doi.org/10.1007/978-3-031-61257-2_1
2 1 Human Genome, Development, Evolution and Aging
1.1 The Human Genome and Its Variation
The human genome is important for aging research, because interindividual differ-
ences in the process of aging are driven in part by genetic variations. The reference
haploid sequence of the human genome (Box 1.1) was released in 2001 by the first
big biology consortium, the Human Genome project. In relation to the reference
genome, every human carries millions of variants, most of which are SNVs. These
are genetic variants where exactly one nucleotide is altered (Fig. 1.1). In contrast,
structural variants of the genome mostly affect more than one nucleotide. These can
be insertion-deletion (indel) variants, where in most cases only a few bases are added
or removed, respectively, but there are also indels of up to 80 kilo base pairs (kb) in
length. Indels that are not multiples of 3 base pairs and are located within protein
coding regions result in frameshift mutations, i.e., from the position of the mutation
onwards the whole amino acid sequence of the encoded protein is changed. Further-
more, CNVs are structural variants that consist of deletions or insertions of DNA
stretches in one genome compared to another. These variants can be heterozygous or
homozygous. A predominant class of insertions is that of ancient transposons. These
DNA stretches persist in the genome as short interspersed nuclear elements (SINEs,
e.g., Alu elements) and long interspersed nuclear elements (LINEs, Box 1.1).
GACTGAGGCA AGGCTATG GCTGCATG---------GATGC
GACTGGGGCA AGG--ATG GCTGCATGATC---GATGATGC
GACTGAGGCA AGG--ATG GCTGCATG---------GATGC
GCAGCTGACAA...GTTGCATG ACTTGACTGCAAATCGTCAGTC
GCAGCTGA---------GCATG ACTTGACTACGATTTGCCAGTC
GCAGCTGA---------GCATG ACTTGACTGCAAATCGTCAGTC
Fig. 1.1 Types of variations present in human genome sequences. The haploid reference genome is
indicated at the top of each variant example, while the individual’s diploid genome is shown below.
The genetic variants can be either heterozygous or homozygous
1.1 The Human Genome and Its Variation 3
Box 1.1: The Human Genome
The human genome is the complete sequence of the anatomically modern
human and was obtained by the Human Genome project (www.genome.gov/
10001772) via whole genome sequencing. This reference sequence represents
the assembly of the genomes of a few young healthy male donors. With the
exception of germ cells, i.e., female oocytes and male sperm, each human cell
contains a diploid genome formed by 2 × 3.05 giga base pairs (Gb) that is
distributed on 2 × 22 autosomal chromosomes and two X chromosomes for
females and a XY chromosome set for males. In addition, every mitochondrion
contains 16.6 kb DNA. The haploid human genome contains some 20,000
protein-coding genes and about twice as many non-coding RNA (ncRNA)
genes. The protein-coding sequence covers less than 2% of the human genome,
i.e., 98% of the human genome is non-coding and primarily has regulatory
function.
Some 54% of the sequence of the human genome is formed by repetitive
DNA (often also referred to as “junk DNA”), which is sorted into the following
categories (by order of frequency):
• LINEs (500–8000 base pairs)—20.71%
• SINEs (100–300 base pairs)—12.79%
• Retrotransposons, such as long terminal repeats (LTRs) (200–5000 base
pairs)—8.85%
• Minisatellite and microsatellite (2–100 base pairs)—4.93%
• Satellites (200–2000 base pairs)—2.54%
LINEs and SINEs are identical or nearly identical DNA sequences that
are separated by large numbers of nucleotides, i.e., the repeats are spread
throughout the whole genome. LTRs are characterized by sequences that
are found at each end of retrotransposons. DNA transposons (also known as
“jumping DNA”) are full-length autonomous elements that encode for a trans-
posase, i.e., an enzyme that transposes DNA from one to another position in
the genome.
The different types of human genetic variants are referred to as common (or poly-
morphisms), when they have a minor allele frequency (MAF) of at least 1% within
the studied population, or as rare, when they have a MAF of less than 1%. SNVs,
which in case of common variants also referred as single nucleotide polymorphisms
(SNPs), represent the most frequent class of genetic variations among individuals.
Approximately 7 million SNVs show a MAF of more than 5% (www.ncbi.nlm.nih.
gov/SNP). The 1000 Genomes project and other large whole genome sequencing
projects indicated that in addition there is a huge number of rare and novel SNVs
(some 500 million). Nevertheless, the majority of variants of any given individual
are common in the whole population. Therefore, the average difference in nucleotide
sequence of a pair of unrelated humans lies in the order of only 1 in 1000, i.e., SNVs
4 1 Human Genome, Development, Evolution and Aging
create a 0.1% variation of the genome. This proportion is low compared with other
species and confirms the recent origin of anatomically modern humans (Homo
sapiens) from a small founding population.
The impact of SNVs on the coding sequence of the human genome is well
established. Synonymous variations do not alter the encoded protein, while non-
synonymous variations cause a change in the amino acid sequence (missense muta-
tion) or introduce a premature stop codon (nonsense mutation). In average, a typical
human genome contains some 150 SNVs resulting in protein truncation, about 10,000
SNVs changing amino acids and approximately 500,000 SNVs affecting transcrip-
tion factor binding sites. Interestingly, each individual is heterozygous for 50–100
genetic variants that can cause inherited disorders in homozygous offspring,
including different types of cancer (Sect. 7.3) or premature aging syndromes (Sect. 8.
2). This provides a large demand and challenge for genetic counseling based on whole
genome sequencing. Moreover, gene-environment interactions provided by lifestyle
factors, such as the personal choice of food (Sect. 9.1) or physical activity (Sect. 9.
2), create an additional level of complexity.
Indels as well as CNVs in exonic sequences can result in either non-frameshift or
frameshift mutations. Moreover, CNVs in intronic sequences may lead to alternative
splicing (Sect. 2.1). More than 60,000 unique CNVs are known and some of them are
quite common in human populations. Every individual has structural variants that
cover between 9 and 25 Mb (mega base pairs) of sequence, i.e., 0.3–0.8% of the whole
genome. However, the vast majority of SNVs and CNVs are located in regulatory
and not in coding regions of genes, i.e., the phenotypic consequences, such as
accelerated aging, of most genetic variants are rather based on an epigenetic or
gene regulatory processes than on a change in protein function (Chap. 2).
Mendelian disorders, such as Cystic Fibrosis or Huntington’s disease, are mono-
genetic, i.e., in these cases a single SNV can explain the occurrence of the rare
disease. However, non-communicable diseases like type 2 diabetes (T2D), cardio-
vascular disease (CVD) or cancer have a multigenetic basis. These diseases and
many other traits, such a body height, have been investigated over the past 20 years
by genome-wide association studies (GWASs). This method employs an agnostic
approach in the search for unknown disease variants, i.e., millions of SNVs are tested
for association with a disease in large cohorts of patients versus healthy controls.
GWASs with 2000–5000 individuals confidently identified common variants with
effect sizes, referred to as odds ratios (ORs), of 1.5 or greater, i.e., a 50% increased
risk for the tested disease. Larger sample sizes were achieved by pooling several
GWASs through meta-analyses. For example, sample sizes of at least 60,000 indi-
viduals provide sufficient power to identify the majority of variants with ORs of 1.1,
i.e., a 10% increased risk. Disease- and trait-associated genomic loci can be found in
the database GWAS Catalog (www.ebi.ac.uk/gwas). Monogenetic forms of diseases
or traits have high ORs (Fig. 1.2, left). Moreover, there is a large number of common
variants with a small to modest OR that have a role in common non-communicable
diseases and traits (Fig. 1.2, right). For example, the trait “body height” is depen-
dent on at least 180 gene loci, i.e., it is a paradigm of a complex/polygenic trait. In
Europe, this trait has changed significantly (in average by 10 additional centimeters)
1.1 The Human Genome and Its Variation 5
Rare alleles Few examples of
50.0
causing high-effect
Mendelian disease common variants
influencing
High
common disease
Effect size (OR)
3.0
Low-frequency
mediate
Inter-
variants with
intermediate effect
1.5
Modest
Rare variant of Common variant
small effect implicated in
1.1 very hard to identify common disease
Low
by genetic means by GWAS
Very rare Rare Low Common
0.001 0.005 0.05
Allele frequency
Fig. 1.2 Identifying genetic variants by risk allele frequency. The strength of a genetic effect is
indicated by ORs. Most emphasis and interest lies in identifying associations with characteristics
shown within the diagonal box. Whole genome sequencing of large numbers of individuals identifies
further low frequency SNVs with intermediate ORs (center)
within the last few generations under the environmental trigger of improved quality
and quantity of nutrition. Moreover, individual’s differences in lifespan are to some
extend (3–5 years) based on genetic variations.
Despite some notable successes in revealing numerous novel SNVs and genomic
loci associated with complex phenotypes, in average not much more than 10%
of the heritability of most complex, polygenic traits and diseases have been
explained by common variants. This applies also to the trait aging, i.e., genetic vari-
ations explain only to a small extent interindividual differences in aging (Sect. 1.2).
Thus, when exclusively SNV analysis is performed, the missing or unsolved heri-
tability does not allow assigning an individual with any reliable estimation about
his/her risk for a particular disease or speed of aging. The only well-known excep-
tions are age-related macular degeneration and type 1 diabetes (T1D), for which the
combinations of common and rare variants can provide a quantifiable risk profile.
For comparison, the heritability of only 20% of coronary heart disease cases is
explained by in total 80 genetic loci, 20% of T2D by some 100 loci, 20% of inherited
breast cancer by some 150 loci, 33% of inherited prostate cancer by some 100 loci
and 30% of Alzheimer’s disease by some 20 loci. Some of the missing heritability
may be explained by rare variants with high ORs, which are poorly captured by
6 1 Human Genome, Development, Evolution and Aging
standard GWASs. In addition, environmental exposures, including those experi-
enced as fetus, affect the epigenome and may explain large parts of the missing
heritability (Sect. 3.1).
GWAS analysis has indicated that some 88% of trait-associated variants are
located outside of protein-coding regions of the human genome. The functional char-
acterization of regulatory SNVs (Fig. 1.3), such as the identification of transcription
factor binding to the variant genomic region, can suggest possible therapeutic inter-
ventions, e.g., when the respective transcription factor is “druggable” (i.e., there is a
synthetic or natural compound that modulates its activity). Gene regulatory events
that are related to regulatory SNVs do not only depend on the sequence of the respec-
tive genomic site but also on its accessibility within chromatin. This emphasizes the
impact of epigenetics on regulatory variation (Sect. 2.4).
In contrast to the genome, which is identical in all 400 tissues and cell types
of an individual, the epigenome and consequently the expression of genes depends
on the individual tissue and the signals that it is exposed to, i.e., it represents the
dynamic state of the cell. The NGS (next-generation sequencing) method RNA-
seq (RNA sequencing) in combination with SNV information is the basis of the
approach of eQTL (expression quantitative trait locus) mapping. This method allows
KMT
KMT
Transcription factor
Me Me Me
Me Gene
expression
5’ G A A C T G T C 3’
3’ C T T G A C A G 5’
ac
ac ac ac Pol II
HAT A HAT Tall
No gene
expression
5’ G A A C C G T C 3’
3’ C T T G G C A G 5’
Short
G
Fig. 1.3 The basis of human trait variation. Small variations within the DNA binding site for a
transcription factor can facilitate and even enhance the association of this protein, such as the
A (top), or inhibit its binding, when it is a G (bottom). The binding of the transcription factor
influences the local chromatin structure via the activation of chromatin modifying enzymes, such
as HATs (histone acetyltransferases) and/or KMTs (lysine methyltransferases), which eventually
leads to the activation of Pol II (RNA polymerase II) and the transcription of the respective gene.
This may have a positive effect on the trait of interest, such as, e.g., body height. In contrast, when
the transcription factor does not bind, the respective genomic region remains inactive and the gene
is not transcribed. This may have a negative effect on the studied trait