Introduction

Obesity is associated with premature mortality and is a serious public health threat that accounts for a large proportion of the worldwide non-communicable disease burden, including type 2 diabetes, cardiovascular disease, hypertension and certain cancers1,2. Mechanical issues resulting from substantially increased weight, such as osteoarthritis and sleep apnoea, also affect people’s quality of life3. The impact of obesity on communicable disease, in particular viral infection4, has recently been highlighted by the discovery that individuals with obesity are at increased risk of hospitalization and severe illness from COVID-19 (refs5,6,7).

On the basis of the latest data from the NCD Risk Factor Collaboration, in 2016 almost 2 billion adults (39% of the world’s adult population) were estimated to be overweight (defined by a body mass index (BMI) of ≥25 kg m2), 671 million (12% of the world’s adult population) of whom had obesity (BMI ≥30 kg m2) — a tripling in the prevalence of obesity since 1975 (ref.8) (Fig. 1). Although the rate of increase in obesity seems to be declining in most high-income countries, it continues to rise in many low-income and middle-income countries and prevalence remains high globally8. If current trends continue, it is expected that 1 billion adults (nearly 20% of the world population) will have obesity by 2025. Particularly alarming is the global rise in obesity among children and adolescents; more than 7% had obesity in 2016 compared with less than 1% in 1975 (ref.8).

Fig. 1: Prevalence of obesity in males and females according to age and geographical region.
figure 1

The prevalence of obesity has risen steadily over the past four decades in children, adolescents (not shown) and adults worldwide. a | Prevalence of obesity (body mass index (BMI) ≥30 kg m−2) in women and men ≥20 years of age, from 1975 to 2016. b | Prevalence of obesity (weight ≥2 s.d. above the median of the WHO growth reference) in 5-year-old girls and boys from 1975 to 2016. Geographical regions are represented by different colours. Graphs are reproduced from the NCD Risk Factor Collaboration (NCD RisC) website and are generated from data published in ref.8.

Although changes in the environment have undoubtedly driven the rapid increase in prevalence, obesity results from an interaction between environmental and innate biological factors. Crucially, there is a strong genetic component underlying the large interindividual variation in body weight that determines people’s response to this ‘obesogenic’ environment. Twin, family and adoption studies have estimated the heritability of obesity to be between 40% and 70%9,10. As a consequence, genetic approaches can be leveraged to characterize the underlying physiological and molecular mechanisms that control body weight.

Classically, we have considered obesity in two broad categories (Fig. 2): so-called monogenic obesity, which is inherited in a Mendelian pattern, is typically rare, early-onset and severe and involves either small or large chromosomal deletions or single-gene defects; and polygenic obesity (also known as common obesity), which is the result of hundreds of polymorphisms that each have a small effect. Polygenic obesity follows a pattern of heritability that is similar to other complex traits and diseases. Although often considered to be two distinct forms, gene discovery studies of monogenic and polygenic obesity have converged on what seems to be broadly similar underlying biology. Specifically, the central nervous system (CNS) and neuronal pathways that control the hedonic aspects of food intake have emerged as the major drivers of body weight for both monogenic and polygenic obesity. Furthermore, early evidence shows that the expression of mutations causing monogenic obesity may — at least in part — be influenced by the individual’s polygenic susceptibility to obesity11.

Fig. 2
figure 2

Key features of monogenic and polygenic forms of obesity.

In this Review, we summarize more than 20 years of genetic studies that have characterized the molecules and mechanisms that control body weight, specifically focusing on overall obesity and adiposity, rather than fat distribution or central adiposity. Although most of the current insights into the underlying biology have been derived from monogenic forms of obesity, recent years have witnessed several successful variant-to-function translations for polygenic forms of obesity. We also explore how the ubiquity of whole-exome sequencing (WES) and genome sequencing has begun to blur the line that used to demarcate the monogenic causes of obesity from common polygenic obesity. Syndromic forms of obesity, such as Bardet–Biedl, Prader–Willi, among many others12, are not reviewed here. Although obesity is often a dominant feature of these syndromes, the underlying genetic defects are often chromosomal abnormalities and typically encompass multiple genes, making it difficult to decipher the precise mechanisms directly related to body-weight regulation. Finally, as we enter the post-genomic era, we consider the prospects of genotype-informed treatments and the possibility of leveraging genetics to predict and hence prevent obesity.

Gene discovery approaches

The approaches used to identify genes linked to obesity depend on the form of obesity and genotyping technology available at the time. Early gene discovery studies for monogenic forms of obesity had a case-focused design: patients with severe obesity, together with their affected and unaffected family members, were examined for potential gene-disrupting causal mutations via Sanger sequencing. By contrast, genetic variation associated with common forms of obesity have been identified in large-scale population studies, either using a case–control design or continuous traits such as BMI. Gene discovery for both forms of obesity was initially hypothesis driven; that is, restricted to a set of candidate genes that evidence suggests have a role in body-weight regulation. Over the past two decades, however, advances in high-throughput genome-wide genotyping and sequencing technologies, combined with a detailed knowledge of the human genetic architecture, have enabled the interrogation of genetic variants across the whole genome for their role in body-weight regulation using a hypothesis-generating approach.

Gene discovery for monogenic obesity

Many of the candidate genes and pathways linked to body-weight regulation were initially identified in mice, such as the obese (ob)13 and diabetes (db)14 mouse lines, in which severe hyperphagia and obesity spontaneously emerged. Using reverse genetics, the ob gene was shown to encode leptin, a hormone produced from fat, and it was demonstrated that leptin deficiency resulting from a mutation in the ob gene caused the severe obesity seen in the ob/ob mouse15 (Fig. 3). Shortly after the cloning of ob, the db gene was cloned and identified as encoding the leptin receptor (LEPR)16. Reverse genetics was also used to reveal that the complex obesity phenotype of Agouti ‘lethal yellow’ mice is caused by a rearrangement in the promoter sequence of the agouti gene that results in ectopic and constitutive expression of the agouti peptide17,18, which antagonizes the melanocortin 1 and 4 receptors (MC1R and MC4R)19,20. This finding linked the melanocortin pathway to body-weight regulation, thereby unveiling a whole raft of new candidate genes for obesity.

Fig. 3: Timeline of key discoveries in obesity genetics.
figure 3

Genes identified for monogenic obesity in a given year are shown on the left. Discoveries made for polygenic obesity are shown on the right, including a cumulative count of newly discovered loci per year and by ancestry. Although candidate gene and genome-wide linkage studies became available in the late 1990s, findings were limited, and these study designs are not as frequently used as genome-wide association studies.

Once the genes for leptin and its receptor were identified, they became candidate genes for human obesity, and in 1997 the first humans with congenital leptin deficiency were identified21. This discovery was rapidly followed by the report of humans with mutations in the gene encoding the leptin receptor (LEPR)22, as well as in genes encoding multiple components of the melanocortin pathway, including PCSK1 (ref.23), MC4R24,25,26 and POMC27,28,29, all of which were found to result in severe early-onset obesity (Table 1).

Table 1 Overview of all genes implicated in severe and early-onset obesity

Advances in high-throughput DNA sequencing led to candidate gene screening being replaced by WES, an unbiased approach that allows all coding sequences to be screened for mutations. However, it rapidly became clear that, whereas candidate gene studies yielded few mutations, WES identified too many potential obesity-associated variants such that the noise often masked the true causative mutations. However, with improved algorithms to predict the pathogenicity of mutations, as well as a rapidly expanding toolkit of functional assays, it has become easier to filter the likely pathogenic mutations. Several success stories have been reported in which WES has identified novel pathways and genes linked to obesity, such as the class 3 semaphorins (SEMA3A–G), which have been shown to direct the development of certain hypothalamic neurons, including those expressing pro-opiomelanocortin (POMC)30 (see ‘Other neuronal circuits and molecules linked to severe obesity’).

Most monogenic obesity mutations have been identified in cohorts of patients with severe and early-onset (<10 years old) obesity. Additionally, as monogenic obesity often demonstrates a recessive inheritance pattern31, consanguinity in populations has further increased the chance of identifying mutations, owing to greater chances of homozygosity of deleterious mutations32. For example, studies have reported that mutations in the genes encoding leptin, LEPR and MC4R explain 30% of cases of severe obesity in children from a consanguineous Pakistani population33, and single-gene defects more broadly account for nearly 50%34.

Gene discovery for polygenic obesity

The discovery of genes that influence polygenic obesity, which is common in the general population, started off slowly with candidate gene studies and genome-wide linkage studies. The candidate gene approach was first applied in the mid-1990s and aimed to validate genes identified through human and animal models of extreme obesity for a role in common obesity (Fig. 3). Common variants in such candidate genes were tested for association with obesity risk, BMI or other body composition traits. Over the subsequent 15 years, hundreds of genes were studied as candidates, but variants in only six (ADRB3 (ref.35), BDNF36, CNR1 (ref.37), MC4R38, PCSK1 (ref.39) and PPARG40) showed reproducible association with obesity outcomes. The genome-wide linkage approach made its entrance into the field towards the end of the 1990s (Fig. 3). Genome-wide linkage studies rely on the relatedness of individuals and test whether certain chromosomal regions co-segregate with a disease or trait across generations. Even though more than 80 genome-wide linkage studies identified >300 chromosomal loci with suggestive evidence of linkage with obesity traits, few loci were replicated and none was successfully fine-mapped to pinpoint the causal gene or genes41. Ultimately, candidate gene and genome-wide linkage studies, constrained by small sample sizes, sparse coverage of genetic variation across the genome and lack of replication, only had a marginal impact on the progression of gene discovery for common obesity outcomes.

However, the pace of gene discovery for common diseases accelerated with the advent of genome-wide association studies (GWAS) (Fig. 3). The first GWAS for obesity traits were published in 2007 and identified a cluster of common variants in the first intron of the FTO locus that was convincingly associated with BMI42,43. Many more GWAS followed and, to date, nearly 60 GWAS have identified more than 1,100 independent loci associated with a range of obesity traits44 (Supplementary Tables 1,2).

As sample sizes increase with each consecutive GWAS, the statistical power to identify more loci also increases, in particular for loci that are less common and/or have smaller effects. For example, the first GWAS were relatively small (n = ~5,000) and identified only the FTO locus42,43. The BMI-increasing allele of FTO is common, particularly in populations of European ancestry (minor allele frequency (MAF) 40–45%), and has a relatively large effect on BMI (0.35 kg m−2 per allele; equivalent to 1 kg for a person who is 1.7 m tall). Ten years and numerous GWAS later, the most recent GWAS for BMI included nearly 800,000 individuals, identified more than 750 loci, with MAFs as small as 1.6% and per-allele effects as low as 0.04 kg m−2 per allele (equivalent to 120 g for a person who is 1.7 m tall)45. Combined, these genome-wide significant loci explained 6% of variation in BMI45. Large-scale international collaborations have been formed, such as the Genetic Investigation for Anthropometric Traits (GIANT) consortium, that combine summary statistics of individual GWAS to generate data sets comprising hundreds of thousands of individuals. Furthermore, many GWAS efforts have maximized sample size by focusing on BMI as the primary obesity outcome, an inexpensive and easy-to-obtain measurement that is readily available in most studies. As such, the vast majority of loci have been identified first in GWAS of BMI, but their effects typically transfer to other overall adiposity outcomes.

Even though BMI is widely used, it is considered a crude proxy of overall adiposity because it does not distinguish between lean and fat mass46. Therefore, GWAS have been performed for more refined obesity traits, such as body fat percentage47,48, fat-free mass49, imaging-derived adipose tissue50, circulating leptin levels51 and LEPR levels52. In addition, two GWAS have focused on persistent healthy thinness, assuming that genes that determine resistance to weight gain may also inform obesity prevention and weight loss maintenance53,54. Although GWAS of more refined and alternative obesity outcomes are generally much smaller than those for BMI, the phenotypes are often a more accurate representation of body-weight regulation and, as such, the loci identified tend to more often point to relevant biological pathways that underlie obesity.

Almost all GWAS loci for obesity outcomes were first identified in adults. Most of these loci also associate with obesity and/or BMI in children and adolescents, highlighting the fact that the genetic underpinning of obesity is relatively constant across the course of life55,56,57. Similarly to gene discovery for other common diseases, the obesity genetics field has suffered from a strong bias in population representation, with the vast majority of GWAS being performed in populations that are exclusively or predominantly of European ancestry. Nevertheless, some loci have first been discovered in populations of Asian58, African59,60, Hispanic or other ancestry61, despite their much smaller sample sizes. Broadly, loci identified in one ancestry demonstrate good transferability (that is, directionally consistent associations) across other ancestries, even though effect sizes and allele frequencies may differ. The modest-to-high genetic correlations across ancestries observed for BMI (r = 0.78) are consistent with good transferability62, but also suggest that ancestry-specific loci remain to be discovered. Besides increasing the sample sizes of GWAS in populations of non-European ancestry, demographic, evolutionary and/or genomic features of specific populations (such as founder, consanguineous or isolated populations) have been leveraged for gene discovery, identifying genetic variants with large effects that are common in the discovery population, such as CREBRF, first identified in Samoan populations, and ADCY3, first identified in the Greenlandic population, but rare or nonexistent in most others63,64,65,66. CREBRF has been shown to play a role in cellular energy storage and use, and may be implicated in cellular and organismal adaptation to nutritional stress65. ADCY3 colocalizes with MC4R at the primary cilia of a subset of hypothalamic neurons that have been implicated in body-weight regulation67.

GWAS have typically focused on biallelic, common genetic variation (MAF >5%), but have also been used to screen for the role of copy number variants (CNVs) in obesity. So far, only a few CNVs have been identified that have a convincing association with BMI, such as the 1p31.1 45-kb deletion near NEGR1 (ref.68), which encodes a cell-adhesion molecule expressed in the brain69; the 16p12.3 21-kb deletion upstream of GPRC5B70, which may modulate insulin secretion71; the 10q11.22 CNV in PPYR1 (also known as NPY4R)72, which encodes a potent anti-obesity agent known to inhibit food intake73; and the 1p21.1 multi-allele CNV encompassing AMY1A74, which produces salivary α-amylase, a key enzyme in starch digestion75.

To determine the role of other types of variation in obesity, alternative genome-wide screens have been performed. For example, the impact of low-frequency and rare protein-coding variants has been tested using exome sequencing and exome array data76,77,78,79. It was speculated that low-frequency (MAF 1–5%) and rare (MAF <1%) variants would have larger effects than common variants, and thus be easier to detect. Nevertheless, even large-scale studies identified only a few robust associations for rare coding variants. For example, exome-wide screening based on array data from more than 400,000 individuals identified p.Tyr35Ter (rs13447324) in MC4R; p.Arg190Gln (rs139215588) and p.Glu288Gly (rs143430880) in GIPR, which stimulates insulin secretion and mediates fat deposition80; p.Arg95Ter (rs114285050) in GRP151, which modulates habenular function that controls addiction vulnerability81; and p.Arg769Ter (rs533623778) in PKHD1L1, which has been involved in cancer development77,78. A recent study that leveraged WES data for more than 600,000 individuals identified 16 genes for which the burden of rare nonsynonymous variants was associated with BMI, including five brain-expressed G protein-coupled receptors (CALCR, MC4R, GIPR, GPR151 and GPR75)79.

As obesity is a complex, multifactorial condition, some GWAS have integrated demographic factors (such as sex and age82) and environmental factors (such as physical activity83, diet84 or smoking85) into their analyses. Despite sample sizes of more than 200,000 individuals, these genome-wide gene-by-environment (G×E) interaction analyses remain challenging and so far only 12 loci have been identified, the effects of which on obesity are attenuated or exacerbated by non-genetic factors. Nevertheless, the G×E interaction between the FTO locus and a healthy lifestyle has been robustly replicated. Specifically, increased physical activity or a healthy diet can attenuate the effect of the FTO locus on obesity risk by 30–40%86,87.

The increasing availability of large-scale cohorts and biobanks, such as the UK Biobank, the Million Veterans Project, All of Us, Biobank Japan and 23andMe, combined with ongoing work by the GIANT consortium, will boost sample sizes further to easily exceed 4 million participants in meta-analyses, expediting the discovery of many more obesity-associated loci. However, translation of GWAS-identified loci into new biological insights remains a major challenge.

From genes to biology

Despite the difficulties in validating causative mutations and variants, genetic studies into both rare and common obesity over the past two decades have revealed two surprisingly cogent, overarching biological messages: first, the leptin–melanocortin pathway is a key appetitive control circuit31,88 (Fig. 4); and second, genes that are either enriched or exclusively expressed within the brain and CNS have a central role in obesity89.

Fig. 4: The leptin–melanocortin pathway.
figure 4

Pro-opiomelanocortin (POMC)-expressing neurons and agouti-related protein (AGRP)-expressing neurons within the arcuate nucleus of the hypothalamus (ARC) act to sense circulating leptin (LEP) levels, which reflect fat mass. These neurons signal to melanocortin 4 receptor (MC4R)-expressing neurons in the paraventricular nucleus of the hypothalamus (PVN), which controls appetite, thus linking long-term energy stores to feeding behaviour. Binding of class 3 semaphorins (SEMA3) to their receptors NRP and PLXNA influences the projection of POMC neurons to the PVN. Binding of brain-derived neurotrophic factor (BDNF) to its receptor neurotrophic receptor tyrosine kinase 2 (NTRK2) is thought to be an effector of leptin-mediated synaptic plasticity of neurons, including those in the ARC and PVN. The transcription factor SIM1 is crucial for the proper development of the PVN. +, agonist; −, antagonist; LEPR, leptin receptor; MRAP2, melanocortin receptor accessory protein 2; MSH, melanocyte-stimulating hormone; SH2B1, SH2B adaptor protein 1.

The leptin–melanocortin pathway and MC4R

Leptin is a key hormone secreted by adipocytes, which circulates at levels in proportion to fat mass90. Leptin also responds to acute changes in energy state, as its levels decrease with food deprivation and are restored during re-feeding. Administration of leptin to fasted mice abrogates many of the neuroendocrine consequences of starvation, suggesting that the normal biological role of leptin is to initiate the starvation response91. Leptin signals through the LEPR, which exists in several different isoforms. However, obesity-related effects of leptin are predominantly mediated by a long isoform that contains an intracellular domain (LEPRb), which is expressed in various regions of the CNS90.

Within the arcuate nucleus (ARC) of the hypothalamus, LEPRb is found on two populations of neurons at the heart of the melanocortin pathway, one of which expresses POMC and the other agouti-related protein (AGRP)92 (Fig. 4). POMC is post-translationally processed by prohormone convertases to produce several biologically active moieties, including β-lipotrophin and β-endorphin, and, crucially, the melanocortin peptides adrenocorticotrophin (ACTH) and α-, β- and γ-melanocyte-stimulating hormone (MSH)93. The ARC POMC neurons project to MC4R neurons within the paraventricular nucleus (PVN) where melanocortin peptides signal to decrease food intake92. By contrast, AGRP acts as an endogenous antagonist of MC4R to increase food intake92,94. MC3R is another centrally expressed receptor that binds to both melanocortin peptides and AGRP; however, as mice with targeted deletions in the gene are not obese but instead have altered fat to lean mass ratio, MC3R is less likely to be related to food intake and more likely to be involved in nutrient partitioning95,96.

We can state with confidence that the fine balance of melanocortinergic agonism and AGRP antagonism of MC4R, in response to peripheral nutritional cues such as leptin, plays a central part in influencing appetitive drive92. The genetic evidence clearly supports this contention, with mutations in most genes of the melanocortin pathway resulting in hyperphagia and severe obesity in both humans and mice31,88. In fact, the vast majority of single-gene disruptions causing severe early-onset obesity in humans fall within this pathway, including LEPR, POMC, AGRP, MCR4R, PCSK1 (ref.23), SH2B1 (ref.97), PHIP98, MRAP2 (ref.99) and SIM1 (ref.100) (Fig. 4; Table 1). Mutations in MC4R in particular, are the most common single-gene defect leading to hyperphagia and obesity. Pathogenic mutations in MC4R are found in up to 5% of cases of severe childhood obesity101 and up to 0.3% of the general population101,102. Of note, the degree of receptor dysfunction, as measured by in vitro assays, can predict the amount of food eaten at a test meal by an individual harbouring that particular mutation101. Thus MC4R does not act in a binary on/off manner, but as a rheostat; put simply, the melanocortin pathway is a ‘tunable’ system. In addition to regulating food intake, it also regulates food preference, with individuals who carry mutations in MC4R showing a preference for food with higher fat content103.

The importance of the melanocortin pathway in regulating feeding behaviour is highlighted by the identification of naturally occurring mutations in pathway genes in a wide range of different species where the appropriate selection pressure has been present (Table 1). For example, studies have found that 20–25% of Labrador retrievers, which are known to be more food-motivated than other dog breeds, carry a 14-bp deletion in POMC that disrupts the β-MSH and β-endorphin coding sequences and is associated with greater food motivation and increased body weight104. Also, certain breeds of pig have been shown to carry MC4R missense mutations that are associated with fatness, growth and food intake traits105. MC4R mutations even contribute to the adaptation and survival of blind Mexican cavefish to the nutrient-poor conditions of their ecosystem106.

Other neuronal circuits and molecules linked to severe obesity

It is now clear that in addition to engaging classical neuropeptide–receptor systems within the brain, leptin also rapidly modifies synaptic connections between neurons107, and that this structural plasticity is crucial to its downstream functions. One of the ways in which this plasticity is thought to be achieved is via brain-derived neurotrophic factor (BDNF) signalling to its receptor TrkB. BDNF is widely expressed in the CNS where it plays an important part in neuronal development108,109. In the hippocampus, BDNF contributes to synaptic plasticity and long-term potentiation associated with memory and learning110. However, evidence has emerged that implicates BDNF and TrkB in the regulation of mammalian eating behaviour and energy balance111. BDNF is downregulated by nutritional deprivation and upregulated by leptin within the ventromedial nucleus (VMN) of the hypothalamus112, although this regulation is probably indirect, as very few VMN BDNF neurons express the LEPR113 (Fig. 4) and some evidence indicates that it acts at least in part downstream of melanocortin signalling112. Crucially, genetic disruption of BDNF114,115 and TrkB112,116 in both humans and mice results in hyperphagia and severe obesity.

Another group of neuronal proteins important in the development of neuronal circuitry and linked to energy balance are the class 3 semaphorins (SEMA3A–G). A study in humans found that 40 rare loss-of-function variants in SEMA3A–G and their receptors (PLXNA1–4, NRP1 and NRP2) were significantly enriched in 982 individuals with severe obesity compared with 4,449 controls30. Disruption of several of these genes in zebrafish caused increased somatic growth and/or adiposity, and experiments with mouse hypothalamic explants suggest that SEMA3 signalling via NRP2 receptors drives the development of POMC projections from the ARC to the PVN30. However, given that these results are from a single study, more data are required to confirm the exact role of class 3 semaphorins in energy homeostasis.

Insights from genetic loci linked to common obesity

Unlike candidate gene studies, GWAS make no a priori assumptions about the underlying biology that links genetic variants to a disease of interest. While this agnostic approach allows for new biological insights, the vast majority of GWAS-identified variants map to the non-coding parts of genes or to regions between genes. As such, they do not directly disrupt the protein-coding regions, but instead overlap with regulatory elements that influence expression of genes in close proximity or even over long distances.

However, even if the causative genes are unknown, pathway, tissue and functional enrichment analyses based on the genes located in the GWAS loci can provide insights into potential mechanisms. Since the very first GWAS for BMI68,117, such analyses have pointed to the CNS being a key player in body-weight regulation, consistent with insights from human and animal models of extreme obesity. Recent analyses that include the latest BMI-associated loci, combined with updated multi-omics databases and advanced computational tools, have further refined these observations. In addition to the hypothalamus and pituitary gland (which are both known appetite regulation sites), other brain areas have been highlighted, including the hippocampus and the limbic system (which are involved in learning, cognition and emotion) and the insula and the substantia nigra (which are related to addiction and reward)58,89,118,119. The enrichment of immune-related cells (such as lymphocytes and B cells) and adipose tissue was found to be weaker58.

Although enrichment analyses provide preliminary insights into the broad biology represented by genes in the GWAS loci, determining which genes, variants and/or underlying mechanisms are causal has proved an arduous task. For example, the FTO locus, which was identified more than a decade ago and harbours six genes, is the most extensively studied GWAS-identified obesity locus (Fig. 5). Despite its highly significant and widely replicated association with obesity120, the causal variants and/or genes in the FTO locus have not yet been pinpointed with convincing evidence, and the mechanisms by which the locus affects body weight have not been fully elucidated. Early functional follow-up analyses suggested that FTO itself might be responsible, as Fto deficiency in mice results in a lean phenotype, whereas Fto overexpression is associated with increased body weight121,122. Studies in mice have suggested that FTO plays a role in cellular nutrient sensing123,124. Other studies found evidence that FTO influences brain regions that affect appetite, reward processing and incentive motivation by regulating ghrelin levels in humans125 or by controlling dopaminergic signalling in mice126,127. In addition, variants in the FTO locus were shown to alter a regulatory element that controls the transcription of Rpgrip1l in mice, a ciliary gene located immediately upstream of Fto128,129,130. Mice with reduced Rpgrip1l activity exhibit hyperphagic obesity, possibly mediated through diminished leptin signalling128,129,130. In recent years, studies in human and animal models have shown that variants in the FTO locus directly interact with the promoter of Irx3, a gene located 0.5 Mb downstream of FTO. Irx3-deficient mice were found to exhibit weight loss and increased metabolic rate with browning of white adipose tissue, without changes in physical activity or appetite131,132. Further in-depth functional characterization showed that rs1421085 in the FTO locus disrupts a conserved binding motif for the transcriptional repressor ARID5B, which leads to a doubling of IRX3 and IRX5 expression during early adipocyte differentiation132. The authors argue that increased expression of these genes results in a developmental shift from energy-dissipating beige adipocytes to energy-storing white adipocytes, a fivefold reduction in mitochondrial thermogenesis and increased lipid storage132. However, given that multiple studies have shown that the FTO locus is robustly associated with food intake, with no evidence to date linking it to changes in energy expenditure, the relevance of this observation to the actual observed human phenotype still needs to be explored133. A recent study reports that the FTO locus affects gene expression in multiple tissues, including adipose tissue and brain, and, more broadly, that the genetic architecture of disease-associated loci may involve extensive pleiotropy and allelic heterogeneity across tissues134.

Fig. 5: Schematic representation of the FTO locus and its neighbouring genes on human chromosome 16q22.
figure 5

FTO contains nine exons (depicted by blue rectangles) and the body mass index (BMI)-associated SNP identified in genome-wide association studies (depicted by a red ×) maps to intron 1. IRX3 and RPGRIP1L have both been proposed to be the causal genes for obesity within the locus and to act on body weight through distinct mechanisms. HFD, high-fat diet.

Besides the FTO locus, functional follow-up analyses have been performed for only a few obesity-associated GWAS loci. For example, early studies identified a cluster of variants just downstream of TMEM18 (refs68,117). TMEM18 encodes a poorly characterized transmembrane protein that is highly conserved across species and widely expressed across tissues, including in several regions of the brain135,136. Tmem18 deficiency in mice results in a higher body weight owing to increased food intake, whereas Tmem18 overexpression reduces food intake and limits weight gain136. A knockdown experiment in Drosophila melanogaster suggests that TMEM18 affects carbohydrate and lipid levels by disrupting insulin and glucagon signalling137.

Two other GWAS loci for which functional analyses have been performed are located just upstream of CADM1 (ref.82) and in CADM2 (ref.70), genes that encode cell-adhesion proteins of the immunoglobulin superfamily and mediate synaptic assembly in the CNS138. The BMI-increasing alleles at each locus are associated with increased expression of CADM1 and CADM2 in the hypothalamus139,140. Deficiency of either Cadm1 or Cadm2 in mice results in a lower body weight and increased insulin sensitivity, glucose tolerance and energy expenditure without any change in food intake139,140. Conversely, increased neuronal expression of either Cadm1 or Cadm2 is associated with elevated body weight139,140. Furthermore, CADM1 is expressed in POMC neurons and Cadm1 deficiency leads to an increase in the number of excitatory synapses, suggestive of an increased synaptic plasticity140. Cadm2-deficient mice exhibit increased locomotor activity and higher core body temperature139.

Another GWAS locus, just upstream of NEGR1, harbours two deletions associated with increased obesity risk68,117,141. These deletions do not overlap with the coding sequence of NEGR1, but encompass a conserved transcription factor-binding site for NKX6.1, a potent transcriptional repressor68,141. Loss of binding of NKX6.1 leads to higher NEGR1 expression141, which is consistent with the observation that BMI-increasing alleles (that is, deletions) at this locus are associated with higher NEGR1 expression in the brain. Similar to CADM1 and CADM2, NEGR1 is a cell-adhesion molecule of the immunoglobulin superfamily that is expressed in several regions of the brain and has been shown to have a role in brain connectivity69,142, a process believed to be important in obesity143. NEGR1 deficiency in mice was shown to result in lower body weight, mainly due to reduced lean mass, mediated by lower food intake144. However, two other functional studies, one in mice and one in rats, found that knockdown of Negr1 expression resulted in the opposite phenotype — increased body weight and food intake145,146. While NEGR1 deficiency in mice was found to impair core behaviours, so far, findings and proposed mechanisms are not fully aligned69,147,148,149.

Taken together, functional follow-up analyses for these loci are slowly expanding our understanding of the pathophysiology that drives weight gain. However, many more obesity-associated loci are waiting to be translated into new biological insights. A major hurdle in translating GWAS loci into plausible candidate genes and appropriate paradigms for functional research is the annotation of the associated variants in a locus. Defining the regulatory function of the non-coding variants, identifying their putative effector transcripts and determining their tissues of action remains an ongoing challenge. The advent of high-throughput genome-scale technologies for mapping regulatory elements, combined with comprehensive multi-omics databases, advanced computational tools and the latest genetic engineering and molecular phenotyping approaches, is poised to speed up the translation of GWAS loci into meaningful biology150.

Converging results from monogenic and polygenic forms of obesity

Gene discovery is often dichotomized by allele frequency and disease prevalence; that is, mutations are sought for monogenic forms of obesity and common variants for polygenic obesity (Fig. 2). However, it is increasingly recognized that monogenic and polygenic forms of obesity are not discrete entities. Instead, they lie on a spectrum and share — at least in part — the same biology. As GWAS have continued to discover more obesity-associated loci, an increasing number of these loci harbour genes that were first identified for extreme and early-onset obesity in humans or animal models, including MC4R151,152, BDNF117, SH2B1 (refs68,117), POMC70, LEP51,153, LEPR52,154, NPY155, SIM1 (ref.155), NTRK2 (ref.58), PCSK1 (ref.154) and KSR2 (ref.77). In fact, most of these genes encode components of the leptin–melanocortin and BDNF–TrkB signalling pathways (Table 1). Thus, whereas genetic disruption of components of these pathways results in severe obesity, genetic variants in or near these same genes that have more subtle effects on their expression will influence where an individual might sit in the normal distribution of BMI.

Although most genes have been first identified for extreme forms of obesity, a locus harbouring ADCY3 was first identified in GWAS for common obesity77, and ADCY3 was subsequently confirmed as having a role in extreme obesity63,64. ADCY3 encodes an adenylate cyclase that catalyses the synthesis of cAMP, an important second messenger in signalling pathways. There is some evidence that ADCY3 (adenylate cyclase) colocalizes with MC4R at the primary cilia of PVN neurons67 and that cilia are required specifically on MC4R-expressing neurons for the control of energy homeostasis156. In mice, disruption of Adcy3 or Mc4r in the cilia of these neurons impairs melanocortin signalling, resulting in hyperphagia and obesity67.

As more GWAS loci are reported, we expect that findings across different lines of obesity research will continue to converge, providing accumulating evidence for new biology.

From genes to clinical care

Genetic insights from gene discovery efforts are increasingly being used in the context of precision medicine in ways that directly affect health. Knowing a patient’s genotype may enable a more precise diagnosis of the type of obesity, which in turn allows the prescription of personalized treatment or prevention strategies. Furthermore, knowing an individual’s genetic susceptibility to obesity early in life may help to more accurately predict those most at risk of gaining weight in the future.

Use of genotype information in treatment of obesity

When a disease is caused by a single mutation and the environmental contribution is limited, as is the case for some forms of extreme and early-onset obesity, a genetic test can be instrumental in correctly diagnosing patients. Although no standard genetic testing panel is currently available for extreme and early-onset obesity, some clinics, research centres and pharmaceutical companies sequence well-known candidate genes to identify the functional mutation that may be the cause of a patient’s excess body weight. Such a genetic diagnosis can lessen the feelings of guilt and blame for the patient, and alleviate social stigma and discrimination. Importantly, a genetic diagnosis can inform disease prognosis and, in some cases, it will determine treatment. To date, there are two treatments for obesity that are tailored to patient genotype.

The prototype of genotype-informed treatment for obesity is the administration of recombinant human leptin in patients who are leptin-deficient owing to mutations in the LEP gene157,158. Although congenital leptin deficiency is exceptionally rare (only 63 cases have been reported to date28), leptin replacement therapy has been remarkably beneficial for these patients by substantially reducing food intake, body weight and fat mass, and normalizing endocrine function157,158. It has literally transformed their lives.

The second genotype-informed treatment for obesity is setmelanotide, a selective MC4R agonist that was recently approved by the FDA for rare monogenic obesity conditions including LEPR, PCSK1 and POMC deficiency159. Setmelanotide acts as a substitute for the absent MSH in patients with POMC deficiency owing to mutations in POMC or PCSK1, and in patients with LEPR deficiency owing to mutations in LEPR, which is essential for POMC function160,161,162. Daily subcutaneous injection of setmelanotide results in substantial weight loss and in reduction of hunger160,161,162. After a 1-year treatment with setmelanotide in phase III trials, patients with POMC deficiency lost on average 25.6% of their initial weight, with 80% of patients achieving at least a 10% weight loss162. The adverse effects of setmelanotide treatment are minor, and include hyperpigmentation, nausea and/or vomiting, penile erection and injection site reactions. Weight loss in patients with LEPR deficiency was less pronounced; on average, they lost 12.5% of their initial weight, with only 45% of patients achieving at least a 10% weight loss162. The difference in weight loss between the two patient groups may be because POMC deficiency directly affects the production of MC4R ligands (α-MSH and β-MSH), whereas LEPR deficiency affects signalling upstream of POMC162. As such, setmelanotide may be able to completely restore MC4R signalling in POMC deficiency, but only partially in LEPR deficiency. Even though the average weight loss in POMC-deficient patients was twice that in LEPR-deficient patients, the reduction in hunger was substantially larger in LEPR-deficient patients (−43.7%) than in POMC-deficient patients (−27.1%)162. The reasons for the discrepancy between weight loss and reduction in hunger remain to be studied in greater depth. It has been estimated that in the USA, >12,800 individuals carry mutations in the melanocortin pathway for whom setmelanotide may be more effective for weight loss than any other treatment163. Although 12,800 carriers represent only a fraction (0.004%) of the adult population in the USA, and not all of these mutation carriers are overweight or obese, for the patients for whom setmelanotide is effective, it may end a lifelong battle to lose weight163. In patients without genetic defects, neither setmelanotide nor leptin administration have, to date, demonstrated a substantial effect on weight loss164,165.

These two genotype-informed treatments show how insight into the underlying biological mechanisms can guide the development of molecules and medications that restore impaired pathways, at least in monogenic forms of obesity caused by deficiency of one protein. Nevertheless, there remain substantial obstacles in the transition from conventional to precision medicine for monogenic obesity, which would require the adoption of systematic WES for individuals suspected to be carriers of deleterious mutations, and eventually even standardized screening at birth. We are clearly a long way from such a scenario at present.

Use of genotype information in prediction of obesity

As more variants are being discovered for common obesity, there is a growing expectation that genetic information will soon be used to identify individuals at risk of obesity. Knowing a person’s genetic susceptibility would allow for a more accurate prediction of who is at risk of gaining weight and give an opportunity to intervene earlier to prevent obesity more effectively. Genetic susceptibility to complex disease, including obesity, is assessed using a polygenic score (PGS). PGSs to assess obesity susceptibility are based on GWAS for BMI (PGSBMI), the latest of which includes data on more than 2 million variants and explains 8.4% of the variation in BMI166. The average BMI of individuals with a high PGSBMI (top decile) is 2.9 kg m−2 (equivalent to 8 kg in body weight) higher and their odds of severe obesity (BMI ≥40 kg m−2) is 4.2-fold higher than those with a lower PGSBMI (lowest nine deciles)166.

Despite these strong associations with BMI and obesity, the predictive performance of the PGSBMI is weak, which is unsurprising given its limited explained variance. For example, using the same PGSBMI and data from the UK Biobank, we estimate that the area under the receiver operating characteristic curve (AUCROC) is only 0.64 to predict obesity. This means that the probability that an individual with obesity has a higher PGSBMI than an individual without obesity is 0.64. However, for a PGS to have clinical utility, the AUCROC needs to be much higher (>0.80). In addition, we calculated the extent to which a PGSBMI ≥90th percentile correctly classifies individuals with obesity (Fig. 6). We found that such a predictive test (PGSBMI ≥90th percentile) has a positive predictive value of 0.43, meaning that of those who were predicted to develop obesity, only 43% actually developed obesity. Its sensitivity is 0.19, which means that of the individuals who developed obesity, only 19% had been correctly classified by the PGSBMI. Given that the current treatment options for obesity are low risk, or even generally beneficial, the high false-positive rate is less concerning than the low sensitivity, as some at-risk individuals may miss the opportunity for early prevention.

Fig. 6: Predicting obesity using a polygenic score.
figure 6

The outcome is illustrated for a polygenic score (PGS) that assumes that individuals with a score in the highest decile (≥90th percentile (pct)) will develop obesity, has a positive predictive value of 0.4 and a sensitivity of 0.19. Of ten individuals with a high score classified by the PGS as ‘with obesity’, four will be classified correctly but the other six will be misclassified and will not develop obesity — a positive predictive value of 0.4. Likewise, 17 of the 90 individuals with a score <90th pct who are predicted to not develop obesity, will develop obesity. Thus, only four of the 21 individuals who developed obesity were correctly classified by the PGS — a sensitivity of 0.19. Misclassified individuals are indicated by the red boxes, individuals correctly classified as ‘with obesity’ are indicated by a blue box. Adapted with permission from ref.170, Elsevier.

Thus, the current PGSBMI has a high rate of misclassification and does not reliably predict who is at risk of developing obesity and who is not. The predictive ability of PGSs are expected to improve as GWAS increase in sample size and algorithms to calculate the scores become more refined. Nevertheless, given the importance of socio-demographic, lifestyle and clinical risk factors in the aetiology of obesity, it is unlikely that a PGSBMI will ever be able to accurately predict obesity on its own. Instead, effective prediction models will have to include genetic and non-genetic factors, including a broad spectrum of demographic, environmental, clinical and possibly molecular markers, as well.

Conclusions and future perspectives

What initially began as two apparently distinct approaches, one studying rare Mendelian causes of extreme obesity, and the other exploring complex polygenic influences of population body-weight distribution, have eventually converged on the central role of the brain in regulating body weight. In particular, both approaches have highlighted the roles of the leptin–melanocortin pathway and TrkB–BDNF signalling. Perhaps it seems obvious now, but it was by no means certain that, just because genetic disruption of a pathway resulted in a severe phenotype, polymorphisms within that same pathway would produce a more subtle and nuanced result.

The GWAS approach is hypothesis-free, with the promise to reveal new genes that point to new biology and pathways. However, for the vast majority of the >1,000 GWAS-identified loci, we do not know which genes are causal, what cells, tissues and organs they act in to affect body weight, and we do not understand the underlying mechanisms. The translation from variant to function is a well-known challenge167, but with increasing availability of new omics data, high-throughput technologies and advanced analytical approaches, there is an unprecedented opportunity to speed up the translation of hundreds of GWAS loci.

Sample size remains a major driver for gene discovery. In an ongoing collaboration that combines data from more than 3 million individuals of diverse ancestry from the GIANT consortium, the UK Biobank and 23andMe, the number of BMI-associated GWAS loci is set to double. Also, a recent WES effort of more than 640,000 individuals has demonstrated that rare mutations are discoverable when sample sizes are sufficiently large79. However, alternative study designs, a focus on more refined phenotypes or a focus on population subgroups (that is, more homogeneous groups of individuals with similar outcomes) could further add to gene discovery.

Translation of only a few dozen of the GWAS-identified loci could tremendously improve our insights into the biology of obesity and possibly reveal new therapeutic targets. It would also take us a little closer to the ‘holy grail’ — the ability to move away from a failed ‘one-size-fits-all’ strategy, and towards true precision medicine for obesity, metabolic disease and other diet-related illnesses.