University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five...

48
University of Zurich Zurich Open Repository and Archive Winterthurerstr. 190 CH-8057 Zurich http://www.zora.uzh.ch Year: 2009 Genome-wide association study identifies five susceptibility loci for glioma Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon, M; Marie, Y; Boisselier, B; Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A; Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, A T; Feychting, M; Lönn, S; Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J; Price, A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K; Swerdlow, A; Lathrop, M; Bondy, M; Houlston, R S Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon, M; Marie, Y; Boisselier, B; Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A; Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, A T; Feychting, M; Lönn, S; Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J; Price, A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K; Swerdlow, A; Lathrop, M; Bondy, M; Houlston, R S (2009). Genome-wide association study identifies five susceptibility loci for glioma. Nature Genetics, 41(8):899-904. Postprint available at: http://www.zora.uzh.ch Posted at the Zurich Open Repository and Archive, University of Zurich. http://www.zora.uzh.ch Originally published at: Nature Genetics 2009, 41(8):899-904.

Transcript of University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five...

Page 1: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

University of ZurichZurich Open Repository and Archive

Winterthurerstr. 190

CH-8057 Zurich

http://www.zora.uzh.ch

Year: 2009

Genome-wide association study identifies five susceptibility locifor glioma

Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon,M; Marie, Y; Boisselier, B; Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A;

Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, A T; Feychting, M; Lönn, S;Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J;Price, A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K;

Swerdlow, A; Lathrop, M; Bondy, M; Houlston, R S

Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon, M; Marie, Y; Boisselier, B;Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A; Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, AT; Feychting, M; Lönn, S; Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J; Price,A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K; Swerdlow, A; Lathrop, M; Bondy, M;Houlston, R S (2009). Genome-wide association study identifies five susceptibility loci for glioma. Nature Genetics,41(8):899-904.Postprint available at:http://www.zora.uzh.ch

Posted at the Zurich Open Repository and Archive, University of Zurich.http://www.zora.uzh.ch

Originally published at:Nature Genetics 2009, 41(8):899-904.

Shete, S; Hosking, F J; Robertson, L B; Dobbins, S E; Sanson, M; Malmer, B; Simon, M; Marie, Y; Boisselier, B;Delattre, J Y; Hoang-Xuan, K; El Hallani, S; Idbaih, A; Zelenika, D; Andersson, U; Henriksson, R; Bergenheim, AT; Feychting, M; Lönn, S; Ahlbom, A; Schramm, J; Linnebank, M; Hemminki, K; Kumar, R; Hepworth, S J; Price,A; Armstrong, G; Liu, Y; Gu, X; Yu, R; Lau, C; Schoemaker, M; Muir, K; Swerdlow, A; Lathrop, M; Bondy, M;Houlston, R S (2009). Genome-wide association study identifies five susceptibility loci for glioma. Nature Genetics,41(8):899-904.Postprint available at:http://www.zora.uzh.ch

Posted at the Zurich Open Repository and Archive, University of Zurich.http://www.zora.uzh.ch

Originally published at:Nature Genetics 2009, 41(8):899-904.

Page 2: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Genome-wide association study identifies five susceptibility locifor glioma

Abstract

To identify risk variants for glioma, we conducted a meta-analysis of two genome-wide associationstudies by genotyping 550K tagging SNPs in a total of 1,878 cases and 3,670 controls, with validation inthree additional independent series totaling 2,545 cases and 2,953 controls. We identified five risk locifor glioma at 5p15.33 (rs2736100, TERT; P = 1.50 x 10(-17)), 8q24.21 (rs4295627, CCDC26; P = 2.34x 10(-18)), 9p21.3 (rs4977756, CDKN2A-CDKN2B; P = 7.24 x 10(-15)), 20q13.33 (rs6010620,RTEL1; P = 2.52 x 10(-12)) and 11q23.3 (rs498872, PHLDB1; P = 1.07 x 10(-8)). These data show thatcommon low-penetrance susceptibility alleles contribute to the risk of developing glioma and provideinsight into disease causation of this primary brain tumor.

Page 3: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Genome-wide association study identifies six susceptibility loci for glioma

Sanjay Shete1*, Fay J Hosking2*, Lindsay B Robertson2, Sara E Dobbins2, Marc Sanson3,

Beatrice Malmer4, Matthias Simon5, Yannick Marie3, Blandine Boisselier3, Jean-Yves Delattre3,

Khe Hoang-Xuan3, Soufiane El Hallani3, Ahmed Idbaih3, Diana Zelenika6, Ulrika Andersson7,

Roger Henriksson7, A Tommy Bergenheim8, Maria Feychting8, Stefan Lönn9, Anders Ahlbom7,

Johannes Schramm9, Michael Linnebank10, Kari Hemminki11, Rajiv Kumar11, Sarah J

Hepworth12, Amy Price2, Georgina Armstrong1, Yanhong Liu1, Xiangjun Gu1, Robert Yu1, Minouk

Schoemaker , 14 Kenneth Muir13, Anthony Swerdlow14, Mark Lathrop4,15, Melissa Bondy1¥, Richard

S Houlston2¥

1. Department of Epidemiology, The University of Texas M.D. Anderson Cancer Center, Houston,

Texas 77030, USA.

2. Section of Cancer Genetics, Institute of Cancer Research, Sutton, Surrey. UK.

3. Service de Neurologie Mazarin et INSERM U 711, Hôpital de la Salpêtrière 47, Bd de l'Hôpital

75 013 Paris. France.

4. Department of Radiation Sciences, Oncology, Umeå University, Sweden.

5. Neurochirurgische Universitätsklinik, Sigmund-Freud-Str. 25, 53105 Bonn, Germany.

6. Centre National de Génotypage, IG/CEA, 2 rue Gaston Crémieux CP 5721, 91057 Evry

Cedex, France.

7. Department of Clinical Neuroscience, Umeå University, Sweden.

8. Institute of Environmental Medicine, Karolinska Institutet, Sweden.

9. Department of Medical Epidemiology and Biostatistics Karolinska Institutet, Sweden

10. Department of Neurology, University Hospital Zurich, Frauenklinikstrasse 26, CH-8091

Zurich, Switzerland.

11. Division of Molecular Genetic Epidemiology, German Cancer Research Center (DKFZ), Im

Neuenheimer Feld 580, D-69120 Heidelberg, Germany.

12. Centre for Epidemiology and Biostatistics, Faculty of Medicine and Health, University of

Leeds, UK.

13. Division of Epidemiology and Public Health, University of Nottingham Medical School,

Queen's Medical Centre, Nottingham, UK.

14. Section of Epidemiology, Institute of Cancer Research, Sutton, UK.

15. Foundation Jean Dausett-CEPH, 27 Rue Julliette Dodu, 75010 Paris, France

*equal status at this position

¥Corresponding authors

1

Page 4: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Richard S Houlston

e-mail: [email protected]

Tel: +44 (0) 208 722 4175

Fax: +44 (0) 208 722 4365

Melissa Bondy

e-mail: [email protected]

Tel: 001 713-794-5264

Fax: 001 713-792-9568

2

Page 5: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

To identify risk variants for glioma we conducted a meta-analysis of 2 genome-wide

association studies, based on genotyping 550K tagging SNPs in a total of 1,878 cases

and 3,670 controls, with validation in 3 additional independent series totaling 2,545 cases

and 2,953 controls. We identified 6 risk loci for glioma at 5p15.33 (rs2736100, TERT; P =

1.50 x 10-17), 8q24.21 (rs4295627, CCDC26; P = 2.34 x 10-18), 9p21.3 (rs4977756,

CDKN2A/p14(ARF)/CDKN2B; P = 7.24 x 10-15), 20q13.33 (rs6010620, RTEL1; P = 2.52 x 10-

12), 11q13.4 (rs7124728; P = 8.18 x 10-9) and 11q23.3 (rs498872, PHLDB1; P = 1.07 x 10-8).

These data show that common low-penetrance susceptibility alleles contribute to the risk

of developing glioma and provide insight into disease causation of this primary brain

tumor.

Gliomas account for ~80% of primary malignant brain tumors with ~21,000 individuals being

diagnosed with glioma each year in the US1. Most gliomas are associated with a dismal

prognosis despite surgery, radiotherapy and chemotherapy1. To date no lifestyle exposures have

been consistently linked to an increased risk of glioma except for ionizing radiation which is only

responsible for a very small number of cases1. Evidence for an inherited predisposition to glioma

is provided by the single gene disorders neurofibromatosis, tuberous sclerosis, retinoblastoma,

Li-Fraumeni and Turcot’s syndromes .1 These syndromes are, however, rare and collectively only

make a minor contribution to the 2-fold increased risk seen in first-degree relatives of patients

with primary brain tumors (PBT)2. It is therefore likely that part of the familial risk is a

consequence of the co-inheritance of multiple low-risk variants, some of which may be common

and hence detectable through association analyses. Predicated on this hypothesis we have

conducted 2 genome-wide association (GWA) studies to identify common variants influencing

the risk of developing glioma. Pooling data from both GWA scans and following up the best-

supported associations in 3 independent case-control series has enabled us to identify 6 novel

predisposition loci for glioma.

The two GWA studies were conducted by centers in the UK and US; henceforth referred to as

the UK-GWA and US-GWA studies respectively (Figure 1). In both GWA studies genotyping of

cases was conducted using Illumina Infinium HD Human610-Quad BeadChips.

The UK-GWA study was based on genotyping 636 glioma cases ascertained through the

INTERPHONE Study3. After applying strict quality control criteria, genotype data were available

for 631 cases (Figure 1). For controls we made use of publicly accessible Illumina Hap550K

BeadChip genotype data generated on 1,438 individuals from the British 1958 Birth Cohort (58C,

also known as the National Child Development Study)4, which included all live births in England,

Wales and Scotland during a single week in 1958.

3

Page 6: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

The US-GWA study was based on genotyping 1,281 glioma cases ascertained through an

ongoing collection being made by the MD Anderson Cancer Center. After applying strict quality

control criteria, genotype data were available for 1,247 cases (Figure 1). For controls we made

use of publicly accessible Illumina Hap550K BeadChip genotype data generated on 2,243

individuals from the Cancer Genetic Markers of Susceptibility (CGEMS) studies of breast and

prostate cancer5,6.

Across both case series total of 572,571 SNPs were satisfactorily genotyped (99.4%), with mean

individual sample call rates (the percentage of samples for which a genotype was obtained for

each SNP) of 99.8%. We excluded 32 individuals because of non-Western European Ancestry

and 2 because of cryptic relatedness (Figure 1; Supplementary Figure 1). 521,318 SNP

genotypes were available on all 1,878 cases and 3,670 controls in the combined data. Prior to

undertaking the meta-analysis we searched for potential errors and biases in the 2 GWA studies

and imposed a high stringency for quality control for SNPs taken forward. Specifically, we

considered only the 454,576 autosomal SNPs which had call rates >95% in all case and control

series, which showed no extreme departure from Hardy-Weinberg equilibrium (HWE; P>10-5 in

controls) and had minor allele frequencies (MAF) exceeding 1% in cases and controls (Figure 1).

Comparison of the observed and expected distributions showed little evidence for an inflation of

the test statistics in the 2 datasets (inflation factor7 = 1.02 and 1.05 for the UK-GWA and US-

GWA studies respectively based on the 90% least significant SNPs; Supplementary Figure 2),

thereby excluding the possibility of significant hidden population substructure or differential

genotype calling between cases and controls. Using data from both GWA studies we derived

joint odds ratios (ORs) and confidence intervals under a fixed effects model for each SNP and

associated P values from the standard normal distribution. The distribution of the association P

values was significantly skewed from the null distribution; 34 of the SNPs having a P value<10-5,

greater than the 4.5 conservatively expected under the null hypothesis (P <10-15, binomial test).

To identify true risk alleles amongst these 34 SNPs, we conducted replication in 3 independent

case-control series involving a total of 5,498 individuals that passed quality control filtering

(French series, 1,392 cases and 1,602 controls; German series 504 cases and 573 controls;

Swedish series, 649 cases and 778 controls). 31 of the 34 SNPs were satisfactorily genotyped in

all 3 case-control series. On the basis of the combined analyses we found that signals from 15 of

the 31 SNPs, representing 6 genomic regions, reached strong levels of evidence satisfying the

generally accepted threshold for genome-wide statistical significance (i.e. P < 10-7; Table 1,

Supplementary Table 1). For 3 SNPs, genotyping was not possible in at least one of the 3 series

using the analytical platforms available within each study centre. In the French case-control

4

Page 7: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

series SNPs rs9656979 and rs7257116 were not genotyped. An association between these 2

SNPs and glioma risk was however not supported in either German or Swedish series

(Supplementary Table 1). While rs7300686 was not typed in either the German or Swedish

series an association with glioma risk was not supported in the French case-control series

(Supplementary Table 1). Hence failure to genotype all additional case-control series for the

three SNPs has not impacted on the study findings.

Under a fixed-effects model the strongest signal was attained at rs4295627 (combined OR =

1.36, 95% confidence interval [CI]: 1.29 - 1.43; P = 2.34 x 10-18) which localizes to 8q24.21

(130754639 bps; Figure 2, Supplementary Figure 3, Table 1). There was, however evidence of

between-study heterogeneity (Phet = 0.01) ascribable to the association being modest in the

Swedish series. Under a random effects model, the combined OR was 1.36 (95% CI: 1.19-1.58;

P = 2.01 x 10-5) using all case-control series data and 1.45 (95% CI: 1.32-1.57; P = 1.02 x 10-8)

excluding the Swedish study. rs4295627 maps to intron 3 of CCDC26 (coiled-coil domain

containing 26) a retinoic acid (RA) modulator of myeloid differentiation and death8. Retinoic acid

induces caspase-8 transcription through phosphorylation of CREB and increases apoptotic

responses to death stimuli in neuroblastoma cells9. Together with IFN-, RA induces

differentiation, apoptotic death in glioblastoma cells with down regulation of telomerase activity10

and it has been trialed as a chemotherapeutic agent11. There is some evidence to suggest a

second 8q24.21 locus defined by rs891835, which maps to intron 1 of CCDC26 207kb telomeric

to rs4295627 (130,560,934 bps; combined ORtrend = 1.24, 95% CI: 1.17–1.3; P = 7.54 x 10-11)

and displays low LD between with rs4295627 (D’ and r2 = 0.59, 0.19 and 0.58, 0.26 in HapMap

and GWA scan respectively). The OR for rs4295627 with adjustment for rs891835 was 1.30

(95% CI: 1.20-1.41; P = 2.80 x 10-10). Similarly, adjusting rs891835 for rs4295627 still provided

evidence of an association, albeit only at a nominal level of significance (OR= 1.08, 95% CI:

1.00-1.17; P = 0.03). There was also an increasing trend in OR with an increasing number of risk

alleles (P = 2.67 x 10-13) and comparison of haplotype frequencies indicates that the prevalence

of two distinct haplotypes differed between cases and controls (Supplementary Table 2). When

we imputed 8q24.21 genotypes using genome-wide data rs1077236, mapping to 130709683bps,

provided a marginally better association signal than rs4295627 in this dataset (P-values 2.75 x

10-9 and 1.60 x 10-8 respectively; Supplementary Figure 4). Hence, the possibility remains that

rs4295627 and rs891835, although only weakly associated with each other, are in LD with one

or more causal variants in this region.

The second strongest statistical evidence for an association was for rs2736100 (combined OR =

0.79, 95% CI: 0.73 – 0.84; P = 1.50 x 10-17) mapping to 5p15.33 which localizes to intron 2 of

TERT (1339516 bps; Figure 2, Supplementary Figure 3, Table 1). While this SNP is unlikely to

5

Page 8: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

be directly causal, current knowledge implicates variation in TERT as the probable basis for the

association signal. TERT is the reverse transcriptase component of telomerase, essential for

telomerase activity in maintaining telomeres and is a critical to cell immortalization. While

5p15.33 copy number change is not common in glioblastoma multiforme (GBM; Supplementary

Figure 5) TERT expression has been shown to correlate with glioma grade and prognosis12.

Several lines of evidence raise the possibility of a second locus defined by rs2853676 which also

maps to intron 2 of TERT, 2kb telomeric to rs2736100 (1341547 bps; combined OR = 0.79, 95%

CI: 0.76-0.83; P = 4.21 x 10-14) and is in weak LD with rs2736100 (D’ and r2 = 0.67, 0.15 and

0.81, 0.24 in HapMap and GWA scan respectively). The OR for rs2736100 with adjustment for

rs2853676 was 0.84 (95% CI: 0.79 - 0.89; P = 5.33 x 10-8) and similarly, adjusting rs2853676 for

rs2736100 also provided evidence of an association, with little change in the risk estimate (OR=

0.87, 95% CI: 0.81 – 0.93; P = 6.90 x 10-5). There was an increasing trend in OR with an

increasing number of risk alleles (P = 1.19 x 10-19) and comparison of haplotype frequencies

provided evidence of two distinct haplotypes differing between cases and controls

(Supplementary Table 2). Recent data has implicated variation at 5p15.33 (TERT-CLMPTM1L)

defined by rs2736098 with risk of multiple tumor types including lung and bladder cancer13. As

LD between rs2736098 and both rs2736100 and rs2853676 is poor (D’ and r2 0.48, 0.11 and

0.28, 0.02 in HapMap and GWA scan respectively) whether the 5p15.33 association with glioma

has the same causal basis remains to be established.

The third strongest statistical evidence for an association was for rs4977756 (combined OR =

1.24, 95% CI: 1.19 - 1.30; P = 7.24 x 10-15), which maps 59kb telomeric to CDKN2B

(22058652bps; Figure 2, Supplementary Figure 3, Table 1) within a 122kb region of LD at

9p21.3. This region encompasses the CDKN2A/p14(ARF)/CDKN2B tumor suppressor genes.

CDKN2A encodes p16(INK4A) a negative regulator of cyclin-dependant kinases and p14 (ARF1)

an activator of p53. CDKN2A/p14(ARF)/CDKN2B has an established role in glioma with

mutations in CDKN2A reported to be detectable in ~50% of tumors14,15 and loss of expression

linked to poor prognosis12. Furthermore, germline mutation of the locus, specifically p14ARF has

been reported to cause the melanoma-astrocytoma syndrome16,17. Regulation of p16/p14ARF

has been observed to be important for sensitivity to ionizing radiation, the only environmental

factor strongly linked to gliomagenesis18. While rs4977756 is unlikely to be directly causal these

data provide evidence that common variation in addition to rare mutations in this region is a

basis of glioma predisposition.

The fourth strongest statistical evidence for an association was for rs6010620 (combined OR =

1.28, 95% CI: 1.21 – 1.35; P = 2.52 x 10-12) which localizes to intron 12 (61780283 bps) of the

RAD3-like helicase RTEL1 (regulator of telomere elongation helicase 1)19 and maps within a

65kb region of LD at 20q13.33 (Figure 2, Supplementary Figure 3, Table 1). Amplification of

6

Page 9: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

20q13.33 encompassing RTEL1 is seen in ~30% of glioma. While copy number change

correlates with RTEL1 mRNA expression levels a similar relationship is seen with other genes

mapping to the region (Supplementary Figure 5). RTEL1 has recently been shown to maintain

genomic stability directly by suppressing homologous recombination19. These data coupled with

the established role of TERT in glioma biology makes RTEL1 an attractive candidate for the

association signal at 20q13.33. Alternatively, the 20q13.33 association may be mediated through

variation in other genes mapping to the region of LD, such as TNFRSF6B (alias DcR3) whose

expression in gliomas has been correlated with tumor grade20,21. TNFRSF6B is thought to play a

regulatory role in suppressing FasL- and LIGHT-mediated cell death and acts as a decoy

receptor competing with death receptors for ligand binding and hence immune evasion of

gliomas20,21.

The fifth strongest statistical evidence for an association was for rs7124728 (combined OR =

1.18, 95% CI: 1.13–1.24; P = 8.18 x 10-9). rs7124728 maps within a 92kb region of LD at

11q13.4 (70554083 bps; Figure 2, Supplementary Figure 3, Table 1), a region bereft of genes or

predicted transcripts. Furthermore, there are no protein-coding transcripts or predicted

genes/microRNAs in the vicinity of the marker (i.e. 300kb flanking sequence).

The sixth strongest signal was at rs498872 (combined OR = 1.18, 95% CI: 1.13 – 1.24; P = 1.07

x 10-8) which maps to the 5’ UTR of PHLDB1 (pleckstrin homology-like domain, family B,

member 1 protein; alias LL5) within a large LD block of 101kb on 11q23.3 (117982577bps;

Figure 2, Supplementary Figure 3, Table 1). There was some evidence of between-study

heterogeneity (Phet = 0.04) ascribable no association detectable in the German series. Under a

random effects model, the combined OR was 1.16 (95% CI: 1.05-1.28; P = 2.24 x 10-3) for all

studies and 1.22 (95% CI: 1.16-1.28; P = 2.56 x 10-10, Phet= 0.40) excluding the German series.

Although there is no direct evidence for a role of PHLDB1 or other genes mapping to the region

of LD at 11q23.3 (e.g. MLL, ARCN1, TREH) this genomic region is deleted in ~14% of gliomas

(Supplementary Figure 5) and commonly deleted in neuroblastoma22.

Given biological differences between the histological forms of glioma23, we have sought to

analyze the association between SNP genotypes and risk of tumors of glial origin, GBM and

astrocytic linage through case-only logistic regression. Accepting the limitations of pooling data

without central histopathological review, there was evidence that GBMs may have a different risk

profile notably with respect to rs4295627 (CCDC26), rs498872 (PHLDB1) and rs2736100

(TERT) (Supplementary Table 3). Due to significant between study heterogeneity we cannot

however exclude the possibility that the impact of variants on risk is generic in nature and is not

directed to influencing the risk of developing a specific histological form of glioma.

7

Page 10: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

When we modeled pairwise combinations of all of the SNPs we only found evidence of

interactive effects between 8q24.21 SNPs (Supplementary Table 4). For all other pairwise

combinations each locus had an independent role in glioma development (P interaction >0.10). The

risk of glioma increases with increasing numbers of variant alleles for the 6 loci (ORper-allele=1.24,

95% CI: 1.21-1.28; P = 1.30 x 10-61; Supplementary Table 5). The proportion of case and control

subjects grouped according to the number of risk alleles is shown in Figure 3. The distribution of

risk alleles follows a normal distribution in both case and controls, with a shift towards a higher

number of risk alleles in cases. Figure 3 also shows the ORs relative to the median number of 5

risk alleles. Individuals with 10+ risk alleles have at least a 3-fold increase in risk of developing

glioma compared to those with a median number of risk alleles. These ORs may be slight

underestimates because the additive model imposed for allele counting assumes equal

weighting across all SNPs. We estimate the 6 loci we have identified to date through our GWA

account for ~10% of the excess familial risk of glioma (assuming both naïve multiplicative or

additive models). It is acknowledged, however, that the present data provide only crude

estimates of the overall effect on susceptibility attributable to variation at the 6 loci as the effect

of the actual common causal variant responsible for the association, once identified, will typically

be larger. Furthermore, many of the loci may carry additional causal variants, potentially

including low-frequency variants with larger influence on glioma risk.

While fine-mapping and resequencing is required to identify the specific variants underlying each

of the 6 associations so far identified, we conducted 2 analyses to investigate the potential basis

of causality. We interrogated HapMap to identify non-synonymous SNPs (nsSNPs) highly

correlated with the most strongly associated SNP within each region of association. The

strongest LD was between rs6010620 and nsSNP rs3848668 (D’ = 1.0, r2 = 0.05; Supplementary

Table 6). Accepting the caveat that HapMap is not comprehensive these data suggest the

associations identified so far are most probably mediated through LD with sequence changes

that influence gene expression rather than protein sequence or through LD with low frequency

variants (i.e. variants with MAF of 0.001-0.01) that are not catalogued by HapMap. To explore

whether any of the associations reflect cis-acting regulatory effects on a nearby gene, we

searched for genotype–expression correlations in lymphoblastoid cell lines24,25, and in normal

brain tissue26 using publicly available data. We did not, however, find any significant relationship

between any of the SNP genotypes and gene expression (Supplementary Figure 6). These

findings do not exclude subtle effects on gene expression or possibly that the variants exert

effects through untested transactivation of genes.

These results provide the first unambiguous evidence that common genetic variation influences

the risk of developing glioma. Our findings also provide insight into the biological basis of

predisposition. They highlight the importance of variation in genes encoding components of

8

Page 11: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

CDKN2A-CDK4 signaling pathway in the development of glioma. Moreover this pathway,

elucidated through the extended interaction network of CDKN2A, incorporates TERT (through

mutual interaction with HSP90) and other genes (including CCDC26) we have identified as risk

factors for glioma. The identification of additional variants associated with glioma predisposition

should refine the allelic architecture of disease susceptibility and potentially provide further

insights into the biology of this PBT.

9

Page 12: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

METHODS

Subjects

UK- GWA study

The UK study was based on 636 cases (401 male, 285 female; mean age at diagnosis 46 years;

SD 12) ascertained through the INTERPHONE Study3. Briefly, the INTERPHONE Study was an

international multi-centre case-control study of primary brain tumors coordinated by the

International Agency for Research on Cancer (IARC), with material collected between

September 2000 and February 2004. UK patients with PBT were collected through

neurosurgery, neuropathology, oncology, and neurology centers in the Thames regions of

Southeast England and the Northern UK including central Scotland, the West Midlands, West

Yorkshire, and the Trent area. Cases were patients with glioma [International Classification of

Diseases (ICD), 10th revision, code C71; International Classification of Diseases for oncology

(ICD-O), 2nd ed., codes 9380-9384, 9390-9411, 9420-9451, and 9505]. Cases with previous

brain tumors were excluded. To minimize population stratification cases with self reported non-

UK ancestry were excluded from the present study. Individuals from the 1958 Birth Cohort

served as source of controls4.

US-GWA study

The US study was based on 1,247 cases (768 male, 479 female; mean age at diagnosis 47

years; SD 13) ascertained through the MD Anderson Cancer Center, Texas, between 1990 and

2008. Cases were patients with glioma (ICD10 code C71; ICD-O codes 9380-9384, 9390-9411,

9420-9451, and 9505). Individuals from CGEMS5,6 served as controls.

Replication series

French Glioma Collection (FGC): systematic series of 1,439 patients with histologically proven

glioma (WHO classification AIII, AI, AIII, OII, OIII, OAII, OAII, OAIII, GBM-IV) ascertained

through the Service de Neurologie Mazarin, Groupe Hospitalier Pitié-Salpêtrière Paris. The

controls are taken from the SU.VI.MAX (SUpplementation en VItamines et

MinrauxAntioXydants) study of 12,735 healthy subjects (women aged 35 to 60 years; men aged

45 to 60 years) recruited from across France in 199427.

Swedish Glioma Collection (SGCCS): 215 glioma cases ascertained as part of the

INTERPHONE Study3 conducted between 2000 and 2002 in Sweden, 134 cases from Umeå

University Hospital, 122 from the Northern Sweden Health and Disease Study (NSHDS)28 and

203 from neurosurgery university clinics in Sweden. 383 controls for the INTERPHONE study

10

Page 13: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

were randomly selected from the population register, frequency matched to brain tumor cases

on age, sex, and geographical region. In addition 400 controls were ascertained from NSHDS

that were age, gender and geographically representative of the cohort.

German Glioma Collection (GGC): 564 patients who underwent surgery for a glioma at the

Department of Neurosurgery, University of Bonn Medical Center, between 1996 and 2008. All

histological diagnoses were made at the Institute for Neuropathology/German Brain Tumor

Reference Center, University of Bonn Medical Center. Control subjects were 576 healthy

individuals (288 men, 288 women; mean age 31, range 18-45) of German origin recruited in

2004 by the Institute of Transfusion Medicine and Immunology, Mannheim.

Ethics

Collection of blood samples and clinico-pathological information from patients and controls was

undertaken with informed consent and relevant ethical review board approval in accordance with

the tenets of the Declaration of Helsinki.

Genotyping

DNA was extracted from samples using conventional methodologies and quantified using

PicoGreen (Invitrogen, Carlsbad, USA). A genome-wide scan of tag SNPs was conducted using

the Illumina Infinium HD Human610-Quad BeadChips according to the manufacturer's protocols

(Illumina, San Diego, USA; Supplementary Methods). DNA samples with GenCall scores <0.25

at any locus were considered “no calls”. A DNA sample was deemed to have failed if it

generated genotypes at <95% of loci. A SNP was deemed to have failed if fewer than 95% of

DNA samples generated a genotype at the locus. To ensure quality of genotyping, a series of

duplicate samples were genotyped in the same batches.

Subsequent genotyping of SNPs was conducted using either competitive allele-specific PCR

KASPar chemistry (KBiosciences Ltd, Hertfordshire, UK) or single-base primer extension

chemistry MALDI-TOF MS detection (Sequenom, San Diego, USA). All primers and probes used

are available on request. Genotyping quality control was further evaluated through inclusion of

duplicate DNA samples in SNP assays. For all SNP assays >99% concordant results were

obtained. Samples having SNP call rates <90% were excluded from the analysis.

Statistical analysis

Statistical analyses were undertaken using R (v2.6), STATA (version 8; State College, Texas,

US) and PLINK (version 1.05)29 software. Genotype data were used to search for duplicates and

11

Page 14: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

closely related individuals amongst all samples in each of the GWA studies. Identity by state

(IBS) values were calculated for each pair of individuals, and for any pair with allele sharing of

>80%, the sample generating the lowest call rate was removed from further analysis. To identify

individuals who might have non-Western European ancestry, we merged our case and control

data with the 60 western European (CEU), 60 Nigerian (YRI), and 90 Japanese (JPT) and 90

Han Chinese (CHB) individuals from the International HapMap Project. For each pair of

individuals we calculated genome-wide IBS distances, on markers shared between HapMap and

our SNP panel, and used these as dissimilarity measures upon which to perform principal

component analysis. The first two principal components for each individual were plotted and any

individual not present in the main CEU cluster (i.e. outside 5% from cluster centroids) was

excluded from subsequent analyses.

In the UK-GWA study, genotyped samples were excluded from analyses for the following

reasons: relatedness (n=0) and non-CEU ancestry (n=3). In the US-GWA study, genotyped

samples were excluded from analyses for the following reasons: relatedness (n=2) and non-CEU

ancestry (n=23).

The adequacy of the case-control matching and possibility of differential genotyping of cases

and controls were formally evaluated using Q-Q plots of test statistics. Deviation of the genotype

frequencies in the controls from those expected under Hardy-Weinberg Equilibrium (HWE) was

assessed by a 2 test, or Fisher’s exact test where an expected cell count was <5.

The association between each SNP and risk of glioma was assessed by the Cochran-Armitage

trend test. Odds ratios and associated 95% CIs were calculated by unconditional logistic

regression. Relationships between multiple SNPs showing association with glioma risk in the

same region were investigated using logistic regression analysis, and the impact of additional

SNPs from the same region was assessed by a likelihood-ratio test.

The combined effect of each pair of risk locus was investigated by logistic regression modeling

with evidence for interactive effects between SNPs assessed by a likelihood ratio test. The OR

and trend test for increasing numbers of deleterious alleles was estimated by counting two for a

homozygote and one for a heterozygote.

Meta-analysis was conducted using standard methods based on weighted average of study-

specific estimates of the ORs, using inverse variance weights. Cochran’s Q statistic to test for

heterogeneity and the I2 statistic to quantify the proportion of the total variation due to

heterogeneity were calculated. The sibling relative risk attributable to a given SNP was

calculated using the formula:

12

Page 15: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

where p is the population frequency of the minor allele, q=1-p, and r and r are the relative risks

(estimated as OR) for heterozygotes and rare homozygotes, relative to common homozygotes.

Assuming a multiplicative interaction the proportion of the familial risk attributable to a SNP was

calculated as log( )/log( ), where is the overall familial relative risk estimated from

epidemiological studies, assumed to be 2.1 .

1 2

*0 0

2 A naïve estimation of the contribution of all of the

loci identified to the excess familial risk of glioma under an additive model was calculated using

the formula:

Bioinformatics

LD metrics between SNPs reported in HapMap were based on Data Release 2/phase III Feb09

on NCBI B35 assembly, dbSNPb125, except for between rs2736098 and rs2736100 and

rs2853676 which were only available in Data Release 23a/phase II March08.

We used Haploview software (v3.2) to infer the LD structure of the genome in the regions

containing loci associated with disease risk. Prediction of the un-typed SNP in the case–control

data sets of GWA data was carried out using IMPUTE and SNPTEST on HapMap (HapMap

Data Rel 2/phase III Feb09 on NCBI B35 assembly, dbSNPb125).

To examine for a relationship between SNP genotype and mRNA expression in lymphocytes

and normal human cortex we used publicly available data. 90 Caucasian derived Epstein-Barr

virus–transformed lymphoblastoid cell lines were analyzed using Sentrix Human-6 Expression

BeadChips (Illumina, San Diego, USA). Online recovery of data was performed using

WGAViewer Version 1.25 Software24,25. 193 normal human cortex cells were analyzed using

Illumina HumanRef-seq-8 Expression BeadChip arrays26. Where SNPs genotyped in our study

had not been typed HAPMAP data were used to identify correlated SNPs in high LD (r2 >0.8).

Differences in the distribution of levels of mRNA expression between SNP genotypes were

compared using a Wilcoxon-type test for trend. Relationship between copy number change and

mRNA expression at loci in glioma was investigated using data from The Cancer Genome Atlas

(TCGA) 14. The Bioconductor module CGHcall was used to assign copy number status.

We searched for interactions between proteins encoded by genes mapping to association

signals using the program PINA30.

13

Page 16: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

URLs

The R suite can be found at http://www.r-project.org/

Bioconductor: http://www.bioconductor.org/

Detailed information on the tag SNP panel can be found at http://www.illumina.com/

dbSNP: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=snp

HAPMAP: http://www.hapmap.org/

http://pipeline.lbl.gov/cgi-bin/gateway2

KBioscience: http://kbioscience.co.uk/

WGAViewer: http://www.genome.duke.edu/centers/pg2/downloads/wgaviewer.php

1958 Birth Cohort: http://www.cls.ioe.ac.uk/studies.asp?section=000100020003

PLINK: http://pngu.mgh.harvard.edu/~purcell/plink/

Cancer Genetic Markers of Susceptibility (CGEMS): http://cgems.cancer.gov/

The Cancer Genomics Data Portal: http://cbio.mskcc.org/cancergenomics-dataportal

The Cancer Genome Atlas (TCGA) data portal: http://tcga-data.nci.nih.gov/tcga/homepage.htm

A Survey of Genetic Human Cortical Gene Expression: http://labs.med.miami.edu/myers/

IMPUTE: http://www.stats.ox.ac.uk/~marchini/software/gwas/impute.html

SNPTEST: http://www.stats.ox.ac.uk/~marchini/software/gwas/snptest.html

PINA (Protein Interaction Network Analysis platform): http://csbi.ltdk.helsinki.fi/pina/

Note: Supplementary information is available on the Nature Genetics website

ACKNOWLEDGEMENTS

The Wellcome Trust provided principal funding for the study. In the UK additional funding was

provided by Cancer Research UK (C1298/A8362 supported by the Bobby Moore Fund) and the

European Union (CPRB LSHC-CT-2004-503465). The UK and Swedish INTERPHONE studies

were supported by the European Union Fifth Framework Program “Quality of life and

Management of Living Resources” (contract number QLK4-CT-1999-01563), and the

International Union against Cancer (UICC). The UICC received funds for this purpose from the

Mobile Manufacturers' Forum and GSM Association. Provision of funds via the UICC was

governed by agreements that guaranteed INTERPHONE's complete scientific independence.

These agreements are publicly available at http://www.iarc.fr/ENG/Units/RCAd.html. The

Swedish centre was also supported by the Swedish Research Council, the Cancer Foundation

of Northern Sweden, the Swedish Cancer Society, and the Nordic Cancer Union and the UK

centre by the Mobile Telecommunications and Health Research (MTHR) Programme and the

14

Page 17: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Health and Safety Executive, Department of Health and Safety Executive and the UK Network

Operators (O2, Orange, T-Mobile, Vodafone, and ‘3’). The views expressed in the publication are

those of the authors and not necessarily those of the funding bodies. In the US funding was

provided by NIH grants 5R01 CA119215 and 5R01 CA070917. Additional support was obtained

from the American Brain Tumor Association and the National Brain Tumor Society. MD

Anderson acknowledges the work of Phyllis Adatto, Fabian Morice, Hui Zhang, Victor Levin,

Alfred Yung, Mark Gilbert, Raymond Sawaya, Vinay Puduvalli, Charles Conrad, Fredrick Lang,

and Jeffrey Weinberg from the Brain and Spine Center. In France funding was provided by the

Délégation à la Recherche Clinique (MUL03012), the Association pour la Recherche sur les

Tumeurs Cérébrales (ARTC), the Institut National du Cancer (INCa; PL 046) and the French

Ministry of Higher Education and Research. In Germany funding was provided to MS, JS and ML

by the Deutsche Forschungsgemeinschaft (Si 552, Schr 285), the Deutsche Krebshilfe (70-

2385-Wi2, 70-3163-Wi3, 10-6262) and BONFOR. In Sweden the collection of samples from the

Northern Sweden Health of disease study were collected with support from Umeå University

Hospital and NIH funded part of GLIOGENE collection (R01 CA11915). BM was supported by

Acta Oncologica foundation as a research fellow at Royal Swedish Academy of Science. The

UK-GWA study made use of genotyping data on the 1958 Birth Cohort. Genotyping data on

controls was generated and generously supplied to us by Panagiotis Deloukas of the Wellcome

Trust Sanger Institute. A full list of the investigators who contributed to the generation of the data

is available from www.wtccc.org.uk. The US-GWA study made use of control genotypes from the

CGEMS prostate and breast cancer studies. A full list of the investigators who contributed to the

generation of the data is available from http://cgems.cancer.gov/. The results published here are

in whole or part based upon data generated by The Cancer Genome Atlas pilot project

established by the NCI and NHGRI. Information about TCGA and the investigators and

institutions that constitute the TCGA research network can be found at

http://cancergenome.nih.gov. Finally, we are grateful to all the patients and individuals for their

participation. We would also like to thank the clinicians and other hospital staff, cancer registries

and study staff who contributed to the blood sample and data collection for this study.

AUTHOR CONTRIBUTIONS

RSH and MB designed the study. RSH drafted the manuscript, with extensive contributions from

FH and SS. FH and SS performed statistical analyses. SED, FH and SS performed

bioinformatics analyses. LBR performed laboratory management and oversaw genotyping of UK

cases and performed genotyping of German and Swedish case-control series. In the UK: AS,

MS, KM, JSH developed patient recruitment, sample acquisition and performed sample

collection of cases. In the US: MB and GA developed patient recruitment; GA performed curation

15

Page 18: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

and organization of samples and data; YL performed management of genotyping. In Germany:

MS and JS developed patient recruitment and blood sample collection. MS oversaw DNA

isolation and storage. KH and ML collected control samples. MS and ML case ascertainment

and supervision of DNA extractions. KH and RK procurement of German control samples. In

France: MS, J-Y D, K H-X, AI developed patient recruitment; YM, BB and S El-H developed

sample acquisition and performed sample collection of cases. In Sweden: For the Swedish

INTERPHONE Study MF, SL, and AA developed study design and conducted patient

recruitment and control selection, MF, SL, AA, BM, and RH organized sample acquisition and

performed sample collection of case and controls. BM and RH performed laboratory

management and oversaw DNA extraction. The NSHDS samples were collected by Umeå

University, (Principal Investigator Göran Hallmans), and the additional samples collected at the

neurosurgery department in Umeå from 2005 and onwards (TB, RH) and through the national

GLIOGENE study (Principal Investigator BM). All authors contributed to the final paper.

COMPETING INTERESTS STATEMENT

The authors declare no competing financial interests

16

Page 19: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

REFERENCES 1. Bondy, M.L. et al. Brain tumor epidemiology: consensus from the Brain Tumor

Epidemiology Consortium. Cancer 113, 1953-68 (2008). 2. Hemminki, K. & Li, X. Familial risks in nervous system tumors. Cancer Epidemiol

Biomarkers Prev 12, 1137-42 (2003). 3. Cardis, E. et al. The INTERPHONE study: design, epidemiological methods, and

description of the study population. Eur J Epidemiol 22, 647-64 (2007). 4. Power, C. & Elliott, J. Cohort profile: 1958 British birth cohort (National Child

Development Study). Int J Epidemiol 35, 34-41 (2006). 5. Hunter, D.J. et al. A genome-wide association study identifies alleles in FGFR2

associated with risk of sporadic postmenopausal breast cancer. Nat Genet 39, 870-4 (2007).

6. Yeager, M. et al. Genome-wide association study of prostate cancer identifies a second risk locus at 8q24. Nat Genet 39, 645-9 (2007).

7. Devlin, B. & Roeder, K. Genomic control for association studies. Biometrics 55, 997-1004 (1999).

8. Yin, W., Rossin, A., Clifford, J.L. & Gronemeyer, H. Co-resistance to retinoic acid and TRAIL by insertion mutagenesis into RAM. Oncogene 25, 3735-44 (2006).

9. Jiang, M., Zhu, K., Grenet, J. & Lahti, J.M. Retinoic acid induces caspase-8 transcription via phospho-CREB and increases apoptotic responses to death stimuli in neuroblastoma cells. Biochim Biophys Acta 1783, 1055-67 (2008).

10. Das, A., Banik, N.L. & Ray, S.K. Differentiation decreased telomerase activity in rat glioblastoma C6 cells and increased sensitivity to IFN-gamma and taxol for apoptosis. Neurochem Res 32, 2167-83 (2007).

11. Jaeckle, K.A. et al. Phase II evaluation of temozolomide and 13-cis-retinoic acid for the treatment of recurrent and progressive malignant glioma: a North American Brain Tumor Consortium study. J Clin Oncol 21, 2305-11 (2003).

12. Wager, M. et al. Prognostic molecular markers with no impact on decision-making: the paradox of gliomas based on a prospective study. Br J Cancer 98, 1830-8 (2008).

13. Rafnar, T. et al. Sequence variants at the TERT-CLPTM1L locus associate with many cancer types. Nat Genet 41, 221-7 (2009).

14. Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455, 1061-8 (2008).

15. Parsons, D.W. et al. An integrated genomic analysis of human glioblastoma multiforme. Science 321, 1807-12 (2008).

16. Randerson-Moor, J.A. et al. A germline deletion of p14(ARF) but not CDKN2A in a melanoma-neural system tumour syndrome family. Hum Mol Genet 10, 55-62 (2001).

17. Bahuau, M. et al. Germ-line deletion involving the INK4 locus in familial proneness to melanoma and nervous system tumors. Cancer Res 58, 2298-303 (1998).

18. Simon, M., Voss, D., Park-Simon, T.W., Mahlberg, R. & Koster, G. Role of p16 and p14ARF in radio- and chemosensitivity of malignant gliomas. Oncol Rep 16, 127-32 (2006).

19. Barber, L.J. et al. RTEL1 maintains genomic stability by suppressing homologous recombination. Cell 135, 261-71 (2008).

20. Roth, W. et al. Soluble decoy receptor 3 is expressed by malignant gliomas and suppresses CD95 ligand-induced apoptosis and chemotaxis. Cancer Res 61, 2759-65 (2001).

21. Arakawa, Y. et al. Frequent gene amplification and overexpression of decoy receptor 3 in glioblastoma. Acta Neuropathol 109, 294-8 (2005).

22. Guo, C. et al. Allelic deletion at 11q23 is common in MYCN single copy neuroblastomas. Oncogene 18, 4948-57 (1999).

23. Endersby, R. & Baker, S.J. PTEN signaling in brain: neuropathology and tumorigenesis. Oncogene 27, 5416-30 (2008).

24. Stranger, B.E. et al. Genome-wide associations of gene expression variation in humans. PLoS Genet 1, e78 (2005).

17

Page 20: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

18

25. Stranger, B.E. et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848-53 (2007).

26. Myers, A.J. et al. A survey of genetic human cortical gene expression. Nat Genet 39, 1494-9 (2007).

27. Hercberg, S. et al. The SU.VI.MAX Study: a randomized, placebo-controlled trial of the health effects of antioxidant vitamins and minerals. Arch Intern Med 164, 2335-42 (2004).

28. Hallmans, G. et al. Cardiovascular disease and diabetes in the Northern Sweden Health and Disease Study Cohort - evaluation of risk factors and their interactions. Scand J Public Health Suppl 61, 18-24 (2003).

29. Purcell, S. et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 81, 559-75 (2007).

30. Wu, J. et al. Integrated network analysis platform for protein-protein interactions. Nat Methods 6, 75-7 (2009).

Page 21: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Table 1: Summary of results for glioma SNPs replicated. Gene(s) mapping within 20kb of each SNP are detailed

GWA studies Replication studies Combined

SNP Chr Gene Location

(bp)

Ancestral allele

frequency Ancestral

allele

Non-ancestral

allele OR

(95% CI) P OR

(95% CI) P OR

(95% CI) P P het

rs2736100 5 TERT 1,339,516 0.51 T G 0.83

(0.75-0.91) 2.21E-06 0.75

(0.67-0.83) 2.87E-13 0.79

(0.73-0.84) 1.50E-17 0.18

rs2853676 5 TERT 1,341,547 0.27 G A 1.22

(1.14-1.31) 5.30E-06 1.3 (1.21-

1.38) 1.06E-09 1.26

(1.2-1.32) 4.21E-14 0.67

rs10464870 8 CCDC26 130,547,005 0.21 T C 1.24

(1.15-1.34) 3.90E-06 1.22

(1.13-1.31) 1.77E-05 1.23

(1.17-1.3) 3.04E-10 0.05

rs891835 8 CCDC26 130,560,934 0.22 T G 1.24

(1.15-1.33) 3.92E-06 1.24

(1.15-1.33) 4.43E-06 1.24

(1.17-1.3) 7.54E-11 0.01

rs6470745 8 CCDC26 130,711,103 0.20 A G 1.3

(1.2-1.39) 5.79E-08 1.31

(1.22-1.41) 9.09E-09 1.3

(1.24-1.37) 2.77E-15 0.01

rs16904140 8 CCDC26 130,734,825 0.21 G A 1.25

(1.16-1.35) 1.41E-06 1.28

(1.19-1.37) 1.14E-07 1.27

(1.2-1.33) 7.88E-13 0.01

rs4295627 8 CCDC26 130,754,639 0.17 T G 1.33

(1.23-1.42) 1.47E-08 1.39

(1.3-1.49) 2.20E-11 1.36

(1.29-1.43) 2.34E-18 0.01

rs1063192 9 CDKN2A/B 21,993,367 0.44 T C 1.21

(1.13-1.29) 1.44E-06 1.21

(1.14-1.29) 6.97E-07 1.21

(1.16-1.27) 4.61E-12 0.81

rs2157719 9 CDKN2A/B 22,023,366 0.57 G A 0.82

(0.74-0.9) 6.80E-07 0.82

(0.75-0.9) 4.42E-07 0.82

(0.77-0.88) 1.41E-12 0.68

rs1412829 9 CDKN2A/B 22,033,926 0.42 T C 1.22

(1.14-1.3) 7.23E-07 1.23

(1.15-1.3) 1.80E-07 1.22

(1.17-1.28) 6.23E-13 0.67

rs4977756 9 CDKN2A/B 22,058,652 0.40 A G 1.25

(1.17-1.32) 2.39E-08 1.24

(1.16-1.31) 5.90E-08 1.24

(1.19-1.3) 7.24E-15 0.94

rs7124728 11 70,554,083 0.44 C T 1.38

(1.3-1.47) 2.28E-13 1.05

(0.97-1.13) 2.24E-01 1.18

(1.13-1.24) 8.18E-09 0.01

rs498872 11 PHLDB1 117,982,577 0.31 C T 1.26

(1.17-1.34) 1.03E-07 1.12

(1.04-1.2) 4.56E-03 1.18

(1.13-1.24) 1.07E-08 0.04

rs6010620 20 RTEL1 61,780,283 0.23 G A 1.28

(1.18-1.38) 8.38E-07 1.28

(1.18-1.38) 6.49E-07 1.28

(1.21-1.35) 2.52E-12 0.38

rs2297440 20 RTEL1 61,782,743 0.22 C T 1.28

(1.18-1.38) 1.01E-06 1.26

(1.16-1.35) 4.44E-06 1.27

(1.2-1.34) 2.06E-11 0.40

Page 22: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

UK Casesn = 636HumanHap610Quad. 575,837 SNPs

1958 Birth Cohort Controlsn = 1,438 HumanHap550541,327 SNPs

US Casesn = 1,281HumanHap610 Quad. 575,837 SNPs

521,318 SNPs common across all case and control series

Meta-analysis of 454,576 SNPs

UK-GWA study: 631 cases and 1434 controls

US-GWA study: 1247 cases and 2236 controls

66,742 SNPs removed due to:

• Low MAF (<1% in controls)

• Call rate (< 95% in cases/controls)

• Failed HWE (10-5 level) in controls

Samples removed:

• 4 non-Western Europeans

Samples removed:

• 2 failed genotyping

• 3 non-Western Europeans

Samples removed:

• 9 failed genotyping

• 23 non-Western Europeans

• 2 closely related (IBS > 80%)

Samples removed:

• 7 low call rates

• 2 non-Western Europeans

UK-GWA study US-GWA study

CGEMs Breast Controls n = 1,142HumanHap500 533,334 SNPs

CGEMs Prostate Controlsn = 1,101HumanHap300 + HumanHap240534,379 SNPs

Figure 1: Patient and single SNP exclusion schema for genome-wide studies

Page 23: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

CDKN2A CDKN2B

30

15

cM/M

b

rs1063192rs2157719 rs1412829

rs4977756

0

2

4

6

8

10

12

14

16

18

61500 61600 61700 61800 61900 62000 62100

0

2

4

6

8

10

12

14

16

18

70400 70420 70440 70460 70480 70500 70520 70540 70560 70580 70600

cM/M

b

Kb

CR600271

rs10464870

rs6470745

rs4295627

rs16904140

CCDC26 MLZE

-log 1

0P

NKD2 SLC12A7

SLC6A19 SLC6A18 TERT CLPTM1L SLC6A3 LPCAT4

rs2736100

rs2853676

Kb

cM/M

b

Kb

-log 1

0P

20

cM/M

b

Kb

-log 1

0P

rs7124728

90

45

KbKb-lo

g 10P

10

Kb

30

15

cM/M

b

rs6010620rs2297440

KCNQ2 PRPF6

ARFRP1

RTEL1

TNFRSF6B

EEF1A2 PTK6 TPD52L2 UCKL1ZGPATSTMN3

LIME1

(a) 8q24.21 (b) 5p15.33

(c) 9p21.3 (d) 20q13.33

-log 1

0P

rs891835

(e) 11q23.3 (f) 11q13.4

50rs498872

UBE4A

ATP5L ARCN1 TREH BCL9L

MLL PHLDB1 CXCR5DDX6

-log 1

0P

Kb

cM/M

b

25

30

60

0

2

4

6

8

10

12

14

16

18

117500 117600 117700 117800 117900 118000 118100 118200 118300 118400 118500

0

2

4

6

8

10

12

14

16

18

130300 130400 130500 130600 130700 130800 130900 1310000

2

4

6

8

10

12

14

16

18

1000 1100 1200 1300 1400 1500 1600 1700

0

2

4

6

8

10

12

14

16

18

21800 21850 21900 21950 22000 22050 22100 22150 22200

Page 24: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Figure 2: LD structure and association results for the 6 confirmed glioma disease-associated regions: (a) 8q24.21; (b) 5p15.33; (c) 9p21.3; (d) 20q13.33; (e) 11q23.3; and (f) 11q13.4. Chromosomal positions based on NCBI build 36 coordinates, showing Ensemble (release 48) genes. Armitage trend test P values (as –log10 values; left y axis) are shown for SNPs analysed in the GWA-studies in blue and for the combined analysis including the replication series in red. Recombination rates in HapMap CEU across the region are shown in black (right y axis). Also shown are the relative positions of genes mapping to each region of association. Exons of genes have been redrawn to show the relative positions in the gene, therefore maps are not to physical scale.

Page 25: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

0

5

10

15

20

25

30

1 2 3 4 5 6 7 8 9 10 11 12Number of risk alleles

Fre

qu

ency

(%

)

ControlsCases

0

0.5

1

1.5

2

2.5

3

3.5

4

<=1 2 3 4 5 6 7 8 9 >=10Number of risk alleles

Od

ds

Rat

io

(a) (b)

Figure 3: (a) Distribution of risk alleles in glioma cases (blue bars) and controls (green bars); (b) Plot of the increasing ORs for glioma with increasing number of risk alleles. The ORs are relative to the median number of five risk alleles; Vertical bars correspond to 95% confidence intervals.

Page 26: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

OR 95% CI P

rs2736100 5 230 TT 645 TG 372 GG 546 TT 1103 TG 584 GG 0.82 0.72-2.78 1.01E-04rs2853676 5 109 AA 568 AG 570 GG 149 AA 894 AG 1191 GG 1.28 1.17-3.24 1.35E-05rs10464870 8 683 TT 488 TC 76 CC 1370 TT 761 TC 101 CC 1.26 1.14-3.22 9.62E-05rs891835 8 661 TT 498 TG 88 GG 1344 TT 773 TG 115 GG 1.28 1.16-3.24 2.16E-05rs6470745 8 699 AA 469 AG 79 GG 1415 AA 730 AG 90 GG 1.32 1.2-3.28 4.20E-06rs16904140 8 85 AA 476 AG 686 GG 110 AA 773 AG 1352 GG 1.22 1.11-3.18 5.29E-04rs4295627 8 735 TT 451 TG 60 GG 1496 TT 667 TG 72 GG 1.35 1.22-3.31 1.68E-06rs1063192 9 326 TT 602 TC 319 CC 700 TT 1092 TC 436 CC 1.25 1.15-3.21 7.62E-06rs2157719 9 335 AA 601 AG 311 GG 726 AA 1078 AG 428 GG 0.80 0.7-2.76 6.15E-06rs1412829 9 347 TT 596 TC 302 CC 749 TT 1072 TC 412 CC 1.25 1.16-3.21 5.59E-06rs4977756 9 377 AA 594 AG 276 GG 782 AA 1083 AG 370 GG 1.23 1.13-3.19 3.46E-05rs7124728 11 76 TT 740 TC 431 CC 424 TT 1145 TC 663 CC 1.53 1.42-3.49 2.56E-14rs498872 11 156 TT 589 TC 502 CC 211 TT 954 TC 1070 CC 1.27 1.17-3.23 5.07E-06rs6010620 20 46 AA 405 AG 796 GG 123 AA 785 AG 1327 GG 1.20 1.08-3.16 0.0025rs2297440 20 45 TT 397 TC 804 CC 119 TT 777 TC 1330 CC 1.22 1.09-3.18 0.0016rs7300686 12 307 TT 454 TC 466 CC 586 TT 1074 TC 508 CC 1.36 1.26-3.32 4.72E-10rs17748 11 84 TT 487 TC 675 CC 119 TT 753 TC 1363 CC 1.25 1.14-3.21 1.23E-04rs9656979 8 351 TT 630 TC 266 CC 742 TT 1098 TC 393 CC 1.20 1.1-3.16 3.65E-04rs9369226 6 57 AA 428 AG 762 GG 73 AA 659 AG 1502 GG 0.79 0.67-2.75 1.73E-04rs171125 16 19 AA 249 AG 978 GG 6 AA 359 AG 1838 GG 1.43 1.26-3.39 2.33E-05rs6931798 6 56 TT 426 TC 765 CC 71 TT 656 TC 1500 CC 1.26 1.14-3.22 2.11E-04rs6869535 5 28 AA 307 AG 912 GG 69 AA 682 AG 1483 GG 1.32 1.19-3.28 5.17E-05rs2110922 2 188 TT 579 TG 478 GG 273 TT 980 TG 957 GG 0.85 0.75-2.81 0.0016rs2072532 2 390 TT 606 TC 251 CC 572 TT 1128 TC 533 CC 1.21 1.11-3.17 1.71E-04rs7257116 19 148 TT 581 TC 518 CC 339 TT 1095 TC 796 CC 1.22 1.12-3.18 1.36E-04rs1509937 10 517 AA 552 AG 178 GG 997 AA 995 AG 243 GG 1.15 1.05-3.11 0.0062rs2206920 20 18 AA 239 AG 990 GG 18 AA 342 AG 1850 GG 1.32 1.16-3.28 8.44E-04rs12531711 7 966 AA 262 AG 19 GG 1807 AA 407 AG 20 GG 1.23 1.08-3.19 0.0089rs10488631 7 965 TT 263 TC 19 CC 1810 TT 405 TC 20 CC 1.24 1.09-3.2 0.0063rs10924559 1 29 AA 324 AG 872 GG 52 AA 460 AG 1699 GG 0.80 0.66-2.76 0.0014rs1384847 4 0 AA 170 AG 1077 GG 28 AA 360 AG 1847 GG 1.40 1.21-3.36 3.85E-04rs1941114 18 18 AA 268 AG 961 GG 60 AA 602 AG 1573 GG 0.72 0.58-2.68 8.90E-06rs7325443 13 50 TT 427 TC 770 CC 73 TT 633 TC 1529 CC 1.27 1.15-3.23 1.54E-04rs7325927 13 129 TT 456 TC 659 CC 147 TT 807 TC 1281 CC 0.82 0.71-2.78 3.27E-04

Supplementary Table 1: Genotypes for the 34 SNPs in cases and controls in each of the case-control series. Also shown are Odds ratios andassociated 95% confidence intervals, for each of the SNPs genotyped.

ChrSNP

ResultsCase genotypes Control genotypes

Page 27: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

rs2736100 rs2853676 Cases Controls OR 95% CI P

TT GG 678 1'414 1.00 ReferenceTT GA 91 164 1.13 (0.86 - 1.49) 0.37TT AA 3 8 0.93 (0.24 - 3.53) 0.91TG GG 975 1'521 1.33 (1.18 - 1.5) 5.36E-06TG GA 1'158 1'569 1.53 (1.36 - 1.73) 3.17E-12TG AA 60 73 1.67 (1.17 - 2.39) 0.005GG GG 310 449 1.45 (1.22 - 1.72) 2.67E-05GG GA 654 842 1.64 (1.43 - 1.88) 2.75E-12GG AA 372 390 1.97 (1.66 - 2.33) 7.77E-15

Per risk allele 1.08 (1.06-1.09) 1.19E-19P interaction 0.308

rs4295627 rs891835 Cases Controls OR 95% CI P

TT TT 3'349 2'044 1.00 ReferenceTT TG 951 573 1.02 (0.90 - 1.15) 0.76TT GG 82 44 0.90 (0.62 - 1.30) 0.58TG TT 538 384 1.19 (1.03 - 1.38) 0.02TG TG 1'144 946 1.40 (1.25 - 1.54) 4.66E-10TG GG 141 145 1.73 (1.36 - 2.20) 7.82E-06GG TT 16 17 1.89 (0.95 - 3.78) 0.07GG TG 91 82 1.48 (1.09 - 2.01) 0.01GG GG 84 116 2.35 (1.76 - 3.14) 7.04E-09

Per risk allele 1.08 (1.07-1.11) 2.47E-19P interaction 0.03

Supplementary Table 2: Risk of glioma by combined genotypes for rs2736100-rs285

Page 28: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Haplotype*Case

FrequencyControl

FrequencyP

TG 0.409 0.467 4.23E-17GA 0.296 0.247 5.1E-16GG 0.269 0.26 0.1439TA 0.026 0.026 0.733

*Haplotype consists of rs2736100 and rs2853676

Haplotype*Case

FrequencyControl

FrequencyP

TTATGT 0.424 0.462 2.00E-04TTACGT 0.202 0.198 0.6561CGGCAG 0.143 0.099 7.46E-12CGATGT 0.087 0.094 0.2041TTGCAG 0.062 0.059 0.6354TGGCAG 0.02 0.02 0.9459TTGCAT 0.016 0.02 0.1366TTACAT 0.015 0.017 0.4222CTATGT 0.011 0.011 0.9447

*Haplotype consists of rs10464870, rs891835, rs6470745,rs9656979, rs16904140 and rs4295627

53676 and rs4295627-rs891835.

Page 29: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

SNP Comparison P Between study P

rs4295627 Glial vs GBM 2.52E-09 0.111Astrocytic vs GBM 5.12E-05 0.013Glial vs astrocytic 0.026 0.016Glial & astrocytic vs GBM 3.80E-08 0.014

rs498872 Glial vs GBM 0.009 0.748Astrocytic vs GBM 0.030 0.145Glial vs astrocytic 0.158 0.406Glial & astrocytic vs GBM 0.005 0.999

rs4977756 Glial vs GBM 0.230 9.60E-05Astrocytic vs GBM 0.0008 0.003Glial vs astrocytic 0.411 0.121Glial & astrocytic vs GBM 0.010 0.001

rs2736100 Glial vs GBM 3.30E-04 0.069Astrocytic vs GBM 1.91E-04 0.068Glial vs astrocytic 0.548 0.172Glial & astrocytic vs GBM 2.55E-05 0.047

rs6010620 Glial vs GBM 0.003 0.429Astrocytic vs GBM 0.138 0.126Glial vs astrocytic 0.070 0.003Glial & astrocytic vs GBM 0.016 0.043

rs7124728 Glial vs GBM 0.030 3.60E-06Astrocytic vs GBM 0.411 5.00E-08Glial vs astrocytic 0.203 1.00E-06Glial & astrocytic vs GBM 0.153 3.60E-10

Supplementary Table 3: Clinico-pathological association testing.Tumors of glial origin defined by ICD-O codes 93803, 94423, 94503, 94513, GBM by ICD-O codes 94401-13, and astrocytic tumors by ICD-O codes 94003-245.

Page 30: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

TERT 11q13.4 PHLDB1

rs2853676 rs10464870 rs891835 rs6470745 rs16904140 rs4295627 rs1063192 rs2157719 rs1412829 rs4977756 rs7124728 rs498872 rs6010620 rs2297440

rs27361000.31

(10,731)0.68

(10,800)0.73

(10,667)0.74

(10,850)0.30

(10,836)0.30

(10,846)0.34

(10,774)0.20

(10,825)0.41

(10,810)0.27

(10,811)0.19

(10.725)0.55

(10,825)0.60

(10,763)0.63

(10,623)

rs28536760.26

(10,758)0.78

(10,629)0.38

(10,808)0.41

(10.798)0.41

(10,803)0.74

(10,739)0.85

(10,782)0.97

(10,777)0.40

(10,765)0.56

(10,703)0.20

(10,783)0.66

(10,728)0.56

(10,598)

rs104648700.62

(10,705)5.57x10-5

(10,887)3.92x10-12

(10,880)0.02

(10,880)0.86

(10,813)0.80

(10,865)0.55

(10,848)0.73

(10,848)0.73

(10,765)0.18

(10,860)0.94

(10,797)0.86

(10,661)

rs8918351.56x10-4

(10,753)1.45x10-4

(10.740)0.03

(10,747)0.77

(10,677)0.71

(10,728)0.53

(10,713)0.67

(10,711)0.55

(10,625)0.24

(10,729)0.33

(10,668)0.22

(10,572)

rs64707450.33

(10,926)0.48

(10,933)0.17

(10,861)0.19

(10,911)0.51

(10,900)0.33

(10,898)0.36

(10,806)0.39

(10,912)0.73

(10,846)0.61

(10,711)

rs169041400.26

(10,923)0.41

(10,852)0.49

(10,898)0.96

(10,888)0.50

(10,884)0.41

(10,798)0.41

(10,903)0.89

(10,836)0.78

(10,703)

rs42956270.38

(10,859)0.39

(10,911)0.82

(10,896)0.49

(10,894)0.55

(10,806)0.44

(10,909)0.75

(10,846)0.73

(10,708)

rs10631920.60

(10,838)0.58

(10,827)0.76

(10,831)0.14

(10,736)0.73

(10,839)0.87

(10,772)0.97

(10,636)

rs21577190.54

(10,874)0.54

(10,874)0.23

(10,789)0.70

(10,889)0.80

(10,823)0.87

(10,687)

rs14128290.59

(10,857)0.26

(10,771)0.44

(10,873)0.60

(10,814)0.65

(10,674)

rs49777560.12

(10,777)0.29

(10,869)0.70

(10,808)0.84

(10,670)

11q13.4 rs71247280.29

(10,869)0.25

(10,720)0.22

(10,588)

PHLDB1 rs4988720.19

(10,820)0.12

(10,690)

RTEL1 rs60106200.28

(10,632)

Supplementary Table 4: Pairwise analysis of all SNPs associated with glioma risk. For each row-column combination, numbers show the P value forinclusion of an interaction term between the two SNPs. Numbers in parentheses are the number of samples from which statistics are calculated.

CDKN2A/B

RTEL1CCDC26

TERT

CCDC26

CDKN2A/B

Page 31: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Number of risk alleles Contols (%) Cases (%) OR (95% CI)

0-1 33 (0.52) 8 (0.19) 0.45 (0.21-0.99)2 150 (2.38) 32 (0.78) 0.4 (0.27-0.59)3 483 (7.67) 184 (4.48) 0.71 (0.59-0.86)4 1016 (16.13) 489 (11.9) 0.9 (0.78-1.03)5 1542 (24.48) 825 (20.08) 1.0 (0.89-1.13) Reference6 1520 (24.13) 1010 (24.59) 1.24 (1.11-1.39)7 954 (15.15) 902 (21.96) 1.77 (1.56-2)8 417 (6.62) 434 (10.56) 1.95 (1.66-2.28)9 155 (2.46) 174 (4.24) 2.1 (1.66-2.65)

10+ 29 (0.46) 50 (1.22) 3.22 (2.02-5.13)Total 6299 (100) 4108 (100) 1.24 (1.21-1.28) per allele

P trend = 1.30 x 10-61

Supplementary Table 5: Odds ratios corresponding to increasing numbers of risk alleles in rs4295627 (CCDC26 ), rs498872 (PHLDB1 ), rs497756 (CDKN2A, CDKN2B ), rs2736100 (TERT ), rs6010620 (RTEL1 ) and rs7124728 (11q13.4). The median number of risk alleles, 5, is used as the reference group for the odds ratios.

Page 32: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Region Gene nsSNP Change5p15.33; rs2736100 TERT D' v rs2736100 r2 v rs2736100

rs35719940 T1062A na nars34062885 R948S na nars34094720 Y412H na nars11952056 T191S na na

SLC8A1 rs5557 V692E na na

9p21.3; rs4977756 CDKN2A D' v rs4977756 r2 v rs4977756rs45476696 S153G na nars3731249 T148A low MAF low MAFrs6413464 S127A na na

rs34170727 C124R na nars6413463 Q123H na nars35741010 T102A na nars34886500 W99R na nars11552822 Y84D na nars34968276 Q83H na nars11552823 L81P na nars36204594 V60A na nars36204273 Q58R na na

11q23.3; rs498872 PHLDB1 D' v rs498872 r2 v rs498872rs35842472 S347T na nars35219502 I477T na nars11216935 F687L na nars2298484 H868A na nars12787512 C985S na nars497554 W1192R na na

D' v rs498872 r2 v rs498872TREH rs6589669 S578* na na

rs6589670 A561P na nars6589671 A558P low MAF low MAFrs2276064 W486R low MAF low MAFrs11827611 H449Y na nars2276065 A389T na nars34978247 R140K na na

20q13.33; rs6010620 RTEL1 D' v rs6010620 r2 v rs6010620rs3848668 S124N 1 0.054rs16983886 N659K na nars16983889 I660M na nars35640778 Q684R na nars12480603 D870E na nars6089959 S951G na nars3208008 C1042Q na na

ARFRP1 rs35037612 E135K na nars35892492 K105E na nars11547196 C74W na na

ZGPAT rs1291212 R61S 0.758 0.031rs6089764 L105P na na

Page 33: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

rs17855481 R413G na na

LIME1 rs1151625 L211P 0.758 0.031

Supplementary Table 6: Correlations between the SNPs and nsSNPs mapping withgenes.

i

Page 34: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted
Page 35: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

in nearby

Page 36: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

(a) (b)

(c) (d)

Supplementary Figure 1: Identification of individuals in the GWA scan of non-European ancestry. The first two principal components of the analysis were plotted. HapMap CEU individuals are plotted in red; CHB+JPT are plotted in blue; YRI individuals are plotted in black; individuals in the GWA-UK and GWA-US are plotted in grey. GWA-US controls are plotted in (a), cases in (b); GWA-UK controls are plotted in (c) and cases in (d).

Page 37: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

(a) (b)

Supplementary Figure 2: Quantile-quantile (Q-Q) plot of Cochran-Armitage test for trend for the two GWA studies. Results from UK-GWA are plotted in (a) and those from the US-GWA in (b). The black line represents the null hypothesis of no true association.

Page 38: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Kb

30

15

cM/M

b

rs6010620rs2297440

KCNQ2

PRPF6

ARFRP1

(b) 5p15.33

0

2

4

6

8

10

12

14

16

18

1000 1100 1200 1300 1400 1500 1600 1700

0

2

4

6

8

10

12

14

16

18

130300 130400 130500 130600 130700 130800 130900 131000

0

2

4

6

8

10

12

14

16

18

21800 21850 21900 21950 22000 22050 22100 22150 22200

cM/M

b

Kb

CR600271

rs10464870

rs6470745

rs4295627

rs16904140

CCDC26 MLZE

-log 1

0P

NKD2 SLC12A7

SLC6A19 SLC6A18 TERT CLPTM1L SLC6A3 LPCAT4

rs2736100

rs2853676

Kb

cM/M

b

Kb

-log 1

0P

90

45

KbKb-lo

g 10P

RTEL1

TNFRSF6B

EEF1A2 PTK6 TPD52L2 UCKL1ZGPATSTMN3

LIME1

(a) 8q24.21

CDKN2A CDKN2B

30

15

cM/M

b

rs1063192rs2157719 rs1412829

rs4977756

(c) 9p21.3 (d) 20q13.33

-log 1

0P

rs891835

30

60

0

2

4

6

8

10

12

14

16

18

61500 61600 61700 61800 61900 62000 62100

Page 39: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

cM/M

b

Kb

(e) 11q23.3

50

KbcM

/Mb

25

0

2

4

6

8

10

12

14

16

18

70400 70420 70440 70460 70480 70500 70520 70540 70560 70580 70600

20

-log 1

0P

rs7124728

10

(f) 11q13.4

0

2

4

6

8

10

12

14

16

18

117500 117600 117700 117800 117900 118000 118100 118200 118300 118400 118500

UBE4A

rs498872

ATP5L ARCN1 TREH BCL9L

MLL PHLDB1 CXCR5DDX6

-log 1

0P

Supplementary Figure 3: Regional plots of the 6 confirmed associations (a-f). For each of the regions: (a) 8q24.21; (b) 5p15.33; (c) 9p21.3; (d) 20q13.33; (e) 11q23.3; and (f) 11q13.4 the upper panel shows single marker association statistics (as –log10 values) from the GWA-studies in blue and for the combined analysis including the replication series in red as a function of genomic position (NCBI build 36.1). Also shown are the relative positions of genes mapping to each region of association. Exons of genes have been redrawn to show the relative positions in the gene, therefore maps are not to physical scale. In the lower panel are the estimated statistics of the square of the correlation coefficient (r2), derived from HapMap. The values indicate the LD relationship between each pair of SNPs; the darker the shading, the greater extent of LD.

Page 40: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

0

1

2

3

4

5

6

7

8

9

130000 130200 130400 130600 130800 131000 131200

Location (kb)

-log 1

0P

rs1077236

rs6470745

rs4295627

Supplementary Figure 4: Imputation of 8q24.21 SNPs. In total, 894 HapMap SNPs were successfully imputed in the interval between 130,051,729 and 131,225,253bps using available SNP genotype data from GWA scans (225 SNPs). Imputed data integrity was verified where possible, by crosschecking the concordance of imputed genotypes with that of available Illumina SNP genotype data. Directly genotyped SNPs are denoted in blue, those imputed are shown in green.

Page 41: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted
Page 42: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted
Page 43: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Supplementary Figure 5: 5p15.33 (TERT), 8q24.21 (CCDC26), 9p21 (CDKN2A, CDKN2B), 11q23.3 (PHLDB1) and 20q13.33 (RTEL1, ARFRP1, STMN3, SLC2A4RG, ZGPAT) copy number variation (CNV) and mRNA expression in glioma. Plots show mRNA expression stratified by CNV status based on analysis of 188 glioblastoma multiforme (GBM) tumors. Expression data was derived from the MSKCC Cancer Genomics Data Portal (http://cbio.mskcc.org/cancergenomics-dataportal/). MSKCC expression data was generated by combining GBM expression data from the The Cancer Genome Atlas. The bioconductor module CGHcall was used to determine CNV status from Agilent human genome CGH microarray data taken from TCGA data portal. It should be noted that rates of amplification and deletion for a set of randomly selected probes are 9.9% and 13.4% respectively in the tumors. Differences in the distribution of expression by SNP genotype were compared using a Wilcoxon-type test.

Page 44: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Le

vel

of

gen

e ex

pre

ssio

n

rs2736100 genotype

TERT

rs2297440 genotype

Le

vel

of

gen

e ex

pre

ssio

n

Le

vel

of

gen

e ex

pre

ssi

on

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

rs6010620 genotype

rs6010620 genotype

rs6010620 genotype rs2297440 genotype

rs2297440 genotype

STMN3

SLC2A4RGSLC2A4RG

ARFRP2 ARFRP2

STMN3

P = 0.53

P = 0.25

P = 0.25

P = 0.90 P = 0.90

P = 0.60 P = 0.60

26 46 18

49 356 6

35 49

635

494935

6

635

4949 35

6

(a)

Page 45: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

Le

vel

of

gen

e ex

pre

ssi

on

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

res

sio

n

Lev

el o

f g

ene

exp

res

sio

n

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

rs2157719 genotype rs1412829 genotype

rs4977756 genotype rs1063192 genotype rs2157719 genotype

rs1412829 genotype rs4977756 genotype

rs1063192 genotype

CDKN2A CDKN2ACDKN2A

CDKN2A CDKN2B CDKN2A

CDKN2ACDKN2A

P = 0.91

P = 0.74

P = 0.53

P = 0.85

P = 0.39

P = 0.46

P = 0.90

P = 0.59

12 45 33 35 45 10 1042 38

39 438

12 45 33 35 45 10

1042 38 39 43

8

Page 46: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

56 293

29

6 5 4734

3355

5811 2

P = 0.89

P = 0.72 P = 0.28

P = 0.26P = 0.79

Le

vel

of

gen

e ex

pre

ssio

n

Lev

el o

f g

ene

exp

res

sio

n

Lev

el o

f g

ene

exp

res

sio

n

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

PHLDB1

CDKN2ACDKN2B

CDKN2B CDKN2A

rs17748 genotype

rs10965224 genotype rs10965224 genotype

rs564398 genotype rs564398 genotype

(b)

Page 47: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

107 78

822 14

6

107 78 8107

758

107 78 8

P = 0.25P = 0.98

P = 0.008

P = 0.10

P = 0.32

Lev

el o

f g

ene

exp

res

sio

n

Lev

el o

f g

ene

exp

res

sio

n

Le

vel

of

gen

e ex

pre

ssi

on

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

ARFRP1

STMN3 SLC2A4RG

TNFRSF6B – 1st probe TNFRSF6B – 2nd probe

rs2297441 genotype

rs2297441 genotype rs2297441 genotype

rs2297441 genotype rs2297441 genotype

Page 48: University of Zurich - UZH · 2010. 11. 29. · Genome-wide association study identifies five susceptibility loci for glioma Abstract To identify risk variants for glioma, we conducted

711

18

TNFRSF6B – 2nd probe

Lev

el o

f g

ene

exp

ress

ion

rs2738758 genotype

14 57 95

14 57 95 1454 95

P = 0.12

P = 0.362

P = 0.06

P = 0.308

Le

vel

of

gen

e ex

pre

ssi

on

Lev

el o

f g

ene

exp

ress

ion

Lev

el o

f g

ene

exp

ress

ion

ARFRP1 STMN3 SLC2A4RG

rs2738758 genotype

1457 95

P = 0.40

TNFRSF6B – 1st probe

rs2738758 genotype

rs2738758 genotype rs2738758 genotype

Supplementary Figure 6: Relationship between (a) lymphocyte and (b) normal cortical mRNA expression levels of PHLDB1 and rs17748 genotype, CDKN2A/CDKN2B and rs10965224/rs564398/rs1063192/rs2157719/rs1412829/rs49777656 genotype, ARFRP1/STMN3/SLC2A4RG/TNFRSF6B and rs2297441/rs2738758/rs2297440/rs6010620 genotype and TERT and rs2736100 genotype. Expression of PHLDB1, CDKN2A, CDKN2B, ARFRP1, STMN3, SLC2A4RG and TNFRSF6B in lymphocytes based on expression data from analysis of 90 Epstein-Barr virus–transformed lymphoblastoid cell lines using Sentrix Human-6 Expression BeadChip (Illumina, San Diego, USA)24,25. Expression of PHLDB1, CDKN2A, CDKN2B, ARFRP1, STMN3, SLC2A4RG and TNFRSF6B in normal human cortex based on expression data from analysis of 193 normal human brain samples using Affymetrix GeneChip Human Mapping 500K Arrays and Illumina HumanRefseq-8 Expression BeadChip platforms26. Expression and/or genotype information was not available for every sample and thus the total number of samples may be <193. Where SNPs genotyped in our study were not typed HAPMAP data was used to identify correlated SNPs in high LD (r2>0.8). The number of samples with each genotype is given for each plot. Combining the lowest frequency homozygote with the heterozygote and re-performing the statistical test produced no further significant results. Differences in the distribution of expression by SNP genotype were compared using a Wilcoxon-type test for trend.

Lev

el o

f g

ene

exp

ress

ion