Genetic Characteristics of Korean Patients with Autosomal...
Transcript of Genetic Characteristics of Korean Patients with Autosomal...
저 시-비 리- 경 지 2.0 한민
는 아래 조건 르는 경 에 한하여 게
l 저 물 복제, 포, 전송, 전시, 공연 송할 수 습니다.
다 과 같 조건 라야 합니다:
l 하는, 저 물 나 포 경 , 저 물에 적 된 허락조건 명확하게 나타내어야 합니다.
l 저 터 허가를 면 러한 조건들 적 되지 않습니다.
저 에 른 리는 내 에 하여 향 지 않습니다.
것 허락규약(Legal Code) 해하 쉽게 약한 것 니다.
Disclaimer
저 시. 하는 원저 를 시하여야 합니다.
비 리. 하는 저 물 리 목적 할 수 없습니다.
경 지. 하는 저 물 개 , 형 또는 가공할 수 없습니다.
의학박사 학위논문
Genetic Characteristics of Korean
Patients with Autosomal Dominant
Polycystic Kidney Disease by
Targeted Exome Sequencing
2019년 4월
서울대학교 대학원
임상의과학과 임상약리학
김 현 숙
Genetic Characteristics of Korean
Patients with Autosomal Dominant
Polycystic Kidney Disease by
Targeted Exome Sequencing
지도교수 장 인 진
이 논문을 김현숙 박사 학위논문으로 제출함
2019년 4월
서울대학교 대학원
임상의과학과 임상약리학
김 현 숙
김현숙의 박사 학위논문을 인준함
2019년 월
위 원 장 (인)
부 위 원 장 (인)
위 원 (인)
위 원 (인)
위 원 (인)
- I -
Abstract
Patients with Autosomal Dominant
Polycystic Kidney Disease by
Targeted Exome Sequencing
Hyunsuk Kim
Department of Clinical Medical Sciences
The Graduate School
Seoul National University
Autosomal dominant polycystic kidney disease (ADPKD) is a monogenic
disorder that can affect the kidneys as well as the liver and one of the main
causes of end-stage renal disease (ESRD). Genetic information is of the utmost
importance in understanding pathogenesis of ADPKD. Therefore, this study
aimed to demonstrate the genetic characteristics of ADPKD and their effects on
kidneys and liver in 749 Korean ADPKD subjects from 524 unrelated families.
Genetic studies of PKD1/2 were performed using targeted exome sequencing
combined with Sanger sequencing in exon 1 of the PKD1 gene and a multiple
ligation probe assay. Total kidney volume was calculated by ellipsoid method,
total liver volume was measured by stereology and adjusted for height (htTKV,
htTLV). Using htTLV, disease severity was classified as no or mild (htTLV
<1,600 mL/m), moderate (htTLV 1,600-2,400 mL/m), and severe PLD (htTLV>
2,400 mL/m). Associations between the severity of PKD or PLD and the type
and position of genetic mutations were analyzed using a multilevel model
controlling for family-level clustering (within-family correlations). The mutation
- II -
detection rate was 80.7% (423/524 families, 331 mutations) and 70.7% was novel.
PKD1 protein-truncating (PKD1-PT) genotype was associated with younger
age at diagnosis, larger htTKV, lower renal function compared to
PKD1-non-truncating and PKD2 genotypes. The PKD1 genotype showed
earlier onset of ESRD compared to PKD2 genotype (53.3[46.7, 58.7] vs. 63.1[54.0,
67.5] years old, P=0.003). In frailty model controlled for age, gender, and familial
clustering effect, PKD2 genotype had 0.2 times lower risk for reaching ESRD
than PKD1-PT genotype (p=0.037).
As for liver phenotype, severe PLD showed more rapid progression to
end-stage renal disease and higher mortality. Sex, age, multiple pregnancies, and
female sex hormone use were associated with moderate to severe PLD.
Family-level clustering effects existed in moderate to severe PLD implying
genetic link to PLD phenotype, however, unlike kidney, PLD severity did not
correlate with PKD1/2 genotype but the frequency of No Mutation was
increased only in mild PLD group. Additionally, the C-type lectin domain might
be a potential warm spot for moderate to severe PLD in both PT and
non-truncating PKD1 mutations.
In conclusion, the results suggest that genotyping can contribute to selecting
rapid progressors of kidneys for new emerging therapeutic interventions among
Koreans. However, additional genetic effects on PLD have to be investigated.
……………………………………………………………………………………………
keywords : Autosomal dominant polycystic kidney disease,
targeted exome sequencing, rapid progressor, moderate to severe
polycystic liver disease, Genetic variation, phenotype
Student Number : 2015-30809
- III -
CONTENTS
Abstract·…………………………………………………………………Ⅰ
Contents…………………………………………………………………III
List of tables and figures……………………………………………IV
1st part: Kidney
Introduction………………………………………………………………1
Results……………………………………………………………………3
Discussione………………………………………………………………7
Methods…………………………………………………………………12
Figure Legends and Tables·…………………………………………20
Supplementary information……………………………………………27
2nd part: Liver
Introduction………………………………………………………………67
Results……………………………………………………………………68
Discussion·………………………………………………………………74
Methods·…………………………………………………………………79
Figure Legends and Tables·…………………………………………82
Supplementary information……………………………………………90
References··……………………………………………………………99
Abstract(Korean)····…………………………………………………109
- IV -
LIST OF TABLES AND FIGURES
1st part: Kidney
Table 1. Frequency of each class and grade in 524 families………………22
Table 2. Baseline Clinical Characteristics According to PKD Genotypes…………23
Table 3. A. Multilevel multivariate linear regression of log(htTKV) or eGFR
according to genotype B. Multivariate linear regression of log(htTKV) or eGFR
for comparing each slope according to genotype……………………………………24
Supplementary Table S1. Demographic Findings of 749 Patients (524 Families)…27
Supplementary Table S2. Mutations found in this study……………………………28
Supplementary Table S2A. PKD1 protein-truncating mutations in study families·…28
Supplementary Table S2B. PKD1 in-frame insertion/deletions in study families……41
Supplementary Table S2C. PKD1 non-truncating mutations in study families………42
Supplementary Table S2D. PKD2 protein-truncating mutations in study families·…47
Supplementary Table S2E. PKD2 non-truncating mutations in study families………50
Supplementary Table S3. Design of Custom Bait·……………………………………51
Supplementary Table S4. Quality of Targeted Exome Sequencing·………………51
Supplementary Table S5. Definition of Grades··………………………………………52
Supplementary Table S6. Results from Validation Study(n=80)……………………53
Supplementary Table S7. MLPA validation using Long Range PCR sequencing…56
Figure 1. Summary of identified mutations……………………………………………25
Figure 2. Cross-sectional analysis between age and log(htTKV) or eGFR A.
Scatter plot and trend line of log (htTKV) according to genotype. B. Scatter
plot and trend line of log (htTKV) according to genotype…………………………26
- V -
Figure 3. Renal outcomes and death according to type of PKD1/2 mutation. A.
Survival analysis of ESRD progression. B. Survival analysis of all-cause
mortality………………………………………………………………………………………26
Supplementary Figure S1·…………………………………………………………………58
Supplementary Figure S2·…………………………………………………………………59
Supplementary Figure S3·…………………………………………………………………60
Supplementary Figure S4·…………………………………………………………………61
Supplementary Figure S5·…………………………………………………………………62
Supplementary Figure S6·…………………………………………………………………63
Supplementary Figure S7·…………………………………………………………………64
Supplementary Figure S8 Inverse relationship of GC content and the depth of
coverage for PKD1…………………………………………………………………………·65
Supplementary Figure S9. Normalized read depth in PKD1 and PKD2 of
mutations that detected via MLPA………………………………………………………66
2st part: Liver
Table 1. Frequency of each class and grade in 524 families………………84
Table S1. Distribution of PKD genotype according to liver cysts·…………………93
Table S2. Mutations of moderate to severe PLD families·…………………………95
Table S3. Mutations of moderate to severe PLD males·……………………………95
Figure 1. A. ESRD progression according to PLD severity B. All cause mortality
according to PLD severity…………………………………………………………………86
Figure 2. Impact of female sex hormone on severity of PLD··……………………86
- VI -
Figure 3. A. Distrubution of htTLV among siblings B. htTLV of siblings
excluding the subjects having highest htTLV C. Distrubution of htTLV among
siblings excluding famales over 45 years D. htTLV of siblings excluding the
subjects of highest htTLV and famales over 45 years·…………………………… 87
Figure 4. A. Location of mutation of PKD1 in moderate to severe PLD B.
Location of mutation of PKD1 in moderate to severe PLD…………………………89
Figure S1. Distribution of htTLV according to age and sex·………………………90
Figure S2. Distribution patterns of PKD genotypes according to severity………91
Figure S3. A. htTLV according to sex and class of PKD gene B. htTLV
according to sex and class of PKD gene………………………………………………92
Figure S4. Distribution patterns of PKD genotypes according to no liver cyst
and liver cyst groups·………………………………………………………………………94
Figure S5. Mutations of moderate to severe PLD males……………………………97
Figure S6. Diagram of study patient selection in HOPE PKD cohort……………98
1
1stpart: Kidney
INTRODUCTION
Autosomal dominant polycystic kidney disease (ADPKD) is a potentially lethal
monogenic disorder characterized by numerous cysts in the kidneys and
progressive renal failure1. Since the disease is caused by a mutation either in
PKD1 or PKD2, genetic information is of the utmost importance in
understanding the pathogenesis of ADPKD.
The ADPKD mutation database (PKDB, http://pkdb.mayo.edu/) is the largest
international repository for variants found in PKD genes. By September 2018,
1895 pathogenic mutations in the PKD1 gene and 438 in the PKD2 gene were
registered in the PKDB. However, variants registered in the PKDB mainly came
from the results of genetic studies in Western population. There have been only
a few reports that presented genetic profiles on Asian ADPKD patients, since
analyzing PKD1/2 mutations has been difficult in clinical practice. First,
two-thirds of the 5’ region of PKD1 are duplicated by gene conversion (GC) on
chromosome 16 within six pseudogenes (PKD1P1-P6)2. Because these
pseudogenes have a sequence that is 97.7% identical to the original PKD1 gene
sequence, cloning and sequencing PKD1 has been difficult with direct
sequencing. Moreover, high allelic heterogeneity of both PKD1 and PKD2
makes molecular diagnosis of ADPKD more challenging. Previously, indirect
methods such as two-dimensional gene scanning, denaturing gradient gel
electrophoresis, and single-strand conformation polymorphism analysis have been
applied to find pathogenic mutations3-6. However, these indirect methods
demonstrated false-positive results with a low detection rate compared to direct
sequencing. Therefore, a long-range polymerase chain reaction (PCR) with
subsequent nested PCR was developed to isolate the entire PKD1 gene without
2
interference from the six pseudogenes (PKD1P1-P6)7. However, direct Sanger
sequencing is time-consuming and costly. Therefore, there has been an urgent
need for faster and more efficient way to detect mutations in ADPKD.
Targeted exome sequencing (TES) using next-generation sequencing (NGS) is
one of the most suitable methods to study monogenic disorders such as
ADPKD. It has advantages over Sanger sequencing including that a large
amount of data can be analyzed automatically in a short period of time and can
be done cheaply. In addition, researchers can focus on a few genes and exons to
discover mutation spots. Therefore, TES, rather than whole exome sequencing
or whole genome sequencing, has been used more popularly as a genetic
screening tool to detect mutations in ADPKD. Since TES became available in
clinical practice for genetic diseases, the detection rate gradually improved 7.
The importance of genotype on clinical outcome has been demonstrated from
many previous studies. Cornec-Le Gall et al.8 reported the influence of PKD1
mutation type on renal survival. HALT-PKD and CRISP Investigators showed
the difference of renal survival in PKD1 NT carriers according to their
conservation and chemical change9. Previously, anecdotal report from Asian
countries reported novel mutations in Asian ADPKD populations 10-12. However,
they were either case series or included small samples. No well-designed cohort
studies have been conducted to evaluate genetic profiles and genotype-
phenotype associations in Asian populations. The coHOrt for genotype-
PhenotypE correlation in ADPKD (HOPE-PKD) is a large prospective,
multicenter Korean ADPKD cohort constructed to investigate the natural course
of the disease and genotype-phenotype associations. Demographic and clinical
information of the subjects, as well as family trees, is collected, and specimens
including DNA, serum, and urine sample are collected and stored from the
subjects once a year13. To describe the genetic characteristics of ADPKD in
3
Koreans, I performed a cohort-based genetic analysis using the HOPE-PKD
cohort.
RESULTS
Demographic Characteristics of 749 Subjects of the Korean HOPE-PKD
Cohort
Among 749 subjects, 360 (48.1%) were males and 389 (51.9%) were females.
The mean age was 46.4 ± 13.3 years (male, 44.6 ± 14.4; female, 48.0 ± 12.0; P
value for gender difference, <0.001). The mean estimated glomerular filtration
rate (eGFR) was 65.8 ± 38.9 mL/min/1.73m2 (male, 65.6±39.4; female, 66.0±38.5;
P value for gender difference, 0.896, see Supplementary Table S1 online).
Comprehensive Genetic Analysis by Combining TES, PKD1 Exon
1-Specific Sanger Sequencing, Multiple Ligation Probe Assay, and
Familial Segregation Analysis
Of the entire sample of 524 families, variants were found by TES in 471
families (89.9%). Among them, definitively pathogenic (DP), highly likely
pathogenic (HLP), or likely pathogenic (LP) variants were categorized as
mutations. Thus, mutations were found in 409 families. In 62 families with likely
neutral (LN) or indeterminate (I) variants and 53 families with no variants
(NV), subsequent PKD1 exon 1 Sanger sequencing and/or multiple ligation
probe assay (MLPA) were carried out. In LN or I families, six mutations were
additionally detected through PKD1 exon 1 Sanger sequencing, and three
mutations were detected through MLPA. In NV families, one and four mutations
were found through PKD1 exon 1 Sanger sequencing and MLPA, respectively.
Among 129 families with novel LP variants found by TES, 88 families
underwent additional familial segregation analysis. Forty families were
4
determined to have HLP mutations, and 2 families were proven to have LN
mutations. Thus, the mutation detection rate was 80.7% (423/524 families) by
TES, PKD1 exon 1 Sanger sequencing, MLPA, and familial segregation analysis
(see Supplementary Fig. S1 online).
Baseline Clinical Characteristics According to PKD Genotypes and
Summary of Identified Mutations
A total of 364 variants were found in 476 of the 524 families, and of these,
331 mutations were detected among 423 families in the HOPE-PKD cohort
(Figure 1). Among the 749 subjects, 68% had PKD1 variants, 16% had PKD2
variants, and 16% were classified as having no mutations (NM) (Figure 1A).
Excluding NM, 81% of the 749 subjects and 82% of the 524 families had PKD1
mutations (Figure 1B). Figure 1C shows the distribution of mutations by class
among the 630 subjects with mutations. The frameshift mutation (39%) was the
most common mutation class in the PKD1 genotype, while nonsense mutations
(52%) were most common in the PKD2 genotype. The prevalence of classes of
variants (number of families that shared the same variants) among the 524
families was as follows: PKD1: nonsense, 86 (19); frameshift 136 (24), missense,
90 (17); PKD2: nonsense, 39 (20); frameshift, 11 (0); missense, 12 (4) (Figure
1D, Left, Table 1). The prevalence of the grades of the variants (number of
families that shared the same variants) in the 524 families was as follows:
PKD1: DP, 248 (46); HLP, 61 (20); and LP (0), 39; PKD2, DP, 63 (22); HLP, 10
(4); and LP, 2 (0) (Figure 1D, Right, Table 1).
Comparison of Mutation Frequency in Affected Polycystin Domains
Between HOPE-PKD and other Cohorts
5
Among the 331 mutations, 97 were registered in either the PKDB or Toronto
Genetic Epidemiology Study of PKD14. Therefore 70.7% (234/331) of the
mutations were novel (see Supplementary Table S2 online). Mutations were
found more frequently in the C-type lectin and G protein-coupled receptor
proteolytic site (GPS) domains of polycystin-1 in the Korean HOPE-PKD cohort
than in those previously reported from PKDB. Likewise, mutations were detected
more frequently in the linker domain of polycystin-2 in my patients compared to
those in the PKDB. The standardized residual of the Chi-square test (cut-off of
significance, 2.58) was 2.8 in the C-type lectin of polycystin-1 and the linker
domain of polycystin-2, and 2.6 in the GPS domain of polycystin-1 (see
Supplementary Fig. S2 online).
Baseline Clinical Characteristics According to PKD Genotypes
For genotype-phenotype analysis, PKD1 or PKD2 mutations were further
classified into four PKD genotypes according to the previously described manner
14; PKD1 protein-truncating (PKD1-PT), PKD1 small (fewer than five amino
acids) in-frameshift indels (PKD1-ID), PKD1 non-truncating (PKD1-NT), and
PKD2 genotypes. A total of 630 patients in 423 families after excluding NM
were re-classified into four genotype groups: PKD1-PT, 371 (59%); PKD1-ID,
21 (3%); PKD1-NT, 119 (19%); PKD2, 119 (19%). PKD1-PT genotype was
associated with younger age at the first visit (40.2 ± 12.9), earlier onset of
hypertension (37.3 ± 10.1), younger ESRD age (52.5 ± 9.5), larger
height-adjusted total kidney volume (htTKV) (898 [476, 1546], mL/m), and lower
eGFR (63.5 ± 42.2, mL/min/1.73 m2) compared to other genotypes. The median
follow-up duration was 77.3 months (Table 2).
6
As the Mayo imaging classification changed from 1A to 1E, the proportion of
PKD1-PT genotype increased. Conversely, as the genotype changed from PKD2
to PKD1-PT, the proportion of Mayo imaging classification 1C-E increased (P
<0.001 by linear by linear test, see Supplementary Fig. S3A online). During 77.3
months of follow-up, the intraclass correlation of individuals was 0.84 [0.82,
0.86] (See Supplementary Fig. S3B online). Therefore, the Mayo imaging
classification of each subject was mostly retained over time.
HtTKV and eGFR According to PKD Genotypes
When log(htTKV) was plotted according to age, the PKD1-PT genotype
showed a similar slope to PKD1-ID and PKD2 genotypes (Figure 2A and
Table 3B). However, the PKD1-PT and PKD1-ID genotypes showed larger
htTKV values compared to PKD2 genotype (Table 3A). Overall, the PKD1-PT
and PKD2 genotypes showed a steeper slope than the PKD1-NT genotype and
NM group (Table 3B). In other words, the log(htTKV) in the PKD1-PT and
PKD2 genotypes increased rapidly as age increased, but the log(htTKV) of the
PKD1-NT genotype and NM group increased less intensely. The PKD1-PT
genotype also showed a steeper slope of eGFR than the other genotype groups,
except the PKD1-ID genotype (Figure 2B and Table 3B).
Multilevel multivariate linear regression analysis demonstrated that the
PKD1-PT genotype had a larger htTKV than the PKD1-NT or PKD2
genotypes after adjusting for age, gender, and the familial clustering effect
(Figure 2B and Table 3A). The family-level clustering effect was statistically
significant (P=0.001). The regression model of eGFR showed that the PKD1-PT
genotype was associated with a lower eGFR than PKD1-NT or PKD2
7
genotypes (Figure 2B and Table 3A). The family-level clustering effect was
also significant (P=0.029).
Incidence of ESRD and Death According to PKD Genotypes
The PKD1 genotype showed an earlier onset of ESRD than PKD2 genotype
(53.3[46.7, 58.7] vs. 63.1[54.0, 67.5], P=0.003) (see Supplementary Fig. S4 online).
Among the 630 patients, with the exclusion of the NM subjects (n=119) from
the total of 749 subjects, 122 ESRD events were observed during follow-up
(Table 2). The ESRD incidence was 21.7% (110/508) in PKD1 and 10% (12/119)
in PKD2 (P=0.003). The incidence of ESRD was lower in the PKD2 genotype
than for other genotypes (PKD2, 12 (10.1%) vs. PKD1-PT, 84 (22.6%);
PKD1-ID, 4 (19.0%); PKD1-NT, 22 (18.5%), P for trend = 0.004). A multi-level
multivariate Cox proportional hazard (frailty) model was used to compare the
effect of genotype on ESRD. When PKD1-PT genotype was used as a
reference, PKD2 genotype showed a 0.22 times lower hazard ratio than
PKD1-PT genotype. The familial clustering effect was significant (P=0.037)
(Figure 3A).
There were 23 death events during follow up. Either incidence of death or age
of death did not show a significant difference among genotypes (Table 2). In
the frailty model, the incidence was too small to find any significant difference
according to genotypes (Figure 3B).
DISCUSSION
As new therapeutic agents became available as disease-course modifying
interventions, understanding the natural course of ADPKD and identifying
8
high-risk candidates for new clinical interventions have become important.
Although defining genetic variations and their associations with phenotypes is
the first step to understand the natural course of genetic diseases, there have
been only a few reports on genetic profiles of Asian ADPKD populations4,10,11,15.
In this study, I investigated the genetic characteristics of PKD1/2 mutations
using a large prospective Korean ADPKD cohort. In brief, 331 mutations were
detected by TES using NGS combined with PKD1 exon 1 Sanger sequencing
and MLPA. The detection rate of mutations was 80.7% (423/524 families).
Among them, 70.7% of mutations were novel. The prevalence of PKD1 and
PKD2 among 524 families was 84% and 16%, respectively. Mutations were
more frequently found in the C-type of lectin and GPS domains of polycystin-1
and the linker domain of polycystin-2 among the Korean ADPKD population
than among Western populations.
TES using NGS was applied in this study. There was a concern about the
decreased efficacy of TES in mutation detection because of large amount of
pseudogenic lesion in PKD1. However, demonstrated from the milestone study
by Trujillano and colleagues16, NGS was not inferior in mutation detection and
could be a possible substitute for Sanger sequencing because Sanger sequencing
itself is expensive and time consuming. NGS is economically feasible,
convenient, and has the possibility of automation. However, low or high GC
content can reduce the mutation detection rate by NGS by affecting the probe
hybridization and PCR amplification steps. The false negative rate is expected to
be elevated for PKD1 exon 1, which has many GC-rich regions. Moreover,
large deletions can easily be missed by NGS. In this study, I tried to
compensate for these limitations by exon-specific Sanger sequencing of PKD1
exon 1 and MLPA. However, the detection rate of in my study was lower than
that with Sanger sequencing with long-range PCR11. One possible reason for the
9
low detection rate is that I performed target enrichment with whole genomic
DNA instead of pure PKD1 and PKD2 genomic regions. A previous study
suggested using PCR to generate locus-specific amplicons of PKD1 before
sequencing. Therefore, refining the NGS pipeline to increase sequencing depth
through locus-specific PCR amplicon enrichment should be considered to improve
the detection rate. Another possible explanation is that my population may be
enriched with novel Asian-specific missense variants with unknown significance.
Therefore, further functional analysis and Sanger sequencing of PKD1/2 should
be considered in the NM cases to further improve the detection rate.
The proportions of PKD1 and PKD2 were similar to those from previous
data7. However, the mean age of ESRD in Korean patients with ADPKD was
52.6 years for the PKD1 genotype and 60.5 for the PKD2 genotype, which
seemed to be younger than that of Westerners, especially for PKD2 subjects.
This discrepancy may be related to selection bias, because my study only
included ADPKD patients from tertiary hospitals. This means that milder cases
who tended to visit primary clinics could have been excluded from this analysis.
Nonetheless, further study is needed to evaluate factors causing faster renal
progression in my ADPKD patients compared to those from other ethnic
populations.
Excluding NM, 331 mutations were ultimately detected in the current study.
When compared with those reported in other large cohort studies, the
distribution of the mutation classes was similar to those presented in previous
studies. Interestingly, my study revealed a high proportion of novel variants.
This can be explained by the founder effect or repetitive mutations, as a recent
Chinese study also showed similar findings (personal communication).
As part of that effort, I compared the frequency of mutations in certain
domains of polycystin-1 or polycystin-2 with the results from the PKDB for the
10
C-type lectin and GPS domains of polycystin-1 and the linker domain of
polycystin-2. The C-type lectin domain of polycystin-1 (protein product of
PKD1) binds carbohydrate matrices in vitro and may be involved in protein
carbohydrate interactions in vivo. Polycystin-1 undergoes cleavage at the
juxtamembrane GPS, a process likely to be essential for its biological activity.
This GPS is essential for the intracellular transport. Finally, the interaction of
polycystin-2 with polycystin-1 is mediated by the C-terminal coiled-coil domain
of polycystin-2, and the linker region between the EF-hand and coiled-coil
domains is not sufficient to mediate the association by itself. The presence of
the N-terminal linker to the polycystin-2 coiled-coil domain allows a tighter
association. The implications of these findings for the prognosis for Asian
patients with ADPKD should be clarified in the future.
This study enrolled patients with typical ADPKD imaging findings if the
subjects had no familial history. Therefore, the NM group seems to be an entity
that shares other clinico-genetic characteristics. Cases of NM were not directly
included in the analysis, but most of the patients with NM showed similar or
milder clinical features compared to PKD2 genotype, suggesting that they are
groups with mutations in regions such as introns or mosaicism, which are
difficult to detect using TES. Additionally, Other genes (i.e. GANAB, DNAJB11)
involved in atypical presentations of ADPKD could have been part of the
targeted NGS approach. Further studies to clarify the meaning of NM are
warranted.
I found families that had two or more novel missense variants. In the absence
of a functional assay, assessing the putative pathogenicity of missense variants
is not reliable. To validate the clinical pathogenicity of missense variants, I
analyzed family-based segregation studies including at least one affected and/or
one unaffected member in each family. In most of the cases, family-based
11
segregation analysis was very helpful to discriminate the pathogenicity of
missense variants. However, for a complete segregation study, it is necessary to
investigate all household members accessible through the family tree. Complete
segregation can be determined especially clearly when the affected subject is
examined together with both parents, which is not always possible. When the
affected subject is examined with only his/her siblings or children, the power to
completely discriminate the pathogenicity of the variant may be lowered.
However, I made every effort to overcome these limitations by enrolling
additional families until the pathogenicity of the variants was clearly determined.
My study showed that PKD1-PT genotype was the most common genotype
(59%). The distribution of each genotype was similar to what has been reported
in previous studies7. Subjects with the PKD1-PT genotype had an early onset
of hypertension, larger htTKV, lower eGFR, and more frequent ESRD incidence
than other genotypes. The family-level clustering effect for log(htTKV) and
ESRD onset was significant. However, it was also noteworthy that there existed
discrepancies between genotypes and phenotypes. For example, the proportions of
mild phenotypes of Mayo class 1A and 1B were 7% and 25% in individuals with
the PKD1-PT genotype, which suggests that other modifying factors may exist.
A strength of the current study is that it is unique as it represents the first
large cohort analysis of an Asian population. This study reveals many novel
mutations to the PKD1 and PKD2 genes as well as novel mutation enrichment
sites within the ADPKD proteins. My study made efforts to overcome the
shortcomings of NGS by performing PKD1 exon-specific Sanger sequencing,
MLPA, and a familial segregation study. In addition, the HOPE-PKD cohort is
fairly well designed and has a substantial number of subjects. Nonetheless, there
are some limitations. Since this study was conducted on patients in tertiary
hospitals in South Korea, there is a possibility of selection bias that only severe
12
patients come to the hospital. Thorough evaluation and collection of family
history was difficult since not all the members were registered at the hospital.
Therefore, it was difficult to investigate the exact family-level clustering effect.
Moreover, I could not find familial clustering effects for ESRD or mortality,
because of the small number of events. Due to the lost to follow-up of the
patient and the absence of DNA samples, I were unable to perform PKD1/PKD2
LR-PCR/Sanger sequencing for the subjects with NM. Therefore, I could not
present the exact specificity parameter of NGS. In addition, I used the ellipsoid
equation to measure TKV instead of stereology, which is less accurate.
In conclusion, I performed a comprehensive genotype study using one of the
largest Asian ADPKD cohorts, HOPE-PKD, and demonstrated the characteristics
of variants and mutations in Korean patients. The results also strongly suggest
that an analysis of the ADPKD genotype can contribute to the selection of
rapidly progressing patients for new emerging therapeutic interventions. In
regard to the prognostic score combining genetic and clinical information, the
PRO-PKD score has been developed earlier for the European. Likewise, the
development of a score suitable for Koreans is deemed necessary in future
studies.
METHODS
Subjects
A total of 866 subjects from 641 unrelated families registered in the
HOPE-PKD cohort between 2009 and 2016 were screened. All methods were
carried out in accordance with relevant guidelines and regulations. The study
protocol was approved by the Institutional Review Board of Seoul National
13
University Hospital (IRB No; 1205-112-411). The informed consent was obtained
from all subjects before performing study.
Subjects aged over 18 who had compatible imaging findings using the Unified
criteria17 with or without family history were enrolled in HOPE-PKD cohort.
Among the screened patients, 117 patients were excluded from the analysis
including those who had active cancer (n=4) or liver cirrhosis (n = 4) and
chronic HBV or HCV hepatitis (n = 18) that may change liver volume or affect
renal function and those who had no kidney or liver volume data available
(n=91, see Supplementary Fig. S5 online). A total of 749 subjects from 524
unrelated families were included in the analysis.
Data Collection
Baseline demographic profiles were collected including age, gender, height,
comorbidities (hypertension, diabetes, chronic liver disease, and cancer), familial
history of ESRD and their age at the diagnosis of ESRD. Hypertension was
considered present when the subject had either systolic blood pressure > 140
mmHg or diastolic blood pressure > 90 mmHg or was receiving treatment with
antihypertensive medications.
Serum creatinine was measured using the Jaffe method and traced using
isotope dilution mass spectrometry. The Chronic Kidney Disease Epidemiology
Collaboration formula was used to calculate eGFR18. ESRD was defined as either
an eGFR < 15 mL/min/1.73m2 or initiation of renal replacement therapy.
Participants underwent non-contrast computed tomography (CT) exams every 2
to 3 years using multi-detector CT scanners. TKV was measured by the
ellipsoid method an dadjusted to height19 by a single skilled technician.
14
Variant Screening of PKD1 and PKD2 by TES
Genetic analysis of the PKD1 and PKD2 genes was performed using TES.
For cases with inadequate results, additional Sanger sequencing for exon 1 of
PKD1 gene and MLPA were done. For determining pathogenicity, familial
segregation analysis was performed.
For TES, a customized 120-mer RNA bait was designed to capture all exons
of the PKD1 and PKD2 genes and their flanking intronic sequences16 (see
Supplementary Table S3 and S4 online). The bait was tiled 3 times to increase
the coverage of poorly covered regions. For the generation of standard target-all
exon capture libraries, the Agilent SureSelect™ target enrichment protocol for
the Illumina paired-end sequencing library (version B.3, June 2015) was used
together with an input of 500 ng of genomic DNA and sequenced using the
HiSeq™ 2500 platform (Illumina, San Diego, CA, USA). In all cases, the
SureSelectXT Custom probe set was used.
Sequencing reads were first aligned with the human genome reference
sequence (hg19) using BWA version 0.7.5a with the MEM algorithm (default
options). To minimize false-positive and false-negative calls, I only used
uniquely mapped reads that were properly paired to avoid mapping reads that
aligned to both target regions and pseudogenes. In addition, only selected reads
with a high mapping quality (>30) were used. SAMTOOLS version 1.2 and
Picard version 1.127 (http://picard.sourceforge.net) were used to process
SAM/BAM files to duplicate the marking. Specifically, local realignment to
reduce any misalignment in the duplicated regions was performed.
I used RealignerTargetCreator and IndelRealigner from GATK version 3.3-1
with known single-nucleotide polymorphisms (SNPs) and indels from dbSNP142,
Mills and 1000G gold-standard indels at hg19 sites, and 1000G phase 1 indels at
15
hg19 sites. Known SNPs and indels were also used to perform a base
calibration. For calling variants, Unified Genotyper was used, and the called
variants were recalibrated by GATK based on dbSNP142, Mills indels, HapMap,
and Omni. ANNOVAR was used to annotate the variants. The depth of
coverage approach was applied to detect any large rearrangement or copy
number variations. The CalculateHsMetrics module in PICARD was used to
calculate the statistics of each parameter of the samples and depth of each exon,
and the copy number ratio was characterized by the normalized depth of the
exons from all the patients (see Supplementary Fig. S6 online).
Prioritization of Detected Variants
To identify causal variants, I first selected exonic and splicing variants
including nonsynonymous variants and small indels. Variants with an allele
frequency over 1% were discarded based on NHLBI-ESP 6500, the 1000 Genome
Project, and an in-house database consisting of exomes of 192 (Samsung
Medical Center, Korea) and 397 (Korean Bioinformation Center, Korea) normal
Korean individuals. Variants that were not reported in dbSNP142 were included,
and those with low-quality reads (<20) and low genotype quality (<30) were
excluded (see Supplementary Fig. S6 online).
The detected variants were classified into seven categories of nonsense,
frameshift, typical splicing, large deletion, missense, small in-frame indel and
NV. Seven classes of variants were further divided into five grades according to
their pathogenicity: DP, HLP, LP, LN, and I. The DP variants were defined as
all the variants that result in protein truncation, including nonsense, frameshift,
typical splicing, and large deletion variants or those previously established as DP
in PKDB or previous articles. The HLP variants were defined as missense
16
variants or small in-frame indels that were previously established as HLP in
PKDB or previous articles. The LP variants were defined as missense variants
or small in-frame indels that were previously established as LP in PKDB or
previous articles or variants that were predicted to be damaging using in-silico
analysis and segregated within a family. Variants were thought to be damaging
if the results of in-silico analysis satisfied two of the following conditions: 1)
damaging as predicted by a SIFT score ≤0.05, 2) damaging as predicted by
Polyphen-2 (HumDiv), and 3) GERP ++ score ≥4. The definitions of grades are
described in Supplementary Table S5 and Supplementary Fig. S7 online.
For genotype-phenotype analysis, PKD1 or PKD2 mutations were further
classified into four PKD genotypes according to the previously described manner
14;PKD1-PT, PKD1-ID, PKD1-NT, and PKD2 genotypes.
Validation of variants detected by TES using LR-PCR/Sanger sequencing
The validation set (n=80) was developed to confirm the variants that was
discovered by NGS approach by LR-PCR/Sanger sequencing. The pathogenic
variants detected by TES were confirmed by direct sequencing of the
corresponding gene regions using the ABI3730xl Genetic Analyzer (Applied
Biosystems, Foster City, CA). For the duplicated part of PKD1, long-range PCR
followed by nested-PCR was performed using PKD1-specific primers. The
results are summarized in the Supplementary Table S6 online.
Among 80 variants, 74 were PKD1 variants and 6 were PKD2. Among 74
PKD1 variants, 70 variants were detected in duplicated region (exon 1-33 in
PKD1 gene). I performed target specific Sanger sequencing for all 80 variants
found by NGS approach. Among them, 79 variants were confirmed by Sanger
sequencing (98.8%). The unconfirmed variant was a missense variant c.C2081T
17
(p.Pro694Leu) in exon 10 of PKD1. Although I did not perform LR-PCR/Sanger
sequencing for whole PKD1/PKD2 regions, I believe TES is efficient in variant
detection even in the duplicated region.
Nevertheless, there were two challenges using NGS approach. The first
problem was low read depth or coverage of exon 1 of PKD1 gene. This resulted
from high GC contents of exon 1 compared to other exons (See Supplementary
Figure S8). To overcome the problem, I performed additional exon 1 Sanger
sequencing of PKD1 gene and found seven additional variants. The other
problem was pseudogenic issue. In addition to validation study, I discarded
variants with high allele frequency > 1%. The average allele frequency of the
variants detected in the pseudogenic region was 3.7 * 10-3. Since I performed
variant calling only for the mapping quality > 20, the mapping quality of the
reads simultaneously aligned to PKD1 and PKD1P1 to 6 was filtered out to 0,
and variant calling was performed only using PKD1 unique mapped read.
Therefore, it is difficult to know pseudogene variants with my analysis method.
I have examined the low VAF mutations in the NGS data, but it is difficult to
know and verify whether variants are also in PKD1P1 to 6.
Improving Mutation Detection Rate by Exon 1-Specific Sanger Sequencing
and MLPA
The subjects with LN or I variants or those in whom NV was detected
underwent Sanger sequencing of PKD1 exon 1 and MLPA. Firstly, Sanger
sequencing of PKD1 exon 1 was performed because variants might not be
detected by NGS owing to low coverage of exon 1. For the duplicated part of
PKD1, long-range PCR followed by nested PCR was performed using
PKD1-specific primers as described before with minor modifications 20,21. In
18
brief, 300-400 ng of genomic DNA was amplified in a final volume of 50 μL,
containing, 0.2 μmol/L of each primer (Cosmogentech, Korea), 0.5 mol/L betaine
(Sigma, USA), 5% dimethyl sulfoxide (Sigma, USA) in the AccuPower ProFi
Taq PCR Premix (Bioneer, Korea) as a Touchdown protocol. The long-range
PCR products were purified with the Qiaquick PCR purification kit and were
further amplified using a nested PCR. The long-range PCR for the PKD1 exon
was done using the AccuPower HotStart PCR Premix (Bioneer, Korea). Sanger
sequencing was performed using the ABI3730xl Genetic Analyzer (Applied
Biosystems, Foster City, CA, USA). In addition, MLPA was performed to detect
large deletion cases if both TES and PKD1 exon 1 sequencing revealed NM.
MLPA analysis was performed with a SALSA MLPA KIT P351-B1/P352-B1
PKD1-PKD2 kit (MRC-Holland, Amsterdam, the Netherlands) according to the
manufacturer’s instructions 12 and single probe exon deletions via MLPA was
confirmed by direct sequencing that they are not due to SNP under the probe
(for the primer, see Supplementary Table S7 online). Finally, I looked into 7
variants found through MLPA. The read depth did not show any particular
problems for the cases. However, only 3 out of 6 variants were detected. The
Normalized read depth in PKD1 and PKD2 of mutations that detected via MLPA
was shown in Supplementary Figure S9 online. It was not a matter of read
depth but a technical limitation of NGS approach in the case of large deletion.
Familial Segregation Analysis
When a novel LP variant appeared after variant calling, familial segregation
analysis was performed to confirm pathogenicity of the variant. Family members
of the subject with LP variant were additionally enrolled for familial segregation
study. The parents of the subjects were recruited first. If the parents were
absent, siblings with and without ADPKD were recruited for segregation
19
analysis. The presence or absence of ADPKD was screened with portable
sonography (SONON 300C, Healcerion, Korea), and additional CT scans were
performed in non-diagnostic cases. For familial segregation analysis, I utilized
both direct Sanger sequencing and TES.
Statistical Analysis
Statistical analyses were performed using SPSS version 23.0 (IBM Corp.,
Armonk, NY, USA). Analysis of covariance, the Mann-Whitney test, and the
chi-square test were performed to compare variables between two groups. P
values <0.05 were considered statistically significant. A multilevel linear
regression model in Stata 14 (StataCorp, College Station, TX, USA) was
performed to identify the correlations between htTKV or eGFR and PKD
genotypes. To control the clustering effect, variance estimation was also done
by the clustered sandwich estimator. For survival analysis, multivariate analysis
using a multilevel Cox proportional hazard model (frailty model) adjusting for
age, gender, and the family-level clustering effect was performed to determine
risk factors for ESRD or mortality. The linearity assumption of continuous
variables was verified, and proportional hazards assumption of categorical
variables was verified using a log-minus-log plot.
20
FIGURE LEGENDS and TABLES
Figure 1. Summary of identified mutations
A. Prevalence of PKD1 vs. PKD2 mutations and cases of no mutation
among 749 subjects (524 families); B. Prevalence of PKD1 vs. PKD2
mutations among 630 subjects (423 families without NM families [n=101]);
C. Frequency of each class in 423 families according to PKD1/2 status;
D. Frequency of each mutation class and grade in 524 families.
Of the 749 subjects, 68% were classified as PKD1, 16% as PKD2, and 16%
as no mutation (NM). Excluding NM subjects, 81% of the 630 subjects and 82%
of the 423 families had PKD1 mutations. The mutation classes in the 423
families that were obtained by excluding the 101 NM families from the original
524 families are shown in Fig. 1C. In PKD1, frameshift mutations were the
most common (39%), and in PKD2, nonsense mutations were the most common
(52%). The prevalence of classes of variants (number of families that shared the
same variants) among the 524 families was shown in Fig 1D.
Abbreviations. DP, definitively pathogenic; HLP, highly likely pathogenic; LP,
likely pathogenic; LN, likely neutral; I, indeterminate; NV, no variant.
Figure 2. Cross-sectional analysis between age and log(htTKV) or eGFR
A. Scatter plot and trend line of log (htTKV) according to genotype. B.
Scatter plot and trend line of log (htTKV) according to genotype.
PKD1-PT genotype had a larger htTKV or lower eGFR than the PKD1-NT
or PKD2 genotypes. log(htTKV) increased rapidly with age in the PKD1-PT
and PKD2 genotypes, but it did so less intensely in the PKD1-NT genotype
21
and those with NM. The PKD1-PT genotype also showed a steeper slope for
eGFR than the other genotypes except PKD1-ID genotype.
Abbreviations. htTKV, height-adjusted total kidney volume; eGFR, estimated
glomerular filtration rate; PKD1-PT, PKD1 protein-truncating; PKD1-NT,
PKD1 non-truncating; PKD1-ID, PKD1 small in-frameshift indels.
Figure 3. Renal outcomes and death according to type of PKD1/2
mutation. A. Survival analysis of ESRD progression. B. Survival analysis
of all-cause mortality
When PKD1-PT genotype was used as a reference, the PKD2 genotype
showed a 0.22 times lower HR of ESRD. The familial clustering effect was
significant (P=0.037). The incidence of death did not have difference according to
genotypes.
Abbreviations. ESRD, end-stage renal disease; PKD1-PT, PKD1
protein-truncating; PKD1-NT, PKD1 non-truncating; PKD1-ID, PKD1 small
in-frameshift indels; HR, hazard ratio; CI, confidence interval.
22
Table 1. Frequency of each class and grade in 524 families
A. Class Nonsense
Frameshift
Typ icalsplicing
L a r g edel/dup
Missense
In-frameindel NV
■ PKD1 86 (19) 136 (24) 16 (3) 8 (0) 90 (17) 12 (3)
■ PKD2 39 (20) 11 (0) 8 (1) 5 (1) 12 (4) 0 (0)
■ NM 53 (20) 48
Total 125 (39) 147 (24) 24 (4) 13 (1) 155 (41) 12 (3) 524 (112)
B. Grade DP HLP LP LN I NV
■ PKD1 284 (46) 61 (20) 39 (0)
■ PKD2 63 (22) 10 (4) 2 (0)
■ NM 39 (19) 14 (1) 48
Total 311 (68) 71 (24) 41 (0) 39 (19) 14 (1) 524 (112)
The number in () means the number of families that shared the same variants
Abbreviation. DP, definitively pathogenic; HLP, highly likely pathogenic; LP, likely
pathogenic; LN, likely neutral; I, indeterminate; NV, no variant
23
Table 2. Baseline Clinical Characteristics According to PKD Genotypes
Variables Total PKD1-PT PKD1-ID PKD1-NT PKD 2 P value
N (%) 630 371 (59.0) 21 (3.0) 119 (19.0) 119 (19.0)
Male, n (%) 304 (48.3) 175 (47.2) 12 (57.1) 68 (57.1) 49 (41.2) 0.975
Age, [mean±SD] 45.6±12.9 44.2±12.9 45.0±10.6 43.3±12.6 48.9±13.0 0.001
1st visit age, 569,
[mean±SD]41.9±12.9 40.2±12.9 39.8±10.0 43.3±12.6 45.9±13.0 <0.001
Hypertension, 590,n (%)
451 (71.6) 266 (71.7) 16 (80.0) 88 (77.2) 81 (75.7) 0.969
Hypertension Dx age,
316,[mean±SD]38.8±10.0 37.3±10.1 38.6 ±7.4 39.0 ± 8.9 43.4±10.5 <0.001
ESRD 122 (19.4) 84 (22.6) 4 (19.0) 22 (18.5) 12 (10.1) 0.004
ESRD age, 122,
[mean±SD]53.3±9.7 52.5±9.5 54.5±14.3 52.5±7.7 60.5±10.7 0.085
Death 23 (3.7) 13 (3.5) 1 (4.8) 5 (4.2) 4 (3.4) 0.908
Death age, 23,
[mean±SD]62.8±13.8 61.4±16.2 63.4 63.9±9.1 65.7±14.6 0.745
htTKV,
[median [IQR]]
781
[432, 1356]
898
[476, 1546]
901
[544, 1271]
609
[397, 1179]
629
[373, 1160]<0.001
eGFR by CKD-EPI,
620, [mean±SD]65.9±40.0 63.5± 42.2 64.7±38.3 66.6±37.8 73.1±33.8 0.052
*Grade,
[n(%)]
1A 44 (7.0) 18 (4.9) 1 (4.8) 14 (11.8) 11 (9.2) <0.001
1B 158 (25.1) 70 (18.9) 2 (9.5) 35 (29.4) 51 (42.9)
1C 206 (32.7) 127 (34.2) 11 (52.4) 36 (30.3) 32 (26.9)
1D 153 (24.3) 99 (26.7) 7 (33.3) 25 (21) 22 (18.5)
1E 69 (11.0) 57 (15.4) 0 (0) 9 (7.6) 3 (2.5)
P for trend by the Jonckheere-Terpstra test
The eGFR of subjects with dialysis or kidney transplantation was regarded as 15
mL/min/1.73 m2.
Abbreviations. PT, protein-truncating; ID, insertion/deletion; NT, non-truncating, SD,
standard deviation; Dx, diagnosis, ESRD, end-stage renal disease; eGFR, estimated
glomerular flow rate; CKD-EPI, Chronic Kidney Disease Epidemiology Collaboration
24
Table 3. A. Multilevel multivariate linear regression of log(htTKV) or
eGFR according to genotype B. Multivariate linear regression of
log(htTKV) or eGFR for comparing each slope according to genotype
A. Log(htTKV) eGFR
*Variable Coefficient [95% CI] P value Coefficient [95% CI] P value
Age 0.012 [0.010, 0.0.013] <0.001 -1.840 [-2.012, -1.668] <0.001
Male 0.102 [0.062, 0.143] <0.001 -6.535 [-11.030, -2.041] <0.001
PKD1 PT reference reference
PKD1 ID -0.033 [-0.186, 0.120] 0.670 2.668 [-112.914, 17.250] 0.720
PKD1 NT -0.153 [-0.220, -0.085] <0.001 7.278 [0.521, 14.034] 0.035
PKD2 -0.161 [-0.232, -0.090] <0.001 18.476 [11.469, 25.482] <0.001
Constant 2.378 [2.297, 2.458] <0.001 148.405 [139.565, 157.246] <0.001
B. b t P value b t P value
PKD1 PT vs. PKD1 ID -0.02 -0.103 0.918 -0.114 -0.671 0.502
PKD1 PT vs. PKD1 NT -0.479 -3.189 0.002 0.393 2.891 0.004
PKD1 PT vs. PKD2 -0.505 -4.036 <0.001 0.428 3.659 <0.001
PKD1 PT vs. NM -0.719 -4.785 <0.001 0.622 4.507 <0.001
PKD1 ID vs. PKD1 NT -0.453 -1.032 0.304 0.667 1.68 0.095
PKD1 ID vs. PKD2 0.035 0.086 0.931 0.545 1.524 0.13
PKD1 ID vs. NM -0.713 -1.47 0.144 0.718 2.351 0.02
PKD1 NT vs. PKD2 0.601 2.476 0.014 -0.26 -1.156 0.249
PKD1 NT vs. NM -0.244 -0.936 0.35 0.242 1.006 0.316
PKD2 vs. NM -0.874 -3.614 <0.001 0.544 2.448 0.015
* adjusted for age and gender
• P value of Log(htTKV) for familial influence = 0.001
• P value of eGFR for familial influence = 0.029
** Dependent variable=Log(htTKV), eGFR
• By linear regression for verifying moderating effect adjusting for age and
gender
• b is the linearity between the two genotypes, and when |b|>1, it is significant;
and t is the significance of b coefficient, and when |t| is greater, it is more
significant.
25
Figure 1
26
Figure 2
Figure 3
27
SUPPLEMENTARY INFORMATION
Supplementary Table S1. Demographic Findings of 749 Patients (524
Families)
Variable Total Male Female P value
N (%) 749 360 (48.1) 389 (51.9)
Age [mean±SD] 46.4±13.3 44.6±14.4 48.0±12.0 <0.001
Age groups, n(%)
~30 88 (11.8) 62 (17.2) 26 (6.7) 0.001
30~40 149 (19.9) 69 (19.2) 80 (20.6)
40~50 210 (28.0) 93 (25.8) 117 (30.1)
50~60 183 (24.4) 86 (23.9) 97 (24.9)
60~70 95 (12.7) 41 (11.4) 54 (13.9)
70~ 24 (3.2) 9 (2.5) 15 (3.9)
eGFR, [mean±SD] 65.8±38.9 65.6±39.4 66.0±38.5 0.896
CKD stages, n(%) 0.945
Stage I 233 (32.6) 107 (31.1) 126 (34.0)
Stage II 198 (27.7) 96 (26.7) 102 (27.5)
Stage IIIa 85 (11.9) 42 (11.7) 43 (11.6)
Stage IIIb 40 (5.6) 21 (5.8) 19 (5.1)
Stage IV 22 (3.1) 12 (3.3) 10 (2.7)
Stage V 137 (19.2) 66 (18.3) 71 (19.1)
eGFR of subjects with RRT or kidney transplantation was regarded as 15 mL/min/
1.73 m2.
Abbreviations. SD, standard deviation; eGFR, estimated glomerular flow rate
28
Family ID Exon Codon cDNA change Protein change Predictedeffect
Domain PKDB CanadaDB
Koreanindividuals
Koreanfamilies
716 1 5 c.13_25del p.Ala5TrpfsX53 Frameshift LRR 1 0 1 1
61 1 9 c.25_26dup p.Ala9TrpfsX64 FrameshiftExtracellularsequence
0 0 1 1
126 1 15 c.43_62del p.Leu15AlafsX92 Frameshift Extracellularsequence 0 0 3 1
57 1 33 c.78_96dup p.Cys33ArgfsX87 Frameshift LRR 0 0 1 1
473, 886 1 44 c.129_135delCGGCGCC p.Gly44ProfsX27 Frameshift LRR 0 0 2 2
904 1 48 c.142del p.Arg48AlafsX25 Frameshift LRRCT 0 0 1 1
260 1 56 c.165_171delGCTGCGG p.Leu56ArgfsX15 Frameshift LRR 1 0 1 1
185 4 123 c.369del p.Ser123ArgfsX167 Frameshift Extracellularsequence 0 0 1 1
224 4 142 c.423_424del p.Glu142AlafsX36 Frameshift LRRCT 0 0 2 1
415 5 275 c.822_832del p.Ala275GlyfsX92 Frameshift PKD 1 0 0 1 1
181, 478,657 5 286 c.856_862del p.Gly286X Frameshift PKD 1 1 3x 3 3
570 6 442 c.1324del p.Ala442ProfsX23 Frameshift C-type lectin 0 0 1 1
Supplementary Table S2. Mutations found in this study
Supplementary Table S2A. PKD1 protein-truncating mutations in study families
29
16 7 499 c.1495del p.Glu499SerfsX59 Frameshift C-type lectin 0 0 3 1
67 8 557 c.1669_1670del p.Leu557AspfsX29 FrameshiftExtracellularsequence
0 0 1 1
649 9 592 c.1776del p.Glu592AsnfsX192 Frameshift Extracellularsequence
0 0 1 1
599 10 630 c.1889dup p.Asp630GlyfsX83 FrameshiftExtracellularsequence
0 0 1 1
229 10 668 c.2001dup p.Gly668TrpfsX46 FrameshiftLDL-receptor
class A;aTypical
0 0 2 1
363 10 681 c.2040dup p.Ala681CysfsX33 FrameshiftExtracellularsequence 0 0 1 1
302, 414 10 696 c.2085dup p.Ala696ArgfsX18 Frameshift Extracellularsequence 0 0 2 2
793 11 797 c.2390dup p.Val797GlyfsX19 Frameshift PKD 2 0 0 1 1
611 11 824 c.2470del p.Leu824CysfsX74 Frameshift Extracellularsequence 0 0 2 1
79, 204 11 832 c.2494dup p.Arg832ProfsX40 Frameshift Extracellularsequence 0 0 2 2
252 11 873 c.2618_2621del p.Val873AlafsX24 Frameshift PKD 3 1 1x 5 1
20 11 887 c.2659del p.Trp887GlyfsX11 Frameshift PKD 3 0 0 12 1
73 11 906 c.2716del p.Glu906SerfsX21 Frameshift PKD 3 0 0 2 1
110 11 928 c.2784_2796del p.Glu928ValfsX18 Frameshift PKD 3 0 0 1 1
541 12 955 c.2865del p.Val955TrpfsX20 Frameshift PKD 4 1 0 1 1
318 12 966 c.2896del p.Arg966GlyfsX10 Frameshift PKD 4 0 0 1 1
30
197 13 1029 c.3085_3088del p.Ala1029CysfsX8 Frameshift PKD 5 0 0 1 1
146 14 1075 c.3223_3226del p.Phe1075ArgfsX28 Frameshift PKD 5 0 0 1 1
54 15 1108 c.3323del p.Ser1108LeufsX7 Frameshift PKD 5 0 0 5 1
4 15 1117 c.3349dup p.Gln1117ProfsX19 Frameshift PKD 5 0 0 3 1
748 15 1168 c.3503_3504del p.Pro1168ArgfsX42 Frameshift PKD 6 1 0 1 1
900 15 1174 c.3520_3521del p.Gln1174AlafsX36 Frameshift PKD 6 0 0 1 1
245 15 1205 c.3613_3622del p.Asp1205SerfsX12 Frameshift PKD 6 0 0 3 1
160 15 1228 c.3684del p.Val1228TrpfsX44 Frameshift PKD 7 1 0 2 1
47 15 1248 c.3744_3754del p.Asp1248AlafsX48 Frameshift PKD 7 0 0 2 1
265 15 1303 c.3906dup p.Ala1303ArgfsX8 Frameshift PKD 8 0 0 1 1
209 15 1347 c.4041_4042del p.His1347GlnfsX83 Frameshift PKD 8 1 X1 1 1
365, 887 15 1357 c.4069del p.Leu1357TrpfsX9 Frameshift PKD 8 1 0 2 2
76 15 1357 c.4070del p.Leu1357ArgfsX9 Frameshift PKD 8 1 0 5 1
15 15 1439 c.4315_4324delinsCAACTGTTG p.Gly1439GlnfsX5 Frameshift PKD 9 0 0 3 1
279 15 1460 c.4379_4380del p.Val1460GlyfsX62 Frameshift PKD 9 0 0 1 1
368 15 1467 c.4401_4407del p.Val1467AlafsX64 Frameshift PKD 9 0 0 1 1
369 15 1478 c.4434del p.Leu1478TrpfsX55 Frameshift PKD 8 0 0 1 1
888 15 1529 c.4586del p.Gly1529AlafsX5 Frameshift PKD 10 0 0 1 1
31
732 15 1551 c.4650dup p.Leu1551AlafsX27 Frameshift PKD 11 0 0 1 1
21 15 1560 c.4679_4691del p.Val1560GlyfsX3 Frameshift PKD 11 0 0 4 1
199 15 1599 c.4797del p.Tyr1599X Frameshift PKD 11 0 novel 1 1
500 15 1642 c.4924del p.Arg1642AlafsX80 Frameshift REJ 0 0 1 1
1, 33, 336,457, 542,562, 575,635, 666
15 1672 c.5014_5015del p.Arg1672GlyfsX98 Frameshift PKD 12 1 29x 17 9
147, 553 15 1841 c.5517_5521dup p.Val1841GlyfsX110 Frameshift PKD 14 0 0 2 2
901 15 1877 c.5629del p.Ala1877ProfsX72 Frameshift PKD14 0 0 1 1
582 15 2075 c.6224_6225insGTTG p.Ser2075LeufsX6 Frameshift PKD 11 0 0 1 1
364 15 2114 c.6341_6344del p.Tyr2114X Frameshift PKD 17 0 0 1 1
238 15 2183 c.6549_6550del p.Glu2183ValfsX77 Frameshift REJ 0 0 3 1
215 15 2257 c.6770dup p.Pro2257AlafsX4 Frameshift REJ 0 0 1 1
84 15 2270 c.6808_6811del p.Asp2270HisfsX43 Frameshift REJ 1 0 1 1
376 16 2332 c.6994_7000del p.Ala2332TrpfsX7 Frameshift REJ 1 0 1 1
17 16 2334 c.6994_7000dup p.Val2334GlyfsX88 Frameshift REJ 1 0 6 1
136 16 2335 c.7003dup p.Glu2335GlyfsX85 Frameshift REJ 0 0 2 1
232 18 2449 c.7345dup p.Thr2449AsnfsX52 Frameshift REJ 0 0 1 1
288 18 2470 c.7408_7414del p.Pro2470TrpfsX148 Frameshift REJ 0 0 1 1
32
60 19 2527 c.7579_7580del p.Val2527LeufsX67 Frameshift REJ 0 0 1 1
183 19 2542 c.7625del p.Gly2542ValfsX78 Frameshift REJ 0 0 1 1
285 20 2581 c.7741_7742dup p.Thr2581GlnfsX39 Frameshift REJ 0 0 1 1
157, 655 20 2606 c.7816del p.Gln2606SerfsX14 Frameshift REJ 0 0 4 2
559 21 2648 c.7942del p.Glu2648ArgfsX6 Frameshift REJ 0 0 1 1
85 21 2658 c.7973_7974del p.Val2658GlyfsX2 Frameshift REJ 0 0 2 1
277, 539 22 2674 c.8019dup p.Pro2674AlafsX148 Frameshift REJ 0 0 2 2
78, 128 23 2776 c.8327_8330del p.Leu2776ArgfsX98 Frameshift REJ 0 0 4 2
551 23 2800 c.8399del p.Pro2800GlnfsX75 Frameshift REJ 0 0 1 1
315 23 2859 c.8570_8574dup p.Ala2859ProfsX18 Frameshift Extracellularsequence 0 0 1 1
592 23 2861 c.8581dup p.Ile2861AsnfsX76 Frameshift Extracellularsequence 0 0 1 1
586 23 2881 c.8642_8655del p.Asp2881GlyfsX51 Frameshift Extracellularsequence 0 0 1 1
556, 589,746 24 2951 c.8851dup p.Arg2951ProfsX4 Frameshift Extracellular
sequence 0 0 3 3
49, 789 25 3028 c.9083_9084del p.Glu3028GlyfsX40 Frameshift GPS 1 0 2 2
754 26 3080 c.9240_9241del p.Ala3080CysfsX96 Frameshift Transmembrane 0 0 1 1
397 28 3228 c.9683dup p.Leu3228ProfsX24 Frameshift PLAT 0 0 1 1
425 29 3262 c.9780_9784dup p.Ile3262SerfsX56 Frameshift Cytoplasmic 0 0 1 1
33
sequence
881 29 3281 c.9841del p.Ala3281ProfsX35 FrameshiftCytoplasmicsequence
0 0 1 1
627 29 3281 c.9843del p.Thr3281ProfsX34 FrameshiftCytoplasmicsequence
0 0 1 1
407 31 3382 c.10144dup p.Thr3382AsnfsX8 FrameshiftCytoplasmicsequence
0 0 1 1
766 31 3388 c.10162_10163dup p.Glu3388LeufsX8 FrameshiftCytoplasmicsequence 0 0 1 1
118 33 3419 c.10255dup p.Trp3419LeufsX7 Frameshift Cytoplasmicsequence
0 0 2 1
107 35 3529 c.10585_10586dup p.Gln3529HisfsX56 Frameshift Cytoplasmicsequence 0 0 1 1
39 36 3569 c.10706_10707del p.Val3569GlyfsX56 Frameshift Transmembrane 0 0 2 1
119 36 3581 c.10742del p.Pro3581ArgfsX3 Frameshift Extracellularsequence 1 0 2 1
161 37 3613 c.10836_10837del p.Tyr3613LeufsX12 Frameshift Cytoplasmicsequence 0 0 1 1
96 37 3655 c.10963del p.Leu3655TrpfsX28 Frameshift Cytoplasmicsequence 1 0 1 1
42 40 3757 c.11270_11274del p.Leu3757ProfsX56 Frameshift Extracellularsequence 0 0 2 1
188 40 3789 c.11365_11366del p.Asn3789TrpfsX25 Frameshift Extracellularsequence 0 0 2 1
14 40 3793 c.11376dup p.Thr3793AspfsX22 Frameshift Extracellularsequence 0 0 2 1
786 40 3793 c.11378del p.Thr3793SerfsX32 Frameshift Extracellularsequence 0 0 1 1
34
676 41 3817 c.11450dup p.Tyr3817LeufsX142 FrameshiftExtracellularsequence
1 0 1 1
141 41 3843 c.11527del p.Asp3843ThrfsX101 FrameshiftExtracellularsequence
0 0 1 1
600 42 3851 c.11551del p.Leu3851TrpfsX93 FrameshiftExtracellularsequence
1 0 1 1
9 42 3888 c.11661dup p.Ala3888CysfsX72 Frameshift Cytoplasmicsequence
0 0 3 1
396 42 3895 c.11684del p.Gly3895AlafsX49 FrameshiftExtracellularsequence
0 0 1 1
572 43 3994 c.11973_11980dup p.Leu3994ProfsX47 FrameshiftTransmem
brane0 0 1 1
812 43 3995 c.11984_11996del p.Phe3995SerfsX39 Frameshift Transmembrane 0 0 1 1
298 44 4014 c.12042del p.Phe4014LeufsX24 Frameshift Cytoplasmicsequence 0 0 2 1
322 44 4033 c.12097del p.Val4033TrpfsX5 Frameshift Transmembrane 0 0 1 1
36, 159 44 4034 c.12100del p.Val4034CysfsX4 Frameshift Transmembrane 0 0 6 2
540 45 4059 c.12175dup p.Gln4059ProfsX97 Frameshift Extracellularsequence 1 0 1 1
583 45 4066 c.12197del p.Pro4066LeufsX131 Frameshift Extracellularsequence 0 0 1 1
293 45 4073 c.12217_12218del p.Leu4073ValfsX82 Frameshift Extracellularsequence 0 0 1 1
77 45 4083 c.12247_12259del p.Pro4083TrpfsX110 Frameshift Extracellularsequence 0 0 2 1
94, 501 45 4102 c.12305del p.Ala4102ValfsX95 Frameshift Transmembrane 0 0 3 2
35
522 45 4103 c.12307_12310del p.Val4103PhefsX93 FrameshiftTransmem
brane0 0 1 1
38, 88 46 4202 c.12605_12632del p.Arg4202ProfsX146 FrameshiftCytoplasmicsequence
1 0 2 2
28 46 4224 c.12669_12670del p.Gln4224ValfsX2 FrameshiftCytoplasmicsequence
0 0 2 1
167, 382 3 117 c.350T>G p.Leu117X Nonsense Extracellularsequence
0 0 2 2
105, 860 4 135 c.405G>A p.Trp135X Nonsense LRRCT 0 0 4 2
89 5 232 c.696T>A p.Cys232X Nonsense WSC 0 0 1 1
100, 535 5 236 c.706C>T p.Gln236X Nonsense WSC 1 0 3 2
102 5 400 c.1198C>T p.Arg400X Nonsense Extracellularsequence 1 0 1 1
526 6 437 c.1309C>T p.Gln437X Nonsense C-type lectin 0 0 1 1
132 6 439 c.1316G>A p.Trp439X Nonsense C-type lectin 0 0 3 1
243 7 523 c.1568C>A p.Ser523X Nonsense C-type lectin 0 0 2 1
6 9 597 c.1789C>T p.Gln597X Nonsense Extracellularsequence 0 0 2 1
576 10 680 c.2040T>G p.Tyr680X Nonsense Extracellularsequence 0 0 1 1
619 11 718 c.2152C>T p.Gln718X Nonsense Extracellularsequence 1 0 1 1
557 11 901 c.2703G>A p.Trp901X Nonsense PKD 3 0 0 1 1
385 12 987 c.2959C>T p.Gln987X Nonsense PKD4 1 0 1 1
36
269 13 1023 c.3067C>T p.Gln1023X Nonsense PKD 5 0 0 1 1
2 15 1112 c.3334G>T p.Glu1112X Nonsense PKD 5 0 0 2 1
273 15 1116 c.3346C>T p.Gln1116X Nonsense PKD 5 0 0 1 1
784 15 1172 c.3514C>T p.Gln1172X Nonsense PKD 6 1 0 1 1
631 15 1174 c.3520C>T p.Gln1174X Nonsense PKD 6 0 0 1 1
153 15 1203 c.3607C>T p.Gln1203X Nonsense PKD 6 0 0 2 1
469 15 1436 c.4306C>T p.Arg1436X Nonsense PKD 9 1 1x 1 1
24, 59, 91,135, 354,
67415 1483 c.4447C>T p.Gln1483X Nonsense PKD 10 1 0 8 6
403 15 1599 c.4797C>A p.Tyr1599X Nonsense PKD 11 0 novel 2 1
150 15 1621 c.4861C>T p.Gln1621X Nonsense PKD 11 1 0 2 1
66, 127,439, 552 15 1826 c.5477G>A p.Trp1826X Nonsense PKD 14 0 0 5 4
864 15 1837 c.5510G>A p.Trp1837X Nonsense PKD 14 0 0 1 1
207 15 1874 c.5621G>A p.Trp1874X Nonsense PKD 14 0 0 1 1
543 15 1903 c.5707C>T p.Gln1903X Nonsense PKD 15 1 0 1 1
227 15 2039 c.6115C>T p.Gln2039X Nonsense PKD 16 1 1x 2 1
462, 493 15 2067 c.6199C>T p.Gln2067X Nonsense PKD 17 1 0 2 2
667 15 2164 c.6491C>G p.Ser2164X Nonsense REJ 1 0 1 1
37
29, 137,892
15 2184 c.6549dup p.Glu2184X Nonsense REJ 1 0 5 3
156 15 2187 c.6561G>A p.Trp2187X Nonsense REJ 0 0 1 1
81 15 2246 c.6736C>T p.Gln2246X Nonsense REJ 1 0 1 1
454 15 2305 c.6913C>T p.Gln2305X Nonsense REJ 0 0 1 1
35, 43 17 2388 c.7164C>G p.Tyr2388X Nonsense REJ 0 0 3 2
734 18 2405 c.7214G>A p.Trp2405X Nonsense REJ 0 0 1 1
345 18 2430 c.7288C>T p.Arg2430X Nonsense REJ 1 7x 1 1
206 19 2519 c.7555C>T p.Gln2519X Nonsense REJ 1 0 1 1
70, 267 20 2606 c.7816C>T p.Gln2606X Nonsense REJ 1 0 2 2
324, 883 21 2639 c.7915C>T p.Arg2639X Nonsense REJ 1 5x 3 2
903 22 2699 c.8095C>T p.Gln2699X Nonsense REJ 1 0 1 1
629 23 2738 c.8212G>T p.Glu2738X Nonsense REJ 0 0 1 1
452 23 2742 c.8224G>T p.Glu2742X Nonsense REJ 0 0 1 1
797 23 2810 c.8428G>T p.Glu2810X Nonsense REJ 1 3x 1 1
564 23 2824 c.8470C>T p.Gln2824X Nonsense REJ 0 0 1 1
761, 810 27 3183 c.9547C>T p.Arg3183X Nonsense PLAT 1 0 2 2
588 28 3195 c.9585G>A p.Trp3195X Nonsense PLAT 0 0 2 1
388 28 3206 c.9616C>T p.Gln3206X Nonsense PLAT 1 0 1 1
38
890 30 3350 c.10048A>T p.Lys3350X NonsenseCytoplasmicsequence
0 0 1 1
145 32 3395 c.10180C>T p.Gln3394X NonsenseCytoplasmicsequence
1 0 1 1
95 34 3475 c.10420C>T p.Gln3474X NonsenseCytoplasmicsequence
1 0 2 1
104 34 3478 c.10432G>T p.Glu3478X Nonsense Cytoplasmicsequence
0 0 1 1
45 34 3488 c.10459C>T p.Gln3487X NonsenseCytoplasmicsequence
1 0 1 1
148 36 3603 c.10806G>A p.Trp3602X NonsenseTransmem
brane0 0 2 1
7 37 3620 c.10855A>T p.Lys3619X Nonsense Cytoplasmicsequence 0 0 2 1
113 38 3702 c.11101C>T p.Gln3701X Nonsense Extracellularsequence 1 0 2 1
567 39 3755 c.11263G>T p.Glu3755X Nonsense Extracellularsequence 0 0 1 1
607 40 3796 c.11388T>A p.Tyr3796X Nonsense Extracellularsequence 0 0 1 1
239 41 3807 c.11420G>A p.Trp3807X Nonsense Extracellularsequence 1 0 5 1
5, 117 41 3808 c.11423G>A p.Gly3808X Nonsense Extracellularsequence 1 0 5 2
75 42 3871 c.11611G>T p.Glu3871X Nonsense Extracellularsequence 1 0 2 1
74 43 3921 c.11763G>A p.Trp3921X Nonsense Cytoplasmicsequence 1 3x 1 1
93 44 4003 c.12007C>T p.Gln4003X Nonsense Cytoplasmicsequence 1 4x 1 1
39
22 44 4041 c.12121C>T p.Gln4041X Nonsense PKD 3 1 8x 6 1
109 45 4065 c.12195C>A p.Cys4065X NonsenseExtracellularsequence
0 0 1 1
64 45 4126 c.12377dup p.Tyr4126X Nonsense Cytoplasmicsequence
0 0 2 1
210 46 4246 c.12736C>T p.Gln4246X NonsenseCytoplasmicsequence
0 0 1 1
625 12 951 c.2853+1G>ATypicalsplicing
PKD 4 1 0 1 1
549 15 1098 c.3295+1G>TTypicalsplicing
PKD 5 0 0 1 1
274 20 2568 c.7703+1G>C Typicalsplicing REJ 0 0 1 1
246, 538,656, 884 21 2673 c.8017-2_8017-1d
elTypicalsplicing REJ 1 0 5 4
129 23 2672 c.8017-2A>G Typicalsplicing REJ 1 0 2 1
171 26 3067 c.9201+1G>C Typicalsplicing
Transmembrane 1 0 1 1
638 33 3406 c.10217+1G>A Typicalsplicing
Cytoplasmicsequence 0 0 1 1
213 32 3476 c.10429-2_10429-1delCA
Typicalsplicing
Cytoplasmicsequence 0 0 1 1
595 39 3671 c.11014-1del Typicalsplicing
Cytoplasmicsequence 0 0 2 1
34 41 3757 c.11270-2A>C Typicalsplicing
Extracellularsequence 0 0 3 1
608 44 3903 c.11710-1G>T Typicalsplicing
Transmembrane 0 0 1 1
40
194 44 3904 c.11713-1G>ATypicalsplicing
Transmembrane
0 0 1 1
312 45 4000 c.12001-2A>GTypicalsplicing
Transmembrane
0 0 1 1
299 4 exon 4 exon 4 Large deletion 0 0 1 1
590 6 exon 6 exon 6 Large deletion 0 0 1 1
316 15 exon 15 exon 15 Large deletion 0 0 1 1
628 15 2219 c.6657_6671del p.Arg2219_Pro2224del
Large deletion REJ 1 0 1 1
162 19 exon19 dup exon19 dup Largeduplication 0 0 2 1
878 21 exon 21 exon 21 Large deletion 0 0 1 1
23 46 3791 c.11373_11390del p.Gly3791_Ser3797del Large deletion Extracellular
sequence 0 0 3 1
643 46 4201 c.12601_12628del p.Gly4201SerfsX147 Large deletion Cytoplasmicsequence 0 0 1 1
Supplementary Table S2B. PKD1 in-frame insertion/deletions in study families
41
FamilyID
Exon Codon cDNA change Protein change Domain PROVEANscore
PKDB
CanadaDB
Koreanindividu
als
Koreanfamilies
Segregation
patients/control
99 36 3565 c.10694_10696del p.Val3565delTransmem
brane-7.49 0 0 2 1
26, 27,679
10 689 c.2065_2067del p.Ser689delExtracellularsequence
-5.37 0 0 6 3 5,0
133 15 2088 c.6258_6263dup p.Pro2087_Arg2088dup PKD 17 -5.26 0 0 1 1 1,1
216 15 2217 c.6650_6664del p.Val2217_Leu2221del REJ -16.42 1 0 1 1
482 15 2251 c.6752_6754del p.Val2251del REJ -7.19 0 0 1 1
87 20 2613 c.7837_7839del p.Leu2613del REJ -9.98 1 0 3 1 2,0
130 23 2770 c.8308_8310del p.Asn2770del REJ -11.28 0 0 3 1 3,0
124,208 24 2979 c.8935_8937del p.Phe2979del Extracellular
sequence -10.76 1 0 2 2
37 25 3031 c.9092_9094del p.Leu3031del GPS -14.07 0 0 2 1 2,0
Supplementary Table S2C. PKD1 non-truncating mutations in study families
42
FamilyID
Exon Codon cDNAchange
Proteinchange
Domain SIFT polyphen
GERP++
PKDB
CanadaDB
Koreanindividual
Koreanfamily
Segregation
patients/control
334 2 75 c.224C>T p.Ser75Phe LRR 1 0 1 4.27 1 0 4 1
355 2 93 c.278T>C p.Leu93Pro C-typelectin
0 1 4.27 0 0 1 1
898 3 120 c.359T>C p.Ile120ThrExtracellularsequence
00.873
4.26 1 0 1 1
304 5 381 c.1141G>A p.Gly381ArgExtracellularsequence
0.090.999
5.05 0 0 1 1
264 6 419 c.1256G>A p.Cys419Tyr C-typelectin 0 1 5.05 0 0 1 1
186 7 466 c.1396G>A p.Val466Met C-typelectin 0.01 1 4.82 1 2x 2 1
281, 297,465 7 515 c.1543G>T p.Gly515Trp C-type
lectin 0 1 4.85 1 0 4 3
41, 306 7 464 c.1391T>C p.Leu464Pro C-typelectin 0.37 0.99
9 4.82 0 0 3 2 3,0
233 10 693 c.2078G>A p.Gly693Glu Extracellularsequence 0.01 1 5.17 0 0 3 1 3,4
519 10 699 c.2095T>C p.Ser699Pro Extracellularsequence 0.21 0.94
3 4.01 0 0 1 1 2,1
342 11 791 c.2371C>T p.Arg791Trp PKD 4 0.08 0.999 4.43 0 0 1 1 2,0
616 11 869 c.2605C>T p.Arg869Cys PKD 12 0.04 0.992 3.48 0 0 1 1 0,1
248 11 944 c.2830C>T p.Arg944Cys PKD 4 0 1 4.96 1 0 1 1
43
241 11 845 c.2534T>C p.Leu845SerExtracellularsequence
0 1 5.09 1 0 2 1
405 12 960 c.2878G>A p.Gly960Ser PKD 4 0 1 4.79 1 0 1 1 0,1
444 14 1077 c.3229G>A p.Val1077Ile PKD 5 0.230.982 5.33 0 0 1 1
614 14 1055 c.3163T>G p.Trp1055Gly PKD 5 0.09 1 5.29 0 0 1 1
745 15 1587 c.4759C>T p.Arg1587Cys PKD 11 0.02 1 4.14 0 0 1 1
485 15 2085 c.6254C>G p.Pro2085Arg PKD 17 0.04 1 5.49 0 0 1 1
591 15 2215 c.6643C>T p.Arg2215Trp REJ 0.010.999 5.49 1 1x 1 1
530 15 2235 c.6704C>T p.Ser2235Leu REJ 0 0.999 5.35 0 0 2 1
624, 848 15 1414 c.4241G>C p.Trp1414Ser PKD 9 0.01 1 5.71 0 0 2 2 2,0
332 15 1544 c.4630G>A p.Val1544Met PKD 10 0 1 5.36 0 0 2 1 2,0
189 15 1547 c.4640G>A p.Arg1547His PKD 10 0.14 1 4.35 0 0 1 1
219 15 1666 c.4997G>C p.Trp1666Ser PKD 12 0 1 5.41 0 0 1 1
865 15 1832 c.5495G>T p.Gly1832Val PKD14 0.01 1 4.67 0 0 1 1
98 15 2159 c.6475G>T p.Val2159Leu REJ 0.74 1 5.49 0 0 1 1
561 15 1109 c.3327T>A p.Asn1109Lys PKD 5 0.01 0.998 -4.05 0 0 1 1
83 15 1652 c.4955T>C p.Leu1652Pro PKD 12 0 1 5.30 0 0 2 1 2,0
261 15 2006 c.6016T>G p.Trp2006Gly PKD 16 0 1 5.59 0 0 1 1
44
528 15 2132 c.6395T>G p.Phe2132Cys PKD 17 0.18 1 4.38 1 0 1 1
328 18 2445 c.7333A>C p.Thr2445Pro REJ 0.130.999
4.81 0 0 2 1
555 18 2434 c.7300C>T p.Arg2434Trp REJ 0 1 3.78 1 0 1 1
348 18 2436 c.7307G>T p.Gly2436Val REJ 0.04 1 4.81 0 0 1 1
569 18 2476 c.7427G>A p.Cys2476Tyr REJ 0 1 4.77 0 0 1 1
513 19 2557 c.7670A>G p.Asp2557Gly REJ 0 1 5.14 1 0 1 1
668 19 2516 c.7546C>T p.Arg2516Cys REJ 0.01 1 4.96 1 8x 1 1
476 19 2497 c.7490G>T p.Gly2497Val PKD 16 0 1 4.96 0 0 1 1
773 19 2526 c.7576T>C p.Cys2526Arg REJ 0.24 0.999 4.96 0 0 1 1
644, 893 23 2863 c.8587A>G p.Ile2863Val Extracellularsequence 0.33 0.87
4 4.89 0 0 2 2 2,0
108, 169,176, 211,214, 253,471, 777,
854
23 2771 c.8311G>A p.Glu2771Lys REJ 0.01 1 4.89 0 0 12 9 10,1
231 23 2783 c.8347G>C p.Ala2783Pro REJ 0.04 1 4.62 0 0 1 1
713 24 2963 c.8888T>C p.Ile2963Thr Extracellularsequence 0 1 4.55 0 0 2 1 2,0
201, 594 25 3012 c.9034A>C p.Thr3012Pro GPS 0.02 1 3.45 0 0 4 2 3,0
677 25 3046 c.9136C>T p.Arg3046Cys GPS 0 0.994 4.38 0 0 1 1
45
143 25 3043 c.9128G>A p.Cys3043Tyr GPS 0 1 4.38 0 0 1 1
642 25 3014 c.9041T>C p.Leu3014Pro GPS 0.12 1 4.55 0 0 1 1
429 26 3132 c.9395C>T p.Ser3132Leu PLAT 0 1 4.51 1 0 1 1
837 26 3113 c.9338G>A p.Gly3113GluCytoplasmicsequence
0.02 1 4.51 0 0 1 1
158, 203 27 3135 c.9404C>T p.Thr3135Met PLAT 0 1 4.49 1 0 2 2
350, 615 28 3194 c.9581C>T p.Ala3194Val PLAT 0.01 1 4.84 0 0 2 2 4,0
69 29 3253 c.9758T>C p.Leu3253ProCytoplasmicsequence
0.01 1 4.69 0 0 2 1 2,0
50 33 3468 c.10402G>A p.Asp3468Asn
Cytoplasmicsequence 0.01 0.99
9 4.45 1 0 1 1
25 36 3591 c.10769C>T p.Ser3590Phe Transmembrane 0 1 3.98 0 0 2 1 2,0
192 36 3603 c.10807G>A p.Glu3603Lys Cytoplasmicsequence 0.01 1 3.98 1 0 2 1
585 37 3612 c.10835T>C p.Leu3612Pro Cytoplasmicsequence 0.01 1 4.81 0 0 1 1
236 38 3673 c.11018T>C p.Leu3673Pro Extracellularsequence 0.02 0.99
7 3.46 0 0 2 1
164 39 3749 c.11246G>A p.Arg3749Gln Extracellularsequence 0.07 1 4.04 1 0 1 1
894 39 3752 c.11255G>A p.Arg3752Gln Extracellularsequence 0.08 1 4.04 1 0 1 1
90 41 3822 c.11465T>A p.Leu3822Gln Extracellularsequence 0 1 4.49 0 0 1 1
212 42 3846 c.11536A>G p.Ser3846Gly Extracellular 0.02 0.95 3.88 0 0 1 1
46
sequence 8
296 42 3846 c.11538C>G p.Ser3846ArgExtracellularsequence
0.010.996
3.88 0 0 1 1
30 42 3853 c.11558T>C p.Leu3853ProExtracellularsequence
0 1 3.97 0 0 4 1 2,0
527 37 3634 c.10901C>A p.Ala3634AspCytoplasmicsequence
0.570.999
3.91 0 0 1 1
242 4 154 c.461C>T p.Thr154Met LRRCT 00.547 -3.45 0 0 2 1
249 15 1768 c.5303C>A p.Thr1768Asn
PKD 13 0.05 0.984
2.6 0 0 2 1
496 15 2095 c.6285C>A p.Asp2095Glu PKD 17 0.13 0.976 3.51 0 0 1 1
578 25 2997 c.8991C>G p.Ser2997Arg Extracellularsequence 0.07 0.99
9 3.57 0 0 1 1
320, 510 37 3650 c.10948G>A p.Gly3650Ser Cytoplasmicsequence 0.09 1 3.71 1 0 3 2
421 15 1706 c.5116G>A p.Ala1706Thr PKD 12 0.42 0.506 4.44 0 0 1 1
466 15 1951 c.5852G>A p.Arg1951Gln PKD 15 0.82 0.504 4.24 0 0 1 1
646 23 2765 c.8294G>A p.Arg2765His REJ 0.36 0.543 3.87 0 0 1 1
289 43 3949 c.11846T>C p.Leu3949Pro Transmembrane 0.17 0.99
9 3.12 0 0 1 1
47
Family ID Exon Codon cDNA change Protein change PredictedEffect
Domain PKDB
CanadaDB
Koreanindividuals
Koreanfamilies
101 1 63 c.187_188insAG p.Ala63GlufsX55 FrameshiftCytoplasmicsequence
0 0 1 1
902 1 158 c.471dup p.Glu158ArgfsX55 Frameshift Poly-Arg 0 0 1 1
618 1 197 c.588_595del p.Leu197SerfsX13 Frameshift Cytoplasmicsequence
0 0 1 1
134 5 381 c.1142del p.Gly381GlufsX71 Frameshift Extracellularsequence
0 0 2 1
329 7 569 c.1704dup p.Val569CysfsX4 Frameshift Transmembrane 0 0 4 1
72 11 710 c.2127dup p.Lys710X Frameshift EF-handdomain 0 0 1 1
152 11 714 c.2142del p.Lys714AsnfsX2 Frameshift EF-handdomain 0 0 1 1
97 11 720 c.2159dup p.Asn720LysfsX5 Frameshift EF-handdomain 0 0 2 1
346 11 722 c.2164del p.Val722TrpfsX15 Frameshift EF-handdomain 0 0 1 1
220 11 738 c.2213_2214del p.Phe738X Frameshift EF-handdomain 0 0 1 1
366 11 747 c.2240del p.Lys747ArgfsX23 Frameshift EF-handdomain 0 0 1 1
80 1 80 c.239C>A p.Ser80X Nonsense Cytoplasmicsequence 0 0 1 1
596, 641 1 87 c.260G>A p.Trp87X Nonsense Poly-Glu 0 0 2 2
32, 154, 437 1 183 c.547C>T p.Gln183X Nonsense Cytoplasmicsequence 0 0 5 3
474 2 201 c.603G>A p.Trp201X Nonsense Transmembrane 1 0 1 1
Supplementary Table S2D. PKD2 protein-truncating mutations in study families
48
727 2 223 c.667G>T p.Glu223X NonsenseCytoplasmicsequence
0 0 1 1
10, 654 3 266 c.796G>T p.Glu266X NonsenseExtracellularsequence
1 0 3 2
58, 222, 565 4 306 c.916C>T p.Arg306X NonsenseExtracellularsequence
1 0 3 3
389 4 320 c.958C>T p.Arg320X Nonsense Polycystinmotif
1 7x 1 1
142, 237 4 361 c.1081C>T p.Arg361X Nonsense Extracellularsequence
1 0 4 2
40 5 417 c.1249C>T p.Arg417X NonsenseExtracellularsequence 1 10x 2 1
31, 163, 836 6 455 c.1365G>A p.Trp455X NonsenseExtracellularsequence 1 0 4 3
651, 660 6 464 c.1390C>T p.Arg464X Nonsense Extracellularsequence 1 0 3 2
270 9 654 c.1960C>T p.Arg654X Nonsense Extracellularsequence 1 5x 1 1
275 10 684 c.2052C>A p.Tyr684X Nonsense Cytoplasmicsequence 0 0 2 1
106, 198,479, 626,
83810 701 c.2102C>G p.Ser701X Nonsense Cytoplasmic
sequence 0 0 5 5
18, 205,546, 573 11 742 c.2224C>T p.Arg742X Nonsense EF-hand
domain 1 0 6 4
12 12 776 c.2326C>T p.Gln776X Nonsense EF-handdomain 0 0 2 1
123, 155,558, 636 13 803 c.2407C>T p.Arg803X Nonsense Linker 1 6x 10 4
140 14 874 c.2620A>T p.Lys874X Nonsense Coiled coil 0 0 1 1
86 1 198 c.595+1G>C Typicalsplicing
Cytoplasmicsequence 0 0 1 1
372, 633 4 365 c.1094+1G>T Typicalsplicing
Extracellularsequence 0 0 2 2
49
44 5 365 c.1095-2A>GTypicalsplicing
Extracellularsequence
1 0 4 1
460 5 365 c.1095-2A>TTypicalsplicing
Extracellularsequence
0 0 1 1
319 6 516 c.1548+1G>ATypicalsplicing
Transmembrane
0 0 1 1
401 8 572 c.1717-2A>C Typicalsplicing
Cytoplasmicsequence
0 0 1 1
3 11 747 c.2240+1G>T Typicalsplicing
EF-handdomain
1 0 2 1
349 1 exon 1 exon 1Largedeletion 0 0 1 1
11 2 Intron2-exon5 p.M1fsXLargedeletion
Cytoplasmicsequence 0 0 8 1
574, 609 3 Exon 3-5 Exon 3-5 Largedeletion 0 0 2 2
897 8 exon 8 exon 8 Largedeletion 0 0 1 1
50
FamilyID
Exon Codon cDNAchange
Proteinchange
Domain SIFT Polyphen
GERP++
PKDB
CanadaDB
Koreanindividuals
Koreanfamilies
Segregation
patients/control
435 1 179 c.536C>A p.Pro179His Cytoplasmicsequence
0.01 0.838 3.98 0 0 1 1
13 4 290 c.869G>T p.Gly290Val Extracellularsequence
0 0.99 5.07 0 0 4 1 2,0
335, 340 4 322 c.965G>A p.Arg322GlnPolycystin
motif 0 0.999 5.62 1 0 5 2
82, 899 4 325 c.974G>A p.Arg325GlnPolycystin
motif 0.01 0.988 -3.84 1 3x 4 2
240 5 389 c.1166C>T p.Ala389Val Extracellularsequence 0 0.907 4.9 0 0 2 1 3,1
394 5 374 c.1121T>G p.Leu374Trp Extracellularsequence 0 0.892 4.9 0 0 1 1
122, 292 5 381 c.1141G>A p.Gly381Arg Extracellularsequence 0 0.966 4.9 0 0 3 2 3,1
282, 472 14 890 c.2668G>A p.Glu890Lys Coiled coil 0.21 1 5.01 0 0 2 2 2,1
Supplementary Table S2E. PKD2 non-truncating mutations in study families
51
Supplementary Table S3. Design of Custom Bait
Category Description
Sequencing technology Illumina
Sequencing protocol Paired-End Short Read (100bp)
Tiling frequency 3x
Bait length 120
Avoid standard repeat masked regions Yes
Avoid overlap 20
Layout strategy Centered
Strand Sense
Total input targets 1018
Total valid targets 256
Average number of baits per target 5.88
Total targets with bait coverage 2038
Total number of baits 1498
Baits removed due to avoid overlap 46
Total baits covered by baits 59913
Abbreviations. bp, basepair
Supplementary Table S4. Quality of Targeted Exome Sequencing
Cohort (N=749) Average S.D.
QC passed reads 9141842.96 6008387.66
Unique reads 7826567.35 5238200.17
Aligned unique reads 7815828.12 5227207.04
PKD1 target mean coverage (X) 1523.12 973.96
% of PKD1 target bases ≥1X 0.9931 0.0076
% of PKD1 target bases ≥10X 0.9700 0.0073
% of PKD1 target bases ≥20X 0.9650 0.0061
% of PKD1 target bases ≥50X 0.9587 0.0084
% of PKD1 target bases ≥100X 0.9478 0.0207
PKD2 target mean coverage (X) 1422.98 903.62
% of PKD2 target bases ≥1X 0.9999 0.0009
% of PKD2 target bases ≥10X 0.9960 0.0074
% of PKD2 target bases ≥20X 0.9879 0.0158
% of PKD2 target bases ≥50X 0.9594 0.0297
% of PKD2 target bases ≥100X 0.9226 0.0447
Abbreviations. QC, quality control; S.D, standard deviation
52
Supplementary Table S5. Definition of Grades
Grade of Mutations
Definitely
pathogenic
(DP)
• Stop-gain single-nucleotide variants (SNVs)
• Frameshift indels
• In-frameshift indels with ≥5 amino acids
• Typical splice site mutations
• Either newly discovered from the samples or previously
registered in the Mayo DB or TGESP(1) data
Highly likely
pathogenic
(HLP)
• Nonsynonymous SNVs or in-frameshift indels of <5 amino acids
that were established as HLP in the Mayo DB or TGESP DB
• Novel LP variants segregated in family(s)
Likely
pathogenic
(LP)
• Nonsynonymous SNVs that were listed as LP in Mayo DB,
• Possibly causal nonsynonymous SNVs that satisfied 2 of the
following 3 conditions
ØBeing damaging, as predicted by a SIFT score ≤0.05
ØBeing damaging, as predicted by Polyphen-2 ([HumDiv]
≥0.453 or [HVAR] ≥0.447)
ØHaving a genomic evolutionary rate profiling [GERP] ++
score ≥2
• Possibly causal in-frameshift indels of <5 amino acids that
satisfied Protein Variation Effect Analyzer PROVEAN] score of
≤2.5
No mutations (NM)
Likely neutral
(LN)
• Previously registered in the Mayo DB or TGESP DB
• Newly found in normal controls
Indeterminate
(I)
• Nonsynonymous SNV previously registered in the Mayo DB or
TGESP DB
• Nonsynonymous SNV or in-frameshift indels of <5 amino acids
with obscure pathogenicity in SIFT, Polyphen-2 [HumDiv or
HVAR], [GERP] ++, or [PROVEAN]
No variant
(NV)• No possible pathogenic variants
53
Gene Exonic Func Exon cDNA change Protein change DupR Sanger
PKD1 frameshiftdeletion
exon15 c.5014_5015del p.Arg1672GlyfsX
98 Y Confirmed
PKD1 frameshiftinsertion
exon16 c.6994_7000dup p.Val2334GlyfsX
88 Y Confirmed
PKD1 stopgain SNV exon15 c.4447C>T p.Gln1483X Y Confirmed
PKD1 stopgain SNV exon15 c.6549dup p.Glu2184X Y Confirmed
PKD1 frameshiftdeletion
exon25 c.9083_9084del p.Glu3028GlyfsX
40 Y Confirmed
PKD1 stopgain SNV exon15 c.4447C>T p.Gln1483X Y Confirmed
PKD1 stopgain SNV exon20 c.7816C>T p.Gln2606X Y Confirmed
PKD1 frameshiftdeletion
exon15 c.5014_5015del p.Arg1672GlyfsX
98 Y Confirmed
PKD1 frameshiftdeletion
exon15 c.4070del p.Leu1357Argfs
X9 Y Confirmed
PKD1 frameshiftinsertion
exon11 c.2494dupC p.Arg832ProfsX4
0 Y Confirmed
PKD1 stopgain SNV exon15 c.4447C>T p.Gln1483X Y Confirmed
PKD1 stopgain SNV exon5 c.1198C>T p.Arg400X Y Confirmed
PKD1 frameshiftdeletion
exon11 c.2618_2621del p.Val873AlafsX2
4 Y Confirmed
PKD1frameshiftdeletion
exon15 c.5014_5015del
p.Arg1672GlyfsX98 Y Confirmed
PKD1 NA exon23
c.8017-2A>G Y Confirmed
PKD1 stopgain SNV exon15
c.4447C>T p.Gln1483X Y Confirmed
PKD1 stopgain SNV exon15
c.6549dup p.Glu2184X Y Confirmed
PKD2 stopgain SNV exon13 c.2407C>T p.Arg803X N Confirmed
PKD1 frameshiftdeletion
exon15 c.3684del p.Val1229TrpfsX
44 Y Confirmed
PKD1 stopgain SNV exon15 c.6549dup p.Glu2184X Y Confirmed
PKD1 nonsynonymous SNV
exon23 c.8311G>A p.Glu2771Lys Y Confirmed
PKD1nonframeshift
deletionexon24 c.8935_8937del p.Phe2979del Y Confirmed
Supplementary Table S6. Results from Validation Study(n=80)
54
PKD1nonsynonymous SNV
exon27 c.9404C>T p.Thr3135Met Y Confirmed
nonsynonymous SNV
exon15 c.4051C>T p.Arg1351Trp Y ConfirmedPKD1
stopgain SNV exon15 c.3334G>T p.Glu1112X Y ConfirmedPKD1
frameshiftinsertion
exon15 c.3349dup p.Gln1117ProfsX
19 Y ConfirmedPKD1
stopgain SNV exon9 c.1789C>T p.Gln597X Y ConfirmedPKD1
frameshiftinsertion
exon15
c.4315_4324delinsCAACTGTTG
p.Gly1439GlnfsX5 Y ConfirmedPKD1
frameshiftdeletion
exon7 c.1495del p.Glu499SerfsX5
9 Y ConfirmedPKD1
frameshiftdeletion
exon11 c.2659del p.Trp887GlyfsX1
1 Y ConfirmedPKD1
frameshiftdeletion
exon15 c.4679_4691del p.Val1560GlyfsX
3 Y ConfirmedPKD1
frameshiftdeletion
exon46
c.12672_12673del
p.Gln4224HisfsX2 N ConfirmedPKD1
frameshiftdeletion
exon15 c.3744_3754del p.Asp1249AlafsX
48 Y ConfirmedPKD1
frameshiftdeletion
exon19 c.7579_7580del p.Val2527LeufsX
67 Y ConfirmedPKD1
frameshiftdeletion
exon11 c.2659del p.Trp887GlyfsX1
1 Y ConfirmedPKD1
stopgain SNV exon15 c.5477G>A p.Trp1826X Y ConfirmedPKD1
frameshiftdeletion
exon8 c.1669_1670del p.Leu557AspfsX
29 Y ConfirmedPKD1
frameshiftdeletion
exon11 c.2716del p.Glu906SerfsX2
1Y ConfirmedPKD1
frameshiftdeletion
exon23
c.8327_8330delp.Leu2776Argfs
X98 Y ConfirmedPKD1
stopgain SNVexon17 c.7164C>G p.Tyr2388X Y ConfirmedPKD1
stopgain SNV exon17 c.7164C>G p.Tyr2388X Y ConfirmedPKD1
frameshiftdeletion
exon15 c.6808_6811del p.Asp2270HisfsX
43 Y ConfirmedPKD1
frameshiftdeletion
exon21 c.7973_7974del
p.Val2658GlyfsX2 Y ConfirmedPKD1
nonframeshiftdeletion
exon20
c.7837_7839del p.Leu2613del Y ConfirmedPKD1
stopgain SNV exon5
c.696T>A p.Cys232X Y ConfirmedPKD1
stopgain SNV exon3 c.350T>G p.Leu117X Y ConfirmedPKD1
55
stopgain SNVexon5
c.706C>T p.Gln236X Y ConfirmedPKD1
ConfirmedYp.Leu557AspfsX29c.1669_1670delexon
8frameshiftdeletionPKD1
ConfirmedYp.Tyr1599Xc.4797C>Aexon15stopgain SNVPKD1
ConfirmedYp.Trp1826Xc.5477G>Aexon15stopgain SNVPKD1
ConfirmedYp.Leu2776ArgfsX98c.8327_8330delexon
23frameshiftdeletionPKD1
ConfirmedYp.Asn2770delc.8308_8310delexon23
nonframeshiftdeletionPKD1
ConfirmedYp.Trp439Xc.1316G>Aexon6stopgain SNVPKD1
ConfirmedYp.Glu2335GlyfsX85c.7003dupGexon
16frameshiftinsertionPKD1
ConfirmedYp.Phe1075ArgfsX28c.3223_3226delexon
14frameshiftdeletionPKD1
ConfirmedYp.Val1841GlyfsX110c.5517_5521dupexon
15frameshiftinsertionPKD1
ConfirmedNp.Trp3603Xc.10809G>Aexon36stopgain SNVPKD1
ConfirmedYp.Gln1621Xc.4861C>Texon15stopgain SNVPKD1
ConfirmedNp.Gln183Xc.547C>Texon1stopgain SNVPKD2
ConfirmedNp.Lys714AsnfsX2c.2140delAexon
11frameshiftdeletionPKD2
ConfirmedYp.Gln1203Xc.3607C>Texon15stopgain SNVPKD1
ConfirmedNp.Gln183Xc.547C>Texon1stopgain SNVPKD2
ConfirmedYp.Trp2187Xc.6561G>Aexon15stopgain SNVPKD1
ConfirmedYp.Gln2606SerfsX14c.7816delexon
20frameshiftdeletionPKD1
ConfirmedNp.Val4034CysfsX4c.12100delexon
44frameshiftdeletionPKD1
ConfirmedNp.Phe3614LeufsX11
c.10839_10840del
exon37
frameshiftdeletionPKD1
ConfirmedNp.Ser3591Phec.10772C>Texon36
nonsynonymous SNVPKD1
ConfirmedYp.Ser689delc.2065_2067delexon10
nonframeshiftdeletionPKD1
ConfirmedYp.Ser689delc.2065_2067delexon10
nonframeshiftdeletionPKD1
ConfirmedYp.Ala81Valc.242C>Texon2
nonsynonymous SNVPKD1
56
ConfirmedYp.Val908Metc.2722G>Aexon11
nonsynonymous SNVPKD1
NotdetectedYp.Pro694Leuc.2081C>Texon
10nonsynonymous SNVPKD1
ConfirmedYp.Ser689delc.2065_2067delexon10
nonframeshiftdeletionPKD1
ConfirmedYp.Leu1652Proc.4955T>Cexon15
nonsynonymous SNVPKD1
ConfirmedYp.Leu3031delc.9092_9094delexon25
nonframeshiftdeletionPKD1
ConfirmedYp.Ala81Valc.242C>Texon2
nonsynonymous SNVPKD1
ConfirmedYp.Pro2087_Arg2088dupc.6258_6263dupexon
15nonframeshift
insertionPKD1
ConfirmedYp.Leu3031delc.9092_9094delexon25
nonframeshiftdeletionPKD1
ConfirmedYp.Cys3043Tyrc.9128G>Aexon25
nonsynonymous SNVPKD1
ConfirmedNp.Ile4105Leuc.12313A>Cexon45
nonsynonymous SNVPKD1
1. Long Range PCR information
Exon Forward primer Reverse primer Tm
('C)
Product
size(bp)
2-75'-CCCCGAGTAGCTGGAAC
TACAGTTACACACT-3'
5'-CGTCCTGCTGTGCCAGA
GGCG-3'70 4041
13-155'-TGGAGGGAGGGACGCC
AATC-3'
5'-GTCAACGTGGGCCTCCA
AGT-3'65 4391
15-215'-ATCCCTGGGGGTCCTAC
CATCTCTTA-3'
5'-ACACAGGACAGAACGG
CTGAGGCTA-3'68 5253
2. Sequencing primers information
Exon primer
name
Sequence Genomic
position(of5'firstbp;
hg19)
Distance to
exon-intron
junction
2-3PKD1_2-3-F 5'-GGGATGCTGGCAATGTGTGGGAT-3' 2,169,468 -89 exon 2
PKD1_2-3-R 5'-GGACCAACTGGGAGGGCAGAA-3' 2,169,038 77 exon 2
4 PKD1_4-F 5'-GGCGGTGCTGTCAGGGTG-3' 2,168,948 -102 exon 4
Supplementary Table S7. MLPA validation using Long Range PCR
sequencing
57
PKD1_4-R 5'-CCAGAGAGGCCTTCCTGAGC-3' 2,168,582 95 exon 4
5(A)PKD1_5A-F 5'-GAACAGCATGGGAGCCTGTGAGT-3' 2,168,561 -95 exon 5
PKD1_5A-R 5'-AGCCGGCCCAGCGGCATC-3' 2,168,033Within an
exon 5
5(B)PKD1_5B-F 5'-GCCTGTCCCTCTGCTCCG-3' 2,168,262
Within an
exon 5
PKD1_5B-R 5'-GTGTCAACGGTCAGTGTGGGC-3' 2,167,696 96 exon 5
6PKD1_6-F 5'-GTGTCTGCTGCCCACTCCC-3' 2,167,787 -124 exon 6
PKD1_6-RF 5'-CTCCTTCCTCCTGAGACTCCC-3' 2,167,397 93 exon 6
15(A)PKD1_15A-F 5'-TTCTGCCGAGCGGGTGGG-3' 2,161,938 -64 exon 15
PKD1_15A-R 5'-CATGTCGAAGGTCCACGTGATGT-3' 2,161,427Within an
exon 15
15(B)PKD1_15B-F 5'-GACATGAGCCTGGCCGTGG-3' 2,161,516
Within an
exon 15
PKD1_15B-R 5'-CCACCTCTGGCTCCACGCA-3' 2,161,027Within an
exon 15
15(C)PKD1_15C-F 5'-CACGCGGAGCGGCACGTT-3' 2,161,121
Within an
exon 15
PKD1_15C-R 5'-GGTGACCTCCGGACCCTC-3' 2,160,626Within an
exon 15
15(D)PKD1_15D-F 5'-TCTGCTGTGGGCCGTGGG-3' 2,160,706
Within an
exon 15
PKD1_15D-R 5'-CTGTACCGTGTGGTTGGTGGG-3' 2,160,215Within an
exon 15
15(E)PKD1_15E-F 5'-ACAGCATCTTCGTCTATGTCCTG-3' 2,160,303
Within an
exon 15
PKD1_15E-R 5'-GGTTCCCTGCCGTCATGGTG-3' 2,159,812Within an
exon 15
15(F)PKD1_15F-F 5'-GGGCTGAGCTGGGAGACCT-3' 2,159,899
Within an
exon 15
PKD1_15F-R 5'-GACAGCTGAGCCGGCAGC-3' 2,159,417Within an
exon 15
15(G)PKD1_15G-F 5'-CTGTGGGCCAGCAGCAAGGT-3' 2,159,494
Within an
exon 15
PKD1_15G-R 5'-CGTGCGGTTCTCACTGCCCA-3' 2,159,012Within an
exon 15
15(H)PKD1_15H-F 5'-GACGTCACCTACACGCCCG-3' 2,159,095
Within an
exon 15
PKD1_15H-R 5'-CCTCCCAGCGGTACTCAGTCT-3' 2,158,603Within an
exon 15
15(I)PKD1_15I-F 5'-GATGCGGCGATCACAGCGCA-3' 2,158,688
Within an
exon 15
PKD1_15I-R 5'-GGCCAGCCCTGGTGGCAA-3' 2,158,164Within an
exon 15
21PKD1_21-F 5'-AGTCGTGGGCATCTGCTGGC-3' 2,155,550 -75 exon 21
PKD1_21-R 5'-CAAGCTGCCCGTCTGCCCT-3' 2,155,240 83 exon 21
Reference. Ying-Cai Tan et al., J Mol Diagn. 2012 Jul;14(4):305-13. doi: 10.1016/ j.
jmoldx. 2012. 02. 007
58
Supplementary Figure S1
Of the entire sample of 524 families, variants were found by targeted exome
sequencing (TES) in 471 families (89.9%). Among them, mutations were found
in 409 families. In 62 families with likely neutral (LN) or indeterminate (I)
variants and 53 families with no variants (NV), subsequent PKD1 exon 1
Sanger sequencing (SS) and/or multiple ligation probe assay (MLPA) revealed
10 additional mutations. An overall mutation detection rate was 80.7% (423/524
families) by TES, PKD1 exon 1 Sanger sequencing, and MLPA.
59
Supplementary Figure S2
Mutations were found more frequently in the C-type lectin and G
protein-coupled receptor proteolytic site (GPS) domains of polycystin-1 in the
cohort compared to those reported from the ADPKD mutation database (PKDB).
Likewise, mutations were detected more frequently in the linker domain of
polycystin-2 in my patients compared to those reported from PKDB. The
standardized residual of the chi-square test (cut-off of significance, 2.58) was
2.8 in the C-type lectin of polycystin-1 and the linker domain of polycystin-2,
and 2.6 in the GPS domain of polycystin-1.
60
Supplementary Figure S3
A. As the Mayo imaging classification changed from 1A to 1E, the proportion
of PKD1-protein truncating (PKD1-PT) genotype increased. Conversely, as the
genotype changed from PKD2 to PKD1-PT, the proportion of Mayo imaging
classification 1C-E increased (P <0.001 by linear by linear test).
B. This figure shows the longitudinal distribution of Mayo class of individuals.
For the median follow up duration, the ICC of the intraclass correlation of
individuals was 0.84 [0.82, 0.86] showing that Mayo imaging classification of
each subject was mostly retained over time.
Abbreviations. PKD1-PT, PKD1 protein truncating; PKD1 ID, PKD1 small
in-frameshift indels; PKD1-NT, PKD1 non-truncating; NM, no mutations.
61
Supplementary Figure S4
The ESRD incidence was 21.7% (110/508) in PKD1 and 10% (12/119) in
PKD2. The histogram represented only those who reached ESRD. The PKD1
genotype showed an earlier onset of ESRD than PKD2 genotype (53.3[46.7, 58.7]
vs. 63.1[54.0, 67.5], P=0.003)
62
Supplementary Figure S5
A total of 866 subjects from 641 unrelated families registered in the
HOPE-PKD cohort between 2009 and 2016 were screened. Among the screened
patients, 117 patients were excluded from the analysis including those who had
active cancer (n=4) or chronic liver disease (n=22) and those who had no kidney
or liver volume data available (n=91). A total of 749 subjects from 524 unrelated
families were included in the analysis.
63
Supplementary Figure S6
Sequencing reads were first aligned with the human genome reference
sequence (hg19) using BWA version 0.7.5a with the MEM algorithm (default
options). To minimize false-positive and false-negative calls, I only used
uniquely mapped reads that were properly paired to avoid mapping reads that
aligned to both target regions and pseudogenes. In addition, only selected reads
with a high mapping quality (>30) were used. SAMTOOLS version 1.2 and
Picard version 1.127 (http://picard.sourceforge.net) were used to process
SAM/BAM files to duplicate the marking. Specifically, local realignment to
reduce any misalignment in the duplicated regions was performed.
I used RealignerTargetCreator and IndelRealigner from GATK version 3.3-1
with known single-nucleotide polymorphisms (SNPs) and indels from dbSNP142,
Mills and 1000G gold-standard indels at hg19 sites, and 1000G phase 1 indels at
hg19 sites. Known SNPs and indels were also used to perform a base
64
calibration. For calling variants, Unified Genotyper was used, and the called
variants were recalibrated by GATK based on dbSNP142, Mills indels, HapMap,
and Omni. ANNOVAR was used to annotate the variants. The depth of
coverage approach was applied to detect any large rearrangement or copy
number variations. The CalculateHsMetrics module in PICARD was used to
calculate the statistics of each parameter of the samples and depth of each exon,
and the copy number ratio was characterized by the normalized depth of the
exons from all the patients
Supplementary Figure S7
The detected variants were classified into seven categories of nonsense,
frameshift, typical splicing, large deletion, missense, small in-frame indel and no
variant. Seven classes of variants were further divided into five grades
according to their pathogenicity: definitively pathogenic (DP), highly likely
pathogenic (HLP), likely pathogenic (LP), likely neutral (LN), and indeterminate
65
(I). The DP variants were defined as all the variants that result in protein
truncation, including nonsense, frameshift, typical splicing, and large deletion
variants or those previously established as DP in the ADPKD mutation database
(PKDB) or previous articles. The HLP variants were defined as missense
variants or small in-frame indels that were previously established as HLP in
PKDB or previous articles. The LP variants were defined as missense variants
or small in-frame indels that were previously established as LP in PKDB or
previous articles or variants that were predicted to be damaging using in-silico
analysis and segregated within a family. Variants were thought to be damaging
if the results of in-silico analysis satisfied two of the following conditions: 1)
damaging as predicted by a SIFT score ≤0.05, 2) damaging as predicted by
Polyphen-2 (HumDiv), and 3) GERP ++ score ≥4.
Supplementary Figure S8 Inverse relationship of GC content and the depth
of coverage for PKD1. (Solid line: GC contents (%), Dotted line: depth of
coverage)
The problem of NGS was low read depth or coverage of exon 1 of PKD1
gene. This resulted from high GC contents of exon 1 compared to other exons.
66
Supplementary Figure S9. Normalized read depth in PKD1 and PKD2 of
mutations that detected via MLPA
67
2nd part: Liver
INTRODUCTION
Autosomal dominant polycystic kidney disease (ADPKD) is a highly prevalent
and potentially lethal monogenic disorder characterized by numerous kidney
cysts and progressive renal failure. Several studies have examined the natural
course of the kidney manifestations according to genotype. The Halt Progression
of Polycystic Kidney Disease (HALT-PKD)1 and the Toronto Genetic
Epidemiology Study of PKD (TGESP), as well as studies conducted in Italy2
and China3, 4, showed remarkable correlations between the genotype and the
kidney phenotype. In addition to the causative genes, especially PKD1 and
PKD2, locus heterogeneity, allelic heterogeneity, and hypomorphic mutations are
also known to affect the prognosis of the kidney in ADPKD5-7. Furthermore, I
have found a significant correlation between renal progression and genotype,
according to PKD1 protein-truncating (PT) mutations, PKD1 non-truncating
(NT) mutations, and No mutations (NM) in part I.
Meanwhile, ADPKD is frequently accompanied by many extra-renal
manifestations, among which the most common manifestation is polycystic liver
disease (PLD)8. PLD in ADPKD can cause various symptoms, including early
satiety or abdominal discomfort, that lower the quality of life and serious
complications such as infection and malnutrition9. D'Agnolo et al. recently
reported that combined kidney and liver volume was associated with the
presence and severity of pain and gastrointestinal symptoms in ADPKD, with a
more prominent role for height-adjusted total liver volume (htTLV) than for
height-adjusted total kidney volume (htTKV)10. Moreover, I reported that
abdominal symptoms and hepatic complications increased as PLD progressed in
patients with ADPKD8.
68
The increased life expectancy that has been achieved through advancements in
ADPKD management has paradoxically brought to light hepatic manifestations
that would not have had a chance to surface11; therefore, it is expected that
more patients will suffer from hepatic manifestations, and we have no excuse to
procrastinate research into the hepatic manifestations of ADPKD.
Unlike kidney manifestations, few studies have investigated the natural course
and risk factors for PLD progression in ADPKD. Some previous studies showed
that age, female sex8, pregnancy, and the use of female sex hormones12 were
potential risk factors associated with PLD progression in ADPKD. Only a few
studies have investigated genotype-phenotype associations of PLD in ADPKD;
for instance, Zoghby et al. failed to show a significant association between the
genotype and the severity or growth rate of PLD in ADPKD patients13.
Therefore, the impact of genotype on PLD progression has not been thoroughly
investigated.
This study examined non-genetic (age, female sex, pregnancy, and the use of
female sex hormones) and genetic (family-level clustering, type, and position of
the PKD1/2 mutation) risk factors of PLD in ADPKD in a prospective cohort,
the coHOrt for genotype-PhenotypE correlation in ADPKD (HOPE-PKD), with
749 Korean patients.
RESULTS
Non-genetic factors influencing PLD severity
The htTLV of 749 subjects (360 male and 389 female) ranged from 336 mL to
7,885 mL. Subjects were divided into 3 groups (mild, moderate, and severe)
according to htTLV (Figure S1). Severe PLD was found in 64 cases (8.5%),
and moderate PLD in 55 cases (7.4%) (Table 1).
69
In comparison with no or mild PLD, moderate and severe PLD had more
females (% of females; none or mild vs. moderate vs. severe, 47.5% vs. 63.6%
vs. 85.9%, P for trend < 0.001) and was older (45.2 vs. 53.0 vs. 52.3 years, P
for trend < 0.001). Patients with moderate to severe PLD were more likely to be
hypertensive (73.9% vs. 84.6% vs. 85.2%, P for trend = 0.012).
Most of the interventions, such as hepatic artery embolization, percutaneous
drainage, partial hepatectomy, or liver transplantation, were performed in the
severe PLD group (0 (0%) vs. 2 (3.6%) vs. 18 (28.6%), P for trend < 0.001).
As described before8, the median htTKV was higher in patients with moderate
to severe PLD (678 vs. 1,024 vs. 1,102 mL/m, P for trend <0.001). The mean
eGFR was lower (69.2 vs. 49.4 vs. 46.9 mL/min/1.73m2, P for trend <0.001), and
the prevalence of end-stage renal disease(ESRD) was higher in patients with
moderate to severe PLD (15.4% vs. 30.9% vs. 35.9%, P for trend <0.001). The
proportion of grades according to the Mayo classification14 was different among
the 3 groups (P for trend <0.001) (Table 1).
In a fraility model, the hazard ratio (HR) for ESRD in the severe PLD group
was significantly higher than in the patients with no or mild PLD (HR [95%
CI]: mild vs. moderate vs. severe; reference vs. 1.684 [0.931, 3.046] vs. 2.012
[1.164, 3.477], Figure 1A). Addtionally, the familial clustering effect according to
3 PLD group was significant (P value for familial influence = 0.024). Even
though incidence of death and age of death failed to show trend according to 3
groups (Table 1), the HR for all-cause mortality with Cox proportional hazard
model became higher as the severity of PLD increased (HR [95% CI]: reference
vs. 3.155 [1.038, 9.586] vs. 6.723 [2.418, 18.697], Figure 1B). In summary, severe
PLD was associated with advanced renal disease and PLD was significantly
associated with mortality.
70
Among females, htTLV ranged from 458 to 7,885 mL/m. The mean age was
older in the moderate to severe PLD group (46.5 vs. 54.9 vs. 52.3 years, P for
trend <0.001, Table 1)
Pregnancy-related parameters were surveyed in 216 female participants, from
whom information was gathered regarding previous full-term or preterm
deliveries, abortions, and the use of female sex hormones. The number of
pregnancies was found to be significantly higher in patients with moderate to
severe PLD (Linear by linear test, P=0.006, Fig. 2A). The prevalence of sex
hormone use was associated with the severity of PLD (Linear by linear test,
P<0.001, Fig. 2B).
Familial clustering effect
A total of 138 families with more than 2 family members were included in the
analysis. The number of family members ranged from 2 to 7. Among these
families, 49 had 1 or more members with moderate to severe PLD. There were
5 concordant moderate to severe PLD families that had 2 or more members
(3.6%, 5/138), with 10.2% (5/49) of families having 1 or more member with
moderate to severe PLD. Among 5 concordant families, 2 were concordant for
severe PLD.
Since families contained multiple generations, only siblings with an age
difference of less than 5 years were included in the additional analysis.
Concordant moderate to severe PLD was found in approximately 7.7% (5/66) of
all siblings and approximately 27.8% (5/18) in families with 1 or more individual
with moderate to severe PLD. In other words, concordant families were rarely
observed. It was confirmed that the plotted htTLV of the 65 siblings was
71
remarkably different from that of the siblings in families with moderate to
severe PLD. In Fig. 3A, the concordant siblings appeared to be rarely observed.
However, when the siblings were plotted after the subject with the highest
htTLV was removed from each family, it was confirmed that the subjects with
low htTLV values showed the same pattern as that in Fig. 1A (Fig. 3B). I
divided each family into three groups (mild, moderate, and severe) based on the
member with the largest htTLV among the siblings. Then, I observed the trend
of htTLV for the members, except for those with the largest htTLV. Even
though the median age and sex distribution were not statistically different, the
prevalence of females was higher and htTLV increased with the severity of
PLD (P for trend <0.001, Table in Fig. 3B). A similar trend was shown in the
analysis excluding women aged 45 years or older whose liver is clearly large (P
for trend=0.003, Table in Fig. 3D).
In summary, families that have siblings with higher htTLV values also have
other siblings with relatively higher htTLV values. Through this data, it was
suggested that htTLV showed a family-level clustering effect anticipating that
the effect would be from the genetic power. The next step was to identify the
genotype-phenotype correlation of PLD.
The impacts of the PKD1 and PKD2 genes on PLD severity
Distribution of PKD genotype according to PLD severity
A total of 630 patients in 423 families after excluding NM were re-classified
into 4 genotype groups: PKD1-PT, 371 (59%); PKD1-ID, 21 (3%); PKD 1-NT,
119 (19%); PKD 2, 119 (19%). PKD1-PT larger htTKV (898 [476, 1546], mL/m)
but htTLV and intervention did not correlated with genotype. Genotype was
devided into 5 (PKD1 PT, PKD1 ID, PKD1 NT, PKD2, and NM) according to
72
Part 1. I analyzed the difference of htTLV according to 5 genotypes (Figure S2,
P value =0.077 by Linear by linear test). However, the prevalence of NM
mutations decreased in the moderate and severe PLD groups (P>0.001, by
Linear by linear test).
To analyze the family-level clustering effect of htTLV, a multilevel mixed
effect linear regression model was constructed with cross sectional htTLV,
adjusting for age, sex, and the family-level clustering effect. When log(htTLV)
was plotted according to age, the PKD1-PT genotype seemed to larger
log(htTLV) comparing to other genotypes, but there was no signficance.
However, the NM genotypes showed smaller log(htTLV) values compared to
PKD1 PT genotype (Figure S3 and Table Left). About the slope (Figure S3
and Table right) the PKD1-PT genotypes showed a steeper slope than the
NM group. In summary, there was no statistical significance ecept for PKD1
PT vs. NM group.
htTLV had a significant familial clustering effect, with an intra-class
correlation (ICC) of 0.15. However, the effect size was small when compared to
that of htTKV (family-level clustering effect, P<0.001; ICC, 0.43).
To summarize, the severity of PLD was associated with age, female sex, a
history of pregnancies, and the use of female sex hormone. Both htTKV and
htTLV also showed a family-level clustering effect. However, we could not
found the PKD1/2 genotype-phenotype correlation in log(htTLV).
Subgroup analysis: comparison of the no liver cyst and liver cyst groups
Although kidney involvement in ADPKD is generally known to have a
penetrance of 100%, liver cysts did not develop in some cases. Therefore,
additional analyses of the characteristics of patients with and without liver cysts
73
according to the class of the PKD gene were conducted. In a previous study,
liver cysts appeared in subjects aged 40 or older in approximately 90% of
subjects. Only subjects older than 40 years were included in the analysis, and
subjects over 40 years with <4 liver cysts were classified as the no liver cyst
group. Subjects over age 40 years with ≥4 liver cysts were defined as the liver
cyst group for analysis. The no liver cyst group consisted of 54 subjects
(10.5%), the liver cyst group consisted of 458 individuals (89.5%), and the
proportion of males was greater in the no liver cyst group (n=32, 59.3%). The
median htTLV and htTKV values were higher in the liver cyst group, and
eGFR tended to be higher in the liver cyst group, but without statistical
significance (61.1 vs. 51.8, mL/min/1.73 m2,P=0.067). In addition, the prevalence
of PRT was higher in the liver cyst group, and milder phenotype was found to
be more common in the no liver cyst group. In general, the renal phenotype
tended to be milder in the no liver cyst group without clinical significance
(Table S1). As for the prevalence of PKD classes in the 2 groups, NM was
more common in the no liver cyst group (Fig. S4).
Subgroup analysis: description of concordant moderate to severe PLD
families
There were 5 families concordant for moderate to severe PLD (A-E) in the
present study. All but 1 (D) family had significantly damaging variants. Because
a PKD1 missense variant of I was found in the family with PKD2, this variant
was likely to be pathogenic, but this possibility needs to be confirmed in other
families (Table S2).
Subgroup Analysis: Description of Moderate to Severe PLD Males
74
There were 29 male subjects with moderate to severe PLD. Although the
sample size was not large, the proportion of PKD1 was slightly larger compared
to the genotype distribution of 749 persons (PKD1 of total subjects, 68% vs.
PKD1 of moderate to severe PLD males, 76% [Figure S6]). In other words, the
possibility that the frequency of PKD1 is high in males with a large liver can
be considered (Table S3).
Location (domain) of mutations in moderate to severe PLD families
Fig. 4 shows the PKD1 gene locations of subjects in Mayo DB and the
study (SNUH) by domain. As each domain size is different, the prevalence of
each domain was analyzed by size and PT versus NT. Compared to the Mayo
DB data, the prevalence of C-type lectin in moderate to severe PLD was
observed to be higher in both PT and NT mutations; thus, it was suggested
that the prevalence of C-type lectin was significantly higher in families with
PLD individuals when prevalence was analyzed by dividing the subjects in my
study based on an htTLV of 1600 mL/m (multivariate-adjusted with age, sex
and family-level clustering effect, P=0.001). Therefore, C-type lectin can be
suggested as a warm spot for moderate to severe PLD (Fig. 4).
DISCUSSION
In summary, 1) the correlations of female sex, age, renal progression, multiple
pregnancies, and the use of female sex hormones and PLD were confirmed.
Moreover, the moderate to severe PLD participants had poorer renal outcomes
and higher mortality. 2) It is certain that a family-level clustering effect of PLD
existed, although the effect size was not notable compared to that observed for
the kidney. 3) The proportion of PKD1 PT mutations was not related to log
75
(htTLV). 4) In patients with no liver cysts, milder kidney disease was present.
5) The C-type lectin domain was a possible warm spot for moderate to severe
PLD in Korean ADPKD patients.
So far, HALT-PKD remains the only major study focused on the hepatic
manifestations of ADPKD, and revealed that the hepatic manifestations had no
correlation with PKD1/2. However, as a previous Korean study of the hepatic
presentation of ADPKD found that the liver prognosis tended to mirror the
kidney prognosis 8, the present study started with the hypothesis that the liver
would also be influenced by the PKD1/2 genes. Finally, the present study
proved for the first time, through a multilevel analysis, that htTLV showed a
family-level clustering effect. However, the influence of pregnancy and female
sex hormone use was stronger than the genetic effect in women.
In a subgroup analysis, subjects with no liver cysts and their genotypes were
investigated. In the current study, those with no liver cysts had milder renal
presentations. This is a valuable finding in that it is brand-new information that
can be used as an indicator for predicting the prognosis of ADPKD. Moreover,
In general, it is known that there is no hot spot for ADPKD; however, I found
a warm spot. The C-type lectin domain in the PKD1 gene was related to the
prevalence of moderate to severe PLD. C-type lectin is involved in cell-matrix
stability and/or cell-matrix signaling; during kidney development, it binds
carbohydrates in a calcium-dependent manner, and interacts with extracellular
matrix proteins. In the liver, it is associated with remodeling of the extracellular
matrix surrounding the cysts.15, 16 However, for this aspect of the study to yield
more concrete outcomes, a larger number of moderate to severe PLD patients
are required, and in vitro experiments need to be conducted in the future to
verify the results.
76
I have attempted genetic analysis with the NGS technique as the main
pipeline. To date, NGS has been considered to have limitations for use as a gold
standard in detecting PKD gene mutations. However, as shown in the recent
China study 17, NGS very accurately finds a variant and can process a large
amount of data at low cost and high speed, which are benefits. To overcome
the shortcomings of NGS, MLPA and Sanger techniques were used in this
study. NGS is expected to be more useful in clinical practice than the Sanger
technique, and the genetic diagnosis of polycystic kidney disease will be
promoted through NGS.
For now, making a genetic diagnosis of ADPKD is challenging. The gold
standard, Sanger sequencing, also shows approximately a 10% non-detection
rate, and 9.3% non-detection rate was shown in my study based on NGS.
Moreover, even when LN, I, and NC mutations were defined as NM
approximately 80% yield was shown. Therefore, the clinical circumstances of the
roughly 20% of the group without a mutation should be investigated as well. In
this study, NM was analyzed as a group, and although there were some
variations, the clinical course of participants with NM was mild compared to
that of patients with other genotypes, indicating that a genetic diagnosis can be
properly made with NGS. Based on this, I suggest that the results presented in
my study can be used to help inform NM patients about their prognosis.
In the current cohort, I followed ESRD progression and mortality. The
moderate to severe PLD group had more subjects with ESRD progression and
mortality. Attention must be focused on the result that the severe PLD group
had a 6.1 times higher mortality rate than the mild PLD group, and efforts to
identify the risk factors and to reduce the morbidity and mortality of severe
PLD patients are clearly needed.
77
In this study, the liver phenotype correlated well with the kidney phenotype.
The familial clustering effect in the liver phenotype was also proven in the
sibling analysis and frailty model. However, the fact that there is no
genotype-phenotype correlation in PLD is significantly complex. It is also
regrettable to conclude that there is no association between the genotype and
phenotype in the liver.
In this study, the PKD1 / 2 locus heterogeneity and allelic heterogeneity in
PLD were not clarified. We can think of the following reasons.
First, this study did not investigate all known genes that could cause PLD.
Moreover, we should also consider the effects of modifier genes and epigenetics.
Previous studies have suggested that the common phenotype and various
mutations of ADPKD and ADPLD provide more complex polycystic disease 18.
PRKCSH, SEC63, LRP5, ALG8, and SEC61B, known as ADPLD genes, modulate
polycystin 1 function 19. Additionally, the possibility that the GANAB gene
affects the ADPKD and ADPKD may also be considered 20. Further research is
needed on the possibility that unknown genes including above mentioned genes
may act as modifier.
Second, it is possible that the hormonal effect has been a factor disturbing the
genetic effect of PKD1 / 2. In previous retrospective studies, pregnancy did not
affect the severity of PLD, but there was a report that PLD became worse
when estrogen was administered 21. As shown in the result, the possibility that
women, pregnancy, and sex hormonal effect affected the liver phenotype and the
possibility that genotype correlation is covered by these factors should be
considered. In addition, we need additional studies on other enviromenal factors
besides hormal effects.
78
Third, the moderate to severe PLD scarcity. Unlike in the kidneys, moderate
to severe PLD is significantly few in number; therefore, this may have resulted
in the dilution of statistical significance. It is deemed necessary to investigate
the risk factors with more populations in order to confirm the
genotype-phenotype correlation in the liver. In addition to the PKD1 and PKD2
genes, it would also be substantial to study the 3rd gene effect.
A major strength of the current study is that it is the first to elucidate the
correlation between PLD and the PKD1/2 genes in ADPKD, which is supported
by the existence of a family-level clustering effect. Additionally, the cohort in
this study was fairly well designed and had a substantial number of subjects.
Considering that most of these subjects are now receiving prospective follow-up,
it is expected that further, more concrete analyses of this cohort will be possible
in the future.
There were some limitations in the current study. It was difficult to correctly
identify patients’ family history because all family members were not registered
at the same hospital. As only patients with an objective computed tomography
(CT)-based index measurement were enrolled, it was harder to investigate the
family-level clustering effect. Based on their history of moderate to severe PLD,
about 25% of families might have family-level clustering in moderate to severe
PLD (data not shown). As for the kidney and liver size, I used cross-sectional
data, therefore, a longitudinal study is needed to investigate serial htTKV or
htTLV changes.
In conclusion, the results strongly suggest that there is family-level clustering
in htTLV severity, although the magnitude is not overwhelming. I conducted a
thorough analysis despite having limited data, and verified the effect of the
PKD1/2 genes on PLD. However, it might be possible that the information was
not comprehensive enough, and that other variables could not be controlled for.
79
In the future, further research on PLD must be carried out, sufficient data must
be gathered, and corresponding follow-up periods must be implemented to obtain
more sophisticated and detailed correlations. Consequently, it will be possible to
predict the prognosis of PLD in ADPKD more clearly through the examination
of family members’ medical history, an establishment of a national policy to
allow cooperation between hospitals, and international cooperation.
METHODS
Patients
A total of 866 subjects from 641 unrelated families registered in the
HOPE-PKD cohort between 2009 and 2016 were screened. Subjects aged over
18 who have compatible imaging findings with or without family history were
enrolled in HOPE-PKD cohort. Among the screened patients, 117 patients were
excluded from the analysis including those who have active cancer (n=4) or
chronic liver disease (n=22) and those who have no kidney or liver volume data
available (n=91, Figure S5). A total of 749 subjects from 524 unrelated families
were included in the analysis.
Collection of family history and clinical information of subjects
Baseline demographic profiles were collected, including variables such as age,
sex, height, and the presence or absence of hypertension, diabetes, chronic liver
disease, or cancer, familial history of PLD and ESRD. Hypertension was defined
as a blood pressure >140/90 mm Hg or a history of treatment with
antihypertensive medications. Chronic liver disease was defined as chronic
hepatitis (toxic, alcoholic, viral, or autoimmune) or liver cirrhosis. PLD was
80
defined as 4 or more liver cysts. Interventions were defined as transarterial
embolization, percutaneous drainage, partial hepatectomy, or liver transplantation.
Their median follow up duration was 77.3 months.
To measure liver and kidney volume, participants underwent CT exams that
were performed biennially in all patients using multi-detector CT scanners 22.
Volumetry was done manually by a single skilled technician based on a review
of each CT scan. Total liver volume was measured by 3-dimensional stereology
(Vitrea, Vital Images, Inc., Minnetonka, USA). Total kidney volume was
measured by the ellipsoid method 8. A concordant PLD family was defined by
the presence of at least 2 or more individuals with moderate to severe PLD.
Serum creatinine was measured using the Jaffe method, and traced using
isotope dilution mass spectrometry. The Chronic Kidney Disease Epidemiology
Collaboration Formula was used to calculate the estimated glomerular filtration
rate (eGFR) 23. ESRD was defined as either estimating GFR < 15mL/min/1.73m2
or initiation of renal replacement therapy.
Classification of PLD and definition of concordant PLD families
PLD groups were classified according to htTLV as no or mild
(height-adjusted total liver volume [htTLV] ≥ 1600 mL/m), moderate (1600
mL/m < htTLV ≤ 2400 mL/m), or severe (htTLV ≤ 2400 mL/m). The cutoff
value of htTLV ≥ 1600 mL/m for moderate to severe PLD was used based on
the probability of developing 2 or more abdominal symptoms and (hepatic)
complications 8.
in the scatter plot of the distribution of htTLV values ≥ 1600 mL/m by age
and sex (Fig. S2), age was found to affect the total liver volume, so I
attempted to adjust the findings regarding PLD according to age. Because
81
htTLV distribution differs by sex, I analyzed correlations between the PKD
genotypes and htTLV by dividing the patients according to sex. An htTLV
more than 2,400 mL/m was defined as severe PLD based on the definition
presented by Hogan et al. 24 of a total liver volume of 4,000 mL/m or more as
severe PLD; this corresponds to an htTLV of approximately 2,400 mL/m, or
about 3 times the normal liver size.
PKD1 and PKD2 genes screening by NGS, Exon1-Specific Sanger
sequencing and MLPA
PKD genotyping using next-generation screening (NGS), Exon1-Specific
Sanger sequencing and MLPA was performed using the method described in
Part 1.
Statistical analysis
Statistical analyses were performed using SPSS version 23.0 (IBM Corp.,
Armonk, NY, USA). I performed the ANCOVA test, linear-by-linear test, the
Jonckheere-Terpstra test, or the chi-square test to compare variables among the
3 PLD groups. P values <0.05 were considered to indicate statistical significance.
ML-win (University of Bristol, Bristol, UK) was used for comparisons between
htTLV groups with the family-level clustering effect taken into consideration.
Multilevel multiple logistic regression and a multilevel linear regression model
using Stata 14 (StataCorp., Texas, USA) were used to identify the correlations
between htTLV and gene class. In order to control for the clustering effect,
variance estimation was also done by the clustered sandwich estimator.
82
FIGURE LEGENDS and TABLES
Figure 1. A. ESRD progression according to PLD severity B. All cause
mortality according to PLD severity
In a fraility model, the hazard ratio (HR) for ESRD in the severe PLD group
was significantly higher than in the patients with no or mild PLD. Addtionally,
the familial clustering effect according to 3 PLD group was significant (P value
for familial influence = 0.024). The HR for all-cause mortality with Cox
proportional hazard model became higher as the severity of PLD increased.
Figure 2. Impact of female sex hormone on severity of PLD
The number pregnancies was found to be significantly higher in patients with
moderate to severe PLD (P=0.006, Fig. 2A). The prevalence of sex hormone use
was associated with the severity of PLD (P<0.001, Fig. 2B).
Figure 3. A. Distrubution of htTLV among siblings B. htTLV of siblings
excluding the subjects having highest htTLV C. Distrubution of htTLV
among siblings excluding famales over 45 years D. htTLV of siblings
excluding the subjects of highest htTLV and famales over 45 years
When the htTLV of 65 siblings was plotted, they were confirmed to be
noticeably different among siblings in the families that had moderate to severe
PLD. Therefore, it appeared that concordant siblings were rarely observed (Fig.
3A). However, when the siblings were plotted after the subject with the highest
htTLV was removed from each family, it was confirmed that the subjects with
low htTLV values showed the same pattern as found in Fig. 1A (Fig. 3B).
When the individuals with the highest htTLV in each family were divided into 3
groups, even though the median age and sex distribution were not statistically
different, htTLV increased with the severity of PLD (Table in Fig. 3B). A
similar trend was shown in the analysis excluding women aged 45 years or
83
older whose liver is clearly large (Table in Fig. 3C and D). Subjects with *
have multiple siblings and represent overlapping subjects.
Figure 4. A. Location of mutation of PKD1 in moderate to severe PLD B.
Location of mutation of PKD1 in moderate to severe PLD
Fig. 4 shows the PKD1 gene locations of subjects in Mayo DB and the study
(SNUH) by domain. As each domain size is different, the prevalence of each
domain was analyzed by size and PT versus NT. Compared to the Mayo DB
data, the prevalence of C-type lectin in moderate to severe PLD was observed
to be higher in both PT and NT mutations; thus, it was confirmed that the
prevalence of C-type lectin was significantly higher in families with PLD
individuals when prevalence was analyzed by dividing the subjects in my study
based on an htTLV of 1,600 mL/m (multivariate-adjusted with age, sex and
family-level clustering effect, P=0.001). Therefore, C-type lectin can be
suggested as a warm spot for moderate to severe PLD.
84
Table 1. Frequency of each class and grade in 524 families
Variables No or mild Moderate Severe P value
N (%) 630 (84.1) 55 (7.4) 64 (8.5)
Female, n (%) 299 (47.5) 35 (63.6) 55 (85.9) <0.001
Age, [mean±SD] 45.2±13.7 53.0±9.6 52.3±9.1 <0.001
1st visit age, 587,[mean±SD]
41.0±13.2 47.6±10.0 48.5±8.7 <0.001
Hypertension, 703, n (%) 436 (73.9) 44 (84.6) 52 (85.2) 0.012
Hypertension Dx age,372, [median [IQR]]
38.3[32.5,46.3]
41.4[33.7, 48.1]
43.5[39.8, 45.3]
0.011
Intervention, 747, n (%) 0 (0) 2 (3.6) 18 (28.6) <0.001
htTLV, [median [IQR]]882[749, 1048]
1881[1774, 2116]
3186[2775, 3949]
<0.001
htTKV, [median [IQR]]678[396, 1228]
1024[456, 1927]
1102[753, 1578]
<0.001
eGFR by CKD-EPI, 738,[mean±SD]
69.2±38.4 49.4±35.2 46.9±37.9 <0.001
RRT, n (%) 97 (15.4) 17 (30.9) 23 (35.9) <0.001
RRT age, [median [IQR]]53.0[46.2, 59.6]
55.1[52.1, 60.5]
53.3[51.1, 63.9]
0.140
death, n (%) 12 (1.9) 5 (9.1) 8 (12.5) 1.000
death age, [median [IQR]]65.0[54.9, 71.6]
58.9[53.6, 75.4]
64.3[60.5, 72.4]
0.721
*Grade,[n(%)]
1A 61 (9.7) 5 (9.1) 5 (7.8)
<0.001
1B 173 (27.5) 15 (27.3) 10 (15.6)
1C 193 (30.6) 20 (36.4) 25 (39.1)
1D 137 (21.7) 13 (23.6) 17 (26.6)
1E 66 (10.5) 2 (3.6) 7 (10.9)
no or mild (htTLV < 1600 mL/m), moderate (1600 mL/m < htTLV ≤ 2400
mL/m), or severe (htTLV ≤ 2400 mL/m) Linear by linear test,
Jonckheere-Terpstra test, or chi-square test as appropriate; interventions: hepatic
85
artery embolization, percutaneous drainage, hepatectomy, or liver transplantation;
eGFR of subjects with RRT or kidney transplantation was regarded as 15
mL/min/1.73 m2.
Abbreviations: SD, standard deviation; Dx, diagnosis; IQR, interquartile range;
htTLV, height-adjusted total liver volume; htTKV, height-adjusted total kidney
volume; eGFR, estimated glomerular flow rate; CKD-EPI, The Chronic Kidney
Disease Epidemiology Collaboration; RRT, renal replacement therapy.
86
Figure 1. A. ESRD progression according to PLD severity B. All cause
mortality according to PLD severity
Figure 2. Impact of female sex hormone on severity of PLD
87
Figure 3. A. Distrubution of htTLV among siblings B. htTLV of siblings excluding the subjects having highest
htTLV C. Distrubution of htTLV among siblings excluding famales over 45 years D. htTLV of siblings excluding the
subjects of highest htTLV and famales over 45 years
88
89
Figure 4. A. Location of mutation of PKD1 in moderate to severe
PLD B. Location of mutation of PKD1 in moderate to severe PLD
90
SUPPLEMENTARY INFORMATION
Figure S1. Distribution of htTLV according to age and sex
91
Figure S2. Distribution patterns of PKD genotypes according to
severity
92
Figure S3. A. htTLV according to sex and class of PKD gene B.
htTLV according to sex and class of PKD gene
93
Table S1. Distribution of PKD genotype according to liver cysts
VariablesNo liver cyst,
>40 yr
Liver cyst,
>40 yrP value
N (%) 54 (10.5) 458 (89.5)
Male, 512, n (%) 32 (59.3) 197 (43.0) 0.023
Age, 512, [mean±SD] 54.2 ± 10.5 53.4 ± 8.7 0.876
1st visit age,391,
[mean±SD]47.7 ± 11.1 48.9 ± 9.7 0.566
Hypertension,
478, n (%)41 (75.9) 363 (79.3) 0.389
Hypertension Dx age,
372, [median [IQR]]44.0 [35.4, 51.7] 42.4 [37.4, 48.0] 0.467
htTLV, 512,
[median [IQR]]806 [697, 915] 1030 [827, 1530] <0.001
htTKV, 510,
[median [IQR]]613 [338, 1000] 976 [547, 1611] <0.001
eGFR by EPI, 508,
[mean±SD]61.1 ± 34.9 51.8 ± 35.1 0.067
ESRD, 512, n (%) 7 (13.0) 120 (26.2) 0.031
ESRD age, 512,
[median IQR]]51.9 [45.5, 64.0] 53.7 [48.4, 60.4] 0.690
Grade, 512,
[n(%)]
1A 14 (25.9) 41 (9.0)
<0.001
1B 19 (35.2) 118 (25.8)
1C 14 (25.9) 156 (34.1)
1D 5 (9.3) 110 (24.0)
1E 2 (3.7) 33 (7.2)
94
Chi-square test; eGFR of subjects with RRT or kidney
transplantation was regarded as 15 mL/min/1.73 m2.
Abbreviations: SD, standard deviation; Dx, diagnosis; IQR,
interquartile range; htTLV, height-adjusted total liver volume; htTKV,
height-adjusted total kidney volume; eGFR, estimated glomerular flow
rate; CKD-EPI, The Chronic Kidney Disease Epidemiology
Collaboration; RRT, renal replacement therapy.
Figure S4. Distribution patterns of PKD genotypes according to
no liver cyst and liver cyst groups
95
ID Genotype Function AA Domain
A PKD1 PTFrameshift
insertionp.(Ala3888Cysfs*72) Cytoplasmic
B PKD1 PTLarge
deletionp.(Gly3792_Ser3797del) Extracellular
C PKD1 PT Stopgain p.(Ser523*) C-type lectin
D PKD1+2Missense(I)
+ Large deletion
p.(Gly1275Val)
+ Intron2-exon5PKD
E PKD1 ID Inframe indel p.(Ser689del) Extracellular
Age htTLV Genotype ExonicFunc AA Domain
30.1 2391 PKD2 Stopgain p.(Arg361*) ExtracellularSequence
31.9 1657 PKD2Missense(I)+largedeletionlargedeletion
p.(Gly1275Val)+ Intron2-exon5
CytoplasmicSequence
57.0 2084 PKD2 Stopgain p.(Arg803*) Linker
43.5 1950 PKD1 PT frameshiftdeletion p.(Gln1174Alafs*36) PKD
46.1 1668 PKD1 PT Stopgain p.(Gln1621*) PKD
46.1 2850 PKD1 PT frameshiftinsertion
p.(Pro2674Alafs*148) REJ
47.9 2831 PKD1 PT frameshiftinsertion p.(Thr2582Glnfs*39) REJ
49.4 1615 PKD1 PT frameshiftinsertion p.(Ala3888Cysfs*72) Cytoplasmic
Sequence
Table S2. Mutations of moderate to severe PLD families
Table S3. Mutations of moderate to severe PLD males
96
51.0 1873 PKD1 PT Stopgain p.(Gln1483*) PKD
51.2 1631 PKD1 PTframeshiftdeletion p.(Val4034Cysfs*4) transmem
brane
53.0 2401 PKD1 PT frameshiftdeletion p.(Gly3792_Ser3797del) Extracellular
Sequence
53.1 1664 PKD1 PT frameshiftdeletion p.(Arg4202Profs*146) Cytoplasmic
Sequence
54.8 1743 PKD1 PTframeshiftinsertion p.(Glu2335Glyfs*85) REJ
56.0 1970 PKD1 PT frameshiftinsertion p.(Leu1551Alafs*27) PKD
58.0 1975 PKD1 PT frameshiftdeletion p.(Val4034Cysfs*4) transmem
brane
58.1 2460 PKD1 PT frameshiftdeletion p.(Arg1672Glyfs*98) PKD
58.8 2249 PKD1 PT frameshiftinsertion p.(Gly668Trpfs*46)
LDL-receptorclass A
61.7 3199 PKD1 PT Stopgain p.(Lys3619*) CytoplasmicSequence
63.4 3142 PKD1 PT frameshiftdeletion p.(Leu4073Valfs*82) Extracellular
Sequence
40.9 2464 PKD1 NT Missense p.(Cys3043Tyr) GPS
43.7 2040 PKD1 NT Missense p.(Ala2783Pro) REJ
46.7 3156 PKD1 NT Missense p.(Leu3673Pro) ExtracellularSequence
52.2 2154 PKD1 NT Missense p.(Leu464Pro)C - t y p electin
57.3 1833 PKD1 NT Missense p.(Thr2445Pro) REJ
39.2 1778 PKD1 ID small indel p.(Ser689del) ExtracellularSequence
38.8 2352 NM
52.1 2768 NM
57.6 1781 NM
65.3 1822 NM
97
Figure S5. Mutations of moderate to severe PLD males
98
Figure S6. Diagram of study patient selection in HOPE PKD
cohort
99
REFERENCES
1stpart:Kidney
1. Heyer, CM, Sundsbak, JL, Abebe, KZ, Chapman, AB, Torres, VE,
Grantham, JJ, Bae, KT, Schrier, RW, Perrone, RD, Braun, WE:
Predicted mutation strength of nontruncating PKD1 mutations aids
genotype-phenotype correlations in autosomal dominant polycystic
kidney disease. Journal of the American Society of Nephrology, 27:
2872-2884, 2016.
2. Carrera, P, Calzavara, S, Magistroni, R, Den Dunnen, JT, Rigo, F,
Stenirri, S, Testa, F, Messa, P, Cerutti, R, Scolari, F: Deciphering
Variability of PKD1 and PKD2 in an Italian Cohort of 643 Patients
with Autosomal Dominant Polycystic Kidney Disease (ADPKD).
Scientific reports, 6: 30850, 2016.
3. Yang, T, Meng, Y, Wei, X, Shen, J, Zhang, M, Qi, C, Wang, C, Liu,
J, Ma, M, Huang, S: Identification of novel mutations of PKD1
gene in Chinese patients with autosomal dominant polycystic
kidney disease by targeted next-generation sequencing. Clinica
Chimica Acta, 433: 12-19, 2014.
4. Jin, M, Xie, Y, Chen, Z, Liao, Y, Li, Z, Hu, P, Qi, Y, Yin, Z, Li, Q,
Fu, P: System analysis of gene mutations and clinical phenotype in
Chinese patients with autosomal-dominant polycystic kidney
disease. Scientific reports, 6: 35945, 2016.
100
5. Rossetti, S, Kubly, VJ, Consugar, MB, Hopp, K, Roy, S, Horsley,
SW, Chauveau, D, Rees, L, Barratt, TM, Van't Hoff, WG:
Incompletely penetrant PKD1 alleles suggest a role for gene dosage
in cyst initiation in polycystic kidney disease. Kidney international,
75: 848-855, 2009.
6. Hateboer, N, v Dijk, MA, Bogdanova, N, Coto, E, Saggar-Malik,
AK, San Millan, JL, Torra, R, Breuning, M, Ravine, D, Group,
EP-PS: Comparison of phenotypes of polycystic kidney disease
types 1 and 2. The Lancet, 353: 103-107, 1999.
7. Hateboer, N, Veldhuisen, B, Peters, D, Breuning, MH, San-Millán,
JL, Bogdanova, N, Coto, E, Dijk, MA, Afzal, AR, Jeffery, S:
Location of mutations within the PKD2 gene influences clinical
outcome. Kidney international, 57: 1444-1451, 2000.
8. Kim, H, Park, HC, Ryu, H, Kim, K, Kim, HS, Oh, KH, Yu, SJ,
Chung, JW, Cho, JY, Kim, SH, Cheong, HI, Lee, K, Park, JH, Pei,
Y, Hwang, YH, Ahn, C: Clinical Correlates of Mass Effect in
Autosomal Dominant Polycystic Kidney Disease. PloS one, 10:
e0144526, 2015.
9. Ryu, H, Kim, H, Park, HC, Kim, H, Cho, EJ, Lee, KB, Chung, W,
Oh, KH, Cho, JY, Hwang, YH: Total kidney and liver volume is a
major risk factor for malnutrition in ambulatory patients with
autosomal dominant polycystic kidney disease. 18: 22, 2017.
10. D'Agnolo, HMA, Casteleijn, NF, Gevers, TJG, de Fijter, H, van
Gastel, MDA, Messchendorp, AL, Peters, DJM, Salih, M,
101
Soonawala, D, Spithoven, EM, Visser, FW, Wetzels, JFM, Zietse, R,
Gansevoort, RT, Drenth, JPH: The Association of Combined Total
Kidney and Liver Volume with Pain and Gastrointestinal Symptoms
in Patients with Later Stage Autosomal Dominant Polycystic
Kidney Disease. American journal of nephrology, 46: 239-248, 2017.
11. Irazabal, MV, Rangel, LJ, Bergstralh, EJ, Osborn, SL, Harmon, AJ,
Sundsbak, JL, Bae, KT, Chapman, AB, Grantham, JJ, Mrug, M,
Hogan, MC, El-Zoghby, ZM, Harris, PC, Erickson, BJ, King, BF,
Torres, VE: Imaging classification of autosomal dominant polycystic
kidney disease: a simple model for selecting patients for clinical
trials. Journal of the American Society of Nephrology : JASN, 26:
160-172, 2015.
12. Sherstha, R, McKinley, C, Russ, P, Scherzinger, A, Bronner, T,
Showalter, R, Everson, GT: Postmenopausal estrogen therapy
selectively stimulates hepatic enlargement in women with autosomal
dominant polycystic kidney disease. Hepatology, 26: 1282-1286,
1997.
13. Chebib, FT, Jung, Y, Heyer, CM, Irazabal, MV, Hogan, MC, Harris,
PC, Torres, VE, El-Zoghby, ZM: Effect of genotype on the severity
and volume progression of polycystic liver disease in autosomal
dominant polycystic kidney disease. Nephrology Dialysis
Transplantation, 31: 952-960, 2016.
14. Kim, H, Koh, J, Park, SK, Oh, KH, Kim, YH, Kim, Y, Ahn, C, Oh,
YK: Baseline Characteristics of the Autosomal Dominant Polycystic
102
Kidney Disease Subcohort of the KoreaN Cohort Study for
Outcomes in Patients With Chronic Kidney Disease (KNOW‐
CKD). Nephrology, 2018.
15. Stevens, LA, Schmid, CH, Greene, T, Zhang, YL, Beck, GJ,
Froissart, M, Hamm, LL, Lewis, JB, Mauer, M, Navis, GJ:
Comparative performance of the CKD Epidemiology Collaboration
(CKD-EPI) and the Modification of Diet in Renal Disease (MDRD)
Study equations for estimating GFR levels above 60 mL/min/1.73
m2. American Journal of Kidney Diseases, 56: 486-495, 2010.
16. Hogan, MC, Masyuk, TV, Page, LJ, Kubly, VJ, Bergstralh, EJ, Li,
X, Kim, B, King, BF, Glockner, J, Holmes, DR, 3rd, Rossetti, S,
Harris, PC, LaRusso, NF, Torres, VE: Randomized clinical trial of
long-acting somatostatin for autosomal dominant polycystic kidney
and liver disease. Journal of the American Society of Nephrology :
JASN, 21: 1052-1061, 2010.
17. Irazabal, MV, Rangel, LJ, Bergstralh, EJ, Osborn, SL, Harmon, AJ,
Sundsbak, JL, Bae, KT, Chapman, AB, Grantham, JJ, Mrug, M:
Imaging classification of autosomal dominant polycystic kidney
disease: a simple model for selecting patients for clinical trials.
Journal of the American Society of Nephrology, 26: 160-172, 2015.
18. Prydz, K, Fagereng, GL, Tveit, H: The Role of Glycans in Apical
Sorting of Proteins in Polarized Epithelial Cells. In: Glycosylation.
InTech, 2012.
103
19. Weston, BS, Bagnéris, C, Price, RG, Stirling, JL: The polycystin-1
C-type lectin domain binds carbohydrate in a calcium-dependent
manner, and interacts with extracellular matrix proteins in vitro.
Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease,
1536: 161-176, 2001.
20. Jin, M, Xie, Y, Chen, Z, Liao, Y, Li, Z, Hu, P, Qi, Y, Yin, Z, Li, Q,
Fu, P, Chen, X: System analysis of gene mutations and clinical
phenotype in Chinese patients with autosomal-dominant polycystic
kidney disease. Sci Rep, 6: 35945, 2016.
2stpart:Liver
1. Heyer, CM, Sundsbak, JL, Abebe, KZ, Chapman, AB, Torres, VE,
Grantham, JJ, Bae, KT, Schrier, RW, Perrone, RD, Braun, WE:
Predicted mutation strength of nontruncating PKD1 mutations aids
genotype-phenotype correlations in autosomal dominant polycystic
kidney disease. Journal of the American Society of Nephrology, 27:
2872-2884, 2016.
2. Carrera, P, Calzavara, S, Magistroni, R, Den Dunnen, JT, Rigo, F,
Stenirri, S, Testa, F, Messa, P, Cerutti, R, Scolari, F: Deciphering
Variability of PKD1 and PKD2 in an Italian Cohort of 643 Patients
with Autosomal Dominant Polycystic Kidney Disease (ADPKD).
Scientific reports, 6: 30850, 2016.
104
3. Yang, T, Meng, Y, Wei, X, Shen, J, Zhang, M, Qi, C, Wang, C, Liu,
J, Ma, M, Huang, S: Identification of novel mutations of PKD1
gene in Chinese patients with autosomal dominant polycystic
kidney disease by targeted next-generation sequencing. Clinica
Chimica Acta, 433: 12-19, 2014.
4. Jin, M, Xie, Y, Chen, Z, Liao, Y, Li, Z, Hu, P, Qi, Y, Yin, Z, Li, Q,
Fu, P: System analysis of gene mutations and clinical phenotype in
Chinese patients with autosomal-dominant polycystic kidney
disease. Scientific reports, 6: 35945, 2016.
5. Rossetti, S, Kubly, VJ, Consugar, MB, Hopp, K, Roy, S, Horsley,
SW, Chauveau, D, Rees, L, Barratt, TM, Van't Hoff, WG:
Incompletely penetrant PKD1 alleles suggest a role for gene dosage
in cyst initiation in polycystic kidney disease. Kidney international,
75: 848-855, 2009.
6. Hateboer, N, v Dijk, MA, Bogdanova, N, Coto, E, Saggar-Malik,
AK, San Millan, JL, Torra, R, Breuning, M, Ravine, D, Group,
EP-PS: Comparison of phenotypes of polycystic kidney disease
types 1 and 2. The Lancet, 353: 103-107, 1999.
7. Hateboer, N, Veldhuisen, B, Peters, D, Breuning, MH, San-Millán,
JL, Bogdanova, N, Coto, E, Dijk, MA, Afzal, AR, Jeffery, S:
Location of mutations within the PKD2 gene influences clinical
outcome. Kidney international, 57: 1444-1451, 2000.
105
8. Kim, H, Park, HC, Ryu, H, Kim, K, Kim, HS, Oh, KH, Yu, SJ,
Chung, JW, Cho, JY, Kim, SH, Cheong, HI, Lee, K, Park, JH, Pei,
Y, Hwang, YH, Ahn, C: Clinical Correlates of Mass Effect in
Autosomal Dominant Polycystic Kidney Disease. PloS one, 10:
e0144526, 2015.
9. Ryu, H, Kim, H, Park, HC, Kim, H, Cho, EJ, Lee, KB, Chung, W,
Oh, KH, Cho, JY, Hwang, YH: Total kidney and liver volume is a
major risk factor for malnutrition in ambulatory patients with
autosomal dominant polycystic kidney disease. 18: 22, 2017.
10. D'Agnolo, HMA, Casteleijn, NF, Gevers, TJG, de Fijter, H, van
Gastel, MDA, Messchendorp, AL, Peters, DJM, Salih, M,
Soonawala, D, Spithoven, EM, Visser, FW, Wetzels, JFM, Zietse, R,
Gansevoort, RT, Drenth, JPH: The Association of Combined Total
Kidney and Liver Volume with Pain and Gastrointestinal Symptoms
in Patients with Later Stage Autosomal Dominant Polycystic
Kidney Disease. American journal of nephrology, 46: 239-248, 2017.
11. Irazabal, MV, Rangel, LJ, Bergstralh, EJ, Osborn, SL, Harmon, AJ,
Sundsbak, JL, Bae, KT, Chapman, AB, Grantham, JJ, Mrug, M,
Hogan, MC, El-Zoghby, ZM, Harris, PC, Erickson, BJ, King, BF,
Torres, VE: Imaging classification of autosomal dominant polycystic
kidney disease: a simple model for selecting patients for clinical
trials. Journal of the American Society of Nephrology : JASN, 26:
160-172, 2015.
106
12. Sherstha, R, McKinley, C, Russ, P, Scherzinger, A, Bronner, T,
Showalter, R, Everson, GT: Postmenopausal estrogen therapy
selectively stimulates hepatic enlargement in women with autosomal
dominant polycystic kidney disease. Hepatology, 26: 1282-1286,
1997.
13. Chebib, FT, Jung, Y, Heyer, CM, Irazabal, MV, Hogan, MC, Harris,
PC, Torres, VE, El-Zoghby, ZM: Effect of genotype on the severity
and volume progression of polycystic liver disease in autosomal
dominant polycystic kidney disease. Nephrology Dialysis
Transplantation, 31: 952-960, 2016.
14. Irazabal, MV, Rangel, LJ, Bergstralh, EJ, Osborn, SL, Harmon, AJ,
Sundsbak, JL, Bae, KT, Chapman, AB, Grantham, JJ, Mrug, M:
Imaging classification of autosomal dominant polycystic kidney
disease: a simple model for selecting patients for clinical trials.
Journal of the American Society of Nephrology, 26: 160-172, 2015.
15. Prydz, K, Fagereng, GL, Tveit, H: The Role of Glycans in Apical
Sorting of Proteins in Polarized Epithelial Cells. In: Glycosylation.
InTech, 2012.
16. Weston, BS, Bagnéris, C, Price, RG, Stirling, JL: The polycystin-1
C-type lectin domain binds carbohydrate in a calcium-dependent
manner, and interacts with extracellular matrix proteins in vitro.
Biochimica et Biophysica Acta (BBA)-Molecular Basis of Disease,
1536: 161-176, 2001.
107
17. Jin, M, Xie, Y, Chen, Z, Liao, Y, Li, Z, Hu, P, Qi, Y, Yin, Z, Li, Q,
Fu, P, Chen, X: System analysis of gene mutations and clinical
phenotype in Chinese patients with autosomal-dominant polycystic
kidney disease. Sci Rep, 6: 35945, 2016.
18. Cornec-Le Gall E, Torres VE, Harris PC. Genetic complexity of
autosomal dominant polycystic kidney and liver diseases. Journal of
the American Society of Nephrology. 2018;29(1):13-23.
19. Besse W, Dong K, Choi J, Punia S, Fedeles SV, Choi M, et al.
Isolated polycystic liver disease genes define effectors of
polycystin-1 function. The Journal of clinical investigation.
2017;127(5):1772-85.
20. Porath B, Gainullin VG, Cornec-Le Gall E, Dillinger EK, Heyer
CM, Hopp K, et al. Mutations in GANAB, encoding the glucosidase
IIα subunit, cause autosomal-dominant polycystic kidney and liver
disease. The American Journal of Human Genetics.
2016;98(6):1193-207.
21. D'agnolo HM, Drenth JP. Risk factors for progressive polycystic
liver disease: where do we stand? Nephrology Dialysis
Transplantation. 2015;31(6):857-9.
22. Kim, H, Koh, J, Park, SK, Oh, KH, Kim, YH, Kim, Y, Ahn, C, Oh,
YK: Baseline Characteristics of the Autosomal Dominant Polycystic
Kidney Disease Subcohort of the KoreaN Cohort Study for
108
Outcomes in Patients With Chronic Kidney Disease
(KNOW‐CKD). Nephrology, 2018.
23. Stevens, LA, Schmid, CH, Greene, T, Zhang, YL, Beck, GJ,
Froissart, M, Hamm, LL, Lewis, JB, Mauer, M, Navis, GJ:
Comparative performance of the CKD Epidemiology Collaboration
(CKD-EPI) and the Modification of Diet in Renal Disease (MDRD)
Study equations for estimating GFR levels above 60 mL/min/1.73
m2. American Journal of Kidney Diseases, 56: 486-495, 2010.
24. Hogan, MC, Masyuk, TV, Page, LJ, Kubly, VJ, Bergstralh, EJ, Li,
X, Kim, B, King, BF, Glockner, J, Holmes, DR, 3rd, Rossetti, S,
Harris, PC, LaRusso, NF, Torres, VE: Randomized clinical trial of
long-acting somatostatin for autosomal dominant polycystic kidney
and liver disease. Journal of the American Society of Nephrology :
JASN, 21: 1052-1061, 2010.
109
국문초록
상염색체우성다낭신(다낭신)은 신장 및 간을 침범하는 단일유전자 질환
이며, 말기신부전의 주요 원인이다. 유전 정보는 다낭신의 병인을 이해하
는 데 있어 가장 중요하다. 따라서 본 연구는 524 명의 비혈연 가계 및
749 명의 다낭신 환자를 대상으로 다낭신의 유전적 특성과 PKD1/2 유전
자 돌연변이가 신장 및 간에 미치는 영향을 확인하고자 하였다.
표적 엑솜 시퀀싱 및 PKD1 유전자의 exon 1의 Sanger 시퀀싱,
multiple ligation probe assay를 사용하여 PKD1/2의 유전 연구를 수행했
다. 총신장부피는 타원체법으로 계산하였고, 총 간부피는 sterology로 측
정하여 키로 보정 하였다(htTKV, htTLV). htTLV를 사용하여 질병의
중증도는 경증 (htTLV <1,600 mL / m), 중증도 (htTLV 1,600-2,400 mL
/ m) 및 중증 다낭간 (htTLV> 2,400 mL / m)로 분류되었다. 다낭신 또
는 다낭간의 중증도와 유전자 돌연변이의 유형 및 위치의 연관성을 가족
수준 클러스터링 (가족 내 상관 관계)을 제어하는 fraility 모델을 사용하
여 분석했다. 돌연변이 검출률은 80.7 % (423/524 가족, 331 돌연변이) 였
고 70.7 %는 새로운 돌연변이였다. PKD1 단백질 절단 유전자형
(PKD1-PT)은 진단 연령이 낮고, htTKV가 크며, PKD1 비 절단 및
PKD2 유전자형과 비교하여 신장 기능이 낮았다. PKD1 유전자형은
PKD2 유전자형 (53.3 [46.7, 58.7] vs. 63.1 [54.0, 67.5], P = 0.003)에 비해
빠르게 말기신부전으로 진행하였다. PKD2 유전자형은 연령, 성별, 가족
수준 클러스터링을 통제한 모델에서 PKD1-PT 유전자형보다 말기신부전
의 위험도가 0.2 배로 낮았다 (p = 0.037).
110
간 표현형에 관해서는, 심한 다낭간은 말기신부전으로의 보다 빠른 진
행과 더 높은 사망률을 보였다. 성별, 연령, 많은 임신 횟수 및 여성 호르
몬의 사용은 중등도 및 중증의 다낭간과 관련이 있었다. 중등도 및 중증
의 다낭간에 가족 수준의 클러스터링 효과가 있었다. 그러나 신장과 달리
다낭간의 심한 정도는 PKD1 / 2 유전자형과 상관이 없었고 단지 돌연변
이가 검출되지 않은 환자에서 경증 다낭간의 빈도가 증가했다. 또한,
C-type lectin domain은 PKD1 단백질 절단과 비절단 돌연변이 모두에서
중등도 및 중증의 다낭간에 대한 잠재적 온점 일 수 있음을 확인하였다.
결론적으로, 본 연구는 유전형이 한국인의 새로운 치료 요법에 대한 신
장의 빠른 진행자를 선택하는데 기여할 수 있음을 시사한다. 그러나 PLD
에 대한 추가적인 유전적 영향을 연구해야 한다.
…………………………………………………………………………………
주요어 : 상염색체우성다낭신, 표적엑솜시퀀싱, 빠른
진행자, 중증도 및 중증의 다낭간, 유전자변이, 표현형
학 번 : 2015-30809