Complete Mutation Analysis Panel of the 39

13
Complete Mutation Analysis Panel of the 39 Human HOX Genes KENJIRO KOSAKI, 1 * RIKA KOSAKI, 1,2 TAICHI SUZUKI, 1 HIROSHI YOSHIHASHI, 1 TAKAO TAKAHASHI, 1 KATSUMI SASAKI, 1 MASARU TOMITA, 3 WILLIAM MCGINNIS, 4 AND NOBUTAKE MATSUO 1,5 1 Department of Pediatrics, Keio University School of Medicine, Tokyo, Japan 2 Department of Medical Genetics, Saitama Children’s Medical Center, Keio University, Tokyo, Japan 3 Laboratory for Bioinformatics, Keio University, Fujisawa, Japan 4 Department of Biology, University of California, San Diego, San Diego, California 5 National Children’s Hospital, Tokyo, Japan ABSTRACT Background: The HOX gene family consists of highly conserved transcription factors that specify the iden- tity of the body segments along the anteroposterior axis of the embryo. Because the phenotypes of mice with targeted disruptions of Hox genes resemble some patterns of human malformations, mutations in HOX genes have been expected to be associated with a significant number of human malformations. Thus far, however, mutations have been documented in only three of the 39 human HOX genes (HOXD13, HOXA13, and HOXA11) partly because current knowledge on the complete coding sequence and genome structure is limited to only 20 of the 39 human HOX genes. Methods: Taking advantage of the human and mouse draft genome sequences, we attempted to character- ize the remaining 19 human HOX genes by bioinfor- matic analysis including phylogenetic footprinting, the probabilistic prediction method, and comparison of genomic sequences with the complete set of the hu- man anonymous cDNA sequences. Results: We were able to determine the full coding sequences of 19 HOX genes and their genome struc- ture and successfully designed a complete set of PCR primers to amplify the entire coding region of each of the 39 HOX genes from genomic DNA. Conclusions: Our results indicate the usefulness of bioinformatic analysis of the draft genome sequences for clinically oriented research projects. It is hoped that the mutation panel provided here will serve as a launch- pad for a new discourse on the genetic basis of human malformations. Teratology 65:50 – 62, 2002. © 2002 Wiley-Liss, Inc. INTRODUCTION The HOX gene family consists of highly conserved transcription factors that specify the identity of the body segments along the anteroposterior axis of the embryo. Since the discovery of the first prototypic HOX gene Anntenapedia in Drosophila (McGinnis et al., ’84), homologous genes have been found in the genome of a variety of animals (McGinnis and Krumlauf, ’92). HOX genes have arisen from a common ancestral gene and form a cluster in which HOX genes are arranged in tandem. The clustered organization of Hox genes is conserved from nematodes to vertebrates. The number of Hox genes within the ancestral complex increased during evolution and this was followed by successive duplications of the cluster to give rise to the four ver- tebrate HOX clusters. Based on sequence similarity of the HOX genes on separate clusters, HOX genes have been aligned into 13 groups called “paralogues” (Scott, ’92). The composition of each cluster was conserved during tetrapod evolution and each HOX gene within a cluster displays a pattern of expression in the body axis that is dependent of its relative position within the cluster. A specific subset of paralogues was subse- quently lost from each cluster, leaving 39 HOX genes in the present mammalian genome. Humans have four clusters on separate chromosomal locations: HOXA on chromosome 7p15.3, HOXB on chromosome 17q21.3, HOXC on chromosome 12q13.3, and HOXD on chromo- some 2q31 (Apiou et al., ’96). Mice with targeted disruptions of the Hox genes ex- hibit so-called homeotic defects due to misspecification of body segments along the antero-posterior axis of central nervous system, axial skeleton, limbs, gut, uro- genital tract, and external genitalia (Redline et al., ’92). Based on the similarity of the phenotypes of these mice to human malformations, mutations in HOX genes have been expected to be associated with a sig- nificant number of human syndromes (Redline et al., ’92; Mark et al., ’97; Veraksa et al., ’00). Documentation of patients with homeotic defects in limbs, genitalia, and other internal organs, and deletions of the entire Grant sponsor: Pharmacia Fund for Growth & Development Re- search; Grant sponsor: Keio Gijuku Academic Development Funds. *Correspondence to: Kenjiro Kosaki, MD, Department of Pediatrics, Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku, Tokyo, 160-8582 Japan. E-mail: [email protected] Received 5 May 2001; Accepted 18 August 2001 TERATOLOGY 65:50 – 62 (2002) DOI 10.1002/tera.10009 © 2002 WILEY-LISS, INC.

description

Complete Mutation Analysis Pane

Transcript of Complete Mutation Analysis Panel of the 39

Page 1: Complete Mutation Analysis Panel of the 39

Complete Mutation Analysis Panel of the 39Human HOX GenesKENJIRO KOSAKI,1* RIKA KOSAKI,1,2 TAICHI SUZUKI,1 HIROSHI YOSHIHASHI,1

TAKAO TAKAHASHI,1 KATSUMI SASAKI,1 MASARU TOMITA,3 WILLIAM MCGINNIS,4 AND

NOBUTAKE MATSUO1,5

1Department of Pediatrics, Keio University School of Medicine, Tokyo, Japan2Department of Medical Genetics, Saitama Children’s Medical Center, Keio University, Tokyo, Japan3Laboratory for Bioinformatics, Keio University, Fujisawa, Japan4Department of Biology, University of California, San Diego, San Diego, California5National Children’s Hospital, Tokyo, Japan

ABSTRACT

Background: The HOX gene family consists of highlyconserved transcription factors that specify the iden-tity of the body segments along the anteroposterioraxis of the embryo. Because the phenotypes of micewith targeted disruptions of Hox genes resemble somepatterns of human malformations, mutations in HOXgenes have been expected to be associated with asignificant number of human malformations. Thus far,however, mutations have been documented in onlythree of the 39 human HOX genes (HOXD13, HOXA13,and HOXA11) partly because current knowledge on thecomplete coding sequence and genome structure islimited to only 20 of the 39 human HOX genes.Methods: Taking advantage of the human and mousedraft genome sequences, we attempted to character-ize the remaining 19 human HOX genes by bioinfor-matic analysis including phylogenetic footprinting, theprobabilistic prediction method, and comparison ofgenomic sequences with the complete set of the hu-man anonymous cDNA sequences.Results: We were able to determine the full codingsequences of 19 HOX genes and their genome struc-ture and successfully designed a complete set of PCRprimers to amplify the entire coding region of each ofthe 39 HOX genes from genomic DNA.Conclusions: Our results indicate the usefulness ofbioinformatic analysis of the draft genome sequencesfor clinically oriented research projects. It is hoped thatthe mutation panel provided here will serve as a launch-pad for a new discourse on the genetic basis of humanmalformations.Teratology 65:50–62, 2002. © 2002 Wiley-Liss, Inc.

INTRODUCTION

The HOX gene family consists of highly conservedtranscription factors that specify the identity of thebody segments along the anteroposterior axis of theembryo. Since the discovery of the first prototypic HOXgene Anntenapedia in Drosophila (McGinnis et al., ’84),

homologous genes have been found in the genome of avariety of animals (McGinnis and Krumlauf, ’92). HOXgenes have arisen from a common ancestral gene andform a cluster in which HOX genes are arranged intandem. The clustered organization of Hox genes isconserved from nematodes to vertebrates. The numberof Hox genes within the ancestral complex increasedduring evolution and this was followed by successiveduplications of the cluster to give rise to the four ver-tebrate HOX clusters. Based on sequence similarity ofthe HOX genes on separate clusters, HOX genes havebeen aligned into 13 groups called “paralogues” (Scott,’92). The composition of each cluster was conservedduring tetrapod evolution and each HOX gene within acluster displays a pattern of expression in the body axisthat is dependent of its relative position within thecluster. A specific subset of paralogues was subse-quently lost from each cluster, leaving 39 HOX genes inthe present mammalian genome. Humans have fourclusters on separate chromosomal locations: HOXA onchromosome 7p15.3, HOXB on chromosome 17q21.3,HOXC on chromosome 12q13.3, and HOXD on chromo-some 2q31 (Apiou et al., ’96).

Mice with targeted disruptions of the Hox genes ex-hibit so-called homeotic defects due to misspecificationof body segments along the antero-posterior axis ofcentral nervous system, axial skeleton, limbs, gut, uro-genital tract, and external genitalia (Redline et al.,’92). Based on the similarity of the phenotypes of thesemice to human malformations, mutations in HOXgenes have been expected to be associated with a sig-nificant number of human syndromes (Redline et al.,’92; Mark et al., ’97; Veraksa et al., ’00). Documentationof patients with homeotic defects in limbs, genitalia,and other internal organs, and deletions of the entire

Grant sponsor: Pharmacia Fund for Growth & Development Re-search; Grant sponsor: Keio Gijuku Academic Development Funds.

*Correspondence to: Kenjiro Kosaki, MD, Department of Pediatrics,Keio University School of Medicine, 35 Shinanomachi, Shinjuku-ku,Tokyo, 160-8582 Japan. E-mail: [email protected]

Received 5 May 2001; Accepted 18 August 2001

TERATOLOGY 65:50–62 (2002)DOI 10.1002/tera.10009

© 2002 WILEY-LISS, INC.

Page 2: Complete Mutation Analysis Panel of the 39

HOXD cluster (Del Campo et al., ’99) or the HOXAcluster (Devriendt et al., ’99) has confirmed the criticalrole of HOX genes in the pathogenesis of human birthdefects.

Thus far, mutations have been documented in onlythree of the 39 human HOX genes. HOXD13 mutationslead to synpolydactyly, a dominantly inherited limbmalformation with a distinctive combination of syndac-tyly and polydactyly (Akarsu et al., ’96; Muragaki etal., ’96). HOXA13 mutations lead to the hand-foot-gen-ital syndrome, another dominantly inherited conditionwith a combination of distal limb abnormalities andmalformations of the lower urogenital tract. (Mortlockand Innis, ’97; Goodman et al., ’00). HOXA11 muta-tions leads to a megakaryocytic thrombocytopenia withradio-ulnar synostosis (Thompson and Nguyen, ’00).Mutations in the remaining 36 HOX genes may welllead to human malformation syndromes.

The complete coding sequences and exon-intronstructures of only 20 of the 39 human HOX genes areknown. Taking advantage of the draft genome se-quences of human and mouse HOX gene clusters, inthe present study we attempted to characterize theremaining 19 human HOX genes by three differentbioinformatic analyses: phylogenetic footprinting (Gu-mucio et al., ’93), probabilistic prediction of the codingsequences (Burge and Karlin, ’97), and comparison ofgenomic sequences with the complete set of the humananonymous cDNA sequences (Claverie, ’97).

Phylogenetic footprinting means identification offunctionally important segments of the genome se-quence based on comparisons between the genome se-quence of two or more species. Segments with a highdegree of identity, referred to as phylogenetic foot-prints, serve as reliable guides to protein coding se-quences and regulatory elements (Gumucio et al., ’93;Elgar, ’96; Hardison et al., ’97; O’Brien et al., ’99).

Probabilistic prediction method of coding sequencesstatistically identifies potential genes in genomic DNA(Burge and Karlin, ’ 97). The method is based on amathematical model of the gene structure of humangenomic sequences that incorporates descriptions ofthe basic transcriptional, translational and splicingsignals, as well as length distributions and composi-tional features of exons, introns and intergenic regions.

ESTs or expressed sequence tags are short DNAsequences read from one or both ends of expressed genefragments that are selected at random. More than amillion human EST sequences are now available in thepublic EST databases. Despite their fragmentary andinaccurate nature of ESTs, comparison of local genomicsequences with the complete set of the ESTs representsa highly efficient way of exon identification (Claverie,’97).

Determination of the complete genome structure of19 human HOX genes allowed us to develop a completemutation analysis system of all the 39 HOX genes as afirst step toward comprehensive evaluation of the rela-tionship between HOX mutations and human malfor-mations.

MATERIALS AND METHODS

Computer programs

The following computations were performed on aPentium-based PC running the Linux 2.2.12 operatingsystem. The exon/intron structure was predicted by thecomputer program sim4 (Florea et al., ’98), which effi-ciently aligns a transcribed and spliced DNA sequencewith a genomic sequence containing that gene. Inter-spersed repetitive elements in the human genome se-quences were localized with RepeatMasker software(Smit, Green, http://ftp.genome.washington.edu/RM/RepeatMasker.html). Nucleotide sequences or aminoacid sequences were analyzed for homology by theBLAST family of programs (Altschul et al., ’97). Prob-abilistic prediction of the coding potential was per-formed using the computer program Genscan (Burgeand Karlin, ’97). Auxiliary programs were written inUNIX shell script language using the EMBOSS soft-ware toolbox (Rice et al., ’00).

Percent identity plots

The human and mouse genome sequence coveringeach of the four Hox gene clusters were compared usingthe PipMaker website (http://bio.cse.psu.edu) (Schwartzet al., ’00). This web-based tool compares two longgenomic DNA sequences to identify conserved seg-ments. Degree of conservation at a particular locationin the genomic DNA sequence is measured as the per-cent-identity value. When percent-identity value is cal-culated along the entire genomic DNA, the result can

Fig. 1. Overall strategy. The complete genome structure of 19 humanHOX was determined in a stepwise manner as depicted.

HOX MUTATION PANEL 51

Page 3: Complete Mutation Analysis Panel of the 39

be expressed as the percent-identity plot (PIP plots)whose x-axis represent location along the genomic se-quence and the percentage sequence identity (pip) onthe y-axis and show the position in one sequence ofeach aligning gap-free segment and plots its percentidentity.

Within the genomic region with high pip value,subregion with coding potential was selected by twocomplementary methods: 1) probabilistic predictionmethod of the coding sequences using the Genscanprogram (Burge and Karlin, ’97); and 2) comparison ofgenomic sequences of the human HOX clusters withthe complete set of the human anonymous cDNA se-quences (expressed sequence tag [EST] database, Cla-verie, ’97) using the BLAST program (Altschul et al.,

’97). The location of identity with the human ESTdatabase sequences, interspersed repeat sequencesand CpG islands, were depicted on the PIP plot. To testthe validity of the computer-based prediction, pre-dicted exons were depicted together with exons thathave already been published in the GenBank databasebefore the present study.

Stepwise analyses of the genome structure

The genome structure of the following human HOXgenes had been annotated in the GenBank databasebefore the present study: HOXA1, HOXA2, HOXA4,HOXA5, HOXA7, HOXA9, HOXA10, HOXA11, HOXA13,HOXB3, HOXB4, HOXB5, HOXB6, HOXC5, HOXD1,HOXD3, HOXD4, HOXD11, HOXD12, and HOXD13.

TABLE 1. Hox cDNA and genomic sequences of humans and other species*

cDNAsequence no.

Human genomesequence no. BAC/PAC clone no. Notes

HOXAA1 NM_005522 AC004079b RP1-167F23A2 NM_006735 AC004079b RP1-167F23A3 NA AC004079b RP1-167F23 mouse cDNA � Y11717A4 NM_002141 AC004080b RP1-170O19A5 NM_019102 AC004080b RP1-170O19A6 NA AC004080b RP1-170O19 mouse cDNA � AF247663A7 NM_006896 AF032095b RP1-170O19A9 NM_002142 AF010258b RP1-170O19A10 NM_018951 AF040714b RP1-170O19A11 NM_005523 AF071164b RP1-170O19A13 NM_000522 U82827b RP1-170O19

HOXBB1 NM_002144 AC009789 RP11-361K8B2 NM_002145 AC009789 RP11-361K8B3 NM_002146 AF287967b RP11-361K8B4 AK000839 F287967b RP11-361K8B5 NM_002147 AF287967b RP11-361K8B6 NM_018952 X58431b RP11-361K8B7 NM_004502 AC009789 RP11-361K8B8 AY014293-4a AC009789 RP11-361K8 mouse cDNA � X13721B9 AY014295-6a AC009789 RP11-361K8 mosue cDNA � S66855B13 NM_006361 AC068531 RP11-463M16

HOXCC4 NM_014620 AY014297-8a,c RP11-834C11d

C5 NM_018953 X61755b RP11-834C11C6 NM_004503 AC023794 RP11-834C11C8 AY014299-300a AC012531 RP11-834C11 Mouse genomic DNA � M35603C9 AY014301-2a AC012531 RP11-83K1 Mouse cDNA � NM_008272C10 NM_017409 AC012531 RP11-83K1C11 NM_014212 AC012531 RP11-83K1C12 AH010514 RP11-83K1 RP11-83K1 Zebrafish genomic DNA � AF071260C13 AF263466 AC012531 RP11-83K1

HOXDD1 NA AF202118b RP11-387A1D3 NM_006898 Y09980b RP11-387A1D4 NM_014621 X17360b RP11-387A1D8 NM_019558a AC016915 RP11-387A1 Horn shark genomic DNA � AF224263D9 NM_014213 AC009336 RP11-387A1D10 NM_002148 AC009336 RP11-387A1D11 NM_021192 AF154915b RP11-387A1D12 NM_021193 AF154915b RP11-387A1D13 NM_000523 AB032481b RP11-387A1

*NA; not available in the GenBank database.aHuman HOX cDNA sequence determined during the study and deposited in the GenBank.bExon-intron boundaries already annotated in the GenBank file at the time of the study.cExon intron boundaries obtained by long-range PCR from genomic DNA.dPart of the gene not included in the genomic clone.

52 K. KOSAKI ET AL.

Page 4: Complete Mutation Analysis Panel of the 39

The complete genome structure of the 19 humanHOX genes was determined in a stepwise manner, asdepicted in Figure 1. Whenever a human cDNA se-quence of a human HOX gene was available in theGenBank database, it was compared to the human

genomic sequence by using the BLASTN program.Exon-intron boundaries were determined using thecomputer program sim4.

If the human cDNA sequence of a human HOX genewas unavailable but the corresponding mouse cDNA

Fig. 2. Comparison of the nucleotide sequences between human andmouse HOX clusters depicted as percent identity plots (PIP plots).The PIP plots show the human genomic sequence on the x-axis andthe percentage sequence identity (50–100%) on the y-axis. The anno-tation is shown at the top of each main plot, with confirmed andputative exons depicted as numbered black boxes. The other iconsalong the top of the box depict repeats (grey pointed boxes are L1repeats, light grey triangles are SINEs other than MIR, black trian-gles are MIRs, black pointed boxes are LINE2s, dark grey triangles

are LTR elements, and dark grey pointed boxes are other kinds ofinterspersed repeats, such as DNA transposons, short dark grey boxesare CpG islands where the ratio CpG/GpC exceeds 0.75 short whiteboxes are CpG islands where the ratio CpG/GpC lies between 0.6 and0.75). Each of the putative genes is shown with gene predictions in theform of dark-grey boxes, EST hits are shown as light-grey boxes.A: HOXA cluster on chromosome 7p15. B: HOXB cluster on 17q21.C: HOXC cluster 12q13. D: HOXD cluster 2q31.

HOX MUTATION PANEL 53

Page 5: Complete Mutation Analysis Panel of the 39

sequence was available in the GenBank database, themouse cDNA sequence was aligned with the humangenomic sequence by using the BLASTN program, andinspection of the alignment revealed the exon-intronboundaries. Predicted exon sequences were concate-nated to deduce the complete human cDNA sequence(Kosaki et al., ’01).

When neither human nor mouse full coding cDNAsequences were available in the GenBank database,the phylogenetic footprints of the HOX clusters weredelineated by comparison of the human genome se-quence containing the HOX cluster with the corre-sponding mouse genome (Gumucio et al., ’93).

Primer design

Primers for PCR analysis were designed to cover thetarget region together with at least 10 flanking basesaccording to the following criteria with the help of thecomputer program Amplify (William Engels, Univer-sity of Wisconsin, Madison, WI). The estimated anneal-ing temperature was set in the range of 55–58°C pre-dicted using the equation T � 59.9°C � 0.41 �(%GC) � 600/length. The primer pairs would not havedimerization capability. The 3� end of the primerswould not form hairpin structure (�3 bp). A G or Cresidue was included at the 3� end of primers (so-called“GC Clamp”). Priming sites were chosen so that the 3�

Figure 2. (Continued.)

54 K. KOSAKI ET AL.

Page 6: Complete Mutation Analysis Panel of the 39

end of the primers was at least 20 nucleotides awayfrom the region targeted for be sequencing. When oneexon was to be amplified in two or more PCR ampli-cons, the overlap between the amplicons was greaterthan 20 bases.

PCR and sequencing

PCR was performed with a thermal cycler PTC200(MJ Research, Waltham, MA) with the AmpliTaq Goldamplification kit (Perkin-Elmer, Foster City, CA), Plat-

inum Taq PCR system (Invitrogen, Carlsbad, CA) orXL long-range PCR system (Perkin-Elmer). Approxi-mately 50 ng of PCR products were sequenced with thegene-specific primers by using the dideoxy terminatormethod with the BigDye Terminator Cycle Sequencingkit (Applied Biosystems). Reaction products were ana-lyzed by a capillary sequencer ABI3100 (Applied Bio-systems). Nucleotide sequences were aligned and ana-lyzed using Sequencer software (Gene Codes, AnnArbor, MI).

Figure 2. (Continued.)

HOX MUTATION PANEL 55

Page 7: Complete Mutation Analysis Panel of the 39

RT-PCR

RT-PCR was performed using commercially avail-able cDNA templates (Multiple tissue cDNA panel™,Clontech, Palo Alto, CA). cDNA derived from erythro-leukemia cell line K562 was amplified with HOXC11-specific primer pair (HC47: CCA ATGGGTGACTGGT-GCAG and HC52: CCCTCTTTTGCTCTCTGCCAG).cDNA fetal kidney from was amplified with HOXD8-specific primer pair (D8RTF: CTCGCTCTCTGGCT-GCTTAGCG and D8RTR: TAGGTGGTGTCCACAG-CATATGG). The cycling conditions were 94°C for 10min for 1 cycle; 94°C for 1 min; 58°C for 1 min; 72°C for1 min for 35 cycles; and 72°C for 10 min for 1 cycle.When RT-PCR was performed, genomic DNA was am-plified in parallel to ensure that the primer pairs werespecific for the cDNA.

PCR from genomic DNA

Equal amounts of genomic DNA from 50 normal Jap-anese were pooled, amplified by PCR, and sequenced toverify specificity of the PCR primers and to identify poly-morphisms (Kwok et al., ’94). Each coding exon (i.e., fromthe start codon to the stop codon) of the HOX genes and atleast 10 bases of flanking introns was amplified from 50ng of the pooled genomic DNA by using the primer pairsdescribed in the Table 2. The cycling conditions were94°C for 10 min for 1 cycle; 94°C for 1 min; 58°C for 1 min;72°C for 1 min for 35 cycles; and 72°C for 10 min for 1cycle unless otherwise stated in the Table 2.

Polymorphism detection

The sequence of the PCR products was determinedby using fluorescently-labeled dideoxynucleotide termi-

Figure 2. (Continued.)

56 K. KOSAKI ET AL.

Page 8: Complete Mutation Analysis Panel of the 39

TABLE 2. PCR primers for analysis of human HOX genes

Forward Reverse Size

HOXAA1

exon1 GAAAGTTGGCACAGTCACGCCG A1 GATGAGAGATTTCCAGAGTAAACAGCG A2 463bpexon1 CTACGCGTTAAATCAGGAAGCAGAC A3 GAAATTAAGCATACCAGCTCCTTCCAG A4 387bpexon1 CTTTACCTGAGTGTTGCCATGAAGC A5 GATAAGCTAAGCATGTGCTTTGGGTAAG A6 529bp

A2exon1 CGGTCCAATTTCAACCTTGTCTC A7 GATGTCAGGCACTCAGCGAGC A8 289bpexon1 CTCCAAGGAGAAGGCCATGAATTAC A9 AAGAGGGTCCCAGAGACCTGGG A10 448bpexon2 CCAGTGGAATAACCCCGCTC A11 GAAACTTTGGGAGTCGCCATTGTG A12 500bpexon2 GAGCAAGCCCTTAGCGTCTC A13 GCAAAACCACCTGGTCAAAGGAGT A14 492bp

A3exon1 CAAACCCCTGTCAGAGTGTGC A15 GGGTTGTTGCTGGCATTCTGAG A16 448bpexon1 GCACACGAACTGAGTGAGGCGT A17 TCCATCGCTCCTAGGCTGTGCT A18 436bpexon2 GACCTGACGGATGCAGGAACC A19 GGGCAGCCCGTAGGTACCCT A20 515bpexon2 CATTCGCTGGTCAACAGCGTC A21 GTGAGCTTGGGTGCTTCCTGA A22 482bpexon2 CCCTCTTTGGTCTAACTCACCTCCC A23 CGTCACATAAACTATAAAAACGCCTTACCAACGAG A24 380bp

A4exon1 GGAGGAGTGGGCACTTGACAGCGGG A25 CGCCGCGGTAGCCATAGGGG A26 465bpexon1 GAGCCCACTGCCTCCTACTAC A27 GCACATACCCACATCTCACCG A28 565bpb

exon1 GTTTTATGTATCTTGCGTGAACTTGGTGTC A29 GTGGATGAGGAACGGAGCAGGAGAAGAGAA A30 575bpA5

exon1 GTGCAAGGGTGCTATAGACG A31 CGCCGCTGGAGTTGCTTAGG A32 449bpexon1 GTCCACGCACTCTCCTCAGCAC A33 GGAGAAATGAGACCAAGAGAGACTG A34 396bpexon2 CCTGGCGGACTTTGGAAGAC A35 CGCTATAATGGCAATAAACAGGCTCATGATT A36 472bp

A6exon1 TTCGGCCATCCAGAAACAAACCAG A37r GGGAGAAAAGTGCAGGTAGTCC A38r 421bpexon1 TCGAGTCTCCCGGACAAGAC A39 CCTCTGCCATGGCCTGATAG A40 498bpexon2 TCTTGTGGGAGGCACTGGGCTG A41 CAAAGCCGAAGGAGGTTGCAGCG A42 385bp

A7exon1 GGGTGTAATGTTATCATATATCACTCTACCTCG A43 GCTATGGGCTCCACGCAATG A44 656bpexon2 GACAGCAGCCTAACGAGTGC A45 GTAAGTAAAACCAGTGAGTCTCTTAAAGACG A46 459bp

A9exon1 TGCAGTTTCATAATTTCCGTGGGTCG A73 TCAACTGGAGGAGAACCACAAGC A74 551bpexon1 GGCCTTATGGCATTAAACCTGAACC A75 TCTTCACTGCTCTCCAGACTTGG A76 331bpexon2 AGCGGGATGTGCGTCTTCTGC A49 CTACGAGCCAGCCTGAACAG A50 472bp

A10exon1 GCTATCTGCTCCCTTCGCCAAA A51 GCGGGAAGGGAGCCAGTTCG A52 561bpexon1 TACTGCCTCTACGACTCGGCG A53 GCGCGGGCTCCTAGTTTTCTG A54 540bpexon2 CCGTGTGGCCTCGACTTAATC A55 GGCAGAGCCTGAAGACAGAG A56 395bp

A11exon1 GGAGAATCATGTTAAGCTCGGCTAC A57 TGGTCGAAAGCCTGTGGCAGGAC A58 476bpf

exon1 CAGTCTCGTCCAATTTCTATAGCACC A59 CGCTGCCTTTATACGTACTGGAG A60 366bpf

exon2 GCAGTCGGAGCGTTAAAGGCA A61 CACCATGTGGCTTGACTTTGTCAAGG A62 477bpA13

exon1 CGCGCGCTCTTCACTTCTTG A63 ATCAGGTTGCGGCACTGGTTG A64 342bpb

exon1 GCCGACGAGCTCAACAAGAAC A65 TAGTAGCCGCTGCCGAAGTAG A66 455bpb

exon1 AGGCCGCCAAGCAATGCAGCCC A67 CAGAGTGGACTTCCAGAGGTGG A68 476bpexon1 GGCTGGAACGGCCAAATGTACT A71 CTAGCCGAGGTCTCCACAAG A72 302bpexon2 GATCGAGCTGTCGCCTATGC A69 GCAAAGCAACGAGTTCTGAAGCG A70 376bp

HOXBB1

exon1 GTTGTAGGGCAAGAGGGTGTC B1 GCCTCCGTCTCCTTCTGATTG B2 387bpexon1 CCTGCGTTTCAGCAGAACTCC B3 CAGGGGAAGCAGAGATGCTTTG B4 475bpexon2 GAGAGAATTGACCTGGCCTTTCTC B5 AGTGCCTGGAAGCCCCATTG B6 406bp

B2exon1 TTCCGATCCTCCCTCCTGAC B7 GGCGGATTTCTTCTCTTTCATCCAAG B8 372bpc

exon1 GAGACCCAGGAGCCAAAAGC B9 AAGGAACCCCAACAGGCTCG B10 308bpexon2 GTCCGAACTGAGGTTGGGCT B11 CCTCTAAGCGAACGGCTAAAGG B12 479bpexon2 CGTGGGAAGCCTGCTGTCAC B13 TCCAGTAGACGGCCAAGGAG B14 407bp

B3exon1 GGTATCAGGCCTTTCCAAGTTGC B15 CAGTGACATTCCTGGCTCCGA B16 557bpexon2 CGATGCATTCAATTTCGGCGTGTTC B17 GGAACCAGATCTTGATCTGCCG B18 344bpb

exon2 GTGGAGCTGGAGAAGGAGTTC B19 ACGTACACCGGACTGCCCTG B20 461bpexon2 CGAGTATGAGCCGCACGTCC B21 GGTTCTGACCAGGAAGCCTGGGT B22 444bpe

B5exon1 CGTGAAGCACAGGGTTATAACGAC B23 GGCCTCGTCTATTTCGGTGAAATTG B24 470bpexon1 CTGCACCAACGGCGACAGCC B25 GGCTCTGTCTACGAAGCTATGAG B26 422bpe

exon2 CGGAATAATGTTGCCTTGCGGCT B27 GTAACACAAGGCGAGGCAGG B28 418bpB6

exon1 CAAATCATAAACCCGGCGGAGC B29 GCTCACTAGTTCTGTGTCCCG B30 540bpe

exon2 GTCTGTGTTCGAGGGTCGACT B31 GGTCTCTCTGACGCCCGTGGC B32 411bp

Table 2 continues on next page

HOX MUTATION PANEL 57

Page 9: Complete Mutation Analysis Panel of the 39

TABLE 2. PCR primers for analysis of human HOX genes (continued)

Forward Reverse Size

B7exon1 CTCGTAAAACCGACACTAAAACGTCC B33 GAGGCGCCTTCAGGGTAATGTG B34 519bpexon2 GTGCTGGGATTACAGGTTTGAGC B35 TCCTGATTCAGTTCCCAGAGCTG B36 337bp

B8exon1 TGACTAGGAGCCGGCGAAT B37 TGCACCAGGTCTGGATCCT B38 442bpe

exon1 CGGCAATTTCTACGGCTACGAC B39 CTTCCAGAAGCTGGAGGAAAT B40 285bpexon2 GCTCGTTCACGAATCTTCCAACC B41 GCCAGAGCTCTCTCGGGCAG B42 493bpe

B9exon1 GAAAGCCCTCACACCGGTCC B43 GGCGTGCCCTGTTTGAGCAG B44 502bpexon1 GCAGGTACCTCCGCACCTGGCT B45 GTTTCCCTTCACATTGTCCGCAG B46 340bpexon2 CCTTCCTGGGAGCTTGAAGAC B47 GCAGTCGTCACATAACTAAGAGTGAG B48 441bp

B13exon1 GCGAATGCAGGCGACTTGCG B49 CCCGGATATCCCGGATAGAAG B50 537bpf

exon1 TGTCCCGGAGCTCGCTGAAACC B51 GCTACTCAGACCTCCTTCAGAG B52 486bpf

exon2 GTGCACCTGTGAGATGACTTAGC B53 GCCTCTCAGCAGAGTCCTTG B54 458bpHOXC

C4exon1 CTAGTAGGAGGGCTTTATGGAGC C1 CAGAGCGACTGTGATTTCTCGG C2 364bpexon1 CATCACCACCAGGAGCTGTACC C3 GTTCAATTGTGCTAACCCAGAGTCG C4 447bpe

exon2 CTCCTCCTGTTTCTCAGAGCTG C4ex2F TAAGAGTCAATTTGTGTGTGAGGGGAG C4ex2R 521bpd

C5exon1 CCCCTCAACTTCAAAGAGTCACAAATC C5 CTTTGATCTCCCCACTGCTCTTAG C6 434bpexon1 CGCCTTCCAACTCTCTCCACG C7 GGCCTATAAAACCCAGGCGGA C8 353bpexon2 GGGAACGCTGCAAGCTATTCAC C9 GGAAAGGCGAAAAGGAAAGGCGCAG C10 346bp

C6exon1 CTAGTTCCGAGTACAAACTGGAGAC C11 GGTCTGTGTGTTATGTCCTAAGGTG C12 370bpexon1 GAATGTCGTGTTCAGTTCCAGCC C13 GCCGGTCATAAAGCCAGTTCAG C14 284bpexon2 AGTCTGCAGAGGACGCTTTGC C15 GTGCCAGTGATAAATACAGAGTGTGTG C16 459bp

C8exon1 CGGGGTACTCGTGAGCCAGA C47C8 GCGCCTCGTAGCCATAGAATTTGGAGGC C48(C8) 335bpexon1 CAACTCAGGCTACCAGCAGAAC C49C8 GGAGCCTTCCCAACTTCGAG C50(C8) 401bpexon2 GAGTGGCAAAGGAAAACCAAGCC C17 GACAGTCGTAAACTTCTCAATTTATCTGCTAC C18 446bp

C9exon1 CAATGAGCTGTGGGGAAAAGGC C19 TCGAGCCAAGTCCGCATGTAGC C20 413bpexon1 GTGGTATATCACCCGTACGGC C21 GGAGGGCTGTAATTCGGCTG C22 400bpe

exon2 GGTAGAGTAGCAGAAGTCCTGGGC C23 GCTGATTGGCTTTTGTCTATCTTGC C24 441bpC10

exon1 GCTCCTCCGCTGTAGTATTGC C25 GGGAGTAGAGAGCTGCCTCG C26 439bpexon1 GTCTGCTGCATGTACAGCGCA C27 GAAGACCAAGTGCCGGAACGTC C28 479bpexon2 CACAGCAGGCTGGCAAAGGC C29 GCAGTCGCATTGCATTTATACTCAGG C30 475bp

C11exon1 GGATAACGCGTCATCTCGCCT C31 CATGAGGATCTCGGTGACGG C32 437bpexon1 GGCCGACGAGCTTATGCACC C33 CGGACGAGCTGGGATTTGTG C34 352bpexon1 GTTTCTTCGACAACGCCTACTGC C35 GGGAGAGAGGGCTTTGTAGG C36 407bpexon2 GGGTCTCACGTGTCTCTCT C37 CCACAGTCCAGTTTTCCACCG C38 405bpe

C12exon1 CCAATGGTGACTGGTGCAG C47 GTACTTGAAGCCCGAGCGCAG C48 405bpexon1 GAGGACGGCAAGGGTTACTAC C49 TGGTAAGAGGTCGGGTTTGAATCC C50 405bpb

exon2 AAGGGCACTGGACTGGTCAC C51 CCCTCTTTTGCTCTCTGCCAG C52 405bpC13

exon1 TCCCTAGCTCGCTGCCTCT C39 GGTTCACGTTGTGCGACAGG C40 478bpexon1 GCGCCGTCTATACGGACATCC C41 CCATTGGAGAGAGCCCAGTG C42 399bpexon1 GCCAAGGAGTTCGCCTTCTAC C43 GCATTCCTCAGTGCAGCTCG C44 392bpe

exon2 TATCTCAGTCCAGCCGCTTGCCTCAC C45 GTTCGGTTATGGTACAAAGCGGAG C46 372bpHOXD

D1exon1 GCCACTATTTACCTCCGGCTCAC D53 GTGGCGTAGTGGACGTGAGA D54 568bpexon1 CTACGAACCTGGTGCCGCACCTGCC D55 GAGGCAGACTCCATGGAATTCG D56 541bpexon2 CCCTGTCTTTACGTTGCAGGCAAA D1 GTCGCCTGGGACTTCTGCAGG D2 405bp

D3exon1 GCAGTGAAGGATACAGTGGTAGTC D3 TGCTGCTCTGAGTTCAGACCAGG D4 371bpexon1 GAACTCAATGGCAGCTGCATGCG D5 GCCCTGACTAGGCTTCCTTG D6 362bpexon2 CTTGGAATCATACCTCTCA D7 CGGCCAGGCCGTACATATTG D8 558bpexon2 CTAGCCAGTCCCCTGAGCGCA D9 GAGGTAATTTTGCCGCGAGTTCG D10 552bp

D4exon1 CTGCCCAACTTTATTCAGTTGACAGC D49 TTCCAGAGGACCCAGGCTAACG D50 599bpexon2 GCTGACCTGCCTGTCCTGTCTG D11 CTGGTGCGCAGGGAGAGATG D21 430bp

D8exon1 CTCGCTCTCTGGCTGCTTAGCG D13 GCTGTCTCTGTAAGTTATCGTATCCG D14 498bpe

exon1 CTGCAGCTCTATGGCAACAGC D15 GGGAGTTTTAAAGCCAGAACGTGAG D16 507bpexon2 CAGCAACATGCAGAGGTACCATAAC D17 TAGGTGGTGTCCACAGCATATGG D18 440bp

Table 2 continues on next page

58 K. KOSAKI ET AL.

Page 10: Complete Mutation Analysis Panel of the 39

nators. When the sequence traces revealed doublepeaks, each of the 50 genomic DNA sample was ampli-fied and sequenced individually to estimate allele fre-quencies. With respect to HOXB13 each genomic DNAsample was amplified and sequenced individually be-cause we are currently performing mutation analysis ofHOXB13 as a separate project.

RESULTS

Comparison of the human and mouseHOX sequences

Comparison of the cDNA sequences for known hu-man HOX genes against the database of the humandraft genome sequence revealed that BAC or PACclones harboring human HOX clusters are included inRP1-167F23 (HOXA1–3), RP1-170O19 (HOXA4–13),RP11-361K8 (HOXB1–9), RP11-463M16 (HOXB13),RP11-834C11 (HOXC4–8), RP11-83K1 (HOXC9–13),and RP11-387A1 (HOXD1–13) (Table 1). The results ofthe phylogenetic footprinting analysis of the humanand mouse genome sequences are depicted in Figure 2as a percent identity plot.

Genome structure of the HOX genes

Bioinformatic analyses revealed the following (Fig.1). The genome structure of 11 (HOXB1, HOXB2,HOXB7, HOXB13, HOXC4, HOXC6, HOXC10, HOXC11,HOXC13, HOXD9, and HOXD10) of the remaining 19

HOX genes was determined by comparing the humangenome sequences with human HOX cDNA depositedin the GenBank. The complete coding sequence data formurine cDNA, but not for human cDNA were availablefor HOXA3, HOXA6, HOXB8, HOXB9, HOXC8, andHOXC9. The genome structure of these six genes wasdetermined by comparing the human genome sequencewith mouse Hox cDNA sequences. Only partial se-quences have been published for human or mouseHOXC12 cDNA for either HOXC12 or HOXD8. Thegenome structure of human HOXC12 and HOXD8 wasdetermined with the help of the Genscan computerprogram and comparison with the EST database se-quences.

Characterization of HOXC12

Only partial sequences have been published for hu-man and mouse HOXC12 cDNA. The Genscan programidentified a potential coding region between HOXC13and HOXC11, and expression of this hypothetical tran-script was examined by RT-PCR using a pair of primers(HC47: CCA ATGGGTGACTGGTGCAG and HC52:CCCTCTTTTGCTCTCTGCCAG). RT-PCR of cDNAfrom the erythroleukemia cell line K562 yielded a sin-gle-size product, and sequencing of the RT-PCR prod-uct revealed that this genomic region encodes an openreading frame. The deduced amino acid sequence ofthis open reading frame is highly homologous to thezebrafish HOXC12 protein.

TABLE 2. PCR primers for analysis of human HOX genes (continued)

Forward Reverse Size

D9exon1 CTGCAGCCTGCGAACTAGTC D45 TCCATCCAGGAGCGCACGTAGC D46 430bpexon1 CCTCTACCACCCGTACGTTC D47 TCCTTCAGCGAACAGCCTGG D48 493bpexon1 GAGTTCTCGTGCAACTCGTTCC D59 GGTAGGTGGTTATGGGAAGCTC D60 365bpexon2 CACAAGTTGTGAAAAGCGACCATCC D51 AAGTCCAAGTCGCTGGAGAGTTTC D52 483bp

D10exon1 CGGCAGAGGCATCCACAATTAC D19 GGACAAGACTCAGGGACCAG D20 514bpexon1 CAATTGCTGCATGTATTCTGATAAGCGCA D21 GAGAAGCGGGGACTATCTCAGG D22 579bpexon2 GGCCAAGGGTACATTTGAACAGTC D23 CTCTCCAATCCTGGCCTCTGG D24 496bp

D11exon1 CAATCGATGGCTCAGGTTGCTG D25 TAGTCGCGGAAGGCCACTTC D25R 265bpexon1 TCGTCCTGCCAGATGACTTTCC D26F GAGAAGCTCGCGTTGCATGG D26 278bpc

exon1 GCTCTTCAAGGCGCCTGAGCCGGTG D29 GATGTATAAACCTCTTCGAATGCTTATAAAG D30 490bpb

exon1 CGGGGGGCTACGCTCCCTAC D57 CGAAGCCCTGTGGCAAGATGC D28 239bpb

exon2 CCATATATCATCCCCCACGACG D3fin TGAAATACTGCAGACGGTCTCTGTTC D31R2 375bpa

exon2 AGACTTCAACTCTCTCGGATGCTC D32F GACGTCATTAAACCCAAGGACAGTG D32 232bpD12

exon1 GTCCAATCGTCTGAGCCTGTC D33 CAGCGTAGTCATACTTGGCCGCT D34 492bpexon1 CGAAGAGCAGGCTAAGTTCTATGC D35 GCCCGTACTCTGTCTATTGCG D36 407bpexon2 CGCAATAGACAGAGTACGGGC D37 GTCTGTAAATCTCTTGGTGGCTTCTG D38 434bp

D13exon1 GCCGCGCCATGGTGTCCTGC D39 TGGGTGCTGGGCACTCTTTG D39R 394bpb

exon1 TTTGCGTACCCCGGGACCTC D40F GCGATGACTTGAGCGCATTCTG D40 236bpexon1 CAGCGCTGGGCTACGGCTACCACT D41 CAAGTAGGGGCGCATACTCTTAG D42 503bpe

exon2 CTCAGCTAGGTGCTCCGAATATC D43 GGCCTTTTGGAGACAACCGAATG D44 369bpe

*Unless otherwise noted, Applied Biosystems TaqGold PCR system used denaturation at 95°C for 1 min annealing at 58°C for1 min exontension at 72°C 1 min �35 cycles.aGIBCO/BRL Platinum Taq PCR system wih Enhancer 1� PCRx mix.bGIBCO/BRL Platinum Taq PCR system wih Enhancer 2� PCRx mix.cGIBCO/BRL Platinum Taq PCR system wih Enhancer 3� PCRx mix.dGIBCO/BRL Platinum Taq PCR system.eApplied Biosystems XL long-range PCR system.fAnnealing at 60°C for 1 min.

HOX MUTATION PANEL 59

Page 11: Complete Mutation Analysis Panel of the 39

Characterization of HOXD8

Only partial sequences have been published for hu-man and mouse HOXD8 cDNA. The Genscan programidentified a potential coding region between HOXD9and HOXD4. Because the homeobox motif was not in-cluded in the hypothetical protein predicted by Gen-scan, we suspected that this hypothetical protein mightbe truncated at its carboxy end. An EST databasesearch (Claverie, ’97) revealed that an EST sequence istranscribed by the genomic region corresponding to the3� end of the hypothetical transcript predicted by Gen-scan. We assumed that the hypothetical transcript andthe EST are transcribed from the same gene and de-signed the RT-PCR primers accordingly.

Expression of the hypothetical transcript was as-sessed by RT-PCR using a pair of primers (D8RTF:CTCGCTCTCTGGCTGCTTAGCG and D8RTR:TAG-GTGGTGTCCACAGCATATGG). RT-PCR of cDNAfrom fetal kidney yielded a single product. Sequencingof the PCR product revealed that this genomic regionencodes an open reading frame containing the home-odomain.

Complete mutation analysis panel

The genome structure of 19 human HOX genes wasdetermined as described above, and the exon-intronboundary sequences of all 39 HOX genes were delin-eated. Based on this sequence information, togetherwith the published sequence information for 20 otherHOX genes, we were able to design PCR primers toamplify the entire coding region of each of the 39 HOXgenes in 125 amplicons from genomic DNA (Table 2).Each PCR amplification generated only one productunder the conditions specified in Table 2. The size ofthe PCR product always matched the size predictedfrom the genomic sequence. Although many of the HOXgenes contain CG-rich regions that tend to resistant toconventional PCR, we were able to obtain specific am-plicons by use of three different commercially availablePCR kits (Taq Gold PCR kit and XL long-range PCR kit[Applied Biosystems]; Platinum Taq DNA polymeraseHigh Fidelity with PCRx Enhancer [GIBCO BRL]).

Polymorphisms in HOX genes

Sequencing of HOXC11 amplified with primersHC33 (GGCCGACGAGCTTATGCACC) and HC34(CGGACGAGCTGGGATTTGTG) revealed a proline(ccg) to serine (tcg) polymorphism (P130S, GenBankaccession number NP_055027) in exon 1, whereas se-quencing of HOXB13 amplified with primers HB49(GCGAATGCAGGCGACTTGCG) and HB50 (CCCG-GATATCCCGGATAGAAG) revealed threonine (acg) tomethionine (atg) (T41M, GenBank accession numberAA839863) polymorphism in exon 1. Sequencing ofHOXD9 amplified with primers HD59 (GAGTTCTCG-TGCAACTCGTTCC) and HD60 (GGTAGGTGGTTAT-GGGAAGCTC) revealed a trinucleotide-repeat-lengthpolymorphism in exon 1. There were three or four glu-tamines in exon 1 of HOXD9 (starting from Q266,GenBank accession number XP_002545), and the glu-

tamine tract was encoded by a CAG repeat. Results ofgenotyping these three polymorphisms in normal Jap-anese are summarized in Table 3.

DISCUSSION

We delineated the complete coding sequences andgenome structure of 19 human HOX genes by using twocomplementary approaches: phylogenetic footprinting(Gumucio et al., ’93) and probabilistic prediction ofcoding sequences (Burge and Karlin, ’97). By also usingpreviously published data on the 20 other HOX genes,the exon-intron boundary sequences of all the 39 HOXgenes were delineated. Based on this sequence infor-mation, we successfully developed a set of PCR primersto amplify the entire coding region of each of the 39HOX genes from genomic DNA in 125 amplicons.

Phylogenetic footprinting analysis proved successfulin detecting coding sequences embedded in draft ge-nome sequences. All 19 HOX genes characterized in thepresent study were recognized as potential coding re-gions by the phylogenetic footprinting analysis. Theeffectiveness of phylogenetic footprinting analysis asexemplified in the present study justifies concurrentsequencing of the mouse genome with the human ge-nome (Oeltjen et al., ’97; Ansari-Lari et al., ’98; Brick-ner et al., ’99; Jang et al., ’99). Statistical predictionwas shown to be sensitive as well. In retrospect, thegenome structure of 35 of the 39 human HOX geneswas correctly predicted by the Genscan program, asdepicted in Figure 2. It is noteworthy that combineduse of phylogenetic footprinting and statistical predic-tion allowed detection of HOXC12, a gene expressed inonly certain tissues during specific developmentalstages.

The mutation analysis panel herein described willallow a systematic survey of the HOX mutations ofpatients with homeotic mutation-like traits (Redline etal., ’92; Mark et al., ’97; Veraksa et al., ’00), such asVATER association, Rokitansky sequences, Polandanomaly, various limb reduction defects or genital de-fects, and the Moebius sequence. We suspect that si-

TABLE 3. Genotype frequencies for three amino acidvariants in japanese populations

Frequency

P130S polymorphism in HOXC11Pro/Pro (ccg/ccg) 46 (92%)Pro/Ser (ccg/tcg) 4 (8%)Ser/Ser (tcg/tcg) 0 (0%)Total 50

T41M polymorphism in HOXB13Thr/Thr (acg/acg) 195 (97.5%)Thr/Met (acg/atg) 5 (2.5%)Met/Met (atg/atg) 0 (0%)Total 200

Number-of-CAG repeat polymorphismstarting from Q266 of HOXD9

(CAG)3/(CAG)3 10 (20%)(CAG)3/(CAG)4 26 (52%)(CAG)4/(CAG)4 14 (28%)Total 50

60 K. KOSAKI ET AL.

Page 12: Complete Mutation Analysis Panel of the 39

multaneous analysis of multiple HOX genes, ratherthan a single HOX gene, will be necessary to uncoverthe molecular basis of these homeotic defects, becausedouble or triple HOX mutant mice exhibit distinctivephenotypes, even when single mutants show only asubtle phenotype (Horan et al., ’95; Capecchi, ’97).

We identified two common polymorphisms, aminoacid insertion HOXD9 and amino acid substitution inHOXC11, by sequencing PCR products amplified frommixed genomic DNA (Kwok et al., ’94), adding to threeknown polymorphisms [HOXA10 (Kolon et al., ’99),HOXA1, and HOXB1 (Ingram et al., ’00)]. HOX genepolymorphism(s) may contribute to susceptibility toteratogens, because altered Hox gene expression hasbeen implicated in the pathogenesis of malformationsafter exposure to valproic acid (Faiella et al., ’00), ma-ternal diabetes (Jacobs et al., ’98), and hyperthermia(Li and Shiota, ’99) in mouse models. Hence, the HOXgene polymorphisms detected in the present study war-rant further evaluation through functional assays andepidemiological studies.

In summary, we have developed a comprehensivesystem for mutation analysis of the 39 human HOXgenes, by taking advantage of the human and mousedraft genome sequences. Our results illustrate the use-fulness of bioinformatic analysis of the draft genomesequences for clinically oriented research projects. It ishoped that the mutation analysis panel provided herewill serve as a launchpad for exploring the associationof HOX gene mutations with human malformations.

ACKNOWLEDGMENTS

This work was supported in part by grants from thePharmacia Fund for Growth & Development Research,Keio Gijuku Academic Development Funds.

LITERATURE CITED

Akarsu AN, Stoilov I, Yilmaz E, Sayli BS, Sarfarazi M. 1996. Genomicstructure of HOXD13 gene: a nine polyalanine duplication causessynpolydactyly in two unrelated families. Hum Mol Genet 5:945–952.

Altschul SF, Madden TL, Schaffer A, Zhang J, Zhang Z, Miller W,Lipman DJ. 1997. Gapped BLAST and PSI-BLAST: a new genera-tion of protein database search programs. Nucleic Acids Res 25:3389–3402.

Ansari-Lari MA , Oeltjen JC, Schwartz S, Zhang Z, Muzny DM, Lu J,Gorrell JH, Chinault AC, Belmont J, Miller W, Gibbs RA. 1998.Comparative sequence analysis of a gene-rich cluster athumanchromosome 12p13 and its syntenic region in mouse chromosome 6.Genome Res 8:29–40.

Apiou F, Flagiello D, Cillo C, Malfoy B, Poupon MF, Dutrillaux B.1996. Fine mapping of human HOX gene clusters. Cytogenet CellGenet 73:114–115.

Brickner AG, Koop BF, Aronow BJ, Wiginton DA. 1999. Genomicsequence comparison of the human and mouse adenosine deami-nase gene regions. Mamm Genome 10:95–101.

Burge C, Karlin S 1997. Prediction of complete gene structures inhuman genomic DNA. J Mol Biol 268:78–94.

Capecchi MR. 1997. Hox genes and mammalian development. ColdSpring Harb Symp Quant Biol 62:273–281.

Claverie JM. 1997. Computational methods for the identification ofgenes in vertebrate genomic sequences. Hum Mol Genet 6:1735-–.

Del Campo M, Jones MC, Veraksa AN, Curry CJ, Jones KL, Mas-carello JT, Ali-Kahn-Catts Z, Drumheller T, McGinnis W. 1999.Monodactylous limbs and abnormal genitalia are associated withhemizygosity for the human 2q31 region that includes the HOXDcluster. Am J Hum Genet 65:104–110.

Devriendt K, Jaeken J, Matthijs G, Van Esch H, Debeer P, GewilligM, Fryns JP. 1999. Haploinsufficiency of the HOXA gene cluster, ina patient with hand-foot-genital syndrome, velopharyngeal insuffi-ciency, and persistent patent Ductus botalli. Am J Hum Genet65:249–221.

Elgar G. 1996. Quality not quantity: the pufferfish genome. Hum MolGenet 5:1437–1442.

Faiella A, Wernig M, Consalez GG, Hostick U, Hofmann C, HustertE, Boncinelli E, Balling R, Nadeau JH. 2000. A mouse model forvalproate teratogenicity: parental effects, homeotic trans-formations, and altered HOX expression. Hum Mol Genet 9:227–236.

Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W. 1998. A computerprogram for aligning a cDNA sequence with a genomic DNA se-quence. Genome Res 8:967–974.

Goodman FR, Bacchelli C, Brady AF, Brueton LA, Fryns JP, MortlockDP, Innis JW, Holmes LB, Donnenfeld AE, Feingold M, Beemer FA,Hennekam RC, Scambler PJ. 2000. Novel HOXA13 mutations andthe phenotypic spectrum of hand-foot-genital syndrome. Am J HumGenet 67:197–202.

Gumucio DL, Shelton DA, Bailey WJ, Slightom JL, Goodman M. 1993.Phylogenetic footprinting reveals unexpected complexity in transfactor binding upstream from the epsilon-globin gene. Proc NatlAcad Sci USA 90:6018–6022.

Hardison, R, Slightom JL, Gumucio DL, Goodman M, Stojanovic N,Miller W. 1997. Locus control regions of mammalian beta-globingene clusters: combining phylogenetic analyses and experimentalresults to gain functional insights. Gene 205:73–94.

Horan GS, Ramirez-Solis R, Featherstone MS, Wolgemuth DJ, Brad-ley A, Behringer RR. 1995. Compound mutants for the paralogousHOXA4, HOXB4, and HOXD4 genes show more complete homeotictransformations and a dose-dependent increase in the number ofvertebrae transformed. Genes Dev 9:1667–1677.

Ingram JL, Stodgell CJ, Hyman SL, Figlewicz DA, Weitkamp LR,Rodier PM. 2000. Discovery of allelic variants of HOXA1 andHOXB1: genetic susceptibility to autism spectrum disorders. Tera-tology 62:393–405.

Jacobs HC, Bogue CW, Pinter E, Wilson CM, Warshaw JB, Gross I.1998. Fetal lung mRNA levels of Hox genes are differentiallyaltered by maternal diabetes and butyrate in rats. Pediatr Res44:99 –104.

Jang W, Hua A, Spilson SV, Miller W, Roe BA, Meisler MH. 1999.Comparative sequence of human and mouse BAC clones from themnd2 region of chromosome 2p13. Genome Res 9:53–61.

Kolon TF, Wiener JS, Lewitton M, Roth DR, Gonzales ET, Lamb DJ.1999. Analysis of homeobox gene HOXA10 mutations in cryp-torchidism. J Urol 161:275–280.

Kosaki K, Suzuki T, Kosaki R, Yoshihashi H, Itoh M, Goto Y, MatsuoN. 2001. Human homolog of the mouse imprinted gene Impactresides at pericentric region of chromosome 18 within the criticalregion for bipolar affective disorder. Mol Psychiatry 6:87–91.

Kwok PY, Carlson C, Yager TD, Ankener W, Nickerson DA. 1994.Comparative analysis of human DNA variations by fluorescence-based sequencing of PCR products. Genomics 23:138–144.

Li ZL, Shiota K. 1999. Stage-specific homeotic vertebral transforma-tions in mouse fetuses induced by maternal hyperthermia duringsomatogenesis. Dev Dyn 216:336–348.

Mark M, Rijli FM, Chambon P. 1997. Homeobox genes in embryogen-esis and pathogenesis. Pediatr Res 42:421–429.

McGinnis W, Krumlauf R. 1992. Homeobox genes and axial pattern-ing. Cell 68:283–302.

McGinnis W, Levine MS, Hafen E, Kuroiwa A, Gehring WJ. 1984. Aconserved DNA sequence in homoeotic genes of the DrosophilaAntennapedia and bithorax complexes. Nature 308:428–433.

Mortlock DP, Innis JW. 1997. Mutation of HOXA13 in hand-foot-genital syndrome. Nat Genet 15:179–181.

HOX MUTATION PANEL 61

Page 13: Complete Mutation Analysis Panel of the 39

Muragaki Y, Mundlos S, Upton J, Olsen BR. 1996. Altered growth andbranching patterns in synpolydactyly caused by mutations inHOXD13. Science 272:548–551.

O’Brien SJ, Menotti-Raymond M, Murphy WJ, Nash WG, Wienberg J,Stanyon R, Copeland NG, Jenkins NA, Womack JE, MarshallGraves JA. 1999. The promise of comparative genomics in mam-mals. Science 286:458–481.

Oeltjen JC, Malley TM, Muzny DM, Miller W, Gibbs RA, Belmont JW1997. Large-scale comparative sequence analysis of the human andmurine Bruton’s tyrosine kinase loci reveals conserved regulatorydomains. Genome Res 7:315–329.

Redline RW, Neish A, Holmes LB, Collins T. 1992. Homeobox genesand congenital malformations. Lab Invest 66:659–670.

Rice P, Longden I, Bleasby A. 2000. EMBOSS: the European Molec-ular Biology Open Software Suite. Trends Genet 16:276–277.

Schwartz S, Zhang Z, Frazer KA, Smit A, Riemer C, Bouck J, Gibbs R,Hardison R, Miller W. 2000. PipMaker—a web server for aligningtwo genomic DNA sequences. Genome Res 10:577–586.

Scott MP. 1992. Vertebrate homeobox gene nomenclature. Cell 71:551–553.

Thompson AA, Nguyen LT, 2000. A megakaryocytic thrombocytope-nia and radio-ulnar synostosis are associated with HOXA11 muta-tion. Nat Genet 26:397–398.

Veraksa A, Del Campo M, McGinnis W. 2000. Developmental pattern-ing genes and their conserved functions: from model organisms tohumans. Mol Genet Metab 69:85–100.

62 K. KOSAKI ET AL.