Ecole Doctorale de Physique, chimie-physique

No d'ordre: xxxx

Ecole Doctorale de Physique, chimie-physique - ED 182

UDS - IPHC - CNRS/IN2P3

THÈSE

présentée pour obtenir le grade de

Docteur de l'Université de StrasbourgDiscipline: Électronique, Électrotechnique et Automatique

Spécialité : Instrumentation et Microélectronique

par

Wu GAO

Conception d'un circuit de lecture monolithique"Front-End" avec un CTN haute précision etun CAN basé sur le temps en technologie

CMOS pour l'imagerie TEP

soutenue publiquement le 12 Janvier 2011 devant le jury:

Directeur de thèse : Yann HU Professeur - UDS, Strasbourg, France

Co-directeur de thèse : Deyuan GAO Professeur - NPU, Xi'an, China

Rapporteur : Patrick GARDA Professeur - UPMC, Paris, France

Rapporteur : Yu-Shan LI Professeur - Xidian Univ., Xi'an, China

Examinateur : David BRASSE Chargé de Recherches - CNRS, France

Examinateur : Tingcun WEI Professeur - NPU, Xi'an, China

Examinateur : Christine HU-GUO Ingénieur de Recherches - CNRS, France

Examinateur : Xiaoya FAN Professeur - NPU, Xi'an, China

No d'ordre: xxxx

Doctoral School of Physics, Chemistry-Physics - ED 182

UDS - IPHC - CNRS/IN2P3

THESIS

Presented to obtain the degree of

Doctor of Philosophy in University of StrasbourgDiscipline : Electronics, Electrotechnique and Automation

Specialty : Instrumentation and Microelectronics

by

Wu GAO

Design of a Monolithic Front-End Readout Chip witha High-Precision TDC and a Time-Based ADC

in CMOS technology for PET Imaging

Submitted publicly before 12th January 2011 to the jury:

Director of thesis: Yann HU Professor - UDS, Strasbourg, France

Co-director of thesis: Deyuan GAO Professor - NPU, Xi'an, China

Rapporteur : Patrick GARDA Professor - UPMC, Paris, France

Rapporteur : Yu-Shan LI Professor - Xidian Univ., Xi'an, China

Examinateur : David BRASSE Charge of researches -CNRS, France

Examinateur : Tingcun WEI Professor - NPU, Xi'an, China

Examinateur Christine HU-GUO Principal Research Engineer - CNRS, France

Examinateur Xiaoya FAN Professor - NPU, Xi'an, China

If I have been able to see further, it was only because I stood on the

shoulders of giants.

Newton

Contents

Acknowledgments v

Résumé vii

Abstract xxi

1 Introduction 1

1.1 Research background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 PET imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.2 Proposed small animal PET . . . . . . . . . . . . . . . . . . . . 3

1.1.3 Front-end electronics . . . . . . . . . . . . . . . . . . . . . . . . 7

1.2 Advances in PET front-end chips . . . . . . . . . . . . . . . . . . . . . 8

1.3 Proposed work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.3.2 Main contributions of this work . . . . . . . . . . . . . . . . . . 15

1.4 Thesis overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 A survey on front-end electronics for photodetectors 19

2.1 Overview of front-end electronic systems . . . . . . . . . . . . . . . . . . 19

2.2 Photo-electric conversion . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.3 Signal acquisition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.1 Voltage-sensitive ampliers . . . . . . . . . . . . . . . . . . . . . 25

2.3.2 Current-sensitive ampliers . . . . . . . . . . . . . . . . . . . . . 27

2.3.3 Charge-sensitive ampliers . . . . . . . . . . . . . . . . . . . . . 29

2.4 Pulse height analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

2.4.1 CR-RC shaping . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.4.2 Semi-Gaussian shaping . . . . . . . . . . . . . . . . . . . . . . . 34

2.5 Peak detect sample and hold . . . . . . . . . . . . . . . . . . . . . . . . 35

2.5.1 Peak sampling using a xed delay . . . . . . . . . . . . . . . . . 36

2.5.2 Peak-track-and-hold . . . . . . . . . . . . . . . . . . . . . . . . 36

2.6 Analog-to-digital conversion . . . . . . . . . . . . . . . . . . . . . . . . 38

2.7 Time discriminator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.8 Time-to-digital conversion . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.9 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

i

ii Contents

3 Design of Front-End Analog Signal Processing Circuits 51

3.1 Specications and architectures . . . . . . . . . . . . . . . . . . . . . . 51

3.2 Circuit descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.2.1 Preamplier with the variable gain stage . . . . . . . . . . . . . . 53

3.2.2 CR-RC shaper . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.2.3 Time-stamp circuits . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.2.4 Analog memory . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.3 Experimental results and discussions . . . . . . . . . . . . . . . . . . . . 62

3.3.1 Linearity measurement . . . . . . . . . . . . . . . . . . . . . . . 62

3.3.2 "Time Walk" of the triggers . . . . . . . . . . . . . . . . . . . . 64

3.3.3 Trigger eciency . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.3.4 Crosstalk between channels . . . . . . . . . . . . . . . . . . . . . 66

3.3.5 Noise and power dissipation . . . . . . . . . . . . . . . . . . . . 66

3.3.6 Comparison of overview performances . . . . . . . . . . . . . . . 67

3.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4 Design of Low-Jitter Multiphase Delay-Locked Loops 69

4.1 Overview of DLL techniques . . . . . . . . . . . . . . . . . . . . . . . . 69

4.1.1 Architectures and operational principle . . . . . . . . . . . . . . 69

4.1.2 Behavior models . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.1.3 Jitter models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

4.1.4 Circuit techniques for charge-pump DLLs . . . . . . . . . . . . . 76

4.2 Proposed multiphase charge-pump DLL . . . . . . . . . . . . . . . . . . 80

4.2.1 Proposed architecture . . . . . . . . . . . . . . . . . . . . . . . . 80

4.2.2 Circuit description . . . . . . . . . . . . . . . . . . . . . . . . . 81

4.2.3 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . 87

4.3 Optimized charge-pump DLL . . . . . . . . . . . . . . . . . . . . . . . 88

4.3.1 Optimized VCDL . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4.3.2 Dynamic phase detector . . . . . . . . . . . . . . . . . . . . . . 90

4.3.3 Optimized charge pump . . . . . . . . . . . . . . . . . . . . . . 93

4.3.4 Optimized loop lter . . . . . . . . . . . . . . . . . . . . . . . . 93

4.3.5 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . 94

4.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5 Design of Multi-Channel Coarse-Fine Time-to-Digital Converters 97

5.1 Design considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.2 Design of a 625-ps multi-channel coarse-ne TDC . . . . . . . . . . . . . 102

5.2.1 Proposed architecture . . . . . . . . . . . . . . . . . . . . . . . . 102

5.2.2 Circuit description . . . . . . . . . . . . . . . . . . . . . . . . . 103

5.2.3 Experimental results and discussions . . . . . . . . . . . . . . . . 108

5.3 Design of a multi-channel TDC based on a DLL array . . . . . . . . . . 110

5.3.1 Time interpolation using a DLL array . . . . . . . . . . . . . . . 111

5.3.2 Proposed TDC based a DLL array . . . . . . . . . . . . . . . . 114

iii

5.3.3 Experimental results and discussion . . . . . . . . . . . . . . . . 118

5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

6 Design of a Multi-Channel Time-Based Analog-to-Digital Converter 123

6.1 Overview of Time-based ADCs . . . . . . . . . . . . . . . . . . . . . . . 125

6.1.1 Pulse-width-modulation ADC . . . . . . . . . . . . . . . . . . . 125

6.1.2 VCDL-based ADC . . . . . . . . . . . . . . . . . . . . . . . . . 125

6.1.3 VCO-based ADC . . . . . . . . . . . . . . . . . . . . . . . . . . 126

6.1.4 Classic Wilkinson ADC . . . . . . . . . . . . . . . . . . . . . . . 126

6.1.5 Improved ramp ADC . . . . . . . . . . . . . . . . . . . . . . . . 128

6.1.6 Comparison of time-based ADCs . . . . . . . . . . . . . . . . . . 129

6.2 Proposed time-based ADC for PET imaging . . . . . . . . . . . . . . . . 130

6.2.1 Ramp generator . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

6.2.2 Comparator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

6.2.3 Digital DLL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

6.2.4 Gray counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

6.2.5 Sampling and readout circuits . . . . . . . . . . . . . . . . . . . 139

6.2.6 Timing controller . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6.3 Error analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

6.3.1 Errors introduced by the ramp generator . . . . . . . . . . . . . . 141

6.3.2 Errors introduced by the comparator . . . . . . . . . . . . . . . . 142

6.3.3 Errors introduced by the counter and the DLL . . . . . . . . . . 143

6.3.4 DNL model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

6.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

6.4.1 Performances of VTC . . . . . . . . . . . . . . . . . . . . . . . . 144

6.4.2 Performances of the digital DLL . . . . . . . . . . . . . . . . . . 144

6.4.3 Performances of the whole ADC . . . . . . . . . . . . . . . . . . 145

6.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

7 Conclusions 149

7.1 Proposed work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

7.2 Future work and perspectives . . . . . . . . . . . . . . . . . . . . . . . 150

Biography 177

iv Contents

Acknowledgments

It goes without saying that I am indebted to all the people whose contributions, small

and large, made my work and my life easier during the period that I spent working for this

thesis. The list of their names would be too long to write down. However, my sincerest

appreciation is for my advisors Prof. Yann Hu and Prof. Deyuan Gao. Not only have

they shown the passion for research and instinctive sense for electronic phenomena, but

also they have been good teachers and mentors to me. I appreciate Prof. Patrick Garda

and Prof. Yu-shan Li for reading my thesis. I also would like to thank Dr. Christine

Hu-Guo, Dr. David Brasse, Prof. Tingcun Wei and Prof. Xiaoya Fan for their technical

supports and valuable discussions.

I gratefully acknowledge Bernard Humbert, Christian Fuchs, David Bonnet, Patrick

Bard from IPHC for their services on my PhD research. Their knowledge and guidance

were essential to complete my project. I appreciate Marc Winter, Claude Colledani, Nico-

las Ollivier-Henry, Guy Doziere, Andrea Brogna, Mokrane Dahoumane, Andrei Dorokhov,

Wojciech Dulinski, Abdelkader Himmi, Frederic Morel, Isabelle Valin, Sylviane Molinet,

Christian Illinger, Gilles Claus for their technical support on the design and testing of my

prototypes. I thank Xiaochao Fang, Nicolas Pillet, Jia Wang, Yunan Fu, Xiaomin Wei,

Ying Zhang, Liang Zhang, Jerome Nanni and Min Fu for their kind helps and valuable

suggestions.

This work is partly accomplished in Northwestern Polytechnical University, Xi'an,

China. I would like to appreciate the help from Prof. Shengbing Zhang, Prof. Danghui

Wang, Dr. Jianfeng An, Dr. Meng Zhang, Dr. Xiaoping Huang, Dr. Ran Zheng. I

also thank my colleages, Yinjun Yang, Jianing Bai and Chao Chen, of Engineer Research

Center of Embeded System Integration, Ministry of Education, China.

Throughout my graduate study, I have been lucky to get a scholarship from Chinese

government, later EGIDE scholarship of France. I want to give thanks for the Chinese

and French people's support.

In addition, I am grateful for my parents for their love, prayers, and support. My

nal thoughts and thanks go to Miss Jia Ling, who shows her kindness and tolerance to

me during my whole PhD study.

Strasbourg, 2010

v

Résumé

Introduction

La technique de tomographie à emission de positons (TEP) se présente comme une

imagerie non invasive moléculaire mesurant la biodistribution in vivo d'agents étiquetés

pour l'imagerie avec des radioisotopes. Le principe se base sur la détection des radia-

tions gamma est la désintégration de positrons émis par le radiotraceur. Une paire de

photons d'une énergie de 511 keV résulte de l'annihilation d'un positron et d'un élec-

tron. Une ligne de réponse (LDR) est dénie par deux photons (511 keV) émis dans

des directions opposées et détectés en coïncidence. L'énergie déposée par chaque photon

dans le détecteur est convertie en un signal électrique amplié et numérisé par une série

de dispositifs électroniques de lecture. Dans le cas où le système exige des centaines

ou milliers, de LDR's , un circuit intégré multi-canaux spécique à l'application (ASIC)

est nécessaire pour fournir une plate-forme compacte de traitement du signal. Elle est

composée d'amplicateurs de charge sensibles (CSA), de shapers et/ou de discriminateur

pour chaque voie du détecteur. Cette thèse porte sur la conception d'un ASIC de lecture

dédié au photodétecteur multi-canaux (MCP, Photonis Corp.) munis de cristaux LYSO.

Dans la plupart des systèmes TEP, la géométrie de la cellule de détection est basée

sur une structure composée d'éléments de cristaux couplés avec un nombre réduit de

tubes photomultiplicateurs. Usuellement, la résolution spatiale et l'ecacité de détec-

tion des systèmes d'imagerie TEP sont étroitement liées. Autrement, il est très dicile

d'obtenir une haute résolution conjointement à une bonne ecacité. Dans cette étude,

les cristaux sont orientés dans la direction axiale mesurés des deux côtés par des canaux

individuels de photo détecteurs permettant d'obtenir une résolution spatiale et une e-

cacité de détection indépendantes les unes des autres. Le module de détecteur est illustré

à la gure 1. Dans cette conguration géométrique originale, la position de l'interaction

du photon dans un cristal est reliée à la valeur maximale de l'impulsion (valeur crête)

de sortie générée par le détecteur de photons. Par conséquent, cette position peut être

calculée à partir de la mesure de la valeur maximale de l'impulsion de chaque côté du

cristal d'intérêt. Pour dépasser les dicultés dues aux limitations de la technologie, cette

thèse propose une nouvelle méthode utilisant des circuits de lecture en courant et des

convertisseurs numériques basés sur un traitement temporel de l'information de l'énergie.

Les signaux faibles provenant du détecteur sont lus par amplicateur de charges avec une

structure cascode régulée et conditionnée par un circuit de mise en forme semi-Gaussian.

vii

viii Contents

La valeur crête du signal redessiné par le shaper de chaque canal est mémorisée par

une mémoire analogique avec un circuit monostable et numérisée par un convertisseur

analogique-numérique (CAN) de multi-canaux, de faible consommation et d'haute réso-

lution (> 10 bits).

MCP PMT

Matrice de cristaux de LYSO:Ce

(a) (b)

Module de détection

Carte d'acquisition (DAQ)

Figure 1: Le module de détection. (a) Image de synthèse des quatre modules de détection; (b)Image de synthèse du système complet intégrant l'électronique "front-end".

Par ailleurs, l'imagerie TEP à temps de vol (ToF) a démontré une capacité supérieure

à fournir une meilleur image reconstruite par rapport à la TEP classique. L'approche

TEP TOF, mesure la diérence de temps de vol entre deux photons détectés et délivre

une valeur approximative de la position d'annihilation des deux photons. Cette méthode

est principalement limitée par la capacité qui mesure le temps d'arrivée des deux photons.

L'ASIC doit inclure un convertisseur de temps-numérique (CTN) de haute précision pour

atteindre la résolution temporelle nécessaire tout en conservant une bonne stabilité. Il

est à noter qu'une résolution temporelle de 500 ps permet d'améliorer le rapport signal

à bruit de l'image reconstruite d'un facteur 4. Cet eet est moindre pour l'imagerie

du petit animal. Cependant, le développement d'un système d'imagerie TEP dédié au

petit animal présente une très bonne opportunité de tester une approche originale pour

améliorer la résolution temporelle. Pour mesurer cette diérence de temps inme, un pas

de 100 ps s'avère nécessaire. Les techniques de conception des CTN à large gamme et

haute résolution sont présentées plus loin dans ce résumé. Plus particulièrment, le CTN

basé sur la matrice des boucles à verrouillage de délai (DLLs, delay-locked loops)à faible

gigue est exploré pour l'amélioration de la résolution.

Objectifs

Cette thèse se concentre sur la conception et le développement d'un ASIC mono-

lithique "front-end" composé de circuits de lecture en courant et des numériseur de temps

ix

pour le système d'imagerie TEP pour les petits animaux. L'architecture du circuit de

lecture proposée est illustrée à la gure 2.

Detector

Detector

particle

particle

A Av

Channel #1 m

Vin<1>

Vin<2>

Analog-to-Digital InterfaceFront-End Readout Chains

Tin<1>

Front-End Readout circuit Tin<2>Channel #2

Detector particle

Vin<N>

Front-End Readout circuit Tin<N>Channel #N

High-precision TD

C

n

High-resolution A

DC

Monolithic Front-End ASIC

E

T

Detectors

Figure 2: L'architecture proposée des circuits de lecture monolithique "front-end".

Les techniques de conception doivent répondre aux points suivants.

• Circuits de lecture analogique à faible bruit. En règle générale, un rapport signal

/ bruit (RSB) élevé de l'électronique "front-end" signie une meilleure résolution

spatiale du système d'imagerie TEP. En outre, puisque la technologie CMOS utilise

le même substrat pour les deux transistors NMOS et PMOS, le bruit de masse de

la partie numérique et des convertisseurs de données inuenceront les performances

de l'analogique "front-end".

• De multiples canaux. En général, les ASICs multi-canaux permettent d'obtenir un

format compact de l'électronique "front-end". Le nombre de canaux dépend du

détecteur choisi. Les photodétecteurs de type MCP-PMT proposés par la société

Photonis contiennent de nombreux canaux, soit 8 × 8 anodes, soit 32 × 32 anodes,

donc cela nécessite un ASIC multicanaux. Deux considérations doivent être prises

en compte, le nombre de voies intégrées, la stratégie de conversion de données et sa

lecture. Pour un ASIC de 64 voies, avec un CTN composé de 16 bits, il en résultera

une profondeur totale de 128 octets. En plus pour un taux d'échantillonnage élevé

il résultera un nombre élevé de données à stocker puis à lire.

• Large gamme dynamique. La détection des rayons gamma pour l'imagerie TEP

à lecture axiale nécessite un CSA à large gamme dynamique et un CAN à haute

x Contents

résolution. Le compromis entre la consommation, la vitesse et la résolution condi-

tionnera le choix d'une architecture adéquate. En outre, l'information temporelle

devant être extraite de l'impulsion à vitesse élevée, un comparateur à haute résolu-

tion et un CTN avec une conversion inférieure à la nanoseconde doivent être conçus.

Le processus de fabrication aecte sensiblement les performances des circuits de

mesure temporelles. La conception de tels circuits requiert à la fois compétence et

expérience.

• Faible consommation. Pour l'imagerie TEP, la consommation de l'électronique

"front-end" devient un paramètre crucial. Le nombre important de voies nécessaires

requiert une consommation par canal la plus faible possible. Sinon, la dissipation

thermique engendrerait une augmentation de la température dans l'espace de la

couronne de détection. De plus, la hausse de la température provoquerait une

diminution des performances des composants électroniques.

• Changement de n÷ud technologique. Considérant les performances et le coût de

l'ASIC, les technologies CMOS relativement récentes disponibles tels que 90 nm ou

65 nm sont envisageables. La plus grande intégration obtenue grace à ces technolo-

gies encourage le concepteur analogique à envisager de nouveaux circuits d'entrées

("front-end"). Bien que les circuits numériques bénécient alors de la diminution de

la puissance dissipée, de l'augmentation de la vitesse et de l'optimisation des coûts,

les circuits analogiques doivent alors faire face à des dés tels que la réduction de

la tension d'alimentation, un gain intrinsèque faible, des fuites de courant et une

augmentation du bruit en 1/f.

Deux solutions ont été proposées pour surmonter ces dés. La première consiste à

appliquer la méthode "digitally-assited analog design" où des blocs analogiques sont mises

en ÷uvre par des circuits numériques réutilisables facilement dans une future technologie.

L'autre méthode consiste à utiliser des circuits analogiques en mode courant couplé à

une numérisation du signal par convertisseur du temps. La réduction de l'alimentation

n'aecte pas théoriquement les circuits en courant contrairement à ceux en mode tension.

Cependant, le bruit en 1/f et les courants de fuite deviennent non négligeables dans les

circuits en mode courant. Le même principe peut être appliqué aux circuits en mode

"base de temps" tels que les amplicateurs de temps ("time amplier" en anglais) qui sont

nécessaires aux convertisseurs de temps numérique (CTN). Contrairement aux circuits

en mode tension, les circuits en mode "base de temps" peuvent plus facilement bénécier

de l'intégration 90 nm ou inférieure. Nous noterons une caractéristique évidente: le

temps de retard dans une porte et une piste est diminué. Cette diminution permet

d'augmenter la gamme dynamique. Toutefois, les circuits en mode "base de temps"

utilisant cette technique possèdent une vitesse de traitement réduite, d'où un compromis

entre l'architecture basée sur le temps et la vitesse des circuits à prendre en compte.

Cette thèse est consacrée au circuit de lecture en mode courant et au convertisseur de

données basé sur le temps.

xi

Travaux réalisés

Trois prototypes ont été conçus en technologies CMOS 0,35 µm.

Circuits analogique "front-end" de traitement du signal

Le premier prototype est un circuit de 10 canaux analogiques "front-end" de traite-

ment du signal. Le schéma des circuits d'un canal de lecture "front-end" est illustré à

la gure 3. Un préamplicateur est directement lié à la sortie de chaque anode du pho-

todétecteur MCP. Le faible courant généré par le détecteur est d'abord amplié par le

préamplicateur. La sortie du préamplicateur est divisée et transmise à un "canal de

l'énergie" pour la mesure de l'énergie et un "canal de déclenchement (Trigger)" du CTN.

AA

Preamplifier

Current Comparator

Gain Adjustment

Integrator

Shaper

Analog Memory

Output Buffer

Energy(To ADC)

Hold(To TDC)

Hit(To TDC)

Hold

vdda

vref

vtreshvdda

Iref

-Ao

100k 10p

100k

3p 1p

300k

2p

100k Monostable

Detector(LYSO+MCP)

Vdd_hv

Figure 3: un canal schématique de la puce de lecture "front-end".

Dans le canal d'énergie, un étage de gain d'ajustement devrait être utilisé pour

ajuster précisément le gain de l'amplication en raison de la dispersion de gain des anodes

du photodétecteur. En outre, le courant de sortie doit être compensé en raison de la fuite

de courant du photodétecteur. Le signal en courant ajusté est alors intégré par une cellule

RC et mise en forme grâce à un shaper CR-RC du second ordre. La sortie du shaper

est stockée dans un circuit d'échantillonnage et de mémorisation mis en ÷uvre avec des

interrupteurs MOS et des condensateurs. Cette opération permet une lecture continue

sans aucun temps mort dans le mode d'acquisition. L'architecture diérentielle de la mise

en forme lente est conçue pour optimiser le rapport signal sur bruit.

Le canal de déclenchement qui génère un déclenchement précis est constitué d'un

comparateur de courant suivi d'un circuit monostable. Le courant d'entrée est comparé

à un courant de référence dont la valeur est équivalente au niveau d'énergie de 53 fC.

Le circuit monostable donne le "temps de garde ", qui correspond au temps nécessaire à

échantilloner la valeur de crête de l'impulsion mise en forme dans la mémoire analogique.

xii Contents

La gure 4 montre une photo du layout de ce prototype dix canaux. Sa surface est

de 2,8 × 2,18 mm2. Le prototype a été simulé et mesuré. La dynamique d'entrée s'étend

de quelques fC à 104 pC, la gamme de sortie du circuit d'entrée (front-end) va de 1,2 V

à 3,2 V.

Figure 4: Photo de la puce prototype (IMOTEPA).

CTN basé sur des techniques d'un compteur et une matrice des

DLLs

Le second prototype est un CTN multi-canaux basé sur l'architecture à double comp-

teurs et une matrice des DLLs. Son architecture est illustrée à la gure 5. Le CTN est

composé d'une matrice de 5 DLLs, de deux compteurs Gray 10-bits, de circuits 64-canaux

de lecture et de registres entrée-parellel-sortie-série. L'architecture proposée a été étudiée

dès 1996. Elle a été utilisée pour obtenir une résolution 89 ps avec une horloge de 80

MHz en technologie CMOS 0,7 µm. Dans l'étude présente, tous les circuits sont conçusen technologie CMOS 0,35 µm. En outre, les techniques des circuits à faible gigue des

boucles à verrouillage de délai sont adoptées. En conséquence, l'objectif du CTN proposé

est d'atteindre une taille bin de 71 ps avec une horloge de 100 MHz.

Dans cette conception, un compteur Gray est proposé. Puisqu'un seul bit est changé

à chaque coup d'horloge, le compteur Gray consomme moins d'énergie et génère moins de

bruit qu'un compteur binaire avec la même résolution. En outre, le compteur Gray peut

fonctionner plus vite que le compteur binaire en raison de cette caractéristique. Toutefois,

un convertisseur de Gray à binaires doit être utilisé pour réaliser des représentations

binaires du code.

xiii

tn tn tn tn

tmtm

tmtm

ΦC

tn tn tn tn ΦC

tn tn tn tn ΦC

tn tn tn tn ΦCtm

ΦC

0 4 8 136132

5 9 13 137

10 14 18

15 19 23

1

2 6

7 11

N=35, tm = 4∆tTdelay=140∆t

N=28, tm

= 5∆t T

delay =140∆t

Array of DLLs

10-bit Gray-code

Counter#1

10-bit Gray-code

Counter#2

X140

Clk_100M

Resetb

Hit<0>

X10 X10

Readout Circuits for Channel #0

Hit<1> Readout Circuits for Channel #1

Hit<N> Readout Circuits for Channel #N

Parallel-In-S

erial-Out

Registers.

18

18

18

Time Words

18

Resetb

Figure 5: L'architecture d'un CTN basé sur des techniques d'un compteur et une matrice des

boucles à verrouillage de délai.

Un prototype à trois canaux basés sur l'architecture proposée et les techniques des

circuits a été conçu en technologies CMOS 0,35 µm. Dans ce circuit, une matrice com-posée de quatre DLLs avec 35 cellules de délai et une DLL avec 28 cellules de délai, deux

compteurs Gray 10 bits, des circuits de lecture des 3 canaux, et d'un contrôleur JTAG

sont intégrés. La photo de la puce est représentée sur la gure 6. La taille de la puce est

de 3,6 × 2,5 mm2. La résolution de 100ps du CTN peut être atteinte avec une bonne

linéarité.

CAN multi-canaux basée sur la mesure de temps à haute résolu-

tion

Le troisième prototype est un CAN multi-canaux basé sur la mesure de temps à

haute résolution. Dans l'IMOTEPAD l'ASIC front end réalisé à l'IPHC, la fonction CAN

n'est pas réalisée dans l'ASIC "front-end" (IMOTEPAD). Un CAN discret de 14-bit à 20-

MS/s suivi avec chaque ASIC à permettre de numérise la sortie en tension des 64 signaux

sérialirés. Bien que ce régime ne peut réaliser la numérisation, la précision des signaux

en tension est généralement diminuée en raison de l'injection de charge des interrupteurs

MOS dans le multiplexeur 64-à-1 et des opérations d'échantillonnage blocage dans le

CAN. Pendant ce temps, la synchronisation des données de sortie rend la conception du

PCB dicile. Pour surmonter ces problèmes, un CAN intégré multi-canaux est proposé

xiv Contents

Biascircuits

Counters

3.6 mm

2.5

mm

Figure 6: Photo du prototype CTN trois cannax de haute précision.

pour remplacer un CAN discret.

La gure 7 montre l'architecture du CAN proposé. La nouvelle architecture est basée

sur l'architecture CAN à rampe améliorée qui se compose de deux parties, un convertisseur

tension-temps (CTV) et un convertisseur temps-numérique (CTN). Le CTV se compose

d'un générateur de rampe et de comparateurs. Le générateur de rampe est un intégrateur

qui est entraîné par une source de courant hautement linéaritée. Tous les composants

du générateur de rampe sont intégrés. Le comparateur est composé d'amplicateurs

multi-étapes pour générer impulsions "Hit" haute sensibilité avec une latence xe. Le

CTV soutient la précision de conversion de 14 bits. Le CTN se compose d'un compteur

Gray, d'un générateur d'horloge multiphase (GHM) réalisé par une DLL, de registres et

de codeurs. Le compteur Gray-code et la boucle numérique à verrouillage de délai sont

choisis en raison de considérations de conception à faible consommation.

Les blocs dans le CAN proposé sont principalement réalisés par des circuits numériques

à l'exception du générateur de rampe, de comparison et des circuits de polarisation as-

sociés. Ainsi est obtenue une architecture "numérique grande analogique petite" qui

convient à cette échelle de la technologie. Deplus, les lignes à retard sont commandées

par les signaux numériques générées par la DLL. Par rapport aux signaux de contrôlé

analogique, les signaux numériques peuvent piloter de grosses charges de telle sorte que

plusieurs canaux de conversion peuvent être intégrés.

Un prototype basé sur l'architecture et les circuits du CAN proposé a été conçu et

fabriqué en technologies CMOS 0,35 µm. Le prototype est composé d'un générateur de

rampe, d'un compteur Gray 10-bits, d'une DLL avec 16 cellules de délai, de 8-canaux

d'echantillonnage et de lecture des signaux, d'un contrôleur de synchronisation, de circuits

de polarisation, et d'un contrôleur JTAG. La photo du prototype CAN est illustrée à la

gure 8. Sa taille est 2190 × 2600µm2.

xv

Figure 7: Schéma de principe du CAN proposé.

Figure 8: La photo du CAN proposé.

xvi Contents

Contributions principales

Les contributions principales de ce travail peuvent être répertoriées comme suit.

• Recherches sur les progrès des nouvelles techniques d'électronique"front-end" et de

traitement du signal pour l'imagerie TEP. Les diérentes techniques existantes avec

leurs avantages et leurs inconvénients sont présentées plus en détail.

• Propositions pour une architecture monolithique de lecture "front-end" avec un

CTN 100-ps et un CAN 12-bits basé sur le temps. Ces idées introduiront peut-être

une nouvelle direction de recherche pour le système TEP à géométrie axiale, couplé

à des photodétecteurs des deux côtés et dédiés à l'imagerie du petit animal.

• Recherches pour la génération d'horloges multiphases basée sur un temps précise

utilisant des DLLs à faible gigue. Nous présenterons aussi les techniques de con-

ception des DLLs à pompe de charge à faible gigue et d'interpolations de temps

utilisant une matrice des DLLs. Des nouvelles techniques sont proposées pour con-

struire l'interpolation multiphase précis sur base de temps en utilisant une matrice

des DLLs.

• Techniques de conception d'un CTN multi-canaux haute précision utilisant des

circuits des compteurs Gray et une matrice des DLLs. Un prototype à trois canaux

CTN est conçu et fabriqué en technologie CMOS 0,35 µm. Le circuit est testé viaune carte de tests (un circuit imprimé, PCB) 6 couches. Nous avons obtenu un pas

de 71 ps avec une horloge de 100 MHz.

• Techniques de conception d'une DLL numérique avec des lignes linéaires à retard.

Les circuits numériques de la DLL sont proposés an de faciliter la conception à

cause du changement de la technologie. Nous avons conçu aussi un prototype de la

DLL avec 16 cellules de délai.

• Technique de conception d'un CAN multi-canaux 12-bits basé sur de temps en util-

isant l'architecture Wilkinson et des techniques du CTN numériques. Un prototype

8-canaux est conçu et fabriqué en technologies CMOS 0,35 µm. Le circuit est testésur une carte PCB 4 couches. Les DNL et INL mesurées sont de 0,5 LSB et 0,75

LSB, respectivement.

Conclusions

Cette thèse présente la conception de circuits de lecture "front-end" dédiés à la

mesure de la quantité d'énergie et le temps de coïncidence pour des systèmes d'imagerie

TEP qui sont fondés sur le module de détecteur constitué d'un cristal inorganique scintil-

lant LYSO:Ce lu aux deux extrémités par deux photodétecteurs de type MCP-PMT(Photonis

Corp.). Depuis 2007, trois prototypes ont été conçus et fabriqués en technologies CMOS

xvii

0,35 µm. Il s'agit d'un circuit analogique "front-end" de traitement du signal, d'un

CTN basé sur des techniques de compteur et de matrice des DLLs, et enn d'un CAN

multi-canaux basé sur le temps à haute résolution.

Pour les futures conceptions, l'intégration des circuits de lecture en mode courant et

de convertisseur de données basé sur le temps sera eectuée en fonction des applications

spéciques. Parce que la technologie CMOS a developpé à l'ordre de nanomètres, les

considérations de conception pour les dés en raison de l'échelle de la technologie seront

prises en compte.

Abstract

Positron Emission Tomography (PET) is a noninvasive molecular imaging that mea-

sures in vivo biodistribution of imaging agents labeled with positron-emitting radionu-

clides. The physical principle is based on the detection of gamma radiations resulting

from the disintegration of positrons emitted by the radiotracer. A pair of 511 keV pho-

tons results from the annihilation of a positron and an electron. A line of response

(LOR) is dened by two photons (511 keV) emitted in opposite directions and detected

in coincidence. The energy deposited by each photon inside the detector is converted

into an electrical signal which is amplied and digitized by a serial of readout electron-

ics. In the case where the system requires many hundreds or even thousands of LORs,

a multi-channel application-specic integrated circuit (ASIC) is necessary to provide a

compact platform for processing many signals for each detector element. This thesis fo-

cuses on the design of a full-custom front-end readout ASIC dedicated to the Photonis

Corp. multi-channel plate photodetector (MCP) with LYSO crystals.

In most PET systems, the geometry of the detector module is based on a block

structure where the crystal elements are coupled to a reduced number of photomultiplier

tubes (PMTs). Normally, the spatial resolution is limited by the detection eciency with

these PET imaging systems. It means that we can not obtain high resolution with good

eciency. In this study, the crystals are oriented in the axial direction and read out on

both sides by individual photodetector channels allowing the spatial resolution and the

detection eciency to be independent of each other. Using this conguration, the position

of the photon hitting on the scintillation crystal is determined by the energy quantity of

charges generated by the photon detector. As a result, both the energy quantity and the

time information should be measured.

In the energy measurements, the weak current signals from detectors are read out

by a regulated-cascode preamplier and shaped by a CR-RC shaper. The peak values of

the shaped signals in each channel are detected by an analog memory with a monostable

circuit and digitized by an analog-to-digital converter (ADC). In the previous work, the

digitizing function is realized by a discrete 14-bit 20-MSamples/s commercial ADC chip.

In order to achieve more compact size and to improve the conversion precision, this

thesis proposes an integrated multi-channel time-based ADC to replace the function of

the external ADC chip and. The proposed ADC which is realized by the Wilkinson-type

architecture with a digital delay-locked loop (DLL) has several attractive features such

as high resolution, low power dissipation and small die area.

In the time measurements, this thesis proposed a multi-channel 625-ps time-to-digital

xix

xx Contents

converter (TDC) realized by counter-based circuits and time interpolations using a low-

jitter charge-pump DLL for the coincidence events. Besides, PET with time-of-ight

(TOF) capability has been shown to provide a better reconstructed image compared to

the conventional positron tomography. In the TOF-PET approach, for each detected

event, the measurement of the time-of-ight dierence between two 511-keV photons

provides an approximate value for the position of the annihilation. The approximation is

directly limited to the capability of measuring the arrival time of the two photons. The

ASIC needs to include a high-precision TDC for achieving the required time resolution

with good stability. This thesis proposes a coarse-ne TDC based on a low-jitter DLL

array for the resolution enhancement. Precise multiphase clock generation using low-jitter

DLL techniques are discussed.

Three prototype chips are designed in AMS 0.35 µm CMOS technology. In the

front-end analog signal processing chip, the dynamic range, the linearity, and the power

dissipation are optimized. The input dynamic range from few fC to more than 100 pC

can be achieved. The analog output range of the front-end readout circuits is from 1.2 V

to 3.2 V. The shaping time is 280 ns and the power dissipation is reduced to less than 15

mW. In the TDC prototype based on a DLL array, the RMS jitter and the peak-to-peak

jitter of the used DLL are reduced to 7 ps and 21 ps, respectively. The bin size of the

TDC has been reduced to 71ps with a reference clock of 100 MHz. In the multi-channel

time-based ADC chip, a maximum resolution of 12 bits, a sampling rate of ∼ 1 MS/s,

and the power dissipation of 3 mW + 0.2 mW/channel are achieved.

The main contributions of this work are listed as follows.

• Research on advances in novel techniques of front-end readout and signal processing

for PET imaging. The mostly used techniques are surveyed and concluded in this

work. Moreover, the future research directions of front-end readout ASICs are

pointed out.

• Proposals for a monolithic architecture of the front-end readout ASIC with high-

precision TDC and time-based ADC. This idea will introduce a new research direc-

tion for small-animal PET imaging system with the axial-oriented crystal coupled

with dual photodetectors at both sides.

• Research on the precise multiphase timing generation using low-jitter delay-locked

loop (DLL) techniques. Not only a single low-jitter charge-pump DLL but also the

DLL array are realized for time interpolations. Besides, a digital DLL with linear

delay elements is also constructed to overcome the challenges of technology scaling.

• Design and test of a multi-channel high-precision coarse-ne TDC using counter-

based circuits and DLL techniques. The TDCs based on both a single DLL and a

DLL array are studied. It is indicated that this TDC architecture is very suitable

for the PET imaging applications. Moreover, the resolution enhancement using a

DLL array is rstly reported in this eld.

• Design and test of a multi-channel time-based ADC using the architecture of the

Wilkinson ramp ADC and digital TDC techniques. The creative ADC which can

digitize the voltage signals from a large number of readout channels provides a

possibility to achieve the one-chip solution with all-digital outputs for the proposed

PET imaging system.

For the future developments, the performance evaluation of a monolithic front-end

readout ASIC including front-end analog processing circuits, multi-channel TDC circuits

and proposed time-based ADC circuits will be carried out. Moreover, since CMOS tech-

nology scaling has moved the process node to nanometers, design considerations for the

challenges due to technology scaling will be taken into account.

Keywords: Positron Emission Tomography (PET), Front-End Electronics, ASIC,

Charge-Sensitive Amplier (CSA), Time-of-ight, Analog-to-Digital Converter (ADC),

Time-to-Digital Converter (TDC)

xxii Contents

Chapter 1

Introduction

1.1 Research background

The beginning of the 20th century was strongly inuenced by the Nobel Prize in

physics awarded for their work on radioactivity by Henry Becquerel, Pierre and Marie

Curie. This work has subsequently been utilized for the use of the radioactivity in the

diagnosis, rst, cancerous diseases, then to the heart and neurodegenerative diseases. The

radioactivity has enabled the advent of a new diagnostic technique of non-invasive medical

imaging. Unlike the case for radiographic imaging that can not visualize the anatomy,

this imaging technique will report the activity of biological mechanisms at the molecular

level [1]. This technique is called as positron emission tomography (PET) which is an

important modality in the molecular imaging family.

In the past 50 years, the revolutionary changes of techniques and approaches for PET

scanners have taken place although their appearances looks similar to each other. Many

people put forward some of the concepts, made important contributions to improving

their performances and tried to make them in practical applications. Generally, the major

achievements of PET instruments depends on the improvement of the spatial resolution,

the sensitivity, and the characteristic of the counting rate. These performances are mainly

determined by the front-end electronics of PET imaging systems.

The research of this thesis mainly dedicates to design issues of PET front-end elec-

tronics. In particular, this thesis focuses on design techniques of a monolithic front-end

readout application-specic integrated circuit(ASIC) used in a small-animal PET imaging

system.

1.1.1 PET imaging

Since 1950, the observation by the gamma detection of cerebral lesions in human

beings requires the use of a new imaging technique which is capable of identifying sparsely

radiolabeled. Since 1970, the development of the rst computer-assisted PET responds to

this demand thanks to the coincidence detection of annihilation γ photons of a positron

emitter. In 1980s, the research of PET focused on the brain or the heart so that the design

1

2 Introduction

of PET dedicated to cover the range of these organs. Later, since the scan of the whole

body can provide many benets on the clinic diagnosis, whole-body clinic PETs were

developed. In 1990s, the PET instruments utilized only two-dimension data acquisition.

These PET instruments which were low sensitive can not meet the requirements of the

clinic diagnosis. With the development of novel scintillation crystals, such as LSO and

LYSO, which had better timing resolution, the performances of three-dimension PET

scanning systems were obviously enhanced. Meanwhile, the algorithms of 3D image

reconstructions were improved as well.

Figure 1.1: Principle of clinic PET imaging systems [2].(a)The generation of the positron;

(b)The generation of γ-rays from the annihilation of a positron and an electron; (c)The ar-

chitecture of the tracer 18FDG; (d)Detector ring and scanning;(e)Computerized tomography;

(f)Image reconstruction.

The principle, detector and electronics for clinic PET imaging is shown in Figure 1.1.

The generations of PET images are accomplished through the use of radioactive tracers

that are either ingested, inhaled or injected into the body of a patient. The tracers

employed for PET imaging have two important properties. Firstly, they can bond to

cancerous cells so that a large number of concentrations can be accumulated. For ex-

ample, uorine-18 (18F ) is a tracer that bonds to common glucose. Thus, it can be

utilized to measure glucose metabolism. Since malignant tumors have an increased glu-

cose metabolism, more of the uorine-18 will be accumulated into these cancerous cells.

Secondly, the tracer can emit a positron when it decays. When the positron reaches ther-

Research background 3

mal energies it will annihilate with one of the electrons in the patient's body producing a

pair of 511 keV photons traveling in opposite directions. The two photons have sucient

energy to pass through the patient undeected and fall incident to a detector ring where

they can be detected by detector modules [2, 3].

The detector ring consists of a coupled scintillator and a photodetector array. The

scintillator absorbs the incident energies of the photons and converts it into small ashes

of light. The light is collected by the photodetector and then converted into an elec-

trical signal. Furthermore, the weak signal is amplied, digitized and read out for the

subsequent analysis.

With this physical phenomenon, a line of response can be dened by two photons

(511 keV) emitted in opposite directions and detected in coincidence. By taking a large

number of coincident events, many LORs can be constructed. An image can then be

reconstructed by employing a tomographic algorithm on the data from the whole set of

LORs.

In the mid-1990s, the advances in the detection of γ radiation and the increased

development of molecules marked by a positron emitter contributed to the design of PET

systems dedicated to imaging of small animals represented mainly by the mouse. Its

genome to 95 % identical to that of the human beings then allows the study of human

diseases and their associated treatments [1].

These specic systems for small animals require the improvement of performances

such as spatial resolution and detection eciency compared to their counterparts intended

for human beings. This development is so necessary in view of the morphology of the

mouse about 30 times lower than that of man and where brain volume is 2500 times less.

The most obvious challenges of PET imaging technology derives from the magnitude

of the dierence between the physical size of human subjects for which clinical PET

systems have been developed and the laboratory rate or mouse. The current generation

of small animal PET imaging systems can achieve spatial resolution in the 1- to 2-mm

full-width-at-half-maximum (FWHM) range with point source detection sensitivities in 1

% to 15 % range, providing image quality in small animals that is beginning to approach

the qualitative and quantitative capabilities of clinic PET imaging [4, 5]. To achieve

these requirements, both detectors together with front-end electronics and overall system

design should be taken into account.

1.1.2 Proposed small animal PET

Since 2004, a project dedicated to a multi-modality imaging platform including Com-

puterized Tomography (CT), Single Photon Emission Computer Tomography (SPECT)

and PET for the biomedical research on small animals has been carried out in Insitut

Pluridisciplinair Hubert Curien (IPHC)1. The diagram of the future instrument is illus-

trated in Figure 1.2.

1IPHC is a multiple-discipline institute which is founded by both University of Strasbourg and Na-

tional Center of Scientic Research (CNRS/IN2P3), France.

4 Introduction

Figure 1.2: Diagram of a imaging platform dedicated to biomedical research on small animals.

In this imaging platform, a microPET, a microSPECT and a microCT are combined together

to creat both the prole and molecular-level images for small animals

In this project, the systems of three imaging modalities dedicated to small animal

are called as microCT, microSPECT and microPET, respectively. MicroCT and mi-

croSPECT have been developed and tested. However, the PET imaging system is under

development.

The detector module for PET imaging includes two parts: scintillation crystals and

photodetectors. The scintillator absorbs high-energy radiations such as X- or γ-raysand converts a fraction of the absorbed energy into visible or ultraviolet photons. The

photodetector is an element to translate the optical photons to charges. In most PET

systems dedicated to small animal imaging, the geometry of the detector module is based

on a block structure where the crystal elements are coupled to a reduced number of

photomultiplier tubes, as shown in Figure 1.3 (a). The γ-rays are absorbed by the front

face while the PMTs are coupled to the back face. The detection eciency is limited

by radial geometries making the reference to the perpendicular position of detection

modules from the subject. This peculiarity degraded one hand the spatial resolution

beyond the center eld of view and also restricts the detection eciency by the size of

the detection modules. If the lower limit of the spatial resolution appears to be achieved

by PET imaging, increasing the eciency of detection remains a constant challenge. Its

increase can eectuate images in shorter time with the lower injected activity. This is

then translated by a number of subjects studied per day increased and decreased the

amount of the used radioactive material.

In our conguration, shown in Figure 1.3 (b), the crystals are oriented in the axial

direction readout on both sides by individual photodetector channels. This conguration

was rstly introduced by the precursors of PET in the 1970s. However, it was employed

by few PET systems due to required complicate electronics. The feature of this method is

that the system allows the spatial resolution and the detection eciency to be independent

of each other. Thus, this strategy can be employed for high-resolution PET imaging. With


Figure 1.3: The arrangement of the scintillator crystals and photodetectors. (a)Scintillator

crystal with individually coupled photodetector [6];(b)Scintillating crystals with axial-oriented

photodetector [1, 7];

the proposed detector module, both energy and timing signals should be readout by front-

end electronics. Not only multiple types of electronics but also high performances should

be required. Moreover, the required front-end electronics should be fully customized.

MCP PMT

LYSO Crystal

(a) (b)

Detector Module

DAQ Board

Figure 1.4: The rst prototype of detector module for the proposed MicroPET for biomedical

research of small animals [1]. (a)Detector modules; (b)The detector module with front-end

electronics.

The rst PET prototype which is shown in Figure 1.4 (a) consists in four modules

arranged around the animal. Each module consists of a matrix of 32 × 24 LYSO(Ce)

crystals of 1.5 mm × 1.5 mm × 25 mm, each read at both ends by a Photonics Corp.

MCP PMT. Consequently, the PET imaging system is composed of 3072 crystals and

6144 electronic channels.

The images of the used PLANACON MCP are shown in Figure 1.5 (a) and (b). In

6 Introduction

Figure 1.5: Images of PLANACON MCP and its output waveform. (a)Photocathode face

of PLANACON MCP PMT; (b)Anodes face of PLANACON MCP PMT. (c) Typical output

waveform.

fact, the MCP consists of the photodetector array. Each detector pixel has the same

dimension. The typical waveform of the output signal for a MCP channel is shown in

Figure 1.5 (c). The amplitude of the output signal is about 5 mV with a 50 Ω resistor.

The width of the pulse is about 2.5 ns. The signals generated from the MCP PMT have

the following features.

• Multi-channel signals should be readout in parallel and simultaneously. For selected

MCP, the number of channel is 768 or 1024.

• The output pulse is high-frequency weak current signals. The width of the pulse is

several nanoseconds. With a 50 Ω resistor, the amplitude of the tested waveform

is about several millivolts.

• The signal is noisy.

Furthermore, some other characteristics should be pointed out by utilizing the de-

tector arrangement shown in Figure 1.3 (b).

• Large dynamic range will be achieved for the measurement of the charge varying

from few fC to more than 100 pC. It corresponds to a variation of the input signal

induced by the dierent positions of the scintillation along the axial extent of the

crystal. To identify the input signals, an ADC with a resolution of larger than 10

bits is required.

• Both energy and timing information should be obtained with digital representations

for easier storage, transfer and processing.


1.1.3 Front-end electronics

In order to achieve the compact size, high-performance data acquisitions (DAQ),

and the digital signal processing, an ASIC which includes multi-channel front-end analog

signal processing circuits, time discriminators and digitizers are required. The diagram

of the front-end readout circuit for each MCP detector channel is shown in Figure 1.6. A

preamplier is employed to reduce the input impedance so that the inuence of linearity

due to the crosstalk will be reduced. The output of the preamplier then ows into two

branches. The upper branch is proposed to measure energy quantities of the particles.

Here, it has been converted to voltage signals. The peak value of the shaped pulse should

be sampled and held. Then the analog voltage is digitized by an ADC. The lower branch

processes the time measurement. The time stamps of the particles will be given by using

a TDC. To achieve good linearity and high yield, a 64-channel ASIC is dened.

Figure 1.6: The diagram of the front-end readout circuit for each detector channel.

Since 2005, three prototypes have been successfully developed under the cooperation

between the microelectronics group and the biomedical imaging group of IPHC. The rst

one is a ten-channel front-end analog signal processing circuit, named as IMOTEPA [8].

A regulated cascode (RGC) preamplier was proposed for IMOTEPA in order to reduce

the crosstalk between adjacent channels. An integrator as well as a CR-RC shaper is used

for processing the energy information. Meanwhile, the time information is achieved by a

current-mode discriminator from the output of the RGC preamplier. The peak value of

the output from the shaper is sampled by the sampling timing generated by a monostable

circuit. The second one is a 16-channel TDC, named as IMOTEPD, which is based on two

10-bit counters and a 32-phase delay locked loop (DLL) [9]. The IMOTEPD can measure

a time dierence with the bin size of 625 ps. In 2009, a monolithic ASIC [10] composed

of IMOTEPA and IMOTEPD was taped out. The new name is given by IMOTEPAD.

Moreover, IMOTEPAD has been extended to 64 channels for the compatibility of the

Photonics Corp. MCP PMT.

The ASIC and other components are mounted on a DAQ board. Figure 1.7 shows

the schematic of the data acquisition board which uses four ASICs to process 256-channel

signals. Both energy quantity and time stamps are processed by the proposed ASIC. The

energy outputs of each ASIC are converted by an ADC into 14 bits wide numeric values

which are sequentially written into a First-In-First-Out (FIFO) memory. Simultaneously,

the associated time-stamp values are also put in the same queue (FIFO). Data are sorted

8 Introduction

in order to only send relevant results to a PC set. Gigabit Ethernet is used to transfer

data. A FPGA is used as the ASIC readout sequencer and to manage the Ethernet

communication and to setup the front end ASIC such as threshold voltages, bias voltages,

gains and holding delays. For each MCP detector, six DAQ boards are required to process

signals from 768 channels, shown in Figure 1.4 (b).

Figure 1.7: The schematic of the data acquisition board which can process 256-channel signals

for each MCP detector [11].

1.2 Advances in PET front-end chips

In the late 1980s, the ASIC techniques found their applications in front-end elec-

tronics for PET imaging. The design of front-end circuits is determined by the specic

applications, the used detector module and overall system requirements. Thus, the devel-

opment of front-end chips is the full-custom complicated work for ASIC designers. The

previous work of front-end readout chips is listed as follows.

Early techniques

The VLSI architecture of front-ends for PET imaging was introduced in 1988 [12].

However, no continuous reports on the implementation of the proposed architecture were

Advances in PET front-end chips 9

presented. The reason was that the rst generation PET was a two-dimension imaging

instrument which did not require complicated front-end electronics. Five years later, an

ASIC implementation of digital front-end electronics for a high resolution PET scanner

was proposed by D. Newport et al. [13]. The proposed ASIC consisted of 37,000 gates

digital circuits and were realized in 1-µm CMOS gate-sea technology. The organization

of front-ends was still based on the discrete devices such as preamplier and this digital

ASIC on board.

ASICs for PMT

Front-end electronics for a variable eld PET camera using the PMT-quadrant-

sharing detector array design was introduced by W.Wai-Hoi in 1997 [14]. This work

established the basic architecture of front-end electrical systems for PMT-based PET.

PET with PMT-quadrant-sharing detectors was a main branch in the early developments.

Five years later, B. Swann et al. presented a custom mixed signal CMOS integrated circuit

for this series of PET in [15]. The proposed chip integrated front-end readout circuits

together with time-measurement circuits for LSO/PMT detectors in 0.5-µm n-well CMOS

process. The time resolution was 312.5 ps which was relatively advanced in that time.

However, the energy digitizer was not integrated with other blocks. The characteristics

of the proposed front-end ASIC were then presented in [16]. This electrical system could

also be compatible with the BGO-based detector.

ASICs for APD

In 1999, a novel APD-based detector module for multi-modality PET/SPECT/CT

scanners was proposed [17]. The ASIC dedicated to APD detector started to realize.

Meanwhile, the concept of small animal PETs was proposed. Some scientists started to

research front-end electronics for such instruments. In the proceedings of IEEE inter-

national symposium on nuclear science and medical imaging in 2001 and 2002, a large

number of papers involving front-end signal processing ASIC were collected. This in-

dicated that the design techniques of front-end ASIC became a hot research direction.

In [18], the development of an ASIC for APD-based small animal PET was proposed

by M. L. Woodring, et al. Besides, front-end electronics and data acquisition for small

animal PET were also reported in [19, 20].

In 2004, since position-sensitive APDs had better performances than PMTs, the

research of front-end readout circuits dedicated to APDs for PET became more and more

important. However, the signals generated from APDs were weaker than that from PMTs

so that the conventional techniques for PMTs readout ASIC can not be directly applied to

APDs. Alternatively, novel electrical architectures and design techniques, in particular,

low-noise front-end readout circuits, should be developed. The contributions [21, 22,

23, 24] mainly presented the low-noise front-end readout circuit and signal processing

techniques for APD-based PET. These contributions established an important basis for

subsequent researches on PET imaging systems.

10 Introduction

ASICs for TOF PET

The time-of-ight (TOF) information in conventional PETs is utilized to determine

if two detected photons are in "time coincidence" so that the same positron annihilation

event will be pointed out. It is unable to determine which voxel along the line is the

source of the two photons; therefore all the voxels along the line are given the same

probability of the emission. These data can not be used for reconstructed algorithms to

help the imaging. However, TOF PET uses the time-of-ight dierence to better locate

the annihilation position of the emitted positron [25, 26].

PET with time-of-ight (TOF) capability needed a time-to-digital converter (TDC)

to measure the tiny time interval between two photons absorbed by the crystals. The use

of high-precision TDCs should consider novel architectures of front-end ASIC due to the

high-resolution discrimination and the problems such as"time walk".

A multi-channel readout ASIC for ToF-PET was introduced in [27]. An intrinsic

time resolution of 105 ps (FWHM) channel-to-channel was observed with test pulses. In

a PET setup with two LYSO crystals equipped with photo multipliers, the coincidence

time resolution was 330 ps (FWHM). An energy resolution of 13% could be obtained for

511 keV signals from a Na-22 source using the on-chip charge integrator.

ASICs for PET with DOI

The measurement of depth-of-interaction (DOI) can provide more precise location

of the annihilation. A 64-channel mixed signal front-end integrated circuit (IC) [28] for

reading out a photodiode (PD) array coupled with LSO scintillator crystals for a PET

imaging application. Each channel consisted of a low noise charge sensitive pre-amplier

(CSA), an CR-RC pulse shaper and a winner-take-all (WTA) multiplexer that selects the

channel with the largest input signal. This analog multiplexer remained the same number

of transistors as digital multiplexer but did not need the digital decoder. J. F. Pratte et al

presented a fast shaping amplier for PET/CT APD detectors with depth-of-interaction

in [29]. The circuits fabricated by using 0.35-µm technology achieved a time resolution

of 1.49 ns.

Innovative front-end electronics and digital signal processing

With the development of small animal PETs, spatial resolution and timing resolution

become smaller and smaller. For example, the spatial resolution of MicroPET-II [30]

had reduced to 1 mm3 and the time resolution is down to subnanoseconds. More and

more challenges should be faced by ASIC designers. Thus, novel front-end electrical

architectures and signal processing techniques were required.

J.D.Martinez et al. published their work on high-speed data acquisition and digital

signal processing system for PET imaging techniques applied to mammography [31]. It

was the rst time to propose that using digital signal processor(DSP) dealt with front-end

data acquisition and signal processing. This idea initiated a novel research direction for

PET. The following researches on this topic were found in [32, 33, 34].

Proposed work 11

The design of these novel electronics can be identied into two directions. One is the

pipeline architecture using a free-running ADC and a digital processing algorithm. The

concept was based on a deadtimeless pipelined processing of the photosensors signals.

After shaping and sampling by a free-running ADC, the pulses are digitally ltered to

extract time and energy. The data was processed and selected inline before storage. The

idea was also proposed by P. Guerra [33]. Moreover, new embedded digital front-end

for high resolution PET scanners was introduced. This method was proposed again by

J.F. Genat, University of Chicago [35]. They compared this method to other strategies

and concluded that the novel method can obtain a time resolution of several picoseconds.

The other direction was the one-chip solution of front-end readout circuits with integrated

TDC and ADC. So the output signals of PET front-ends were digital. These digital data

can be readout easily and eciently. Moreover, the acquired data can be processed

by both PFGA and imaging-specic DSPs. The products have be launched by Texas

Instrument, Co. [36].

Besides, Xie Qingguo et al. proposed a new pulse processing method which was

based on the techniques of multi-threshold voltage sampling and time measurements [37].

The method was based on the modeling of front-end signal into standard expressions.

By sampling several points of each front-end signal, the virtual waveform can be recon-

structed by using novel electronics. Thus, the peak values and timing information can be

calculated via the o-line software.

1.3 Proposed work

1.3.1 Motivation

Most front-end electronics is based on a signal acquisition named "Charge Inte-

gration". The front-end readout circuits need to include several dierent blocks such

as charge-sensitive amplier (CSA), pulse shaper, discriminator, sample-and-hold circuit,

analog-to-digital interface (ADC and TDC). The diagram of multi-channel front-end read-

out circuits for PET imaging is shown in Figure 1.8.

The integration of these blocks is very complicated work which involves front-end

readout strategies, signal processing methods, very large-scale integration (VLSI), ana-

log and mixed-signal IC design and test, and fabricated technologies. The design and

development of this ASIC should overcome several big challenges.

Low-noise analog front-ends

Generally, larger signal-to-noise ratio (SNR)of front-end electronics means better

energy and spatial resolution of PET imaging systems. Low noise analog front-ends

are necessary. However, the front-end ASIC should include the digital controlling part

and even mixed-signal data converters to digitize front-end analog signals. Since the

CMOS technology uses the same substrate for both NMOS and PMOS transistors, the

performances of relatively quiet analog front-ends will be aected by the substrate noise

12 Introduction

Detector

Detector

particle

particle

A Av

Channel #1 m

Vin<1>

Vin<2>

Analog-to-Digital InterfaceFront-End Readout Chains

Tin<1>

Front-End Readout circuit Tin<2>Channel #2

Detector particle

Vin<N>

Front-End Readout circuit Tin<N>Channel #N

High-precision TD

C

n

High-resolution A

DC

Monolithic Front-End ASIC

E

T

Detectors

Figure 1.8: The diagram of front-end readout circuits for PET imaging.

from the digital part and data converters. This is also happened in the power supply

wire, existing noise not only in the on-chip power supplies but also from the supplies from

the outside of the chip. Moreover, the circuits consisting of MOS transistors and passive

components such as resistors, inductors, capacitors also produce noise.

Multiple Channels

Since the Photonics PLANACON MCP contains many channels, a dedicated multi-

channel ASIC, of course, is required. Two considerations should be taken into account.

One is the number of integrated channels. For 8 × 8 anodes, a 64-channel ASIC is

compatible to one detector, however, for 32 × 32 anodes, the number of channels is up

to 1024. Design of a single 1024-channel chip is a big challenge. The number of channels

will be limited by the nonlinearity and the yield rate of the fabricated chip. The second

one is the data conversion and read-out strategy. For a 64-channel ASIC, if the number

of bits of a TDC is 16 bits, the total data is up to 128 bytes per one sampling. For a

continuous-sampling system, the huge-size data should be stored and read out. Thus, the

data quantity and the readout speed have a relationship to the channel number of the

ASIC.

Large dynamic range

The detection of γ-ray for PET imaging requires a CSA with large dynamic range

and the associated ADC with high resolution. The tradeo of power, speed and resolution

Proposed work 13

should be taken into account to chose the proper architecture and techniques. Moreover,

since the time information needs to be extracted from the high-frequency pulse, a high-

speed and high-resolution comparator and a sub-nanosecond TDC should be designed.

In addition, a periodical wide-range time window should be dened according to the

frequency and the quantity of the collected data.

Low power dissipation

For PET imaging, power dissipation of front-end electronics is a crucial parameter.

Since many electrical channels are required, the power per channel should be controlled

as low as possible. Otherwise, large power dissipation will increase the temperature of

the space in the detector ring. Both biological mechanisms and electronic performances

will be aected by the temperature increase.

Technology scaling

The scaling of CMOS technologies driven by 'Moore's Law ' is a two-side blade for

IC designers. The CMOS technology roadmap is shown in Table 1.1. Although digital

circuits are benet from the technology scaling to achieve improvements of power dis-

sipation, speed and cost, analog circuits have to face challenges such as reduced supply

voltage, low intrinsic gain, increased 1/f noise and leakage current. It promotes the analog

designer to consider novel front-end circuits to resist the scaled features. It is known that

the used CMOS technologies for front-end electronics are relatively behind mainstream

available technologies at the same era. This is mainly determined by the integration of

special circuit types such as low-noise charge-sensitive ampliers, analog pulse shapers,

and mixed-signal data converters.

Table 1.1: CMOS technology roadmap

Parameters / year 1997 1999 2001 2003 2006 2009

Feature size(µm) 0.25 0.18 0.15 0.13 0.10 0.07

Supply (V) 2.5 1.8 1.6 1.5 1.2 0.9

Vth (V) 0.5 0.47 0.44 0.42 0.4 0.37

Interconnect(km/chip) 0.82 1.5 2.2 2.8 5.1 10

Based on the previous work, in particular, the research work in IPHC, this study

focuses on the design of a high-precision TDC and an integrated time-based ADC which

will be built into a monolithic front-end readout chip for PET imaging systems based

on the detector module consisting of photodetectors read at both sides of the crystal

element.

PET with time-of-ight (TOF) capability can provide a better reconstructed image

compared to conventional positron tomography. In the 1980s, TOF-PET were built with

an achieved timing resolution of ∼500 ps [26]. At that time, the electronics available

14 Introduction

drastically reduced the performances of the TOF-PET. Nowadays, electronics operat-

ing in the GHz range is routine and the application-specic integrated circuits (ASIC)

are commonly used [38]. The ASIC needs to include a high-precision time-to-digital

converter (TDC) for each detector element to reach the required time resolution (i.e.,

several hundred picoseconds) with good stability. Particularly, for the new-generation

of small animal PET using novel scintillator materials (such as LaBr3: 30 % Ce) and

semiconductor-based scintillators, a detection timing resolution below 100 ps is achiev-

able. Thus, the resolution of TDC should be several tens picoseconds or less. However,

IMOTEPAD of IPHC contains a TDC with the bin size of 625 ps which should be im-

proved in smaller bin size with the high resolution. As a result, this study partly focuses

on the design of a high-precision TDC with the bin size of less than 100 ps.

Moreover, although the time stamps from the IMOTEPAD is digitized by the inte-

grated TDC, the energy information is still analog voltage signals which are converted

by an external ADC. The resolution of the ADC is up to 14 bits and the sampling rate

is about 20 MSamples/s. Since 64 voltage signals should be processed, several problems

have been discovered.

Firstly, the precision of the voltage signals are aected by the multiplexer realized

by MOS switches and the sample-and-hold circuit in the external ADC. In IMOTEPAD,

64-channel voltage signals are output in serial to be compatible of the external ADC. A

64-to-1 multiplexer is required to select the voltage signals. Since MOS switches in the

multiplexer are not ideal, the amplitudes of voltage signals are aected by the charge

injection and the clock feedthrough in MOS switches. With the multiplexer, this error

can be reduced but can not be eliminated. In addition, the precision of the output voltage

signal is further reduced due to the sample-and-hold operation in the external ADC. In

addition to the charge injection in MOS switches, the voltage drop due to the leakage of

the holding capacitor is another reason.

Secondly, the speed of the signal processing is limited by using the IMOTEPAD-

based architecture. Since the MOS switches have parasitic parameters, 64 voltage signals

suering from charging or discharging during the selecting operation require a settling

time which is related to the dimension of the MOS switches. This time can be reduced

but can not be eliminated. In addition, large parasitic parameters of the wire and the

nodes on the PCB board also limit the speed of signal processing. These features are not

suitable for PET systems with small dead-time windows.

To overcome the above problems, an integrated ADC is proposed by this work. The

integration of the front-end analog signal processing circuits, the TDC and the ADC has

other advantages.

• More compact size. Once the external ADC is integrated into the front-end ASIC,

discrete ADCs with their associated circuits can be removed. Thus, fewer compo-

nents will be employed for DAQ board so that the size of the board will be more

compact. Furthermore, the cost will be decreased because the number of compo-

nents will be reduced in the whole system design.

• Faster signal acquisition and processing. Compared to analog multiplexing opera-

Proposed work 15

tion, parallel-to-serial conversion of digital signals can operate in very high speed

by using an integrated ADC. In addition, the 64-to-1 multiplexer can be removed

and no more sample-and-hold circuits are required. Since smaller parasitic param-

eters will be generated in the signal channels, higher-speed signal acquisition and

processing will be achieved.

• Higher reliability. Since the outputs of the energy and timing information are pure

digital signals, the signal processing will be higher robustness. The whole PET

system will have higher reliability.

To sum up, it is signicant to design an integrated ADC so that a monolithic front-

end ASIC will be achieved. Besides, the above prototypes were designed and fabri-

cated in AMS 0.35 µm technology, which falls far behind the state-of-the-art commercial

deep-submicron meter CMOS technology. The migration of the existed circuits to more

advanced technology is urgent. Considering the challenges of analog and mixed-signal

circuits in submicron meter CMOS technologies, a time-based ADC is proposed in the

second part of my thesis.

1.3.2 Main contributions of this work

• Research on advances in front-end electronics dedicated to radiation photodetectors.

The signal processing and associated electronics are discussed. The mostly used

blocks such as preamplier, shaper, peak-detect-and-hold circuits, analog-to-digital

converters, time discriminator, and time-to-digital converters are introduced.

• Proposals for a monolithic architecture of the front-end readout ASIC with the

integrated high-precision TDC and the time-based ADC. This idea will introduce a

new research direction for small-animal PET imaging system with the axial-oriented

crystal coupled with dual photodetector at both sides. The monolithic front-end

readout ASIC that outputs only digital signals for both energy quantity and timing

information will simplify data acquisitions by using digital signal processing method.

• Design of a multi-channel front-end analog signal processing circuits with a RGC

preamplier and a high-speed current comparator. The design and characteristics

of the blocks such as the RGC preamplier with a variable gain stage, the CR-RC

shaper, the analog memory, the current comparator, the current-steering DAC and

the monostable circuit are described.


loop (DLL) techniques. The principle and architectures of low-jitter charge-pump

DLLs have been discussed in detail. Novel circuit techniques to reduce jitter from

the circuits are proposed. In addition, a DLL array is constructed for the precise

multiphase clock generation. Moreover, based on the principle and the architecture

of an analog DLL, a digital DLL using linear delay cells is proposed to overcome

16 Introduction

the challenges due to the technology scaling. A digital lter algorithm is presented

as well. Ideally, jitter-tolerant performances of the digital DLL can be obtained.

• Design techniques of a multi-channel high-resolution TDC using counter-based cir-

cuits and DLL techniques. A 16-channel 625-ps TDC using a single DLL and a

3-channel TDC using a DLL array are designed and fabricated in AMS 0.35 µmCMOS technology. The 625-ps TDC chip has been tested and successfully applied

to a 64 channel front-end readout chip. Moreover, it is indicated that the prototype

chip of the TDC using a DLL array can achieve a bin size of 71 ps with a reference

clock of 100 MHz.


ramp ADC and digital TDC techniques. An 8-channel ADC prototype is designed

and fabricated in AMS 0.35 µm CMOS technology. The chip has been tested by

a four-layer PCB test board. The proposed ADC can achieve a typical resolution

of 12 bits, a sampling rate of ∼ 1 MS/s, and the power dissipation of 3 mW + 0.2

mW/channel.

1.4 Thesis overview

The arrangement of this thesis is as follows.

• Chapter 1 introduces the research background and the motivation of the PhD work.

The PET imaging system based on a detector consisting of LYSO scintillation

crystals read out at both sides by two MCPs is described. The proposed work and

the arrangement of the thesis are given.

• Chapter 2 presents a detail survey on front-end electronics for radiation photode-

tectors. In this chapter, the overview of front-end electrical systems and modern

front-end electronics are described. In addition, the front-end signal processing and

associated electronics are presented. In the section of photo-electric conversion, a

short introduction to the mostly used scintillation crystals and photodetectors is

given. In the section of signal acquisition, the techniques of the voltage-sensitive

amplier, the current-sensitive amplier and the charge-sensitive amplier are dis-

cussed. In the section of pulse heigh analysis, the CR-RC shaper and the semi-

Gaussian shaper are presented. In the section of peak detect sample and hold,

three methods and associated circuits are discussed. In the section of analog-to-

digital conversion, the conception and the performance comparison of the mostly

used ADC architectures are given. In the section of time discriminator, high-speed

high-resolution voltage comparators, high-speed current comparators and constant

fraction discriminator are discussed. In the section of time-to-digital conversion, the

conception, performance gure-of-merits and dierent architectures are described.

Thesis overview 17

• Chapter 3 describes the design of front-end analog signal processing circuits. Ac-

cording to the requirements of the proposed PET imaging system, a front-end

readout chain based on a RGC preamplier and a high-speed comparator is de-

scribed. In the second section, the schematics of the preamplier with a variable

gain stage, the CR-RC shaper, the analog memory, the time-stamps circuits, the

current-steering DAC are illustrated. The experimental results and analysis are

given the third part.

• Chapter 4 proposes the design techniques of delay locked loops which are utilized to

precise multiphase clock generation. The state-of-the-art DLLs is rstly introduced.

A behavior model and a jitter model are also given. In addition, an charge-pump

multiphase DLL with a Start Controller is designed. The blocks such as current-

starved delay cells, the Bangbang phase detector, the charge pump and the loop

lter are discussed. Moreover, the experimental results and analysis are given. To

achieve better jitter performances, an optimized charge-pump DLL is presented in

the third part. The improved blocks such as the delay cell with the DC current

source, the dynamic phase detector, the charge pump using the feedback adjustment

and the loop lter with testing circuits are described. The experimental results of

this optimized DLL are also given.

• Chapter 5 illustrates the design and characteristics of multi-channel coarse-ne

TDC based on a counter and the DLL techniques. The design considerations are

rstly given. In addition, the principle and the architecture of a coarse-ne TDC

are presented. In the second section, the design of a multi-channel 625-ps TDC

based on dual binary counters and a single DLL is described. The experimental

results of a 16-channel prototype chip are given. To obtain smaller bin size, the

design of a multi-channel coarse-ne TDC using a DLL array is given. At last, the

experimental results of a 3-channel prototype chip are discussed.

• Chapter 6 presents the design techniques of the integrated multi-channel time-based

ADC. Several classic architectures of time-based ADCs have been investigated in the

rst section. In addition, the design considerations of the analog-to-digital interface

for the front-end electronics are given. In the second section, the architecture and

circuit techniques of the proposed time-based ADC are presented. The schematics

and simulation results of the ramp generator, the high-speed high-resolution com-

parator, the digital DLL and the Gray-code counter are described. In the third

part, the error analysis of the proposed ADC is given. At last, the experimental

results of a 8-channel prototype chip are given.

• Chapter 7 concludes the thesis. In the rst section, several conclusions of this thesis

are given. The advantages and design challenges of the monolithic front-end ASIC

including front-end analog signal processing circuits, the multi-channel coarse-ne

TDC and the time-based ADC are discussed. In the second part, the perspectives

of this study is described. Including the monolithic front-end ASIC, both the novel

18 Introduction

front-end electronics using the pipeline sampling with a high-speed ADC and the

circuits dedicated to multi-threshold-voltage sampling approaches are the research

trend for PET imaging applications.

Chapter 2

A survey on front-end electronics forphotodetectors

Nuclear radiation detectors provide electronic signals (usually electric charges) when

the radiation interacts with the material of the detector. The quantity of the released

charges is strictly related to the energy of the radiation, e.g., in the case of semiconductor

detectors it is a linear function of the energy. Therefore, by measuring the quantity of

the released charges, one can determine the energy of the radiation. Since the direct

measurement is dicult, the charges are usually converted into voltage or current signals.

In such a way, the energy measurement can be performed by measuring the amplitude of

a voltage pulse. In addition, the arriving or hitting time is very useful for some imaging

systems. The discrimination of time stamps is another dedicated techniques for radiation

detectors.

Front-end electronics is one of the most important parts for a PET imaging system

which is a typical semiconductor detector system based on radiation photodetectors such

as CdTe, PMT, APD and SiPM detectors. The front-end electrical systems can be divided

to several signal processing steps such as photo-electric conversion, signal acquisition,

pulse height analysis, analog-to-digital conversion, time discriminator and time-to-digital

conversion. Each step has its specic circuits. This chapter will give the overview of the

electronics for all steps. The objective is to nd the possibility of the integration of these

steps into a single chip.

2.1 Overview of front-end electronic systems

The described front-end electrical system is merely dedicated to radiation photode-

tectors. The basic architectures [39] can be concluded in Figure 2.1. For all cases, the

preamplier and the shaper are common elements. However, the choice of subsequent

signal processing circuits depends on specic applications. As shown in Case (a) of Fig-

ure 2.1, the shaper is followed by a threshold comparator which detects the presence of a

signal carrying charges above a threshold value. Several imaging systems employing pixel

or strip detectors are based on this simple architecture [40]. This case has a drawback

19

20 A survey on front-end electronics for photodetectors

that the space resolution is limited by the intrinsic geometry such as the dimension of the

pixels. Thus, the charge information referring to the energy level should be retained and

utilized for a centroid evaluating algorithm. This can be realized by the circuits shown

in Case (b) and (c) in Figure 2.1. Case (b) utilizes the width of the shaped pulse to rep-

resent the energy level. The shaped pulse is sampled by a high-speed comparator with a

threshold voltage or current. The generated square pulse is measured by a high-precise

counter. The last output of the counter is approximately proportional to the energy level.

Since all output bits are digital signals, this method is widely used. However, the result

of this method which is nonlinear transformations is not precise. Case (c) uses an analog

memory to restore the peak value of the shaped pulse. The voltage is then followed by a

voltage buer. This method can be applied to the centroid evaluating algorithm in the

analog form. Alternatively, the stored value can be converted into digital representations

by an ADC shown in Case (d). Thus, the digital centroid evaluating algorithm can be

used to calculate the space resolution.

Figure 2.1: Basic architectures of the analog section in front-end systems [39].

As shown in Figure 2.1 a timing channel derived from the preamplier output is

becoming more and more important with the growing request of time-correlated imaging

in some chemistry and biology studies. Besides, three-dimensional imagers based on

time-domain reectometry have been obtained from two-dimensional imaging systems by

associating an accurate timing channel with each pixel [40].

The features of these front-end systems can be concluded as

• Low noise: Since the detection of particles is very sensitive, low noise is very

important in the front-end electronics. In particular, the preamplier must be low

noise to identify output signals from the detectors.

• Low power: Front-end electronics process high-energy particles which is easily

Overview of front-end electronic systems 21

aected by the temperature. Thus, low power design is required to assure that the

correct energy and timing information can be detected.

• High speed: The output pulses of the detector is actually high-frequency. The

width of the pulse is about several nanoseconds. The preamplier should have

large bandwidth. Besides, the time window of the particles is short. To achieve

high detection eciency, high-speed front-end signal processing is required.

• Large dynamic range: The number of input charges depends on the energy of theparticles and the conguration of the detector module. The value is from several

fC to several hundred pC. Thus, the dynamic range is very large.

• Low material: In many front-end electrical systems, thousands to millions of

channels are required. To obtain small volume and low cost, low material should

be satised.

• Radiation hardness: The front-end electronics can be damaged by the high en-

ergy particles. In particular, the circuits fabricated on silicon substrates usually

suer from radiation eects. Thus, radiation hardness should be taken into ac-

count.

• High reliability: Front-end electrical systems should be reliable so that the de-

tected signals are believable and available. Besides, high reliability should be par-

ticularly required for medical imaging elds.

Conventional front-end electronics are generally realized by discrete circuits on one

or several PCB boards. If the system needs many channels of detectors, their data

acquisition (DAQ) boards will occupy large space. In addition, since front-end elec-

tronics processes weak signals(current or voltage), on-board signals are usually aected

by the electro-magnetic interference (EMI) or environmental noises. Besides, discrete

circuits dissipate large power consumption which is not desirable for biomedical imag-

ing applications. As a result, the integration of the front-end circuits into a monolithic

application-specic integrated circuit (ASIC) has become a trend.

In the past quarter century, low-noise low-power multi-channel front-end ASICs were

available. Since many channels of front-ends were integrated into a single chip, the instru-

ments become compact and meanwhile high performances and low power consumptions

can be achieved. With the development of microelectronics and computer science, digital

signal processing techniques nd their applications in front-end electronics. The energy

quantity and the timing information of the particles are required to be converted into

digital representations for easier storage, transferring and processing. This benets from

integrated data converters such as analog-to-digital converters (ADCs) and time-to-digital

converters (TDCs). In particular, the conception of mixed-signal system-on-a-chip (SoC)

has been applied to front-end electronics. In recent years, the dream of using specic

medical imaging digital signal processor (DSP) embedded into front-end ASIC has been


come true. Intelligent sensing SoC has been a new research direction in the front-end

electronics.

The modern front-end electronics dedicated to a photodetector are shown in Fig-

ure 2.2. It basically consists of a preamplier, a slow shaper, an analog memory, an

ADC, a fast shaper, a time discriminator, a TDC and digital signal processing circuits.

In the view of signal processing, the signals owing from the photodetector to the digital

signal processing part can be divided into several steps: photo-electric conversion, sig-

nal acquisition, pulse height analysis, peak-detect-and-hold, analog-to-digital conversion,

time discriminator, time measurement and digitizing, digital signal processing. In the

following sections, the state-of-the-art of the signal processing and associated electronics

will be discussed in detail.

SlowShaper

Analog MemoryPreamp

FastShaper

Detector

DSP

E = g(x)

T = f(x)

Charge sensitive Preamplifier

A1InOut

ADC

TDC

Figure 2.2: Basic architecture and signal ow of modern front-end electronics systems.(a)

Photo-electric conversion;(b)Signal acquisition;(c)Pulse height analysis;(d)Peak-detect-and-

hold;(e)Analog-to-digital conversion;(f)Time discriminator;(g)Time measurement and digitiz-

ing;(h)Digital signal processing.

The front-end electronics and signal processing have been concludes in a few contri-

butions. Table 2.1 illustrates some of their representatives. They are also main references

of this chapter.

2.2 Photo-electric conversion

The objective of semiconductor detection systems is to fulll both the photo-electric

conversion and the electric signal processing. The photo-electric conversion is realized

by the detector module which is generally composed of two parts: scintillation crystals

and photodetectors. The scintillator absorbs high-energy radiations such as X- or γ- raysand converts a fraction of the absorbed energy into visible or ultraviolet photons. The

photodetector is an element to translate the photoelectrons to electrical charges.

The traditional detector module was made of a single crystal of thallium-doped

sodium iodide ( NaI[TI] ), individually coupled to a photomultiplier tube (PMT) [44].

Photo-electric conversion 23

Table 2.1: Several contributions on front-end electronics for high-energy physics and medical

imaging

No. Name Author Press Year Reference

1Low-noise wide-band amplier

in bipolar and CMOS technologies

Z.Y.Chang,

W.M.C. SansenKluwer 1991 [41]

2 Semiconductor detector systems H.Spieler Oxford Univ. 2005 [42]

3Medical Imaging:Principle,

Detectors and ElectronicsK.Iniewski Wiley 2009 [43]

Later, bismuth germanate (Bi4Ge3O12 or BGO) was discovered and employed due to

its much greater eciency of detecting γ-rays. Other scintillators include barium u-

oride (BaF2), yttrium aluminate (YAlO3[Ce] or YAP), and cerium-doped gadolinium

oxyorthosilicate (Gd2SiO5[Ce] or GSO). In recent years, new scintillators such as cerium-

doped lutetium oxyorthosilicate (Lu2SiO5 or LSO) and Cerium and yttrium doped lutetium

oxyorthosilicate (LYSO) have promoted PET performances.

Photodetectors are sensors that absorb the energy of photons and create electron-hole

pairs. Namely, they convert "light" to voltage or current signals that can be processed

by analog and digital circuits. Photodetectors have a large variety. The mostly used

photodetectors include CdTe or CdZnTe detector, photomultiplier tube(PMT), hybrid

Photon Detectors (HPD), silicon photomultiplier array, avalanche diodes (APD) array,

Multi-channel plate (MCP), charge-coupled device (CCD) detector and CMOS detector.

• CdTe/CdZnTe: Cadmium Telluride ( CdTe ) and Cadmium Zinc Telluride (

CdZnTe ) have been regards as semiconductor materials to detect X- and γ-rays.The high atomic number of the materials gives a high quantum eciency suitable

for a detector operating typically in the 10 ∼ 500 keV range.

• PMTs: they are one class of vacuum tubes and extremely sensitive detectors of light

in the ultraviolet, visible, and near-infrared ranges of the electromagnetic spectrum.

They can be mono or multianodes. The gain is about 106.

• HPDs: they joint the photoconversion principle of a PMT with the spatial resolu-

tion and the low uctuations of a semiconductor device.

• SiPMs: a novel sensor to detect optical photons. They operate at low bias voltage

of 20-60 V, provide unprecedented amplitude resolution and already now show

photon detection eciencies (PDE) comparable to or better than that of bialkali

PMTs. The technology of SiPM has not yet matured, but their parameters are

steadily improving.

• APDs: they are a type of highly sensitive semiconductor electronic devices that

exploit the photoelectric eect to convert light to electricity. The advantages are


compactness, high quantum eciency (up to a factor of 4 higher than PMTs), good

spatial uniformity, insensitivity to magnetic elds. However, the drawbacks are low

gain (∼50),large leakage current (up to 40 nA) and large capacitance (∼10 pF).

• MCP PMT: The principle is similar with the PMTs. However, the dimension of

anodes was greatly reduced. Besides, the time response has been improved.

• CCD: A type of sensors that detect visible photons. They are implemented as shift

registers that move charge between capacitive bins in the device. They are usually

integrated with an image sensor which is widely applied for digital imaging. High

resolution can be achieved, but the speed is limited.

• CMOS: They are a novel type of sensors that detect visible photons (CMOS image

sensor) or charged particles (Monolithic active pixel sensor). The CMOS detector

can be integrated with custom readout integrated circuits together. Thus, many

additional features such as various pixel architecture, random pixel access, analog

signal processing, on-chip bias generation, on-chip digitizing and radiation hardness

can be implemented on CMOS technology. Low power and high readout speed can

be achieved.

iS Cdet

VDDhv

iLeakageVout

vout

Ldet

Figure 2.3: The simplied model of the photodetector.(a)DC model;(b)Small-signal model

Although they are realized by dierent methods, these photodectors have a similar

model. Figure 2.3(a) shows the simplied DC model which consists of a bias resistor and

a reversed diode. Generally, the bias voltage is high. Figure 2.3(b) illustrates the small-

signal model of the photodetectors. This model is composed of four parts, the current

source, the detector capacitor, the detector inductor and the leakage current source. The

equivalent capacitance of the pixel detector is 0.1∼10 pF/pixel. However, the capacitanceof photomultiplier (PM) is larger. The value is 3∼30 pF. The current source is generateddue to the moved charges. In the pixel detector, the charges is about 100e-/µm. In the

PMTs, the output is 1 photoelectron which is corresponding to 105 ∼ 107e-. Thus, the

current source can be modeled as an impulse. We have

iS = Q0 · δ(t) (2.1)

Signal acquisition 25

where Q0 is the initial quantity of the charges. Since the signal has high frequency, the

inductance of the wire should be taken into account. The inductance is about 1 nH.

Moreover, the leakage current should be modeled. The leakage current depends on the

type of the photodetector. The value of the leakage current is from several nA to several

µA. One notes that, in this model, the high voltage bias, neighbors and calibration circuitsare missing. However, for the small signal analysis, the simplied model is available.

2.3 Signal acquisition

Reading the signals from photodetectors is a key technique in the front-end elec-

tronics. Since the current source is generated from a capacitive detector, two physical

parameters should be readout. One is the quantity of the charges whose measurement

can be called as 'energy measurement'. The other is arrival time measurements of the

detected particles. In the former, the preamplier and the integrator are required. In the

second, the discriminator together with time measurement circuits such as TDC should

be adopted.

The pulses from the detector, whose level are very weak, require the amplication.

Thus, each detector channel coupling with a preamplier is necessary. Three types of

ampliers are available. They are voltage-sensitive amplier, current-sensitive amplier,

and charge-sensitive amplier. Selecting the proper architecture of the preamplier is

critical for the whole system. This is mainly dependent on the type of photodetector and

the signal processing method after readout operations.

2.3.1 Voltage-sensitive ampliers

The voltage-sensitive amplier is one of the mostly used method in the front-end elec-

tronics. The equivalent model of the voltage-sensitive amplier is shown in Figure 2.4(a).

The signal voltage at the input stage is given as

vi =Ri

RS +RivS (2.2)

where RS and Ri are the source resistance and the equivalent input resistance of the

amplier, respectively. vS is the signal voltage source. From this equation, vi is to be

approximately equal to vS when Ri >> RS . Since the gain of the amplier is nite,

one should increase Ri to amplify more input signal voltage. Thus, maximizing input

resistance is very important in the design of a voltage-sensitive amplier.

The simplied conguration of the voltage-sensitive amplier for detectors is shown

in Figure 2.4(b). A coupling capacitor should be connected in serial between the detector

and the preamplier to convert the charges or weak current into voltage. Assuming the

quantity of the charges is ∆Q, the voltage variety of Ccoupling is

∆V = Ccoupling ·∆Q (2.3)


vS

RS

Rivi -A

Rf

Vout,vsa

Ccoupling

VDDhv

Figure 2.4: Signal acquisition using a voltage-sensitive amplier;(a)Basic equivalent

model;(b)Simplied schematic.

Thus, the output of the preamplier is given as

∆Vout,vsa = −A · Ccoupling ·∆Q (2.4)

where, A is the gain of the preamplier. By setting the proper value of A and Ccoupling,the variety of the charges can be detected into voltage signals.

Vb

Vin

Vout

Vb

Vin

Vout

Vdd Vdd

Vb1

Vin

Vout

Vdd

Vb2

Vb

Vin

Vout

Vdd

RD

VbVin

Vout

Vdd

RD

Vin

Vout

Vdd

RD2RD1

IBias IBias

+

_

+ _

(a) (b) (c) (d)

(e) (f)

M1

M2

M1

M2

M1

M2

M1

M2

M1 M2

M1 M2

Figure 2.5: The mostly used CMOS voltage-mode ampliers [45].(a)Common-source ampli-

er;(b)Source follower or common-drain amplier;(c)Common-gate amplier;(d)Cascode ampli-

er;(e)Folded cascode amplier;(f)Dierential-pair amplier.

Real ampliers in CMOS technology can be implemented in dierent ways. Fig-

ure 2.5 shows the mostly used CMOS ampliers which include common-source amplier,


source follower or common-drain amplier, common-gate amplier, cascode stage, folded

cascode, dierential pair, respectively. The rst three are the basic single-stage ampliers

which can construct complex analog circuits. The cascode stage and folded cascode am-

pliers can achieve high gain-bandwidth product. Thus, it has been widely used in the

front-end readout circuits. For some low-noise and high-EMI environments, dierential

pair as a preamplier is preferred for the front-end readout electronics. The analysis

and the design of these voltage-mode CMOS ampliers can be found in [45]. How-

ever, for front-end applications, since low noise and radiation hardness should be usually

considered, advanced design techniques should be developed according to the specic

applications.

The voltage ampliers can achieve high gain by using cascade architectures. However,

the gain is limited by the power supply scaling. In addition, the gain and bandwidth can

not be increased simultaneously. The response speed is usually limited by using transistors

with large dimensions. Thus, voltage ampliers can be only applied to the elds which

require moderate gain and moderate speed.

2.3.2 Current-sensitive ampliers

Since the photodetector converts the light to electron-hole pairs, the current signal is

very sensitive. The current-sensitive amplier is designed to read out the current signals

from the detectors. The model of the current-sensitive amplier is shown in Figure 2.6(a).

In this model, the signal source is represented by a current source and a parallel shunt

resistor. The current owing into the amplier is

ii =RS

RS +RiiS (2.5)

To maximum ii, Ri should be greatly less than RS . Thus, one should reduce the input

resistance in the design of a current-sensitive amplier.

iS RS RiiRS ii-A Vout,curr

VDDhv Rint Cint

VDD

Figure 2.6: Signal acquisition using a current-sensitive amplier;(a)Basic equivalent


The simplied front-end circuit using a current-sensitive amplier is shown in Fig-

ure 2.6(b). The input of the current-sensitive amplier is directly connected to the output


of the photodetector so that the current can directly ow into the amplier. Since the

output of a current-sensitive amplier is also current, an open-loop integrator is required

to convert this current to the voltage signal. The output voltage of the current-sensitive

amplier with the integrator is

Vout,curr =ARintCint

· iS(t) · e−t

RintCint (2.6)

where, A is the gain of the current amplier. Rint and Cint are the resistance and the

capacitance of the integrator, respectively.

Iin

Vdd

M1 M2

ID1 ID2

Iout

VbIin

Vdd

M1 M2

ID1 ID2Iout

M3

Iin

Vdd

M1 M2

ID1 ID2Iout

M4M3

Iin

Vdd

M1 M2

ID1 ID2Iout

M4M3

M5

ID3

Iin

Vdd

M2

M3

IoutM4

M1

ID1 M5 M6

(a) (b) (c)

(d) (e)

Figure 2.7: The mostly used CMOS current-mode amplier [46, 47]. (a)Simple current amplier

based on a current mirror;(b)A simple current amplier using three-transistor cascode current

mirror;(c)A current amplier using four-transistor cascode current mirror;(d)A modied current

mirror using cascode current mirror for low-voltage power supply;(e)A regulated cascode current

amplier.

The implementation of the current amplier in CMOS technology mainly resorts to

the current mirrors. The mostly used current ampliers [46] are listed in Figure 2.7.

The simplest current amplier shown in Figure 2.7(a) is realized by a current mirror

consisting of two transistors. The gain of this current mirror is determined by the ratio

of the width to length of the transistors. However, the gain is nonlinear due to the channel-

length modulation eect. To enhance the performances of such a current amplier, new

architectures shown in Figure 2.7(b)∼(d) are proposed. In Figure 2.7(b), M3 is added

to ensure Vds2 is near to Vds1 or to keep a constant dierence. However, Vds2 varies

while the voltage level of the output is changed. Figure 2.7(c) using two transistors

connected to the current mirror in parallel. The channel-length modulation eection


can be obviously contained. However, the drawback of this circuit is that the voltage

headroom is reduced. In the applications of low-voltage power supply, the improved

scheme is given by Figure 2.7(d).

To achieve high linearity, increasing the length of the transistor is another method

for the current mirror or the cascode current mirror. However, large length introduces

large parasitic capacitor which will limits the bandwidth of the ampliers. For high-speed

applications, regulated cascode (RGC) amplier [47] is preferred. Figure 2.7(e) shows a

CMOS implementation of the RGC amplier. The amplier using a positive feedback

can easily achieve high linearity with high speed. The gain-bandwidth product of 10 GHz

can be obtain. This is a good feature of front-end signals in both communications and

high-energy physics.

Although current ampliers suer from nonlinear gain, they have advantages of high

speed and low input impedance. Besides, current-mode amplier can overcome the chal-

lenges due to the technology scaling and are recommended for submicron meter CMOS

technologies.

2.3.3 Charge-sensitive ampliers

In front-end electronics, the mostly used conguration is the charge-sensitive ampli-

er(CSA). The CSA collects the charges from the photodetector and convert the input

charges to a voltage signal. The equivalent model is shown in Figure 2.8(a). Since the

photodetector can be modeled as a current source and a detector capacitor, the charges

can also be collected by the input capacitor. Assuming that ∆Q denotes the vary of

charges, the voltage change between the positive pole and the negative pole of the capac-

itors can be given as

∆V =∆Q

Cdet + Cin(2.7)

where, Cin represents the equivalent capacitance at the input stage. For simplicity as-

suming no current ows into the preamplier, the quantity of collected charges at the

input stage of the amplier is given as

∆Qin = Cin ·∆V =Cin

Cdet + Cin· 100% (2.8)

In order to collect larger number of charges, the value of Cin should be larger. It means

that the input capacitance of the amplier should be increased as large as possible.

The simplied schematic of the front-end circuits using a CSA is shown in Fig-

ure 2.8(b). A feedback RC network is connected between the input and the output of the

CSA. The function of this circuit is to convert a part of charges to the output voltage. If

no signal current ow into the preamplier, the transfer function [42] can be given as

AQ =A

1 + A· 1

Cf≈ 1

Cf(A >> 1) (2.9)


-A

Rf

Vout,Csa

VDDhv

iS iiCdet Cin

Cf

Figure 2.8: Signal acquisition using a charge-sensitive amplier;(a)Basic equivalent


where A = ∂vO∂vi

is the voltage gain. One notes that the gain of a CSA is inversely

proportional to the feedback capacitance. Thus, the feedback capacitor is a key element

in the design of a CSA.

Since a feedback is utilized, the value of the input capacitor is (A+1) larger times

than that of a capacitor directly connected to the input stage. Neglecting the parasitic

capacitance, the equivalent input capacitor is Cf (A + 1). Thus, the real eciency of

charge collection is

ηcsa,real =Cin

Cdet + Cin≈ Cf (A+ 1)

Cdet + Cf (A+ 1)(2.10)

It is clear that the eciency of the charge collection is determined by the detector capac-

itor and the charge-sensitive amplier. In order to achieve the higher sensitive conversion

from charges to a voltage signal, the collection eciency should be increased as large as

possible. Thus, one should increase the value of Cf (A+ 1).

Due to the subjective of a CSA is charges, low noise performances are very important.

For a charge amplier, the evaluated parameter is equivalent noise charge (ENC) which

is noise referred to the input in electrons. The ENC can be calculated as

ENC =

√V 2n,o

AQ(2.11)

where V 2n,o is the output voltage noise in V 2 and AQ is the gain of the CSA in V/C.

Thus, ENC is proportional to the output voltage noise and inverse proportional to the

gain of the CSA. Since AQ is mainly determined by the feedback capacitor, reducing

ENC depends on decreasing the output noise of the CSA. This requires design techniques

of low-noise charge ampliers.

The CMOS implementation of a CSA is shown in Figure 2.9. It consists of two-

stage voltage ampliers. The rst stage is a folded cascode amplier, which is utilized

to obtain both high gain and large bandwidth. The second stage is a common-source

amplier which provides large output dynamic range. The feedback resistor is realized

Pulse height analysis 31

bp1

in

dd

1

2

outbp2

bn1

rst

3

4

5

6

sw

Figure 2.9: Schematic of a cascode charge-sensitive preamplier [48].

by a MOS transistor operated in linear region. The capacitor is usually implemented by

precise poly-ploy capacitors or metal-in-metal (MiM) capacitors.

In the past thirty years, many contributions has involved the topic of charge sensitive

ampliers. Here, the author gives three examples. In 1990s, W. Sansen et al proposed

a low noise CMOS CSA for large-capacitance detectors [49, 41]. The CSA was designed

and integrated in a standard 3-µm CMOS technology with 1-pF feedback capacitance for

40-pF detector capacitance. The core amplier was basically a folded cascode amplier.

The input MOS transistor contributed more than 90 % to the total amplier noise. Thus,

the optimization of the noise of the input transistor is key in this kind of CSAs. Later,

G. Gramegna et al. presented a CMOS preamplier for low-capacitance detectors (0.1 ∼1 pF) [48]. Cascode architecture with NMOS input transistor was chosen. Implemented

in a 1.2 µm CMOS process, the preamplier achieves an ENC of 35 e- + 58 e-/pF at

23 µs shaping time at a power consumption of about 3.2 mW. The feedback capacitor

is connected the output of the CSA. Recently, low-gain APDs were utilized for PET

imaging. The readout front-ends should be both low noise and high gain. A new CSA

based on a single-ended input split-leg cascode conguration was proposed in [50]. This

CSA can be utilized for read out the APD detector with 3-pF capacitance. Although

low noise and high gain are obtained, this CSA requires a bias current of 2 mA. Thus, it

consumes large static power dissipation.

2.4 Pulse height analysis

Pulse height analysis is the operation that identies useful information from the

readout signals combined with the electronic noise and measures the pulse height. To

obtain the peak value of the pulse, shaping techniques should be employed. Pulse shaping


optimizes the energy resolution and minimizes the risk of overlap between successive

pulses. The mostly used shaping techniques are CR-RC shaping and semi-Gaussian

shaping. In fact, the latter refers to the high-order CR-(RC)n shaping. Other shaping

techniques include delay-line shaping, quasi-triangular and trapezoidal pulse shaping and

gated-integrator pulse shaping and so on. These shaping techniques can be utilized for

specic applications. In this section, the CR-RC and semi-Gaussian shaping techniques

will be discussed in detail.

2.4.1 CR-RC shaping

For some low-gain semiconductor detectors, the amplitude of the output signal is low

and meanwhile the frequency of the signal is high. A band-pass lter is required to identify

real signals and noise signals. The mostly used CR-RC pulse shaper which consists of a

high-pass lter followed by a low-pass lter is shown in Figure 2.10(a). The high-pass lter

is realized by a capacitor and a resistor in serial, following by an operational amplier

(op amp) which isolates the load. The low-pass lter is realized by RC section and an op

amp. In fact, the RC and CR sections behave as a "dierentiator" and an "integrator",

respectively. Thus, the high-frequency electric noise and low-frequency white noise can

be suppressed. Thus, signal-to-noise ratio (SNR) of front-end signals can be improved by

using a CR-RC shaper.

Figure 2.10: Schematic of the CR-RC shaper circuit.(a)Open-loop CR-CR shaper;(b)Active

CR-RC shaper

Besides, the signal can be modulated. Firstly, the frequency of front-end signals can

be modulated by setting proper values of the resistors and the capacitors. This feature

gives a possibility to process front-end signals by using low-frequency analog circuits.

Secondly, the moment that the pulse achieves the maximum value can be adjusted by

the resistance and the capacitance. In particular, this moment can be identical for all

quantity levels of input charges.

Pulse height analysis 33

The transfer function of the shaper shown in Figure 2.10(a) is given as

Ha(s) =sR1C1

1 + sR1C1· A2

1 + sR2C2(2.12)

If R1C1 = R2C2 = τp and A = 1, Equation 2.12 can be rewritten as

Ha(s) =sτp

1 + sτp· 1

1 + sτp(2.13)

Thus, this system has one zero and two poles which are overlapped. The dependence

of timing characteristics is mainly upon τp. If the input of the shaper is a voltage step

which can be modeled as U(t− τintr), the output of the shaper can be given as

Vout,shpr =t

τ· e−

tτ · U(t− τintr) (2.14)

where τintr refers to the integrating time and the delay of the integrator. When t > τintr,the maximum value of Vout,shpr can be derived by

∂Vout,shpr∂t

= 0 (2.15)

By solve this equation, we have

tmax = τp (2.16)

In front-end electronics, τp is also called "shaping time". From Equation 2.12 to

Equation 2.16, we found that the shaping time is determined by the value of R1C1,

R2C2 and A.In fact, the CR-RC is utilized in the form of the circuits shown in Figure 2.10(b).

This circuit is also called "active shaper". The low-pass section is connected between the

negative input port and the output of the op amp. A reference voltage is connected to

the positive port of the op amp. Thus, the DC value of the shaper output can be easily

adjusted. The transfer function of this circuit is given as

Hb(s) = − A · sR2C1

sR2C1 + (A+ 1)(1 + sR1C1)(1 + sR2C2)(2.17)

if A >> 1, we have

Hb(s) ≈ − sR2C1

(1 + sR1C1)(1 + sR2C2)(2.18)

Eq. 2.18 has the same form as Eq. 2.12. It means that the active shaper can achieve the

same performances as that in the circuit shown in Figure 2.10(a).

The typical output waveform of an active CR-RC shaper is shown in Figure 2.11.

The amplitudes of the outputs linearly increase when the input voltage varies at the step

of 0.1V. All shaper outputs have the same shaping time. However, one can note that the

outputs after peaking points returns to baseline rather slowly. It means that the decay

time of the shaper output is large. This feature limits its applications in the systems with

the high signal rate. The typical phenomenon is the pile-up problems if this decay time

is not small enough.


Figure 2.11: Output waveform of an active CR-RC shaper with the shaping time of 3 µs. The

input voltage varies from 0.1 V to 1V.

2.4.2 Semi-Gaussian shaping

To overcome the drawbacks of the CR-RC shaper, the decay time should be reduced.

Meanwhile, the output pulse can be made more symmetrical, allowing higher signal rates

for the same shaping time. This can be realized by a semi-Gaussian shaper [51] which

has been proposed in 1970s. The way of this shaper is to utilize multiple-stage low-pass

section to modulate the pulse so that the shaped pulse will be as symmetrical as possible.

The diagram of the semi-Gaussian CR-(RC)n and the schematic of the integrator are

shown in Figure 2.12(a) and (b), respectively.

Figure 2.12: The semi-Gaussian shaper (a)The architecture.(b)The schematic of the integrator.

The transfer function of a semi-Gaussian shaper can be given as

HSGs =

(sτp

1 + sτp

)·(

A

1 + sτp

)n(2.19)

Peak detect sample and hold 35

where τp is the time constant of the dierentiator and integrator. A is the DC gain of

the integrator. Compared to the CR-RC shaper, the n-order integrator is utilized. The

step response of the Semi-Gaussian shaper is given as

Vout,sgshpr =

(t

τp

)ne− tτpU(t− tintr) (2.20)

Similarly, one can nd that the shaping time is a constant parameter with the indepen-

dence of the order number of integrator. Besides, more integrators mean more improve-

ments on the SNR and more symmetric semi-Gaussian waveforms. This results can be

shown in Figure 2.13. However, more integrators will occupy larger die area which is not

approved in some multi-channel front-end electrical systems.

Figure 2.13: Typical output waveforms of a semi-Gaussian shaper [41, 42]

2.5 Peak detect sample and hold

The shaped pulse contains the information about the energy of particles. To nish the

energy measurement, the peak value of this shaped pulse should be detected, sampled and

held for the subsequent digitizing. Thus, peak-detect-and-hold circuits are an important

part of the front-end electronics.

Several methods can be employed for the precise detection of the peak voltage. First,

based on the CR-RC or semi-Gaussian shaping, the shaping time is the same for all levels

of input charges. Thus, the peak value can be sampled by using this xed delay time.

The circuits can be implemented by using an analog memory realized by switch-capacitor

circuits and a monostable circuit. Second, the peak value of the shaped pulse can be

tracked and stored by a specic circuit which named as peak-track-and-hold circuit. This

method is independent on time stamps. Third, the shaped pulse can be oversampled by

an high-speed ADC. Thus, the pulse can be recovered by using digital signal processing

algorithm. The peak value can be calculated by o-line programs.


2.5.1 Peak sampling using a xed delay

Preamplifier

Sample and Hold

Vout

Hold2

Vref

-Ao

Vdd_hv

Monostable

Rf

Cf

Rbias

V1 V2 V3

Hold1

Shaper

CR-RC or semi-Gaussian shaper

Figure 2.14: Schematic of the front-end electronics to realize pulse height measurements using

CR-RC shaper or semi-Gaussian shaper with a monostable circuit.

It is known that the common feature of both the CR-RC shaper and the semi-

Gaussian shaper is the xed shaping time for all levels of input charges. Thus, one can

use a circuit to control the delay time of the discriminated pulses. The scheme of the

measurement is shown Figure 2.14. In this scheme, a discriminator with a quiet threshold

voltage generates the time-stamp pulse. This time-stamp pulse can be propagated via

the monostable circuit which can generate the adjustable precise delay so that a sampling

pulse of an analog memory can be generated. Since the shaping time is xed for all input

charges, the peak voltage of the shaper output can be sampled by this sampling pulse.

The precision dependence of the measured pulse height is upon the generated kT/C

noise when the sampling and holding operation is performed. However, the circuit im-

plementation of this method is very simple. Besides, this method can be employed for

the applications which have large dynamic range.

2.5.2 Peak-track-and-hold

The feature of front-end signal is that only one peak point exists in the shaped pulse.

Thus, this pulse can also be tracked and stored the information of the peak value. The

mostly used architecture is an amplier charging a capacitor through a nonlinear device.

As shown in Figure 2.15, if the output voltage is higher than the input voltage, the

output of the OTA is high. The diode is conducting and the capacitor is charged. Until

the output voltage exceeds the input voltage, the output of the OTA is changed to low.

The diode is reverse biased so that the charging operation is stopped. As a result, the

peak voltage of the input pulse is held on the capacitor.

The discrete realization was rstly proposed in [52]. The drawback of this architec-

ture is that the voltage step on the diode will introduce the charge injection on the hold

capacitor. Moreover, there is no discharging paths. Since 1994, M.W. Kruiskamp et al

presented an improved architecture which used a current mirror to overcome these prob-

lems [53]. The modied architecture is shown in Figure 2.16. When the input voltage

Peak detect sample and hold 37

is lower that the output voltage (High level in the initial condition), the current mirror

works and the capacitor is charged. When the voltage level of the output exceeds that of

the input, the current mirror is shut o so that the charging operation is stopped. Thus,

the peak voltage is held on the capacitor. A MOS switch is set to discharge the capaci-

tor. This successful architecture can achieve very good performances. The pedestals were

found to be less than 5 mV for the positive-pulse channel and less than 35 mV for the

negative pulse channel while the droop rate was less than 8 µV/µs [54].

1VinVout

C

DiodeOTA

+_

Preamplifier

-Ao

Vdd_hvRf

Cf

Rbias

V1 V2 V4Shaper

CR-RC or semi-Gaussian shaper

Peak detect and hold

V3ADC

Energy

Figure 2.15: Front-end electronics to realize pulse height measurements using CR-RC shaper or

semi-Gaussian shaper followed by a peak-track-and-hold block which tracks the pulse and stores

the information of the peak value. The circuit is implemented by a OTA, a diode, a capacitor

and a voltage buer with a feedback loop.

inout

Figure 2.16: The modied architecture of peak-track-and-hold circuits [53].

However, this architecture also suers from some disadvantages. Firstly, the current

mirror and the reset switch can not be totally shut o. This introduces the leakage

current while charging and discharging. Thus, the error of the holding voltage can not be

eliminated. Secondly, the accuracy of the peak-height measurement is aected by factors

such as the oset voltage and poor common-mode rejection of the amplier, the low slew

rate and the parasitic capacitance.


OTA+

_Vin

Vout

Vdd

CReset

Φ1(write): S11,S12 ON and S21,S22,S23 OFF

S11S12

S21

S22

S23

ZL

Φ2(read): S11,S12 OFF and S21,S22,S23 ON

Figure 2.17: The two-phase peak-track-and-hold circuit [55].

Since 2002, G.D.Geronimo et al. proposed a two-phase peak-detect-and-hold circuit

shown in Figure 2.17. In the charging mode, the switches S11 and S12 are close and

S21 ∼ S23 are open. The proposed circuit behaves as the modied peak-detect-and-hold

circuit shown in Figure 2.16. In the reading mode, the states of the switches are on the

contrary. S11 and S12 are open and S21 ∼ S23 are close. The circuit operates as a hold

circuit followed by a voltage buer. Several advantages can be obtained from this circuits.

Firstly, the amplier oset voltage can be canceled. Secondly, the current mirror and the

reset switch can be totally shut o by using two-phase non-overlapping timing. Thirdly,

the precision can be improved by using rail-to-rail OTA and minimized transistors. A

prototype was designed in TSMC 0.35 µm CMOS technology. The absolute accuracy of

0.2 % at 500 ns and droop rate of 0.25 µV/µs have been achieved [55].

2.6 Analog-to-digital conversion

With the development of VLSI and computer science, front-end electronics has been

enter the era of "Go Digital As Soon As Possible". These requirements motivate

the development of the ADCs with high resolution, large sampling rate and low power.

In recent years, the evolution of integrated ADCs makes possibly that the ADC circuits

can be integrated into the front-end readout ASICs. As a result, the signal digitizing

becomes more and more important for the development of front-end chips.

Here, the analog-to-digital conversion refers to the conversion operation dedicated to

analog voltage signals to digital signals. The ADC process consists of two steps: sample-

and-hold (S/H) operation followed by digital quantization. For a given continuous-time

analog input, Vin, the ADC outputs a serial of digital codes. The transfer function of a

3-bit ADC is shown in Figure 2.18.

The relationship of the analog input and digital outputs is given as [56, 57].

Vin = VFS

N−1∑k=0

Dk

2k+1+ ε (2.21)

Analog-to-digital conversion 39

Figure 2.18: Ideal transfer function of a 3-bit ADC.

where VFS is the full-scale voltage. Dk is the kth individual output bit and ε is the

quantization error. This equations can also be reworded by using the quantum voltage

level, VLSB.

Vin = VLSB

N−1∑k=0

Dk2k + ε (2.22)

where VLSB = VFS/2k. Due to the nite resolution, the quantization error ε exists in

all ADCs. The value can be given as

ε =VLSB√

12(2.23)

We have

SNR = 6.02N + 1.76dB (2.24)

Thus, the SNR of the ADC is limited by the resolution. Moreover, it is indicated that the

SNR is proportional to the resolution N. This equation gives the decision of the resolution

of an ADC while the SNR is given.

All real ADCs have additional noise sources and distortion processes that degrade

the performances of ADC circuits from the SNR discussed above [56, 57]. The dynamic

behavior can be reported in several dynamic parameters. Signal-to-noise-and-distortion

ratio (SNDR) is the ratio of the input signal amplitude to the rms sum of all other spectral

components. For an M-point FFT of a sine wave test, if the fundamental is in frequency

bin m (with amplitude Am), the SNDR can be calculated from the FFT amplitudes

SNDR = 10log

A2m

m−1∑k=1

A2k +

M/2∑k=m+1

A2k

−1 (2.25)


Thus, eective number of bits (ENOB) can be dened as

ENOB =SNDR− 1.76dB

6.02dB/bit(2.26)

ENOB which should be a test result is a simple way to evaluate the ADC's dynamic

performance.

Moreover, ADC performances are mainly limited by the noise introduced by the S/H

circuit and signal distortion due to the quantization [58, 59]. For all types of ADCs, reso-

lution, sampling rate, and power dissipation are three universal performance parameters.

To evaluate ADCs, two gures-of-merits (FOMs) are widely utilized. They are

P = 2B · fs (2.27)

F =2B · fsPdiss

(2.28)

where, P is the gure-of-merit for the combined performance of resolution and speed and

F is the gure-of-merit of power eciency with resolution and speed. B, fs and Pdissrefer to resolution, sampling rate and power dissipation, respectively.

For the applications of radiation photodetectors, a large number of channels and a

dynamic range higher than 6 bits are routine. Selecting the proper ADC architecture for

a particular application appears to be a formidable task. Nowadays, a large number of

ADC architectures is available. In [59], the authors surveyed nearly 1,000 commercial

ADC chips released in the past 20 years. The results were partly concluded in two gures

shown in Figure 2.19 and Figure 2.19.

Flash architectures [60, 61] are basically excluded by the constrains of power dis-

sipation and die area. Semi-ash or pipeline architectures [62], at the basis of most of

the modern commercial ADCs, are better suited. However, the design of these ADCs is

more dicult, in particular, if good performances of dierential linearity are required.

Successive-approximation-register (SAR) architectures [63] are easier to design but their

area may become prohibitive if a large dynamic range and a good linearity are needed [64].

Sigma-delta (Σ-∆) architecture can obtain high resolution even up to 24 bits. However,

the speed is generally limited as lower than 1 MSamples/s. At last, the single-ramp or

Wilkinson architecture has advantages of low power dissipation and multiple channels.

The sampling rate of Wilkinson ADCs is only several hundred kSamples/s when the refer-

ence clock is about 100 MHz. However, since they can achieve low power, high resolution

and multiple channels, Wilkinson ADCs are very suitable and widely utilized for front-

end electronics. Since ADCs have been developed for several decades, a large number of

contributions including scientic papers, dissertations and books have been published in

this eld. In this section, the representative contributions are concludes in Table 2.2.

2.7 Time discriminator

In front-end electronics, a time discriminator plays an important role for time mea-

surements. Generally, the time discriminator is realized by a high-speed high-resolution

Time discriminator 41

Figure 2.19: Survey results of the number of bits versus the sampling rate for ADCs released

in the past 20 years [59].

Figure 2.20: Survey results of the power dissipation versus the sampling rate for ADCs releasedin the past 20 years [59].


Table 2.2: Several contributions on analog-to-digital converters (ADC)


1 Data converters G.B Clayton Wiley 1982 [65]

2Principle of Data Conversion

System DesignB. Razavi Wiley 1995 [56]

3Simplied design of

data convertersJ.D. Lenk Elsevier 1997 [66]

4Delta-Sigma data converters:

theory, design and simulation

S.R.Norsworthy,

et alIEEE 1997 [67]

5CMOS data converter for

communications

M.Gustavsson,

et al.Kluwer 2000 [68]

6Data converters for

Wireless standardsC.Shi, et al. Kluwer 2002 [69]

7 CMOS telecom data convertersA. Rodriguez,

et al.Kluwer 2003 [70]

8CMOS integrated analog-to-digital

and digital-to-analog converters

Rudy J. van

de PlasscheSpringer 2003 [71]

9 Data converters F.Maloberti Springer 2007 [72]

10

Analog Circuit Design: Smart

Data Converters, Filters on chip,

Multimode Transmitters

A. Roermund,

H. Casier,

M.Steyaert

Springer 2009 [73]

comparator. According to the type of input signals, either current-mode comparators or

voltage mode comparators can be employed.

The design techniques of a high-speed high-resolution voltage-mode comparator was

described in [74]. The proposed comparator consisted of a preamplier following by

several regenerative stages and a latch as the output stage. Although this architecture

was designed for the applications of high-resolution ADCs, it was also employed by the

front-end electronics. The reduction of oset voltages in each stage is a big challenge

for the design of these comparators. Input oset storage (IOS) or output oset storage

(OOS) can be utilized to reduce the oset voltages. In [75], an auto-zero high-speed

comparator using both IOS and OOS were proposed for a monolithic active pixel sensor.

The output stage of these comparators are realized by a high-speed latch using a positive

feedback loop. However, for the continue-time discriminator, the last stage should be

replaced as a current amplier which was reported in [64]. Since the detector output is a

current pulse, current-mode comparators [76, 77, 78, 79] are also preferred in the front-

end electronics. In these contributions, the positive feedback mechanism is proposed to

Time discriminator 43

enhance the speed of the current comparators.

For the photodetectors, the amplitude of the output current signals and the slope

of their falling edges are dependent on the number of injected charges. The dierent

slopes of falling edges according to the dierent number of injected charges introduce a

small error which is dened as "Time Walk" , shown in Figure 2.21. Time Walk is an

intrinsic error which can be tested and corrected via o-line programs. However, the

on-line correction using the constant fraction discriminator (CFD) [80, 81, 82] is another

solution.

ref

in

Figure 2.21: "Time Walk" due to the variable slope of the current pulse.

Figure 2.22: A simple model of CFD circuits.

A simple model of the CFD circuit is shown in Figure 2.22. The circuit is composed

of a delay element, a fraction circuit, a summing circuit and a zero-crossing discriminator.

The input signal is processed by an attenuator with the phase inverse operation. Mean-

while, the input signal is delayed by a circuit. The outputs of both the fraction circuit

and the delay element are summed and output a dual-pole signal. The moment of the

zero crossing from the negative pole to the positive pole is independent on the amplitude

of the input signal. Thus, by using a zero-crossing discriminator, the time marks without

'time walk' can be achieved.


2.8 Time-to-digital conversion

A TDC is an essential electronics which quantizes small time dierences between two

signals (dened as "Start" and "Stop") and provides digital representations of this time

interval. The function of a TDC is similar with an ADC. However, the TDC deals with

the time dierence rather than voltage or current dierences in the ADCs, as shown in

Figure 2.23(a). The measured time is dened as the phase dierence between the positive

edges of Start and Stop(Figure 2.23(b)). Figure 2.23(c) shows the transfer characteristics

of a 3-bit TDC. The input is continued time signals. The outputs are digital codes. Since

the inuence of the mismatches and the noise, the real transfer curve will deect the ideal

curve and generate quantization errors.

Start

Stop

Tin

001010011100101110111

000

TDCStartStop

D2D1

D0

TLSB

DNL

TR

Dig

ital o

utpu

ts

Time interval

Real curveIdeal curve

(a)

(b) (c)

Figure 2.23: Basis of time-to-digital conversion.(a)TDC block symbol;(b)Time interval of Start

and Stop signal;(c)Transfer curve of a 3-bit TDC

The relationship between measured time and outputs digital codes is given as

Tin = TLSB ·k=n−1∑k=0

Dk · 2k (2.29)

where Tin is the measured time interval between Start and Stop. TLSB is the minimum

unit of time measurements. n and D are the number of bits and the digital codes of the

TDC outputs, respectively.

Since the operation of a TDC is familiar with that of an ADC, the performance merits

of the TDCs can be dened as that of ADCs. The number of bits for digital outputs,

dynamic measured range, nonlinearity and conversion speed are important characteristics

to evaluate a TDC. Meanwhile, power dissipation, dead time or hit rate, and single shot

precision should be considered for a TDC design. However, minimum quantized unit

TLSB and resolution σ should be taken account to evaluate the precision of a TDC.

Integrated TDCs are newer electrical components compared to ADCs. Since 1990s,

diverse TDC architectures have been explored. The useful references on the TDC design

is concluded in Table 2.3.

Time-to-digital conversion 45

Table 2.3: Several contributions on time-to-digital converters (TDC)


1

Design and charaterization of

CMOS high-resolution time-to-

digital converters

M.Mota

PhD thesis,

Univ. Tech.

de Lisboa

2000 [83]

2

An integrated CMOS high-

precision time-to-digital converter

based on stabilised three-

stage delay line interpolation

A.MantyniemiPhD thesis,

Univ. of Oulu2004 [84]

3

Noise shaping techniques for

analog and time to digital

converter using voltage

controlled oscillator

M. Straayer Phd theis, MIT 2008 [85]

4 Time-to-digital converter S.Henzler Springer 2010 [86]

The overview of TDC architectures is given in the following text. Generally, the

existing TDCs can be divided into four generations which are analog TDC, counter-based

TDC, sub-gate TDC and sub-picoseconds TDC.

Analog TDC

The rst class is the analog TDCs which are based on the current integration com-

bined with a high-resolution ADC. As shown in Figure 2.24, the architecture of the analog

TDC consists of a time-to-amplitude converter (TAC) and a high-resolution high-speed

ADC [87, 88]. The TAC is generally implemented by the current-integration circuit

consisting of a charge-pump and a capacitor. A sample-and-hold circuit is required to

provide a stable voltage signal. A high-resolution ADC digitizes this sampled voltage sig-

nal to binary codes. The analog TDCs can be achieve high resolution time measurements.

However, their applications are limited because the TAC and the ADC are mainly analog

circuits so that the implementation in deep sub-micrometer process is complicated.

Counter-based TDC

A counter-based TDC would be the oldest and the simplest scheme of time measure-

ments. The measured time equates to the counted number multiplied by the period of the

clock. Nowadays, counter-based TDCs are widely used in the TDC design because they

have advantages on wide-range measurements and easier designs in several technologies

such as CMOS/BiCMOS process, eld-programmable gate array (FPGA) [89], and GaAs

superconductive process [90].


UP

Down

Icp

Icp

Vdda

Charge Pump

Cc Csh

ADCTime

Interval Logic

Start

Stop

Sample & Hold

Time words

Figure 2.24: Architecture of a TDC using current integration and analog-to-digital conversion.

The mostly used counter-based TDCs are driven by a reference clock and reset by

the Start signal. The outputs of the counter is sampled by the Stop signal. The sampled

data are then stored into the registers. The counter-based TDCs usually suer from

the metastability of D ip ops in the counter. In addition, the conversion precision is

limited due to the clock jitter and the electrical noise. However, the architecture can be

optimized by using the Gray-code counter or dual-counter architectures. In [91, 83], a

dual-counter TDC was introduced to overcome the problems due to the metastability.

The architecture and the timing are shown in Figure 2.25.

Figure 2.25: The TDC using dual counters to overcome the metastability of D ip op in the

digital counters [83].

DLL-based TDC

Digital gates have the xed delay which can be used for time measurements. How-

ever,the delay circuits using digital gates can not be controlled. Besides, the unit of the

gate delay is limited by the fabricated technology. One way to get controlled sub-gate de-

lay is to develop analog delay cells such as current-starved delay cells [92] and dierential

Time-to-digital conversion 47

delay cells to construct a voltage-controlled delay line (VCDL) which can be embedded

in a delay-locked loop (DLL) and easily generate multiphase delayed clocks in one clock

period [93]. The architecture of the TDC using a single DLL is shown in Figure 2.26. The

resolution of this TDC is determined by the minimum delay time of the delay cell and

the jitter performances of the DLL. To achieve smaller bin size, higher clock frequency

or time interpolation should be utilized.

Figure 2.26: Architecture of the TDC based on a single DLL [83].

The time interpolation using an array of DLL is one of the most eective methods to

improve the resolution [91, 83]. Two kinds of DLLs with dierent delay elements are used

to construct the array. The bin size depends on the time dierence of the delay cells in

both DLLs. Smaller time taps can be obtained by using larger die area. An unfortunate

feature of the TDC based on the DLL array is that the array scheme is unable to produce

the multiphase clocks with a number of a power of 2. This results in the digital outputs

with pseudo binary codes. However, the measured results can be easily processed by

o-line programming.

Dierent from the TDC using a DLL array, another method of the resolution im-

provement is to use multiple sampling signals delayed by the hit signal. The multiple

sampling signals can be generated from a resistor-capacitor (RC) delay line [94].

The Vernier delay line whose principle originates from a Vernier ruler is also a good

solution for the resolution improvement [95]. Two DLLs are required to generate the

precise reference delay time. The key point is the delay dierence of the delay cell in two

delay lines should be exactly equivalent to the clock period divided by number of delay

cells. In reality, the sampling process can be equivalent to the ash sampling. The bin

size of the VDL-based TDC is determined by the time dierence of the delay cells in two

delay lines.

Cyclic TDC

A cyclic TDC using pulse shrinking delay line [96, 97] is a kind of low power TDCs

which can achieve the resolution of few picoseconds with good linearity. This TDC uses

the inhomogeneity of the gates in cyclic delay line to implement the pulse shrinking

mechanism. The architecture is shown in Figure 2.27. Since it consists of pure digital

circuits, the cyclic TDC nds its applications on many elds. A strong recommendation

is to use a cyclic TDC as a digital phase detector [98] for all-digital PLL. Moreover,


the cyclic TDC can be implemented not only in standard CMOS technology but also in

FPGA. Furthermore, the power dissipation is low for the cyclic TDC. This is an attractive

feature for some low-power portable devices. However, the drawback is the precision is

limited by the mismatches of the gates.

Figure 2.27: Architecture of a cyclic TDC.

GRO TDCs

Gated-ring-oscillator (GRO) TDCs [99] which is a novel time-measurement tech-

nique. The GRO TDC is similar to the oscillator-based TDC [100] which uses the multiple

outputs of the oscillator for phase measurements. However, a GRO TDC only allows that

the oscillator has the phase transition during a given interval measurement. The outputs

of the gated ring oscillator can be used as the clocks which drive the counter to counting

numbers. The architecture and principle is shown in Figure 2.28. Since the resolution of

Figure 2.28: Architecture of GRO TDC.

the GRO TDC is independent on the mismatches of the inverters, the precision can be

achieved as 100 fs or less. Moreover, GRO TDCs are realized by digital circuits which

is very suitable for technology scaling. This also introduces low static power dissipation.

However, the gated ring oscillator may suer from non-oscillation when the time interval

is enable. Thus, the design of gated ring oscillator becomes an important issue.

Conclusions 49

Time-amplier-based TDCs

The concept of the time amplication (TA) is an eective solution for the resolution

enhancement. The operation is similar with an voltage amplier. However, the TA

processes a tiny time dierence and outputs a larger time dierence. In [101], a TA-

based TDC is proposed and the resolution of 1.25 ps can be achieved. The schematic of

a time amplier and the architecture of TA-based TDC are shown in Figure 2.29. The

design of a time amplier is a new research direction of TDC techniques. However, the

techniques of time ampliers are immature require further improvements.

TADLL-

basedTDC

AB

Ao

Bo

Tin Tout

Time words

011...010

(a) (b)

Figure 2.29: Time amplier and DLL-based TDC.(a)Schematic of a time amplier;

(b)Architecture of TA-based TDC.

2.9 Conclusions

This chapter presents the survey on the front-end electronics of photodetectors. At

the beginning, the overview of front-end electronic systems is described. The modern

front-end system is divided into several steps such as photo-electric conversion, signal

acquisition, pulse height analysis, peak-detect-and-hold operation, and signal digitizing.

The signal processing and CMOS circuits for each steps are discussed in the subsequent

sections.

The next chapter will discuss the design of front-end analog processing circuits which

include RGC preamplier, shaper, analog memory, discriminator and monostable circuit

and so on.

Chapter 3

Design of Front-End Analog SignalProcessing Circuits

The most used front-end readout circuits are based on a voltage-mode cascode charge-

sensitive amplier(CSA) which can achieve high gain, low noise and large bandwidth.

However, the voltage-mode CSA suers from large input capacitance when the low noise

design needs input transistor with large W/L ratio. As a result, the readout speed is

greatly limited. Moreover, CMOS technology scaling introduces several design challenges

such as the reduced voltage headroom and the limited intrinsic gain of the transistor.

To overcome the above issues, current-mode ampliers are preferred. As a solution, this

work proposes a regulated cascode (RGC) transimpedance amplier as the front-end

preamplier which can obtain a -3 dB frequency of larger than 3 GHz. Since it processes

current signal, the RGC amplier can be easily migrated to advanced technologies.

This chapter presents the design of front-end analog signal processing circuits based

on the RGC preamplier and a current-mode discriminator. Both the RGC preamplier

and the discriminator are optimized for the proposed small animal PET imaging systems.

The schematics of the blocks are discussed in detail. Moreover, the experimental results

and the analysis of the performances are given.

3.1 Specications and architectures

A multi-channel front-end readout ASIC is necessary for the data acquisition of

the proposed PET imaging system. This chip is a charge-detection part which is very

important for the whole system. Because the quality of the reconstructed images is

directly aected by the performances of the chip, the characteristics of the chip will be

carefully observed and discussed.

In order to fulll the requirement of the PET system, a 64-channel front-end readout

chip has to take up several important challenges such as,

• Large dynamic range will be achieved for the measurement of the charge varying

from few fC to 104 pC. It corresponds to a variation of the input signal induced by

the dierent positions of the scintillation along the axial extent of the crystal.

51

52 Design of Front-End Analog Signal Processing Circuits

• Nonlinearity is less than 3 % and the signal-to-noise ratio (SNR) is 40 dB. Two

operating modes are set to improve the linearity. One is "calibration mode", where

the input range is from few fC to 480 fC (corresponding to a fraction of photoelec-

tron to 3 pe-). The other is "acquisition mode", where the dynamic input is on the

full range, which is 2.4 pC to 104 pC (corresponding to 15 pe- to 650 pe-).

• Low input impedance (180 Ω) and CR-RC shaping time is 300 ns.

• The threshold of comparator for trigger generation is low enough to ensure a trigger

below the photoelectron charge, while the trigger jitter is less than 0.1 ns.

The schematic of one-channel front-end readout chip [8, 102] is shown in Figure 3.1.

A preamplier is directly bonded to the output of each pixel of the photodetector. The

weak current pulse generated from the detector is rstly amplied by the preamplier.

The output of preamplier is split and transmitted to a so called "energy channel" for

the energy measurements and a so called "timing channel" for the trigger.

AA

Preamplifier

Current Comparator

Gain Adjustment

Integrator

Shaper

Analog Memory

Output Buffer

Energy(To ADC)

Hold(To TDC)

Hit(To TDC)

Hold

vdda

vref

vtreshvdda

Iref

-Ao

100k 10p

100k

3p 1p

300k

2p

100k Monostable

Detector(LYSO+MCP)

Vdd_hv

Figure 3.1: Proposed one-channel schematic of the front-end ASIC [8, 102].

In the energy channel, a gain-adjustment stage should be utilized to adjust pre-

cisely the gain of amplication due to the gain dispersion of the photodetector's anodes.

Moreover, the output current has to be compensated due to the leakage current from

photodetector. The adjusted current signal is then integrated by an RC cell and shaped

through a second order CR-RC shaper. The output of shaper is stored in a sample-

and-hold circuit implemented with MOS switches and capacitors. This operation allows

a continuous readout without any dead time in the acquisition mode. The dierential

architecture of the slow shaper is designed for optimizing signal-to-noise ratio (SNR).

The timing channel that generates the precise triggers consists of a current com-

parator followed by a monostable circuit. The input current is compared to a reference

current whose value is equivalent to the energy level of 53 fC. The monostable circuit

provides the "holding time", which corresponds to the time needed to sample the peak

value of the shaped pulse into the analogue memory.

Circuit descriptions 53

3.2 Circuit descriptions

3.2.1 Preamplier with the variable gain stage

The charge sensitive preamplier plays a critical role in the front-end ASIC design.

Generally, three kinds of preampliers, common-source voltage amplier [103] including

cascode architectures, dierential architecture amplier [104] and current-mode ampli-

er [105], can be chosen. Voltage ampliers can achieve low-noise performances dedicated

to capacitive detectors by using large-dimension input transistor. However, the large in-

put capacitance decreases the timing sensitivity. Moreover, the output dynamic range

will be limited by the output transistors. The dierential amplier can obtain better

matching performances for the input dierential pair and be utilized to large capacitance

(for example, 80 pF [104]). Nevertheless, dual-end signal processing is more complicate

than that in a single-end voltage amplier. The third class is the current-mode pream-

plier whose attractive features are large bandwidth and low input impedance. However,

the current amplier has low gain which usually needs a gain-enhancement stage.

In our case, the preamplier deals with weak current pulses from MCP PMTs, several

signicant performances such as low input impedance, large dynamic range and large

bandwidth should be considered. Moreover, time stamping feature for a 1 ns precision

should be satised. These items require that the preamplier should have a fast response

with the small propagation delay time. Due to the suciently large gain of the MCP

PMTs, it is not necessary to use a low-noise CSA. As a result, a current-mode amplier

using regulated cascade (RGC) architecture is proposed as our solution. The basic RGC

transimpedance amplier is shown in Figure 3.2(a).

Iin

M1

R2

M2

R1

Rs

Vout

Vdd

R2 Rds2 gm2VG2 Rs

R1

gm1(VG1-VG2) Rds1

VG1

VG2

Iin

Vout

VG1 VG2

(a) (b)

Figure 3.2: Regulated cascode transimpedance amplier.(a)The schematic; (b)Small signal

analysis model, where the body eect is neglected.

With a feedback loop composed of the transistors M1 ∼ M2, an extremely low input


impedance can be obtained. The value of input impedance is given by

Zin ≈1

gm1(1 + gm2R2)(3.1)

where gm1 and gm2 are the transconductances of M1 and M2, respectively. This value is

1 + gm2R2 times smaller than that in a common-gate amplier. By using this kind of

architecture, the crosstalk between channels will be minimized.

The output impedance is given by

Zout ≈ [Rs +Rds1 − gm1gm2(R2//Rds2)RsRds1]//R1 (3.2)

where Rds1 and Rds2 are the equivalent resistances of M1 and M2, respectively. The out-

put impedance mainly depends on R1, R2 and Rs. To obtain large output impedance,

large R1 and Rs should be employed. This will introduce large die size. In our design, theR1 and Rs are realized by current mirrors. In addition, the output of the TIA is changed

to current mode. The output current can be easily adjusted by a programmable cir-

cuit such as current-steering digital-to-analog converter (DAC). Meanwhile, this current

signal can be discriminated by a current-mode comparator without using pulse-shaping

operation.

Figure 3.3: One-channel schematic of the front-end ASIC.

Figure 3.3 represents the preamplier architecture with the variable gain stage. The

preamplier comprises the transistors MN1∼MN4, MP5. The gain of the RGC amplier

is given by Equation 3.3.

ARGC =∂Iout∂Iin

≈ − gm1(1 + gm2R1)rds1rds31/gm5 + rds1 + gm1(1 + gm2R1)rds1rds3

(3.3)

where gm1, gm2 and gm5 are the transconductance of the transistor MN1, MN2 and

MP5, respectively. rds1 and rds3 are the resistance of the transistor MN1 and MN3 due

to channel-modulation eects, respectively. The absolute value of ARGC is less than 1.

Thus, a gain-adjustment circuit is required. The variable gain stage is realized by the


current mirrors M61 ∼ M66 which are switched and controlled by a 6-bit register. The

dimensions of M61 ∼ M66 are increased successively by two times. This solution has

been chosen due to its simplicity and very good accuracy of the current gains. Although

few transistors are employed, the challenge of this design is the layout mismatch. The

variable gain stage can also be realized by a 6-bit current-steering DAC.

The current-to-voltage conversion is performed in the last stage. The current is

integrated on the R2 and C2. The transfer function from the variable gain stage to the

output of the integrator is given by Equation 3.4.

∂Vshpr∂Iout

≈ (W/L)6

(W/L)5· (W/L)7

(W/L)8·

5∑i=0

2D[i] · R2

1 +R2C2(3.4)

Thus, this equation illustrates that the gain of the amplier is determined by the dimen-

sion of MP5, M6, MN7 and MN8, and the register value controlling the variable gain.

The values of R2 and C2 are from the tradeos between integration accuracy and decay

time.

In our design, a very low input impedance of 180 Ω with a bandwidth of 500 MHz

has been obtained. A tradeo between the dimensions of the transistors M1∼M5 and

the value of R1 resistor has been also chosen in order to maximize the dynamic range, to

optimize the noise performance and to keep good feedback loop stability.

A trade-o has been done between accuracy and integration time. The parameters

of the RC circuit are 100 kΩ for the resistance R and 10 pF for the capacitance C. The

dimensioning of R and C was made in order to ensure a small nonlinearity (< 2 % for

adjustable gain of 1 ) independently of the input signal amplitude.

3.2.2 CR-RC shaper

The integrated signal from preamplier is shaped through a slow shaper. In the

frequency domain, this kind of shaper can be considered as an active band-pass lter

which increases the SNR. In this study, the second order semi-Gaussian (CR-RC) shaper

is chosen. The schematic is shown in Figure 3.4.

The circuit consists of passive resistors and capacitors with a high-performance op-

erational amplier. Since the time window is 10 µs, the denition of shaping time is

critical for the front-end readout circuits. In our case, the theriotical value of shaping

time is chosen as 300 ns. In order to deal with mismatches, the value of R1, C1, R2, and

C2 are selected as 100 kΩ, 3 pF, 300 kΩ and 1 pF, respectively.

Moreover, the shaping time is aected by the performances of the OPAMP. As

discussed previously, the gain of the OPAMP should be large enough so that the OPAMP

can be treated as a ideal circuit. Second, the bandwidth should be satised with the

shaped signals. Last, the output dynamic range should be large enough so that the

shaped signal can not be distorted. As a solution, a two-stage OPAMP with cascaode

PMOS dierential pair is proposed. The schematic is shown in Figure 3.5. In this

architecture, the PMOS dierential pair with a current-mirror load is employed to obtain


Figure 3.4: Schematic of the proposed shaper circuit.

low noise performance. MP5 and MP7 are added to the dierential pair to achieve both

high gain and large bandwidth. The slow shaper is characterized by a gain bandwidth

product (GBW) of 60 MHz.

MP0

MP1 MP12

MP5

MP6

MP7

MN3 MN4

MN10

MP9

Cc

MN8

Inn Inp

Vbp1 Vbp2

Vbn

Vdda = 3.3 V

1 pF

100 μA

150 μA

10 μA

Figure 3.5: Schematic of the proposed high-performance two-stage OPAMP.

Another advantage of the proposed OPAMP is that the bias currents are supplied

by separated circuits so that the noise performances can be optimized. The bias of the

OPAMP is shown in Figure 3.6.

3.2.3 Time-stamp circuits

Architecture

With the current-mode preamplier, the time stamp can be achieved by a high-

speed current-mode comparator directly from the output of the RGC amplier (Icmp inFigure 3.3), unlike the other design that acquires the time information from the output

of a fast shaper. The output current of preamplier is compared to a programmable


MN10 MN12 MN14 MN16

Ibias

100 μA

Vdda

MP13MP11 MP15 MP16

MN17

100 μA

Vbp1

150 μA

Vbp2

Vbn10 μA

Figure 3.6: Schematic of the bias circuits for the proposed OPAMP.

reference current (Iref ). A trigger is generated from the comparison operation. This

triggering pulse is then sent to a wide-range high-resolution TDC for measurements and

digitizing. Besides, the trigger should be utilized for the sampling of the peak voltage

of the shaped voltage signal. This time-stamp method is shown in Figure 3.7. In this

design, the current-steering DAC controlled by JTAG registers is used to generate the

reference current.

D7D6

D0D2

JTAG DAC

RGC CSA

-Ao

Iref

Iin,csa

TDC

Tn-1Tn-2

T0

Current Comparator

T1

Time Words

Timing

Analog Memory

Figure 3.7: Proposed time-stamp method for PET imaging.

High-speed current comparator

The delay time of comparator will introduce errors to real trigger moment. As a

solution, a very high slew-rate current-mode architecture is selected. The design idea

originates from the contributions on design of current comparators which are shown

in Figure 3.8. The idea using a positive-feedback loop to enhance the circuit speed is

routine in the analog design world. However, changing the position of NMOS and PMOS

in the input point is a creative thought. As shown in Figure 3.8(a), H. Tra proposed a

novel high-speed current comparator by using given method to accelerate the comparison

operations [76]. The drawback of this circuit is that the existence of a dead zone limits


the performances, in particular, for very low input current. To overcome this problem,

L. Ravezzi et al. proposed a new architecture [78], as shown in Figure 3.8(b). By adding

two transistor, MP5 and MN6, between the MP3 and MN4, the gate voltages of MP1

and MN2 have been adjusted. Thus, these two transistors become more saturated so

that the comparator is more sensitive for the input current. The two diode-connected

transistors in Figure 3.8(b) need very wide channel widths in order to provide the current

charging/discharging the output node and avoid excess voltage drop across the two diode-

connected transistors due to body eect [79]. For the sake of optimized performance, one

diode realized by PMOS was removed by H. Lin [79]. Meanwhile, a fast inverter was

proposed by added two controlled transistors, MP8 and MN11 in Figure 3.8(c). This

improvement can process the input current from several µA to several nA. In addition,

the delay of the comparator can be achieved in the order of nanoseconds.

Figure 3.8: Design of high-speed weak current comparator. (a)Circuit of positive-feedback

high-speed current comparator proposed by H.Tra in 1992 [76]; (b)Modied high-speed current

comparator using two diode transistors [78]; (c)Optimized circuit for high-speed low-current

comparator by using one diode transistor and fast inverters [79].

In our study, since an input current should be compared to a reference current,


MN1 MN2

MP3MN4

MP5 MN7

MN12

MN15 MN17

MP6

MP11

MP14 MP16

Iref

Iin

VOut

vdda

MP0

MND

Input stage Current positive feedback Slew rate enhancement

V1

V2

V3

V4 V5

MP10

MN13

V6∆Iin

MP8

MN9

Figure 3.9: Proposed high-speed current comparator as a discriminator for time-stamp circuits.

a dual-end input circuit is added. In addition, the slew rate should be enhanced to

obtain smaller delay errors. Moreover, since the modality of Iin is a negative current, an

additional inverter should be used to adjust the nal phase of the acquired pulse. The

proposed high-speed low-current comparator is shown in Figure 3.9.

The subtraction is performed at the rst step. The dierence of the input current

and the reference current is given by Equation 3.5.

∆Iin =(W/L)3

W/L)0· Iin −

(W/L)2

W/L)1· Iref (3.5)

where Iin and Iref are the input current and the reference current, respectively, (W/L)0

∼ (W/L)3 the width-to-length ratio of MP0, MN1, MN2, and MP3, repectively. The

advantage of the input stage is that the small dierences between Iin and Iref can be

amplied. If(W/L)3W/L)0

= (W/L)2W/L)1

= α, the Equation 3.5 can be rewritten as

∆Iin = α(Iin − Iref ) (3.6)

where α > 0. Besides, the input impedance to ∆Iin is

Rin =1

gm4 + gm5(3.7)

where gm4 and gm5 are the transconductance of MN4 and MP5. The value of this

impedance can be very large. Thus, V1 ∼ V3 are very sensitive to weak ∆Iin, so as to

the input current. When ∆Iin ows into the circuits, if V1 keeps constant, V2 decreases

and V3 increases. With the positive feedback, MP8 is quickly opened while MN9 is

cut o. Thus, V5 becomes to the HIGH voltage and V6 is LOW. Finally, the output

of the comparator is HIGH. When ∆Iin ows out of the circuits, vice versa. In our

case, since the input current is analog pulse, the output of the comparator would be a

square pulse. Due to its low input impedance and very high gain, this structure can


obtain the required high speed response less than 1 ns. Nevertheless, the pulse signals

generating by the afterglow of LYSO scintillating crystal are also detected because of the

comparator's low current threshold. An autocorrelation circuit has been added to avoid

random coincidences which are sources of errors. This circuit consists of delay cells with

delays varying from 6 to 24 ns and an AND gate. It suppresses these parasitic pulses and

provides a time trigger for eective signals only.

Current-Steering DAC

The theoretical Iref is equivalent to a fraction of one photoelectron. However, it is

better to set an adjustable value in the design. For our applications, an 8-bit current-

steering DAC congured by the JTAG controller is embedded in the circuit. The value

can be adjusted easily. The operational principle is depicted in Figure 3.10.

Figure 3.10: Principle of the Binary-weight current steering DAC.

This class of DAC can achieve high speed and good linearity. The switches are

controlled by the input digital bits. If the digital number is "1", the switch is ON so that

the corresponding current can be chosen. Otherwise, the current in the branch equals

zero. By adding the current in all branches, the sum is sent to a high-impedance current

mirror. If the gain of the current mirror is unit, the output current can be written as

Iout =7∑i=0

Di · 2i · I0 (3.8)

where D0 ∼ D7 are the input digital codes. I0 is the reference current.

The key technique of the current steering DAC is to generate the reference current.

The current in dierent branches should be matched to improve the linearity of the

conversion. Generally, the reference current (I0) is generated by a Replica-based circuits

connected a bandgap voltage. To obtain better performances, the current branches are

improved as an current array [106]. The size of the array is 16× 16 for 8-bit DAC. The

architecture is shown in Figure 3.11. The input digital signal is divided in two segments.


Figure 3.11: Architecture of high-performance current steering DAC.

Each segment deals with 4 bits codes and is encoded to 16-bit thermometer codes. The

codes from Y encoder control the column switches while that from X encoder handles the

switches of all rows. As a result, the current in all branches of the row will contribute to

the output current when this row is selected.

3.2.4 Analog memory

The function of the analog memory is to store the peak value of the shaped voltage,

which is proportional to the magnitude of the charge energy. In our study, since the

signals from detectors are shaped by a CR-RC shaper with a xed shaping time of 300

ns, the peak value can be sampled by the trigger with a monostable circuit which can

generate precise delay. However, the pile-up and random pulses should be considered in

the design. To overcome these problems, a dual sample-and-hold circuit is proposed. The

schematic of analog memory is shown in Figure 3.12.

Two non-overlapping clocks should be generated from the triggers to control the

switches. Supposed two clocks are Φ1 and Φ2, when Φ1 is high and Φ2 is low, S11

and S22 are ON. The upper capacitor is used to store the peak value while the sampled

value in the lower capacitor is buered. When Φ1 is low and Φ2 is high, the situation

is reversed. The switches are realized by CMOS transistors with the cancellation of the


VamoVshpr

S11 S12

S21 S22

Cs1

Cs2

Φ1

Φ1 Φ2

Φ2

Figure 3.12: Schematic of the proposed analog memory.

feed though. The capacitors are implemented by integrated Poly-Poly capacitor.

The timing constant of the analog memory composed of switched-capacitor circuits

is very critical for this method. If the switches has the same W/L ratio and Cs1 = Cs1 =Cs, the model can be given in the following equation.

τam = 2Ron(Cs + Cin,amp) (3.9)

where Ron is the switch equivalent resistance when it is ON. Cin,amp is the input equiv-alent capacitance of the voltage buer. Here, τam should be small enough so that the

signal can be processed correctly. In the design, τam ≤ 5ns is selected. The capacitanceof Cs is 3 pF.

3.3 Experimental results and discussions

The rst prototype chip of 10 analogue channels, so called as 'IMOTEPA', has been

implemented in AMS 0.35 µm CMOS process. It includes JTAG controller and an 8-bit

bias DAC as well. Figure 3.13 shows a photo of the layout of this prototype. Its die area

is 2.8 mm × 2.18 mm in size. The prototype chip has been simulated and measured. The

results are described as follows.

3.3.1 Linearity measurement

Linearity of front-end readout chip, which is directly related to the annihilation po-

sition of the photon along the axial of the crystal, is very important for PET imaging

systems. We emphasis on the measurements of linearity in two dierent modes so called

as "direct-signal mode" and "hold mode". In "direct-signal mode", the trigger is not acti-

vated and the signal appears as it is shaped up by the integrator and the shaper. During

the "hold mode", the trigger is activated. The peak values of the signals are maintained

and stored in the analogue memory for the readout. In addition, the chip operates under

the variable gain stages, which are realized by switched current mirrors. The gain is

controlled by a DAC to select the current gains. We also measure the performance of

nonlinearity for dierent gain stages.

Experimental results and discussions 63

Figure 3.13: Photo of the prototype chip (IMOTEPA).

Figure 3.14: Output transient responses of the shaper in "direct-signal mode".

Figure 3.14 shows the measured transient responses of the shaper in "direct-signal

mode". The dynamic range of acquisition varies from 2.4 pC to 104 pC. In this exper-

iment, we focus on the measured shaping time and the peak values of the shapes. The

measured shaping time is 280 ns and is independent of the input charge. The peak values


are proportional to the input signal charges. From this gure, the calculated gain is 13.1

mV/pC versus 11.25 mV/pC in simulation. The dierence between these two values is

due to mismatch of layout.

The nonlinearity in "hold mode" is shown in Figure 3.15. The injected charge varies

from 10 pC to 104 pC. For all channels, the nonlinearity is less than 3 %. Figure 3.16

shows the nonlinearity of channel 9 while the input is xed and the coding gain is varied.

For all channels, we have almost the same curve. It is indicated that lower nonlinearity

is achieved at higher gain stages. Moreover, the nonlinearity is less than 0.5 % while the

gain code is more than 4. The nonlinearity in the gain stages from 1 to 3 is bigger due

to the noise and mismatch of layout.

Figure 3.15: Nonlinearity of the 10 channels according to dierent injected charge from 10 to

104 pC in "hold mode".

3.3.2 "Time Walk" of the triggers

Figure 3.17 shows the time walk of all channels with the variation of injected charges.

There is scattering between channels which can be adjusted with the thresholds of the

trigger's comparators. For a threshold beyond 3 photoelectrons (at the gain of 106 ),

there is no time walk noticeable and its value is less than 3 ns for all of the channels.

To conrm this value, it's necessary to do time walk measurements with photodetector

output signal.

In our case, large dynamic range at the input should be satised so that there is

a large "time walk" between the lowest energy level and the highest energy level. Con-

sequently, the commonly used time walk compensation techniques should be improved.

Experimental results and discussions 65

Figure 3.16: Nonlinearity of the energy value versus the gain code (results from Channel 9).

Figure 3.17: Test results of "time walk" of the 10 channels.

The time walk phenomenon is a xed problem in the front-end electronics. Some designs

propose CFD techniques to solve this problems. However, the use of CFD needs more

circuits to process the time information. Fortunately, the characteristics of the time-walk

phenomenon can be tested by experimental methods. By collection enough experimental

data, a model can be found so that the time-walk problems can be xed by a software.


3.3.3 Trigger eciency

For the designed chip, the specications require a threshold of the discriminator

below the charge quantity corresponding to one photoelectron. Figure 3.18 represents

the eciency of the trigger according the 10 channels of the chip. The tests show that,

for most of the channels, the triggers are generated when the threshold is below 53 fC

which is corresponding to 1/3 pe- for a photodetector with the gain of 106.

Figure 3.18: Trigger eciency according the 10 channels.

3.3.4 Crosstalk between channels

For all tested chips, the measured results of the crosstalk between adjacent channels

are less than 0.2 %. This value has been conrmed with real photodetector output signals.

It means that the RGC architecture chosen for the preamplier satises absolutely our

requirements.

3.3.5 Noise and power dissipation

In the acquisition mode, the output RMS noise is measured about 300 µV when the

input signal is 2.4 pC (corresponding to 15 pe-). This value is corresponding to 1.4 ×105 e- which means that the used preamplier is not a low noise design. However, the

amplitude of the MCP detector is large as well. Thus, the SNR is large. The measured

result of SNR is about 40 dB. These values conrm the simulation results. The power

consumption is about 15 mW/channel in agreement with the simulation.

Conclusions 67

3.3.6 Comparison of overview performances

Table I summarizes the overall performances of the prototype chip. The measured

results match well with the specications. Meanwhile, compared to other ASICs, our

proposed chip can easily achieve a large dynamic range due to the employment of current-

mode preamplier and discriminator. In general PET imaging systems with LYSO detec-

tors, there is a big non-uniformity of the gain. Our proposed ASIC can be easily adjusted

to compensate it. Moreover, the nonlinearity is greatly improved by using the calibration

mode and the crosstalk is minimized with the current-mode circuits.

Table 3.1: Comparison of the overview performances of the front-end readout ASICs

Ref. [28] [15] [27] [107] This work

Detector LSO/PMT LSO/PMT LYSO/PMT LYSO/APD LYSO/MCP

Process 0.5µm 3.3 V 0.5µm 5 V 0.35µm 0.35µm 0.35µm

No. of

Channels64 4 N/A 32 10

Die Area

(mm2)N/A 20 2.84×3.04 2.85×2.85 2.80×2.18

DR N/A90 mV∼

980 mVN/A N/A

few fC∼

104pC

Shaping time 0.7µs∼0.9µs 70 ns N/A 100 ns∼200 ns 280 ns

Gain 20 mV/1000e- N/A 32 N/A 2.18 mV/1000e-

Noise 33e-+7.3e-/pF N/A N/A 560e-+30e-/pF 3.5× 104e-/pF

Power diss.

mW/Ch5.4 106.3 150 6.9 15

ApplicationsMedical

imaging

PET front-

endTOF PET

PET with

TOF

Small animal

PET

3.4 Conclusions

This chapter presents the design and characteristics of a new front-end readout ASIC

dedicated to PET imaging systems for biomedical research. A 10-channel prototype chip

has been developed to provide trigger and also energy signals from the photodetectors.

The triggers will be sent to a multi-channel TDC for the measurement of time stamps.

Meanwhile, the energy signals are then sent to a multi-channel ADC for digitizing.

The main results of simulation and measurement are depicted and discussed. It is

shown that the operations of the chip is correct and the results well match the speci-


cations. Low nonlinearity and high trigger eciency are achieved. Moreover, the perfor-

mance of crosstalk shows that the architecture of preamplier is absolutely suitable for

the MCP detectors.

The next chapters will present time-based data converters including time-to-digital

converters and a multi-channel fast ramp ADC with TDC techniques. However, delay-

locked loops are the basis of the time-based converters. Thus, design techniques of low-

jitter DLLs will be discussed rstly.

Chapter 4

Design of Low-Jitter MultiphaseDelay-Locked Loops

In recent years, time-based data converters have became a new research direction

for signal processing. Time-based data converters, which utilize delay time as the unit

of the quantization, require a reference clock generator to provide precise time intervals.

Both PLLs and DLLs are good solutions for this applications. However, DLLs are pre-

ferred to be utilized to generate multiphase clocks due to their lower jitter performances,

unconditional stability and faster locking time.

Generally, DLLs can be divided into three classes: analog DLL which is called as

charge-pump DLL as well, all-digital DLL and mixed-signal DLL. An analog DLL is

congured by a feedback loop consisting of a delay line, a phase detector, a charge pump

and a loop lter. A digital DLL and a mixed-mode DLL have the same architecture as

the analog DLL. The dierence, however, is that internal blocks of these two serials are

realized by digital circuits or mixed-signal circuits. Since the best jitter performances

can be achieved, analog DLLs are usually selected to generate multiphase delayed clocks.

Once the architecture and the technology are chosen, the circuit techniques dominate

the contribution of the jitters of output clocks. For an analog DLL, the jitter mainly

originates from the voltage-controlled delay line, the charge-pump and the loop lter.

The performances of internal blocks play an important role in the circuit design.

This chapter will mainly dedicate to design techniques of low-jitter analog DLLs.

The architecture, blocks and design criteria for the charge-pump DLLs will be discussed.

4.1 Overview of DLL techniques

4.1.1 Architectures and operational principle

Analog DLL

As shown in Figure 4.1, a conventional analog DLL is mainly composed of four part,

for example, a voltage-controlled delay line (VCDL), a phase detector, a charge pump,

and a loop lter. Clk_ref is delayed in the VCDL whose last output is Clk_out. Their

69

70 Design of Low-Jitter Multiphase Delay-Locked Loops

phase dierence between Clk_ref and Clk_out, can be detected by the phase detector.

For simplicity, we assume that the rising-edge dierence is detected by the phase detector

and the delay time of the VCDL decreases when the controlled voltage increases. If the

rising edge of Clk_ref is detected earlier than that of Clk_out, Up goes "High" and Down

is xed to "Low". The upside switch in the charge pump turns ON while the downside

one simultaneously becomes OFF . Thus, the capacitor in the loop lter is charged. The

controlled voltage increases. Thus, the delay time of the VCDL decreases. The rising-

edge position of Clk_out becomes closer to Clk_ref. It means that the phase dierence

between Clk_ref and Clk_out decreases. Since the reference clock operates continually,

this phase dierence theoretically decreases to zero. The phase of Clk_ref and Clk_out

are nally locked.

UP

Down

Ic

Ic

t t t t

C

Voltage-Controlled Delay LIne (VCDL)

VrefVdda

RstB

Vc

Figure 4.1: The architecture of a charge-pump delay locked loop.

On the contrary, if the rising edge of Clk_ref is detected later than that of Clk_out,

Down goes "High" and Up is xed to "Low". The capacitor in the loop lter is discharged.

The controlled voltage decreases. Thus, the delay time of the VCDL increases. The

rising-edge position of Clk_out becomes closer to Clk_ref as well.

If no dierences between Clk_ref and Clk_out are detected, both Up and Down is

xed to "Low". Two switches in the charge pump are OFF. As a result, the capacitor

can not be charged or discharged so that the controlled voltage keep a constant value.

The delay time of the delay line varies according to the controlled voltage. The vary

direction depends on the curve of delay characteristic in the delay cell. If the slope of

the delay time versus the controlled voltage is negative, the delay time of the VCDL

decreases while the controlled voltage increases. On the contrary, the delay time of the

VCDL increases. Generally, the operation of the VCDL and the phase detector should

match well to fulll the negative feedback loop.

Since they are composed of analog circuits, analog DLLs are very dicult to be

migrated to more advanced technology. Moreover, analog DLLs usually suer from non-

locking or false locking problems. However, Analog DLLs can obtain sub-gate delay

Overview of DLL techniques 71

resolution with low-jitter performances. This is obvious feature in the generation of

multiphase delayed clocks.

Figure 4.2: The architecture of a digital delay locked loop.

Digital DLL

A digital DLL is shown in Figure 4.2. It consists of delay line, phase selector, phase

detector (PD), loop lter(LF) and nite state machine (FSM). The delay line is composed

of delay cells with the xed delay time. Thus, a phase selector should be placed to select

the proper delayed clock. The phase of this selected clock is then compared to the

reference clock. If the phase of the selected clock is before that of the reference clock,

the outputs of the digital loop lter will be increased. The controlled numbers are then

generated by the FSM. Thus, the next delayed clock will be selected. This operation is

repeated by the feedback loop until the DLL is locked. When the DLL is locked, the

output clock and the reference clock have tiny or no phase dierence. This means that

the output clock and the reference clock are synchronized.

The advantage of digital DLL can be implemented by digital gates. This feature

can be utilized to overcome the technology scaling. However, the delay time of the delay

elements depends on the fabricated process. Moreover, the delay time of delay cells can

not be adjusted. Besides, the jitter performances of digital DLLs are worse than that

of analog DLLs. Digital DLLs are generally applied for clock synchronization. If it is

utilized as a multiphase generator, the architecture of a digital DLL should be improved.

Dual-loop DLL

A dual-loop DLL, depicted in Figure 4.3, is composed of two DLLs, a digital DLL

and an analog DLL. In the digital DLL, the delayed clock is compared to the reference

clock until the digital DLL is locked. The selected clock is sent to the VCDL of the analog

DLL for further phase adjustments. The output clock of the VCDL is then compared to

the reference clock until the analog DLL is locked. The output clock and the reference

clock are synchronized.


Phase detector

UP

Down

Ic

Ic

Clk_out∆t ∆t ∆t ∆t

C


VrefVdda

Charge Pump Loop filter

RstB

Vc

Clk_refτ τ τ τ

Phase Selection

Phase Dector LPF FSM

Sel

Figure 4.3: The architecture of a dual-loop delay locked loop.

The dual-loop architecture is an mixed architecture of the digital DLL and the analog

DLL. Thus, the performance is better than the digital DLL. However, the precision of

the dual-loop is determined by the analog DLL. The dual-loop DLL can achieve good

performances on fast locking, wide range and low-jitter. However, the architecture is

complicated so that the design of such a circuit is very dicult.

Mixed-signal DLL

With technology scaling, digitally-assisted analog design becomes a trend. The

mixed-signal DLL is a typical example for this methodology. Figure 4.4 shows a mixed-

signal DLL which consists of a VCDL, a phase detector, a bi-direction (up/Down) counter

and a digital-to-analog converter(DAC). The operational principle is similar to the charge-

pump-based analog DLL. However, the charge pump and the loop lter in the analog DLL

are replaced by the counter and the DAC. When the phase dierence between Clk_ref

and Clk_out is detected, Up and Down which indicate the phase states are generated.

The output numbers of the bi-direction counter are increased or decreased according to

the value of Up or Down signals. These digital numbers are then converted to an analog

voltage which controls the VCDL so that the proper delay can be obtained.

Compared to the charge-pump DLL, the counter realized by digital circuits is more

simple and more robust than the charging/discharging operation. Besides, the power

dissipation will be lower that that in the charge-pump DLLs. However, the dependence


of the overall performance is upon the DAC. The overshot and undershot of the controlled

voltage will contribute clock jitters.

UP

Down

∆t ∆t ∆t ∆t

Vc

DACD

Figure 4.4: The architecture of a mixed-mode delay locked loop.

Other improved DLLs

An all-analog multiphase DLL architecture that achieved both wide-range operation

and low-jitter performance was presented in [108]. A replica delay line was attached to a

conventional DLL. The proposed DLL incorporated dynamic phase detectors and triply

controlled delay cells with cell-level duty-cycle correction capability to generate equally

spaced eight-phase clocks.

A dual-loop DLL was presented in [109] to overcome the problem of a limited delay

range by using multiple VCDLs. A reference loop and four other VCDLs were used to

generate multiphase clocks. This architecture enabled the DLL to emulate the innite-

length VCDL with multiple nite-length VCDLs.

A DLL with wide-range operation and xed latency of one clock cycle was proposed

in [110]. This DLL used a phase selection circuit and a start-controlled circuit to enlarge

the operating frequency range and eliminate harmonic locking problems.

A new DLL circuit that uses a replica delay line and a cycle period detector was

proposed in [111] to solve the false lock problem in the conventional DLLs. The auxiliary

loop in the proposed DLL monitored the lock state of the main loop by estimating the

cycle period of the input clock and decided whether the main loop was in the coarse lock

state or not.

A 0.7-2G-Hz precise multiphase DLL using a digital calibration circuit was presented

in [112]. The mismatch-induced timing error among multiphase clocks in the proposed

DLL can be self-calibrated by using the proposed digital calibration circuit. A start

controlled circuit was proposed to enlarge the operating frequency range of the DLL.

A fast-lock mixed-mode DLL (MMDLL) for wide-range operation and multiphase

outputs was introduced in [113, 114]. The architecture of the proposed DLL utilized

the mixed-mode time-to-digital converter (TDC) scheme for frequency range selector, a

start-up circuit and coarse tune circuit to oer the faster lock time.


A 0.5-5 GHz wide-range multiphase DLL with a calibrated charge pump was pre-

sented in [115]. A multiperiod-locked technique was used to enhance the input frequency

range of a MDLL and avoid the harmonic-locked problem. The charge pump current was

also calibrated to reduce the static phase error.

4.1.2 Behavior models

A DLL behaves like a voltage buer. It is very necessary to model its behavior before

we design the circuits. Here, the charge-pump DLL is considered as an example. The

operation can be modeled in the following equations [116, 117].

ΦClk_out(t) = ΦClk_ref (t− αT )−KV CDL · Vc(t) (4.1)

Vc(t) =ΦClk_ref (t)− ΦClk_out(t)

C

∫ t

−∞Ic(t)dt (4.2)

where ΦClk_ref and ΦClk_out denote the phase of Clk_ref and Clk_out in rad, respec-

tively. α (≥1) is the coecient of the total delay of the VCDL in the number of clock

periods. KV CDL which equates to ∂ΦClk_out/∂ΦClk_ref is the gain of the VCDL in s/V.

Ic is the charging or discharging current. C is the capacitance of the loop lter. Thus,

the phase transfer function derived by discrete Z-transform is given by

H(z) =ΦClk_out(z)

ΦClk_ref (z)=

1−KT zα−1

zα −KT zα−1(4.3)

where KT is the loop gain which is dened by KV CDL ·KCP . Here, KCP = ∂VC/∂∆Φis the gain of the charge pump and loop lter.

4.1.3 Jitter models

Since the DLL is utilized as a clock synchronizer or a clock reference generator, jitter

plays a important role in the design. Three kinds of jitter should be taken into account.

• Input jitter. The jitter from input clock is due to the noise of the clock generator

and ambient noise. This jitter, input to the DLL, can be accumulated by the VCDL.

As a result, this jitter should be reduced as small as possible.

• Jitter due to the VCDL. Since the VCDL consists of many reduplicate delay cells,

the mismatch of the delay cells usually causes jitters. Meanwhile, the power noise

also introduces the jitter to the output clock.

• Jitter due to charge pump and loop lter. The mechanism of the charging and dis-

charging operation introduces periodic ripples of the controlled voltage. Moreover,

the mismatch of the charging and discharging current contributes the ripples on

the controlled voltage as well. The uctuation of the controlled voltage makes the

VCDL generate jitter to the output clocks.


Phase detector

UP

Down

Ic

Ic

Clk_ref Clk_out∆t ∆t ∆t ∆t

C


VrefVdda

Charge Pump Loop filter

RstBVc

ΘIn Θdc,1 Θdc,2 Θdc,n

ηVc

VC + ηVc

Θdc,n

Figure 4.5: The architecture of a charge-pump delay locked loop with jitter models.

The DLL architecture with jitter model is shown in Figure 4.5. Since these three

kinds of jitter are independent, the transfer function for each jitter can be derived from

this model [116, 117]. The transfer function of input jitter can be given by

Jin(z) =ΘClk_out(z)

Θin(z)=

1−KT zα−1

zα −KT zα−1(4.4)

The transfer function of the jitter due to the controlled voltage is given by

JVC(z) =ΘClk_out(z)

ΘVC(z)= KV CDL (4.5)

The transfer function of the the jitter in the VCDL is contributed by each delay cell,

thus,

JV CDL(z) =ΘClk_out(z)

ΘOut,n−1(z)· ΘOut,n−1(z)

ΘOut,n−2(z)· · · ΘOut,2(z)

ΘOut,1(z)=

KT (1− Z−α)

(1− Z−1)(1−KTZ−1)(4.6)

Assuming all jitters have a Gaussian distribution, by solving the equations 4.4∼ 4.6, we

have

δOUT ≈√δ2IN + δ2

V CDL + δ2Vc

(4.7)

where the jitter contributed by the input clock is

δIN =2π · θinTclk

·√

1−KT

1 +KT+K2

T + δKclk (4.8)

The jitter contributed by the VCDL is

δV CDL =2π

Tclk· θV CDL√

1−K2T

(4.9)


The jitter contributed by the controlled voltage is

δVC =2π

Tclk·KV CDL ·∆Vc (4.10)

The parameter KT in above equations is given by

KT =2π ·KV CDL · ICP

C(4.11)

The denitions of other parameters are listed below.

• θin is the input phase error caused by the signal noise.

• δKclk is the time error caused by the noise and the slope variation of the clock.

• θV CDL is the VCDL output phase error.

• ∆Vc is voltage noise of the capacitor(C) in loop lter.

• KV CDL represents the gain of VCDL in s/V.

• ICP is the charge and discharge current in the charge pump.

The VCDL and loop lter exhibit signicant contributions to total jitter perfor-

mances. Generally, the worst-case jitter actually occurs at the last delay cell of the

VCDL due to the accumulation of jitters. The optimization of jitter performances de-

pends on the jitter of the last delayed clock in the VCDL. The total jitter is mainly

determined by the value of KT in Equation 4.7 ∼ 4.11. To reduce the δOUT , KV CDL,

ICP will be reduced or the value of capacitor (C) will be increased.

4.1.4 Circuit techniques for charge-pump DLLs

Although a large spectrum of DLLs has been proposed, the charge-pump DLL is

the main-stream architecture for multiphase clock generations. The reason is that the

charge pump DLL can achieve the best jitter performances among other conventional

architecture. Thus, it is necessary to review the existed circuit techniques of charge-

pump DLL.

Based on the architecture of Figure 4.1, dierent kinds of blocks are developed to

achieve the optimized performances.

Delay cells

The mostly used delay cells in charge-pump DLLs can be categorized to two groups.

One is the full-swing or single-end architecture. The other is the partial-swing or dier-

ential pair conguration [118]. Full-swing delay cells can be current-starved, linear delay

elements, or weighted current-adder inverter cells. The delay time of current-starved

delay cells is controlled by the magnitude of the bias current. The delay time of linear


Vin Vout

(a)

Ibias

Vin Vout

Vctr+

Vctr+

Vctr-

Vctrl

Vin Vout

Vin+ Vin+

Vout-Vout-

(b)

(c) (d)

Vcp

Vdd Vdd

Figure 4.6: The schematic of delay cells. (a)Current-starved delay cell; (b)Dierential delay cell;(c)Weighted current-adder inverter cell; (d)Linear delay element using variable MOS capacitor.

delay elements and weighted current-adder inverter cells are controlled by changing the

capacitance and resistance, respectively. The schematics of these three delay cells are

shown in Figure 4.6.

The full-swing delay cells have simple architecture and dissipate low power consump-

tion. However, the drawback is that the delay time is easily aected by power supply

noise and the operational frequency is limited by the fabricated process. On the contrary,

the dierential pair delay cells can obtain smaller delay time and better power-supply-

noise-rejection ratio (PSRR). In addition, they can operate in a higher frequency range.

The disadvantage of the dierential pair delay cell is its high power dissipation.

Phase detectors

Phase detector plays a important role in the DLLs. The dependence of the detection

precision is upon the performances of the phase detector. In the past decades, dierent

phase detectors have been developed. The phase detector is generally divided into two

classes. One is the two-state phase detector. and the other is the three-state phase

detector. For two-state phase detectors, when the phase dierence is detected, either

Up or Down goes the high voltage level; When no phase dierence is detected, both Up

and Down are set as low level. Bangbang phase detector 1 is a typical two-state phase

1This circuit will be discussed in the next section.


detector.

Some other phase detectors can be categorized as three-state phase detectors. The

mostly used three-state phase detector is shown in Figure 4.7.

D Q

D Q

Clk1

Clk2

Up

Down Down

UpClk1

Clk2

Down

UpClk1

Clk2

(a) (b) (c)

Figure 4.7: The schematic of three-state phase detectors. (a)Three-state detector using stan-

dard D ip op; (b)Phase detector using RS latches; (c)Phase detector using true-single-phase-

clock (TSPC) ip op.

The three-state phase detector using standard D ip op is shown in Figure 4.7 (a).

This phase detector can be easily constructed by used standard gates. When this phase

detector works, both Up and Down are high level. However, the width of high level is

dierent. The dierence equates to the phase dierence of Clk1 and Clk2. The waveform

of this three-state phase detector is shown in Figure 4.8. When the phase of Clk2 is

behind that of Clk1, the width of high level for Up is larger than Down. The situation is

opposite when the phase of Clk2 is before that of Clk1. When the phase of Clk2 equates

to that of Clk1, the width of high level for Up equates that for Down. One notes that,

each state has some moment that Up and Down are simultaneously high. That means

a direct current will be open from the power supply and the ground. This will dissipate

large power. Moreover, the standard D ip op has a limitation on the response speed.

A dead zone will be existed when the phase dierence of two clocks is very small, for

example, few picoseconds.

Clk1

Clk2

Up

Down

Up

Down

Up

Down

(a) (b) (c)

Clk1

Clk2

Clk1

Clk2

Figure 4.8: The waveform of a three-state phase detector. (a)The phase of Clk_out is behind

that of Clk_ref; (b)The phase of Clk_out is before that of Clk_ref; (c)The phase of Clk_out

equates to that of Clk_ref.

To reduce the dead zone due to the mismatches, three-state phase detector can be

realized in the form shown in Figure 4.7(b) and (c). These two phase detectors use higher-


speed ip op such as RS latches and dynamic TSPC ip ops. The phase detector using

dynamic TSPC ip ops can detect a sub-picoseconds resolution so that it is widely used

in high-speed DLL design.

Charge pumps

A charge pump also plays an important role in the analog DLL design. The mostly

used charge pumps include the single-end architecture with high-speed switches and cur-

rent mirrors and the dual-end architecture with dierential pairs. The schematics of these

two charge pumps are shown in Figure 4.9.

Vdda

Down

Up

Vc

（）

Icp

Icn

Vdda

Icp

DNB DN UPB UP

（）

Vc

Figure 4.9: Circuits for charge pumps. (a)Charge pump using single-end current mirror with

high-speed switches; (b)Dierential charge pump.

Single-end charge pumps, shown in Figure 4.9 (a), can be easily realized because

the current mirror has good stability. However, dummy transistors should be placed to

match the operational conditions of the switches. In addition, since both NMOS and

PMOS are employed as switches, the logic of Up and Down signal should be processed.

The delay of the inverter will introduce current glitch. Thus, a resistor implemented by

the CMOS switches is used to balance the delay. Dierential charge-pump is shown in

Figure 4.9(b). Diering from the single-end architecture, the switches are placed rather

at the charging/discharging branch than the current mirror circuits. The charging or

discharging operations are controlled by the value of Up, DN and their inverse signal. If

UP is high and DN is low, the charging current mirrors are open so that the capacitor in

the loop lter is charged. On the contrary, if Up is low and DN is high, the discharging

current mirrors are open. The capacitor in the loop lter is discharged. Compared to

single-end charge pump, this architecture has no problems such as current mismatch due

to feedthough and current glitches due to dierent delay. The disadvantage is large power

dissipation.

The improved circuits of charge pumps basically originate from these two architec-

tures. The single-end charge pump is widely used to construct low or moderate speed


DLLs. However, the dierential charge pump which has good frequency characteristics

and PSRR is usually employed by high-speed DLLs.

Loop lters

A loop lter embedded in PLLs and DLLs has two functions. One is to generate the

controlled voltage. The other is to reduce the noise. In the PLLs, two-pole loop lter,

shown in Figure 4.10(a), is usually utilized. However, for DLL design, large capacitance

is required to reduce the noise ripples, one-pole lter shown in Figure 4.10(b) is preferred.

The capacitor in this one-pole lter can be implemented by a MOS transistor to reduce

the die size. Moreover, a reset circuit is required to set the initial controlled voltage in

some applications.

1

2

1

1

Figure 4.10: Circuits for loop lter. (a)Two-pole lter; (b)One-pole lter, the capacitor can be

realized by MOS transistors.

4.2 Proposed multiphase charge-pump DLL

The aim of this section is to design a charge-pump analog DLL which will be utilized

as a multiphase clock generator. The DLL has the some features such as

• Clk_ref and Clk_out are locked in one clock cycle.

• Large number of delay cells in the VCDL is required

• Low jitter performances has to be achieved.

4.2.1 Proposed architecture

The proposed DLL [119] is shown in Figure 4.11. It consists of a phase detector, a

charge pump, a loop lter, a VCDL and a Start Controller. The multiphase delayed clocks

(Q_1, Q_2,· · · ,Q_N) are generated from the VCDL by adjusting the phase dierence

of Clk_out and Clk_ref. Since the conventional DLLs usually suer from fail-to-lock or

Proposed multiphase charge-pump DLL 81

false-lock problems to more than one clock cycle [110], in our design, a Start Controller

is embedded into the DLL for solving these issues.

As shown in Figure 4.11, four signals, Up0, Down0, Select and Charge, are generated

by this digital part. When the Reset signal is valid after power ON, Up and Down are

generated from the start controller. The capacitor is rstly charged to positive power

supply (Vdda). The initial state of VCDL is set to generate the smallest delay. If the

ag of DISCHARGE is valid, the charge pump and the loop lter are set to discharge

function. As a result, the delay of the VCDL will be increased so that the delay dierence

between Clk_ref and Clk_out will be reduced. When the delay dierence is very small,

a ag (DLL_LOCK), which refers to locking state of DLL, is changed to the high voltage

level. Then Start Controller will be released and the phase detector will be selected as a

phase comparator. At that time, the DLL behaves as a charge-pump analog DLL.

Phase detector UP

Down

Ic

Ic

Clk_ref Clk_out∆t ∆t ∆t ∆t

C


VrefVdda

Charge Pump Loop filterCharge

Vc

Start Controller

MUX

UP1

DN1

UP2

DN2

Figure 4.11: The diagram of the proposed charge-pump DLL.

4.2.2 Circuit description

In this section, the blocks of the proposed DLL are described in detail. The mostly

used circuits such as the current-starved delay cell, the Bangbang phase detector, the

charge pump and one-pole loop lter are selected. Their CMOS implementation is dis-

cussed in detail.

Voltage controlled delay line(VCDL)

In this study, the current-starved delay cell is chosen due to its compact die size

and low power. The architecture of the VCDL and the schematic of the bias circuits

and a current-starve delay cell are shown in Figure 4.12. The VCDL consists of a bias

circuits and a delay chain connected by delay cells in serial. Two dummy delay cells are

placed at the beginning and the ending of the delay line is to assure the same ambient


environment for all delay cells. The function of the bias circuit is to generate controlled

voltages, Vcp and Vcn. The delay time of the current-starved delay cells is controlled by

these two voltages. The curve of DC characteristics of Vcp and Vcn versus Vc is shown in

Figure 4.13. It indicated that the relationship between Vcp (or Vcn) and Vc is nonlinear.In addition, the voltage dierence between Vcp and Vcn is decreased while Vc is increased.

Vout

Vcp

Vdda

Vc

Vcn

M1

M2

M3

M4

M5

M6

M7

M8

M9

M10

M11

M12

M13

M14

Bias Circuit

DC DC DC DC

Clk_in Clk_ref Clk_out

Dummy Dummy

Vin

Vdda

Vcp

Vcn

Vc

Vcp

Vcn

Figure 4.12: The architecture of the VCDL and the schematic of the bias circuits and a current-

starve delay cell.

The delay cell is composed of two current-starved inverter with the same architecture

and the same dimension of the transistors. Its delay time is determined by the controlled

voltage and the dimension of the transistors. The delay curve of the used current-starved

delay cell is shown in Figure 4.14. With the increase of the controlled voltage, Vc, thedelay time decreases from 800 ns to 200 ns. The slope of the curve is larger when Vcis from 0.5 V to 1.5 V and smaller when Vc is larger than 1.5V. This can be explained

by the states of the controlled transistors in the delay cells. In the former case, since

controlled transistors operate in linear region, their current is proportional to the gate-

source voltage. However, in the latter case, since controlled transistors enter into the

saturation region, the varying of their current depends on the channel modulation eects.

Assuming that λp and λn denote the modulation factor of PMOS and NMOS, respec-

tively, gm,M1 is the transconductance of M1, Cpar represents the parasitic capacitance atthe inputs of the inverter of the delay cell, the gain of a single delay cell is given by

KaDC = −gmM1Cpar

λn(gds,M1)2

+λp(

gds,M1 · λpλn)2

(4.12)


Figure 4.13: DC characteristics of Vcp and Vcn versus Vc.

Figure 4.14: Delay curve of the used current-starved delay cells.

The duty cycle of the output clock in the VCDL is very important for the multiphase

clock generation. Since the load capacitance of the inverter is the same for all half of

delay cells, the dependence of the duty cycle is upon the process as well as the match of

the charging/discharging current through the transistors. Assuming that the bias circuit

and the controlled transistors can provide the same amount of current, the value of duty

cycle can be given by

ηdc =µpCox(

WL )p

µnCox(WL )n× 100% (4.13)

where ηdc denotes the duty cycle of the delay cell. Cox is the capacitance of the oxide perunit area. (WL )p and (WL )n represent the dimension of the PMOS and NMOS transistor


in the inverter, respectively. Generally, the duty cycle is very dicult to get 50 % due

to the mismatch of the transistors. The error of the duty cyle is about ±2 %. Thus, theduty cycle correction is usually required for precise reference clocks.

Bangbang phase detector(PD)

Among the mostly used architecture, two-state Bangbang phase detector based on

cross-coupled RS latch is usually utilized for DLLs [120, 94]. Its schematic is shown in

Figure 4.15. Only NAND gates are utilized to construct the circuits. However, to keep

symmetry, the input ports of the NAND gate should be correctly connected so that the

load condition of Clk_ref and Clk_out is in the same case. So is the reason for dummy

NAND gates.

The Bangbang phase detector can only detect two phase states. One is that the

phase of Clk_out is behind Clk_ref, as shown in Figure 4.16(a). The other is that the

phase of Clk_out is before Clk_ref, as shown in Figure 4.16(b). Thus, Up and Down

signals have two states when the input clocks have tiny time due to the mechanism of

two-state detection. Either Up and Down are High and Low, respectively, or Up and

Down are Low and High, respectively. However, when the phase deferences of Clk_ref

and Clk_out are very tiny, the state can not be pointed out by this phase detector due to

the parasitic parameters of the circuits. At this moment, the phase detector operates in

the dead zone. Up and Down are changed periodically like the input clocks. Their initial

levels are determined by the last state which can be detected by the phase detector. The

behaviors are shown in Figure 4.16(c). This moment in the DLL is also called as "locking

state". The dependence of the tiny phase dierence is upon the used CMOS process. For

standard 0.35 µm CMOS technology, the dead-zone phase dierence is about 10 ps.

In DLLs using a Bangbang phase detector, the capacitor in the loop lter is charged

or discharged in a whole period when the phase detector enters into the dead zone.

Periodical ripples of the controlled voltage will be generated. The amplitude of ripples is

proportional to the period of input clocks.

Figure 4.15: Schematic of a Bangbang phase detector which consists of three cross-coupled RS

latch.


Up

Down

Clk_ref

Clk_out

(a) (b) (c)

Figure 4.16: Typical waveforms of a Bangbang phase detector.

Proposed charge pump

Figure 4.17 shows the proposed charge pump circuit, which consists of a reference

current generator and a single-end charge pump circuit. Here, the single-end architecture

is chosen because the operational frequency of the DLL is about 100 MHz and the low

power dissipation is required.

The reference current is generated by using an external resistor so that the current

can be conveniently adjusted. Since the Bangbang phase detector is utilized, small pieces

of stable charging/discharging current should be generated. The dummy transistors are

placed to the current mirror to ensure the current match. Moreover, the dummy transistor

is added to the switches which connect to Up and Down signals. This strategy is used to

reduce the time feedthough and charge injection.

Vdda

Down

Up

Vc10uA

RExternal

Reference current circuit Charge pump

Vbp

10uA

10uA

10uA

Figure 4.17: Schematic of the proposed charge pump circuit.

Proposed loop lter

The proposed loop lter, shown in Figure 4.18, consists of a MOS switch and an

integrated capacitor. According to the jitter model, the capacitor value should be set


as large as possible to obtain low jitter performances. The objective of this design is to

achieve the RMS jitter of less than 10 ps so that the capacitance of 90 pF is chosen. The

capacitor is implemented by MOS-capacitor with die size is 90 µm × 570 µm. The wholeMOS capacitor is divided into four banks to obtain high yield rate.

Bank 1 Bank 2 Bank 3 Bank 4

Figure 4.18: Schematic of the loop lter.

The curve in Figure 4.19 shows the simulation results of the capacitance versus

dierent gate voltage and the vary of temperature. It indicates that operational gate

voltage should be larger than 0.65 V for the best case of the circuit. In the operational

region, the value of the capacitance is exactly 90 pF.

Figure 4.19: Characteristic curve of parameters scan for charged current and capacitor in loop

lter.

Floor plan and layout

The challenge of layout for a mixed-mode DLL is to reduce jitter due to noise and

crosstalk. The VCDL should be sent far from phase detector and loop lter, but the wire

of controlled voltage should be as short as possible. Meanwhile, its application in multi-

channel TDC should be considered. As a result, the VCDL is placed near the readout

circuits of TDC and the loop lter is set far from digital circuits.


750um

270um

Loop Filter

Start Controller and Phase Detector

Voltage-Controlled Delay Line(32 delay cells)

Band-gap Reference

Circuit

Resistor

Figure 4.20: Layout of the DLL prototype with 32 delay cells.

4.2.3 Experimental results

A prototype circuit using the proposed architecture and described circuits has been

design in AMS 0.35 µm CMOS technology. The layout is shown in Figure 4.20. The die

area is 270 µm × 750 µm. This DLL prototype includes a VCDL consisting of 32 delay

cells, a Bangbang phase detector, a charge pump, a loop lter with 90 pF capacitance, a

Start controller. The prototype circuit has been tested. The output waveform is shown

in Figure 4.21. One notes that a large delay between the input clock and the rst output

of VCDL exists. Thus, the reference phase is rather the input clock that the rst output.

The clocks of the phase comparison must be changed to the rst outputs (Q<0>) and

the last output(Q<31>).

Figure 4.21: Test waveform of the DLL with 32 delay cells. The Bangbang phase detector is

employed. The gure shows that the DLL is well locked. The rst output (Q<0>) and the last

output(Q<31>) has small time dierence.


The waveform illustrates that the DLL is well locked. However, Q<0> and Q<31>

have small time dierence and Q<31> is located in the left hand of Q<0>. This phe-

nomenon is due to the initial method of the proposed DLL. Since the initial state of

the capacitor in the loop lter is set to "vdda" (3.3 V), the minimum delay time of the

VCDL is achieved. Thus, the phase of Q<31> is before that of Q<0>. To lock the DLL,

Q<31> should be shifted from the initial position to the right hand side until the time

dierence can not be detected. At this moment, the DLL enters into a locking state.

Consequently, a phase oset between Q<0> and Q<31> exists due to the dead zone of

the phase detector. The phase dierence in the locking state is called as "phase oset".

Its peak-to-peak value is tested as about 100 ps while the RMS value is about 20ps.

When the charging/discharging current is 10 µA, the controlled voltage varies in

three clock cycles and the amplitude of the voltage ripple is about 5 mV. Since the

capacitor is charged or discharged in one clock cycle, the change of the controlled voltage

can introduce the varing of delay time in the VCDL. However, the state of the phase

detector can not be inversed due to the small charging/discharging current. If larger

current or more sensitive phase detector is utilized, the controlled voltage will vary in one

clock cycle and the ripple will be smaller.

The test RMS jitter is about 7.8 ps and the peak-to-peak jitter is 29.5 ps. These value

meets the specications very well. However, the peak-to-peak jitter should be reduced.

4.3 Optimized charge-pump DLL

Multiphase delayed clocks with the delay time of less than 100 ps are required for

many applications. A single DLL is not suitable to generate the clocks with the delay time

of such a resolution due to the limitation of the operational frequency and the mismatch of

delay cells. Time interpolation is usually utilized to solve this problem. Generally, Vernier

method and the array of DLLs are two dedicated techniques. In these two architectures,

more precise DLLs are required. However, the proposed DLL in Section 4.2 can not

satisfy the requirements. Thus, the optimization of the proposed DLL is very necessary.

In [120, 83], the proposed DLL array consists of ve DLLs which can easily generate

140-phase delayed clocks in one clock period. A bin size of 89 ps can be achieved by

using a reference clock of 80 MHz.

Since the jitter performances of a DLL array are worse than that of a single DLL, not

only low jitter but also low oset phase should be required. Thus, novel design techniques

should be adopted to optimize the proposed DLL. In this section, the three-state phase

detector using true-single-phase-clock (TSPC) ip-ops is proposed to reduce the dead-

zone phase dierences and the phase oset. Moreover, the architectures of current-starved

delay cells and charge pump are improved.

Optimized charge-pump DLL 89

4.3.1 Optimized VCDL

According to Eq. 4.9, reducing RMS jitter due to the VCDL is determined by the

value of the loop gain of the DLL. Since KT is proportional to KV CDL according to

Eq. 4.11, smaller δV CDL will be achieved if KV CDL is reduced. Moreover, δVC will be

decreased as well according to Eq. 4.10. Thus, to achieve smaller jitter performances,

KV CDL of the VCDL as well as the gain of the delay cell should be decreased.

V_in Vout

Vcp

Vdda

Vc

Vcn

Bias Circuit Delay Cell

Mbp

Mbn

Rx

MNxM1

M2

M3

M4

M5

M6

M7

M8

M9

M10

M11

M12

M13

M14

Figure 4.22: Modied current-starved delay cells with the novel bias circuit.

In this design, direct current (DC) sources are added into the delay cell to reduce the

gain. The novel schematic is shown in Figure 4.22. The DC current sources are realized

by pull-up and pull-down MOS transistors (Mbp and Mbn in Figure 4.22). The new gain

is given by

KbDC = −gmM1Cpar

λn(gds,M1 + gds,Mbn)2

+λp(

gds,M1 · λpλn + gds,Mbp

)2

(4.14)

where gds,Mbp and gds,Mbn are the conductance of Mbp and Mbn. Compared with con-

ventional architecture, the absolute value of KbDC is smaller than that of Ka

DC adding

gds,Mbp and gds,Mbn into the denominator. The curve of delay time versus the controlled

voltage is shown in Figure 4.23. It illustrates that the slope of curve is reduced by adding

DC current sources.

In addition to the delay cell, the bias circuit is improved as well. The features of

delay time for delay cells are partly determined by the bias circuit. To well control the

maximum delay time of the delay cell, a diode-connected transistor(MNx in Figure 4.22)

and a resistor (Rx in Figure 4.22)are added into the bias circuit. MNx can be utilized

to adjust the maximum range of the delay curve. The use of Rx is to increase the gate

voltage of MNx and to reduce the dimension of M1 which control the minimum range


Del

ay ti

me(

/ps)

Controlled Votage(Vc) (/V)

800

100

200

300

400

500

600

700

0 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2

Without Mbp and Mbn

With Mbp and Mbn

Figure 4.23: The delay time of a delay cell versus the controlled voltage. It illustrates that the

slope of curve is reduced by adding DC current sources.

of the delay curve. When a very small delay time is required, the transistor M1 with

the large W/L ratio is realized to provide the large current. However, large-dimension

M1 has large parasitic capacitor which will limit the sensitive of Vc. If Rx = 1 kΩ, the

width-length ratio of M1 can be reduced 1/3 of that in the previous version. It means

that the operation frequency of the circuit can be enhanced 3 times.

Since the characteristics of the delay cell depends on the fabricated process and

parasitic parameters. By using the given model, the simulated results of the delay curve

are not precise. An improved approach is to use the programmable current-starve delay

cells. The schematic is shown in Figure 4.24. The DC current is controlled by the register

Rsn and Rsp which can be connected to a JTAG interface. The advantage of this setup is

that the slope can be adjusted by an external register. However, adding more transistors

into the delay cell means larger die size of the VCDL.

4.3.2 Dynamic phase detector

Although Bangbang phase detector has good performance on the low-jitter DLL

design, a dead zone of ±10 ps can not be eliminated. To construct a DLL array, few

picoseconds or sub-picoseconds should be detected so that novel phase detectors are

required. Fortunately, the phase detector realized by true-single phase clock (TSPC) ip

op is a good solution.

Figure 4.25 shows the schematic of a dynamic phase detector that consists of two

TSPC ip-op. Each TSPC ip-op includes only 8 transistors. In addition ,the circuit

is symmetric for both Clk_ref and Clk_out. The dynamic PD is a three-state phase


in

cp

cn

sp0 sp1 sp2

sn0 sn1 sn2

out

sp0 sp1 sp2

sn0 sn1 sn2

Figure 4.24: Current-starved delay cell with programmable DC current source.

Vdda

UpClk_ref

Clk_out

Down

Clk_ref

Clk_out

Figure 4.25: Schematic of dynamic phase detector which consists of two TSPC ip ops.

detector. If the phase of Clk_out is before or behind than that of Clk_ref, Up and

Down will be High and Low or in the inverse state. If Clk_ref and Clk_out have no

phase dierence, both Up and Down are pulse signals with a short width. These pulses

can open a DC loop in charge pump. If the amplitudes of the pulses match well, the

controlled voltage of the capacitor will not be changed by charged/discharged current

and noise. This is a good feature to reduce δVC in Equation 4.10. As a result, the jitter

of the DLL can be improved by adjusting the dimension of the transistors in the TSPC

ip op.

The dynamic PD which has simple architecture can operate at high clock frequency

and can achieve no dead zone. For AMS 0.35 µm CMOS process, the time dierence of

several picoseconds can be detected by using this phase detector. To illustrate the feature

of the dynamic phase detector, the simulated results of DLL using both Bangbang PD

and dynamic PD are compared. The waveform of the DLL is shown in Figure 4.26. In the

Bangbang PD, the controlled voltage is uctuated with the varying of Up and Down due


Figure 4.26: The comparison of the waveform for both two-state Bangbang PD and three-state

dynamic PD when DLL is locked.

-150 -100 -50 0 50 100 150 200

400

200

600

-400

-600

-200

-200

-800

-1000

1000

800

0

Bangbang PD

Dynamic PD

t

10-10

V_c

Figure 4.27: Comparison of the transfer characteristic curve for both Bangbang PD and dy-

namic PD. The dynamic PD has no dead zone and smaller generated jitters. However, Bangbang

PD has a dead zone of ± 10 ps in 0.35 µm CMOS technology.

to two-state detections. Voltage ripples of about 5 mV are generated. However, very small

ripples are generated in the dynamic PD. This is because Up and Down simultaneously

turn ON the switches in the charge pump circuits, small current is generated to charge


and discharge the capacitor. The ripples of the controlled voltage is simulated as less

than 50 µV in 0.35 µm CMOS process.

The comparison of dead-zone curves for two phase detectors is shown in Figure 4.27.

We note that the DLL using a Bangbang PD has a dead zone of ± 10 ps. However, the

DLL using a dynamic PD behaves no dead zone. Moreover, the peak-to-peak jitter of the

DLL using a Bangbang PD is larger than that of the DLL using a dynamic PD. Thus,

the DLL using a dynamic PD can obtain better jitter performances than that using a

Bangbang PD (if other blocks are the same). However, the power dissipation is higher for

the DLL using a dynamic PD because a current pass is opened when the DLL is locked.

4.3.3 Optimized charge pump

In the proposed charge pump, the charging and discharging current have mismatch

problems due to the channel length modulation of the transistors in dierent DC levels of

the controlled voltage (Vc). The key point of optimizing the charge-pump circuit lies in

reducing the mismatch of the charging and discharging current. In this study, both the

charge pump and the reference current circuit are improved. The improved charge pump

circuit is shown in Figure 4.28. The simple current mirror in the proposed charge pump is

replaced by a high-impedance cascode current mirror. The charge and discharge current

of the charge pump can be precisely matched. To overcome dierent DC level of Vc, afeedback circuit is added into the proposed charge pump to detect the varying of Vc [110].The feedback circuit consisting of four transistors. Two NMOS transistors are placed for

the discharging current mirror and two PMOS transistors for the charging current mirror.

Take the discharging current Icn as an example to explain the operational principle. If

Vc increases, the drain-source voltage of MN0 increases. Thus, Icn increases. Meanwhile,

the gate-source voltage of Mn2 increases. Thus, the current of Mn2 increases. This makes

the current owing to Mn4 decreases and the gate voltage of Mn0 decreases. As a result,

Icn decreases. So does the operation of Icp. By setting proper value of the transistors, thecharging and discharging current can be clamped according to the feedback mechanism.

The characteristics of current match between charging current (Icp) and discharging

current (Icn) are shown in Figure 4.29. With a capacitor of 100 pF, the available range

of the charging and discharging current is 10 µA ∼ 30 µA while the controlled voltage

operates in the range of 1 V ∼ 2 V.

4.3.4 Optimized loop lter

In the proposed loop lter, the function is well realized. However, the controlled

voltage can not be tested because no test ports are designed. In the optimized circuit,

a testing circuit is added. The varying of the controlled voltage can be tested by using

a high precision oscilloscope. Figure 4.30 shows the improved loop lter with a testing

circuit. An operational amplier (OPAMP) is inserted to isolate the external capacitance

load. A CMOS switch is placed between the MOS capacitor and the OPAMP in order

to reduce the inuence of noise form the OPAMP.


Vdda

Down

Up

V_c

Mp2

Mp3

Mn3

Mn2

20uA

RExternal

Reference current circuit Charge pump

Vbp

Icp

Icn

Vbn

20uA

20uA

20uA Mn0

Mn1

Mn4

Figure 4.28: Schematic of the improved charge pump circuit. A feedback circuit is added to

improve the mismatch of the charging and discharging current so that the jitter due to Vc can

be reduced.

Figure 4.29: Simulated charging and discharging current versus the controlled voltage. The

data are collected from the input current varying from 10 µA to 40 µA. With a 100-pF capacitor,

the available range of the charging and discharging current is 10 µA ∼ 30 µA while the controlled

voltage operates in the range of 1 V ∼ 2 V.

4.3.5 Experimental results

The DLL using optimized blocks is mainly used for construct a DLL array. The

experimental results of the optimized DLL are listed in Table 4.1. Compared to the pro-

posed charge-pump DLL, the optimized DLL can obtain not only low jitter performances

but also low phase oset. The RMS jitter and Peak-to-Peak jitter are 2.0 ps and 12.4 ps

which are better than the proposed charge pump. The phase oset has been reduced to

Conclusions 95

Figure 4.30: Schematic of the improved loop lter for the DLL array.

about 1 ps.

Table 4.1: The performance comparison of the proposed and the optimized charge-pump

DLL in the locking states

Characteristics Proposed charge-pump DLL Optimized charge-pump DLL

Type of PD Two-state Bangbang PD Three-state TSPC PD

Fluctuation period 3 1

ripple voltage (peak) 5 mW 36.6 uV

JCC RMS jitter 7.8 ps 2.0 ps

Peak-to-peak jitter 29.5 ps 12.4 ps

phase oset 18.7 ps ∼1 ps

4.4 Conclusions

This chapter describes the design techniques of Low-jitter multiphase DLLs. Firstly,

the state-of-the-art of analog and digital DLLs are reviewed. The behavior model and the

jitter model are described. The circuit techniques are also surveyed. Secondly, a mixed-

signal low-jitter DLL is proposed. The internal blocks of the proposed DLL and their

experimental results are discussed in detail. Thirdly, to obtain low-jitter performance, the

conditional circuits have been optimized. The current-starved delay cell is optimized by

using DC current sources to reduce the slope of the characteristic curve between the delay

line versus the controlled voltage. The dynamic phase detector is proposed to replace the

Bangbang phase detector to achieve sensitive phase detection and no dead zone. The

charge pump is optimized by using a feedback circuit to overcome mismatch of charging

and discharging current due to the value of the controlled voltage. The loop lter is

optimized by adding a test circuits to isolate the inuence of the ambient capacitor. The


optimized DLL can obtain better jitter performances and smaller phase oset than the

proposed charge-pump DLL. Thus, the optimized DLL is more suitable for the multiphase

clock generation, in particular, for the construction of the DLL array. The architecture

and circuit techniques will be introduced in the next chapter.

Chapter 5

Design of Multi-Channel Coarse-FineTime-to-Digital Converters

The coincidence electronics is one of the necessary parts in PET imaging systems. In

the past, the AND logic implemented by FPGA or custom chips is utilized to achieve the

real time coincidence. This method had less exibility in timing assignment. Nowadays,

time marks generated by a time-to-digital converter (TDC) are proposed in coincidence

measurements. A multi-channel sub-nanoseconds TDC is required to provide time stamps

for the coincidence judgments in the proposed small animal PET imaging systems.

In addition, PET with time-of-ight (TOF) capability can provide a better recon-

structed image compared to conventional positron tomography [25]. In TOF-PET ap-

proach, for each detected event, the measurement of the time of ight dierence between

two 511 keV photons provides an approximate value for the position of the annihilation.

The approximation is directly limited to the capability of measuring the arrival time of

the two photons. In the 1980s, TOF-PET was built with an achieved timing resolution of

∼500 ps [26]. At that time, the electronics available drastically reduced the performancesof the TOF-PET. Nowadays, electronics operating in the GHz range is routine and the

application-specic integrated circuits (ASIC) are commonly used [38]. The ASIC needs

to include a high-precision time-to-digital converter (TDC) for each detector element to

reach the required time resolution(i.e., less than 100 ps)with good stability.

The TDCs dedicated to PET imaging have several features such as wide measurement

range, high resolution and multiple channels. Among the existing TDCs, counter-based

architectures with the time interpolation are very suitable for this application. Thus,

this chapter focuses on the design techniques of multi-channel coarse-ne TDCs using

counter-based circuits and low-jitter DLL techniques. The review of TDCs for PET

imaging is rstly given. The principle and architecture of coarse-ne TDCs are discussed.

Moreover, two prototypes using counter-based circuits and a single DLL and a DLL array

are presented.

97

98 Design of Multi-Channel Coarse-Fine Time-to-Digital Converters

5.1 Design considerations

A sub-nanoseconds TDC is required in our proposed PET imaging system. The

specications are given as the following.

• Number of channels: 64

• Bin size: ≤ 625 ps.

• Reference clock: 50 MH

• Power dissipation: < 5 mW/channel.

The survey of TDC architectures is necessary. Few contributions are dedicated on

the high-resolution TDC in the eld of PET imaging. The characteristics of these TDCs

are reviewed in the following items.

• A TDC that performed coincidence detection in a liquid Xenon PET prototype was

introduced in [121]. The TDC architecture was based on dual counters and a DLL

with 128 delay cells. The TDC designed in 0.35 µm CMOS technology was able to

operate at the temperature of 150 K and obtain a resolution of better than 250 ps.

• A 100-ps time-resolution CMOS TDC for PET imaging applications was proposed

in [92]. The TDC architecture combined an accurate digital counter and an analog

time interpolation circuit to make the time interval measurement. The dynamic

range was programmable without any timing resolution degradation by using a

coarse counter. The ne conversion utilized a time-to-amplitude converter followed

by an 5-bit ash ADC. The bin size was 312.5 ps with a DNL of under ±0.2 LSB

and INL less than ±0.3 LSB.

• A ne resolution and process scalable CMOS time-to-digital converter (TDC) ar-

chitecture was presented in [122]. The TDC architecture used a hierarchical delay

processing structure to achieve single cycle latency and high speed of operation.

The TDC had a 31 ps timing resolution and power consumption of less than 1 mW.

• A TDC based on Vernier method with 1.3 ns timing resolution was realized by

using only one FPGA [123]. The obtained resolution can meet the demand for the

coincidence measurement of LYSO PET detectors with a 9 ns ∼ 15 ns coincidence-

timing window.

• A full-custom 16-channel 625 ps TDC was proposed in [9] at IPHC, in 2007. The

TDC was designed in 0.35 µm CMOS technology. The coarse conversion of the

TDC was realized by dual 10-bit counter with a reference clock of 50 MHz. The

ne conversion is based on the multiphase sampling techniques based on a charge-

pump DLL with 32 delay cells. The dynamic range is 10 µs.

Design considerations 99

To sum up, several dierent architectures such as the TDC using a counter and a

DLL, the TDC using a TAC and an ADC, the TDC using a hierarchical delay structure

and vernier method are utilized in these contributions. However, other TDC architectures

should also be considered. The distribution of the TDC architectures with the resolution

versus the measured range is shown in Fig. 5.1. It is indicated that a coarse-ne TDC

based on a counter and time interpolations [121, 92] is very suitable for PET imaging

applications which need TDCs with large dynamic range, high resolution and multiple

channels. In fact, this kind of TDCs can be called as 'Coarse-ne TDCs '. The coarse

conversion and the ne conversion are realized by the counter-based circuits and DLL-

based circuits, respectively.

1p 10p 100p 1n 10n

GRO,Time Amp.

10n

100n

1u

10u

Clinic,DLL+RC,

VDL

Resolution(seconds)

Mea

sure

d ra

nge

(sec

onds

)

Single DLL

TAC+ADC,DLL array

Counter-basedTDCs

Hybrid TDCs:Counter-based + DLL/VDL/GRO/TA

Figure 5.1: The distribution diagram of the TDCs with the resolution versus the measured

range according to the survey on the TDC architectures [124].

The timing of a coarse-ne TDC based on the counter and the time interpolation is

shown in Figure 5.2. In the coarse conversion, a reset signal is set as the "Start" in each

time window. The counter is reset when the Reset goes LOW and starts to count number

when Reset goes HIGH. If a Hit signal is changed from LOW to HIGH in the time window,

the corresponding counted number will be sampled. This sampled number is proportional

to the time dierence between the positive edge of Reset and Hit. However, the counted

number only illustrates that the position of the Hit signal is in the corresponding period.

The position of internal period should be pointed out by the ne conversion which is

realized by the time interpolation and multiphase sampling techniques. To obtain n-bit

ne resolution, 2n delayed clocks in one period should be generated. The states of the

delayed clocks can be sampled by the Hit signal. The sampled data are thermometer

codes or peusdo-thermometer codes which need to be converted to binary codes.

The architecture of the coarse-ne TDC is shown in Figure 5.3. It consists of a

counter, a multiphase clock generator, Hit registers, a encoder, ne registers, coarse


Hit(stop)

Clk_ref

Reset(start)

Cycle N Cycle N+1

1 2 3 4 5 6 Nc-1 Nc... 0 0Counter ...1 2 3 Nc‘

Tf

Tc

0000111...100...00 0

Clk_refQ[0]Q[1]Q[2]Q[3]Q[4]

Q[2n-2]Q[2n-1]

Acquisition data 0 00011...100...00

(a)Coarse

Conversion

(b)fine

Conversion

Zoom

Figure 5.2: The timing of the coarse-ne TDC based on the counter and the time interpolation.

registers and a delay circuits. Generally, several architectures should be selected for

the counter. For example, a binary counter, a Gray-code counter or the dual-counter

architecture is one of the solutions. The multiphase clock generator which is usually

realized by a DLL or a DLL array is the key part of the TDC. The precision of the TDC

is determined by the performances of the multiphase clock generator. Excluding these two

parts, other blocks are digital circuits to realize sampling strategy and data processing.

These digital circuits can be reduplicated to construct a multi-channel TDC. In this case,

the data storage and transition is a key point. Serial, parallel, rst-in-rst-out(FIFO)

and/or mixed methods can be selected as one of solutions.

The coarse-ne TDC can achieve the shortest conversion time or latency which di-

rectly relates to the detection eciency of front-end electronics. Large dynamic range is

realized by counter. Moreover, the dynamic range can be programmable by controlling

the cycle of the reset signal. The measured time of the TDC in each cycle can be given

as

Ttdc(t− kT ) = TLSB ·N−1∑i=0

2Di(k) (5.1)

where, T is the dynamic range which equates to the period of the time window. k is the

number of the time window. Ttdc(t− kT ) is the total time measured by the TDC in the

kth cycle. N is the number of bits for TDC outputs.TLSB is the bin size of the TDC.

Assuming the coarse resolution and the ne resolution are m and n, respectively. We

have

N = m+ n (5.2)

Design considerations 101

Hit Registers (2n bits)

Fine Register(n bits)

2n

n

Counter

Hit Registers(m bits)

m

m

Coarse Register(m bits)

Encoder(2n to n)

Fine time Coarse time

n m

Clk

Hit

delay

Hit

Hold Hold

Multiphase Clock Generator

(DLL/ADLL)

2n

Figure 5.3: The architecture of a coarse-ne TDC using ash sampling. Here, the counter can

be binary counter or Gray-code counter. The multiphase clock generator can be a single DLL

or a DLL array.

The dynamic range of the coarse-ne TDC depends on the coarse resolution and the

period of the reference clock. That is,

DR = 2m · Tref (5.3)

where Tref is the period of the reference clock. The bin size of the TDC is determined by

the time dierence between two adjacent delayed clocks in the DLL. This time dierence

which has a relationship of the clock period and the number of delayed clocks is given as

TLSB =Tref2n

(5.4)

Thus, the total measured time can be written as

Ttdc(t− kT ) = Tref ×m−1∑i=0

2Di(k) + TLSB ×n−1∑j=0

2Dj(k) (5.5)

One notes that the measured range is determined by m and the bin size is propor-

tional to the period of the reference clock and inversely proportional to the number of

delayed clocks. Thus, the tradeo of the frequency of the reference clock, the resolution

of the counter and the number of delay cells in the DLL should be made in the coarse-ne

TDC design.


5.2 Design of a 625-ps multi-channel coarse-ne TDC

Since 2007, the microelectronic group of IPHC has initiated a project to develop a

sub-nanoseconds TDC dedicated to PET imaging. This section will discuss the the design

and characteristics of a successfully developed 16-channel TDC with 625-ps resolution.

The advantages and drawbacks are discussed as well.

5.2.1 Proposed architecture

The architecture of the rst prototype (named "IMOTEPD") is shown in Figure 5.4.

It consists of two 9-bit counters, a 32-phase DLL, 16-channel readout circuits, parallel-

in-serial-out(PISO)registers and bias circuits.

The coarse conversion is implemented by two 9-bit counters with associated readout

circuits. Since binary counters are employed, the coarse readout circuit is relatively

simple. Only 9-bit Hit registers are required. The second-stage registers are required for

parallel-in-serial-out operations. With a reference clock of 50 MHz, the coarse conversion

can achieve the maximum measured range of 29 × 20 ns = 10.24 µs.

Hit Registers

Fine Register(5 bits)

32

5

Counter#0

Register#0

Register#1

9 9

Mux2to1(×9)

9 9

9

Coarse Register(9 bits)

Encoder(32 to 5)

Clk

Hit<0>

delay

Hit

Hold Hold

MSB of fine conversion

32-phase DLL

32

Counter#1

5 9

Reaodout Interface

Time words

Hit<1>

Hit<15>

Figure 5.4: The proposed architecture of a 16-channel 625-ps TDC based on a 9-bit counter

and a DLL with 32 delay cells.

The circuits for ne conversion are composed of a 32-phase analog DLL and associ-

ated readout circuits. 32-bit Hit registers are required to sample the states of the delay

Design of a 625-ps multi-channel coarse-ne TDC 103

clocks. The sampled data are pseudo-thermometer codes which need to be converted

into 5-bit binary codes. Thus, a 32-to-5 encoder is required. Similarly, 5-bit second-stage

registers are required to restore the converted digital signals. Since the reference clock is

50 MHz, the delay time of the adjacent delayed clocks is 20 ns/32 = 625 ps. The DLLis realized by using the circuits which are described in Section 4.2 in Chapter 4.

One advantage of coarse-ne TDCs is that the readout circuits can be extended

to multiple channels. In IMOTEPD, 16-channel readout circuits are designed. The

readout circuits in each channel include Hit registers for both coarse and ne conversion,

encoder, and second-stage PISO registers. They are pure digital circuits which are easily

implemented by the Top-Down design methodology. Moreover, the digital circuit has

robust performances on crosstalk. Thus, the time-to-digital conversion can performed in

parallel simultaneously for all channels.

5.2.2 Circuit description

To construct a sub-nanosecond TDC, both the counters in the coarse conversion and

the multiphase clock generator in the ne conversion should be carefully designed.

32-phase DLL

The diagram of 32-phase delayed clocks using a delay line in the DLL is shown in Fig-

ure 5.5. Achieving low-jitter performances is important for the DLL. The jitter introduced

by both the architecture and circuit techniques should be taken into account. The design

techniques of the DLL has been discussed in Chapter 4. Since only one DLL is integrated,

the Bangbang phase detector with large lter capacitance and small charging/discharg-

ing current are selected to achieve a low-jitter performances in IMOTEPD. The MOS

capacitance in the loop lter is larger than 90 pF. Meanwhile, the charging/discharging

current is about 10 uA.

Clk_in Clk_out∆t ∆t ∆t ∆t

Q_0

Q_1

Q_3

0

Q_3

1

t0625 ps625 ps 625 ps

Waveforms

20 ns (one clock period)

Phase Detector

Figure 5.5: The delay line in the DLL used for the IMOTEPD. With a 50 MHz reference clock,

32 delayed clocks with precise delay of 625 ps can be obtained.

The driving capability of the delay cells should be well designed. Since 16-channel

readout circuits should be driven, the driving capability of the delay cell should be high


enough to ensure the conversion in all channels is correct. Besides, the operation of the

DLL should be taken into account. The rst and the last outputs of the DLL should be

compared in the phase detector. Thus, two output signals in these delay cells are required.

It means that two buers should be connected. To achieve precise delay time, the other

delay cells should connect two buers as well. The architecture and the schematic are

shown in Figure 5.6. This architecture ensures that the load condition of each delay cell

is the same so that the delay time is precise.

∆t ∆t ∆t ∆tClk_in Clk_out

Clk_ref

Q_0

Q_1

Q_3

0

Clk_outQ

_31

∆t

Phase Detector

I Z

0.8um0.35um

0.5um0.35um

16um0.35um

10um0.35um

Buffer

Figure 5.6: The proposed two-buer architecture for delay cells in the DLL. The buer is

realized by two inverters, the dimension of the second inverter is 20 times than that of the

premier one.

The buer is realized by two inverters. The dependence of driving capability is upon

the output current of the inverter. Neglecting the channel-modulation eect, the output

current of PMOS is given as

ID,p =1

2µpCOX(

W

L)p(|VGS| − |Vthp|)2 (5.6)

and the output current of NMOS is

ID,n =1

2µnCOX(

W

L)n(VGS − Vthn)2 (5.7)

Considering the load of the two inverters, the premier inverter has a load of the input

capacitance of the second inverter. However, the load of the second one equates to the

parasitic capacitance of 16-channel readout circuits. Thus, the output current of the

second inverter should be larger than that of the rst one. This can be implemented by

increasing the dimension of MOS transistors the second inverter.

The duty cycle of delayed clocks should also be considered. Since the outputs of

delayed cells are utilized as reference clocks for multiphase sampling, the duty cycle of

the delay clocks should be 50 % so that the sampled data can be symetric from the MSB

to the LSB. Nevertheless, errors will occur in the encoding operation. To obtain good

duty cycle, the output current of the PMOS should be equal to that of the NMOS when


the transistors are ON. According to Equation 5.6 and 5.7. We have

1

2µpCOX(

W

L)p(Vdd − |Vthp|)2 =

1

2µnCOX(

W

L)n(Vdd − Vthn)2 (5.8)

Solving this equation, the ratio of the dimension for PMOS and NMOS in the inverter is

given as(W/L)p(W/L)n

=µnµp· (Vdd − Vthn)2

(Vdd − |Vthp|)2(5.9)

For the used 0.35 µm technology, we have

(W/L)p(W/L)n

≈ 1.6 (5.10)

Thus, the widths of transistors in the rst inverter are 0.8 µm for PMOS and 0.5 µmfor NMOS, respectively. the widths of transistors in the second inverter are 16 µm for

PMOS and 10 µm for NMOS, respectively. The lengths of all transistors in the buer

are 0.35 µm. The proposed buer can drive the load capacitance of larger than 1 pF.

The duty cycle is simulated as (50 ± 0.07) %. However, the error of the duty cycle is

accumulated in the delay chain. The rst delay clock has the best duty cycle. The last

delay cell has the worst duty cycle which is changed to (50 ± 2.24) %.

Dual-counter architecture

The coarse conversion implemented by a dual-counter architecture is shown in Fig-

ure 5.7. The resolution of the binary counter is 9 bits. The counter core is implemented

by the D-ip-op-based digital circuits which can be synthesized from Verilog codes.

However, two design issues should be considered.

• The reset circuits should be well designed. Since the TDC is performed in a time

window of the coincidence, a reset circuit is required to generated the initial counted

number. However, the delay of the circuit and the jitter accumulation introduce

the nonlinearity of the initial number.

• The test circuits of the counter should be considered. Since the proposed counter is

the pure digital circuit, the test can be realized by using the BIST circuits based on

the JTAG interface. The D op ops can be replaced as boundary-scan registers.

The schematic of the 9-bit binary counter is shown in Figure 5.8. The circuit is

based on the registers with combination logic gates. In the previous designs, the outputs

of the counter are connected to the Q ports of the D ip ops. Since they are utilized

for the subsequent operations, these output signals have dierent load conditions which

introduce dierent delay of the outputs. In our design, the outputs are changed to the

QN ports of the D ip ops.

The reset circuits of the counts are shown in Figure 5.9. In out design, the clock

frequency is 50 MHz which is corresponding to the clock period of 20 ns. In addition,


Figure 5.7: The coarse conversion using the dual-counter architecture.

D Q

QN

D Q

QN

D Q

QN

D Q

QNQN0

QN1

QN2QN8

Q<0> Q<1> Q<2> Q<8>

QN0

QN7

QN0

TIE1

Hit_clear

Clk

Figure 5.8: The schematic of the proposed 9-bit binary counter.

since the time window of the system is 10 µs, 500 numbers should be counted. Thus,

when the outputs of the counter are (111100100)B, the counter should be reset and start

a new counting operation from zero. Since the outputs of the counter are generated at

the positive edge of the clock, their states can be sampled by the next negative edge of

the clock. In our design, an external reset signal 'Hit_clear_ext ' is set to well control the

time window. Thus, a multiplexer is used to select Hit_clear_int and Hit_clear_ext.

The selecting signal Hit_clear_sel will be given by a boundary scanning register via

JTAG interface.

16-channel readout circuits

The readout circuits including Hit registers, encoder and second-stage registers can

be reused and extended into multiple channels. The readout circuits in each channel

including two parts: coarse readout circuits and ne readout circuits. The coarse readout

circuits are composed of two 9-bit Hit registers, a multiplexer and 9-bit second-stage

registers. The selected signal of the multiplexer is from the MSB of the ne data. Thus,

the sampling clock of the second-stage register should be delayed when the selected signal


Clk_50M

D Q D Q

Q<8>

Q<7>

Q<6>

Q<2>

Q<5>

Q<4>

(111110100)B=500D

0

1

RstB

Hit_clear_ext

Hit_clear_int Hit_clear

Hit_clear_sel

Figure 5.9: The reset circuits of the counters.

is ready. The ne readout circuits consist of 32-bit Hit registers, a 32-to-5 encoding circuit

and a 5-bit second-stage registers.

The data format of the TDC is shown in Figure 5.10. This nal format which is 16

bits includes coarse data, ne data and two ag bits which named as 'Error' and 'Hold'.

"Error" is utilized to point out the validation of the conversed ne data. Moreover,

"Hold" is utilized to valid the nal data format. If Hold is High, the nal data is valid.

Or else, the output data can be neglected. The nal data are summed together.

Coarse Time Fine Time

Error Hold

LSB LSBMSB MSBData format

MSB LSB

Figure 5.10: The data format of the time words for the proposed TDC.

Since the TDC include multi-channel readout circuits, the readout strategy of the

time words in all channels should be taken into account. The parallel-in-serial-out (PISO)

registers are adopted to realize the second-stage registers in this design. The architecture

of PISO registers is shown in Figure 5.11. In the writing mode, the data in each channel

are parallel written into the register. In the read mode, the registers are connected in a

serial chain. The data are read out in serial. The one-bit register can be realized by a

standard D ip op with switches.

The timing for the readout circuits is shown in Figure 5.12. Here, the 64-channel

conguration is considered. Clk_Rd is the readout clock which frequency is 10 MHz.

Hit_clear is the reset signal of each time window. The period of the reset signal is 10

µs. R/Wb is the enabling signal which controls the reading and writing mode of the

registers. The 16-bit data in each channel are read out in one period of Clk_Rd. The

total readout time is 6.4 µs for 64-channel architecture.


DFF7

DFF6

DFF5

DFF4

DFF3

DFF2

DFF1

DFF0

Parallel Data In

Clock pulses

Serial Data Out

Figure 5.11: The architecture of PISO registers used in the TDC. In the write mode, the data

from the Hit registers and encoder are written in parallel into the registers. In the read mode

the data are then read out in serial. The one-bit register can be realized by standard a D ip

op with transmission-gate switches.

Clk_Rd

Hit_clear

R/Wb

Cycle N Cycle N+1

1 2 3 4 5 6 63 64... 0 64 0Number of Channel

...1 2 3 63

10us6.4us

10us

Figure 5.12: The timing of the readout method for 64-channel converted data.

5.2.3 Experimental results and discussions

A 16-channel prototype (IMOTEPD) is designed in 0.35 µm CMOS technology. The

layout of IMOTEPD is shown in Figure 5.13. The prototype integrates a DLL with 32

delay cells, two 9-bit binary counters, 16-channel digital readout circuits and a JTAG

controller.

The TDC was measured with a reference clock of 50 MHz. The bin size of 625 ps

has been obtained. The dierential nonlinearity is shown in Figure 5.14. It illustrates

that the typical DNL of IMOTEPD is ± 0.35 LSB. This value is corresponding to the

DNL error of 218.75 ps. The contributions of the DNL include the jitter of input clock,

the jitter due to the DLL, the noise and mismatches of the circuits. One notes that the

worst DNL is at the both sides of the output codes. This phenomenon is mainly due to

the oset delay time between the rst output and the last output of the DLL. The tested

oset is about 200 ps. Besides, the power dissipation is 28.8 mW which is corresponding

to about 1.8 mW/channel.


Figure 5.13: The layout of the TDC prototype which is named as IMPTEPD. The prototype

integrates a DLL which use the circuits described in Section 4.2 in Chapter 4, two 9-bit binary

counters, 16-channel digital readout circuits and a JTAG controller.

Figure 5.14: The dierential nonlinearity of IMOTEPD. The maximum value is ± 0.35 LSB

(where 1 LSB = 625 ps).

Based on the design and characteristics of IMOTEPD, the circuits of the 625-ps


TDC are optimized and then are embedded into a 64-channel front-end ASIC named

"IMOTEPAD". Including the DLL and two counters, it consists of 64-channel readout

circuits, PISO registers and a JTAG controller in the TDC part. The design challenges

of the TDC circuits lie in the optimization of the conversion linearity and the 64-channel

layout of the circuits.

The improved nonlinearity of the ne conversion are shown in Figure 5.15 and Fig-

ure 5.16. The performances were obtained by code density test from the collection of

the 640,000 events (about 20,000 points for each digital code) by sampling signal at the

condition of 27 oC and the power supply of 3.3 V. The DNL of the ne conversion is

± 106.2 ps which is corresponding to ± 0.17 LSB. The INL of the ne conversion is ±193.7 ps which is corresponding to ± 0.31 LSB. The tested results illustrated that the

proposed coarse-ne TDC can achieve good linearity.

Figure 5.15: The tested dierential nonlinearity (DNL) of TDC circuits built in IMOTEPAD.

The DNL of the ne conversion is ± 106.2 ps which is corresponding to ± 0.17 LSB when the

bin size is 625 ps.

The test results illustrate that the proposed coarse-ne TDC is very suitable for PET

imaging due to its multi-channel architecture, good linearity and low power. However,

the resolution of the TDC is only 625 ps which is not small enough for TOF PET imaging

applications.

5.3 Design of a multi-channel TDC based on a DLL

array

With the architecture of a coarse-ne TDC, several methods can be utilized to achieve

the smaller bin size. Firstly, one can increase the frequency of the reference clock. For

example, a half of the bin size (312.5 ps) can be achieved if we use 100 MHz clock.

Design of a multi-channel TDC based on a DLL array 111

Figure 5.16: The tested integrated nonlinearity (INL) of TDC circuits built in IMOTEPAD.

The INL of the ne conversion is ± 193.7 ps which is corresponding to ± 0.31 LSB when the

bin size is 625 ps.

However, operational range of the proposed DLL can not approve the frequency of larger

than 75 MHz. Thus, this method is usually limited. Secondly, one can reduce the bin

size by increasing the number of the delay cells in the DLL when the frequency of the

reference clock is still 50 MHz. Nevertheless, the number of the delay cells is limited by

their minimum delay time. Moreover, the mismatch of the delay cells does not approve

long chain delay cells in the DLL. Thirdly, the time interpolation can be utilized to

improve the precision. The TDC based on a DLL array is a good solution for PET

imaging. Since only the single DLL is replaced by a DLL array, a new TDC can be easily

realized to achieve the smaller bin size based on the architecture of the coarse-ne TDC.

5.3.1 Time interpolation using a DLL array

The architecture of a DLL array [83] is shown in Figure 5.17. Two kinds of DLL

should be used to construct the array for the time interpolation. The input clock is con-

nected to the input of the vertical DLL. The inputs of the horizontal DLLs are connected

to the outputs of the vertical DLL. If each DLL is locked in one clock period, the outputs

of the horizontal DLLs can be reorganized to achieve a smaller time interval unit which

equates to the dierence of the dierent delay time generated by two kinds of DLLs.

Assuming the numbers of delay cells in two dierent DLLs are m and n (m < n),respectively, when all DLLs are locked, the delay time of delay cells in two dierent DLL

can be given as

Tm =Tclkm

(5.11)

Tn =Tclkn

(5.12)


delay clk

delayclk

nn n n

nn n n

nn n n

nn n n

mm

mm

m

Figure 5.17: The topology of a DLL array [120, 83]. Two kinds of DLL should be used to con-

struct the array. The time taps of delay cells in both classes of DLL are Tm and Tn, respectively.

The bin size of ADLL can be obtained by delay dierence of Tm and Tn (where Tm > Tn). Each

DLL is locked in one clock period.

where, Tm and Tn are the delay time of delay cells, respectively. Tclk is the period of

the input clock. The time interval of the clocks generated by the horizontal DLL array

is given as

∆t = Tm − Tn (5.13)

The number of horizontal DLLs should be constrained by the following equations to

assure that the time interpolation is correct.

F =Tclk

∆t · n=

m

n−m(5.14)

where F is the number of the horizontal DLLs with n delay cells. The Equation 5.14 can

be also rewritten asm

n=

F

F + 1(5.15)

One notes that the ratio of m and n has a characteristic that the denominator is larger

than the numerator by 1. Once Tclk and ∆t are given, the value of F, m and n can be

obtained from the above equations. For example, assuming Tclk = 10 ns and ∆t <100ps, we have

F = 4⇒ m

n=

4

5(5.16)

Assuming m = 4k and n = 5k, where k is an integer and k >1, thus

∆t = 100ns

(1

4k− 1

5k

)≤ 100ps⇒ k ≥ 5 (5.17)


If k = 5, m = 20 and n = 25.

Nevertheless, the number of delay cells in both DLLs should be constrained by the

process parameters and jitter performances of the DLLs as well. The minimum delay time

of the delay cell is generally limited by the process. Assuming τp denotes the minimumdelay time for a given process, Tm and Tn should meet the following equations.

Tm ≥ τp ⇒Tclkm≥ τp ⇒ m ≤ Tclk

τp(5.18)

In the same way, we have

n ≤ Tclkτp

(5.19)

For example, τp approximately equates to 150 ps for 0.35 µm technology, m and n can

not be larger than 67 if Tclk = 10 ns.

In addition, the worst-case jitter occurs at the last delay cell of the delay line in

a DLL due to the eect of the jitter accumulation. The jitter performances of delayed

clocks in the array are worse than that of a single DLL. The jitter of the last output in

the DLL array can be given as

δOUT ≈√δ2in + δ2

V CDL,m + δ2V c,m + δ2

V CDL,n + δ2V c,n (5.20)

where, δin is the jitter due to the input clock. δV CDL,m and δV CDL,n are the jitter due tothe VCDLs in the DLL with m delay cells and the DLL with n delay cells, respectively.

δV c,m and δV c,n are the jitter generated by the loop lter in these two DLLs, respectively.

We note that the RMS jitter is worse than that in a single DLL in Equation 4.7. Thus,

the jitter should not exceed a ration to the time unit of the DLL array. That is√δ2in + δ2

V CDL,m + δ2V c,m + δ2

V CDL,n + δ2V c,n ≤ η ·∆t (5.21)

where, η is the ratio of the error tolerance for the time measurement. Normally, this

value is given as less than 5% in the engineering implementation.

In this study, a DLL array consisting of ve DLLs is proposed. The parameter of the

DLL array is F = 4, m = 28 and n = 35. Thus, 140 delayed clocks can be generated in

one clock period. The principle of the time interpolation using a DLL array is shown in

Figure 5.18. The vertical DLL generates 28 delayed clocks with a delay time of 5∆t inone clock periods. Each horizontal DLL generates 35 delayed clocks with a delay time of

4∆t in one clock periods. Thus, the clocks with a delay time of ∆t can be achieved by

using the reorganization of the output clocks of the horizontal DLLs.

The achieved time unit is determined by the frequency of the input clock. Figure 5.19

shows the relation of the bin size versus the clock frequency. the time unit varies from 178

ps to 60 ps while the clock frequency increases from 40MHz to 120 MHz. It is indicated

that the operation of the circuits can be not a fast technology to achieve a very small

time unit by using a DLL array.


0

Interpolation

∆t

DLL28

DLL35 #0

DLL35 #1

DLL35 #2

DLL35 #3

DLL Array(τ=∆t)

5∆t dly0

Tclk = 140 ∆t

Tclk = 140 ∆t

dly1 dly2

4 8 12 16 124 128 132 136

1 5 9 13 21 129 133 13717

136

137

6 10 14 18 26222138 134 138

11 15 19 23 312773139 139

0 1 2 3 139

dly3

Figure 5.18: The principle of the time interpolation using a DLL array.

Figure 5.19: Bin size versus the frequency of the input clock.

5.3.2 Proposed TDC based a DLL array

Architecture

The architecture of a multi-channel TDC based on a DLL array and the dual-counter

architecture is shown in Figure 5.20. The TDC is composed of a DLL array, two 10-bit

Gray-code counters, 64-channel readout circuits and parallel-in-serial-out (PISO) regis-

ters. The proposed architecture was rstly studied in [91, 83]. They utilized such an

architecture to obtain a 89 ps resolution with a 80 MHz clock in a commercial 0.7 µmtechnology. In this study, all circuits are designed in AMS 0.35 µm CMOS technology.

Moreover, low-jitter circuit techniques of DLLs are adopted. As a result, the objective of

the proposed TDC based on a DLL array is to achieve a bin size of 71 ps with a 100-MHz

clock.


tn tn tn tn

tmtm

tmtm

ΦC

tn tn tn tn ΦC

tn tn tn tn ΦC

tn tn tn tn ΦCtm

ΦC

0 4 8 136132

5 9 13 137

10 14 18

15 19 23

1

2 6

7 11

N=35, tm = 4∆tTdelay=140∆t

N=28, tm

= 5∆t T

delay =140∆t

Array of DLLs

10-bit Gray-code

Counter#1

10-bit Gray-code

Counter#2

X140

Clk_100M

Resetb

Hit<0>

X10 X10

Readout Circuits for Channel #0

Hit<1> Readout Circuits for Channel #1

Hit<N> Readout Circuits for Channel #N

Parallel-In-S

erial-Out

Registers.

18

18

18

Time Words

18

Resetb

Figure 5.20: The architecture of the multi-channel TDC based on a DLL array.

In this design, the Gray-code counter is proposed. Since only one bit is changed

for each counting transition, the Gray-code counter has lower power and less noise than

that in a binary counter with the same resolution. Besides, the Gray code counter can

operate faster than the binary counter due to this characteristic. The architecture of the

coarse conversion using dual Gray-code counters is similar with the architecture using

dual binary counters in IMOTEPD. However, a Gray-to-Binary converter should be used

to achieve binary-code representations.

DLL array circuits

The design criteria of the DLL array lies in low jitter and low phase oset of the

DLLs. The circuit techniques of a low-jitter DLLs have been discussed in Chapter 4.

Here, the optimized DLL techniques in Section 4.3 is employed to construct the array.

In addition, since the output clocks of the horizontal DLls are utilized for the time inter-

polation, the phase oset of the DLL should be reduced as small as possible. Moreover,

since the horizontal DLLs have the same architecture, the design reuse is an eective

method to reduce the number of components and decrease the power dissipation. The

bias current circuits and the start controller can be shared by all DLLs. Besides, since

the circuits suer from the unavoidable mismatch in the fabricated process, mismatch

problems should be taken into account in the layout design of delay cells. The diagram

of the proposed DLL array is shown in Figure 5.21. The 140 output clocks generated by


four DLL35s should be reorganized by using a lookup list for the multiphase sampling.

DLL35#0

DLL35#1

DLL35#2

DLL35#3

Bias Circuits

DLL28

Start Control

Ibias

Clk_ref

Reset

Clk0<34:0>

Clk1<34:0>

Clk2<34:0>

Clk3<34:0>

Lookup list

Clk_out<139:0>Clk_dly0Clk_dly1Clk_dly2Clk_dly3

Figure 5.21: The diagram of the proposed DLL array.

Pipeline readout circuits with DCC circuits

The pipeline readout strategy is utilized in this design. In the coarse conversion, two

10-bit registers, a Gray-to-Binary encoder and a 10-bit BSR registers are employed for

states sampling, data conversion and storage, respectively. In the ne conversion, a duty-

cycle-correction (DCC) circuit, a 140-bit register, a thermometer-to-binary converter and

an 8-bit registers are employed. For a multi-channel operation, a pipeline readout strategy

is adopted. A 10-MHz clock (ClkRd) is selected as a reading clock for the second-stage

registers. An enable signal (R/Wb) is also selected for controlling the READ and WRITE

operation of registers. The readout circuits for the ne conversion is shown in Figure 5.22.

Generally, the delay clocks from the VCDL in the DLL array have duty cycle prob-

lems due to the current mismatch of PMOS and NMOS in the delay cells. If the duty

cycle is not 50 % , an error will occur on the encoding operation. To solve this problem,

a DCC circuit should be placed when the input signal is a clock. In our design, a fre-

quency divider using a D ip op is proposed. This scheme is the simplest for the delayed

clocks from a DLL. The reason is that the clock periods are constant. The frequencies

of the clocks are divided will not inuent the characteristics of time interpolation. On

the contrary, the DCC circuit leads to an easier encoding operation. The disadvantage

of this method is that the nonlinearity performances of the TDC will be inuenced by

the mismatches of D ip ops. In our design, 140-bit DCCs are placed in each pipeline

readout chain.

Thermometer-to-Binary converter

With the above DCC circuit, the sampled states of 140-phase clocks are divided into

two cases. The sampled data in each case are pure thermometer codes. The lookup table

of thermometer-to-binary conversion is listed in TABLE 5.1. The sampled data from the

rst delayed clock (Clk_dly_0) is served as a selection signal and the sampled data of


Encoder(Themometer_to_binary conversion)

Hit

6

Fine_data

n

D Q

Qb

n

D Q

Qb

D Q

Qb

D Q

Qb

n n

DQ

Qb

DQ

Qb

DQ

Qb

Clk_ref Clk_outClk_dly_0 Clk_dly_1 Clk_dly_139

Fine_Registers

8

8

Hit_fine

DCC

Figure 5.22: Proposed pipeline readout circuits for ne conversion in the prototype TDC. A

open-loop duty-cycle-correction (DCC) circuit should be placed when the delayed signal is a

clock.

the other delay (Clk_dly_[1:139]) are used for the thermometer-to-binary conversion.

The thermometer-to-binary encoder is realized by the multiplexer-based circuits due to

their simple architecture and relative fewer elements. By using this architecture, the total

conversion time is about 5.1 ns.

Table 5.1: Lookup table of the thermometer-to-binary conversion

Thermometer patters

Case A Case B Binary code

1000...0000 0111...1111 0

1100...0000 0011...1111 1

1110...0000 0001...1111 2

1111...0000 0000...1111 3

... ... ...

1111...1000 0000...0111 136

1111...1100 0000...0011 137

1111...1110 0000...0001 138

1111...1111 0000...0000 139


5.3.3 Experimental results and discussion

A three-channel prototype chip (named as 'TDC-ADLL') based on the proposed

architecture and circuit techniques has been designed in AMS 0.35 µm CMOS IP4M

technologies. In this chip, a DLL array composed of four DLLs with 35 delay cells and

one DLL with 28 delay cells, two 10-bit Gray-code counters, 3-channel readout circuits,

and a JTAG controller are integrated. The photo of the prototype chip is shown in

Figure 5.23. The die size is 3.6 mm × 2.5 mm.

Biascircuits

Counters

3.6 mm

2.5

mm

Figure 5.23: The photo of the three-channel high-resolution TDC prototype (TDC-ADLL). In

this chip, a DLL array consisting of ve DLLs can generate 140 clock phases in one clock period.

With a clock of 100 MHz, the typical resolution of 71 ps can be obtained.

Test setup

A test module is designed and implemented by a six-layer PCB board which is shown

in Figure 5.24 to test the proposed TDC based on a DLL array. Due to the limitation of

pads in the prototype chip, only DLL28 and DLL35〈0〉 in the DLL array can be tested

by our test module. The initial states and the test process are controlled by a JTAG

interface. The data are directly written or read via the custom software running on

the PC. The voltage signals are measured by a high-precision oscilloscope. The writing

and reading operations of digital signals are realized by a Logic Analyzer and a Pattern

Generator, respectively. In addition, the test of DLLs requires a Serial Data Analyzer to

measure jitter performances.

DLL characteristics

The operational range of both DLL28 and DLL35 is 50 MHz to 120 MHz. This

value is limited by the characteristic of VCDL. With a reference clock of 100 MHz, the


Figure 5.24: The photo of the test board of the high-resolution TDC.

delay time of the DLL28 and DLL35 is 357.1 ps and 285.7 ps, respectively. The test

waveform of DLL35〈0〉 is shown in Figure 5.25. The jitter performances are measured at

Figure 5.25: Test Waveform of the rst output and the last output of the DLL35〈0〉 when the

frequency of the clock is 100 MHz.

the last output of each DLL by a Serial Data Analyzer. Both the cycle-to-cycle jitters

and peak-to-peak jitters are measured with charge/discharge current of 40 µA. The RMS

jitter in DLL28 and DLL35〈0〉 is 7.2 ps and 7.9 ps, respectively. The peak-to-peak jitter


in DLL28 and DLL35〈0〉 is 19.8 ps and 20.2 ps, respectively.

The delay errors between two adjacent clocks in each DLL are tested by using the

scanning of the Hit signal with a 10 ps interval. The data are read out via the JTAG

interface. 1000 testing points are collected for each clock in one clock period. The test

results are shown in Figure 5.26. When the operational frequency of the reference clock

is 100MHz, the maximum delay errors in DLL35s are 12.5 ps, 22.4 ps, 23.7 ps, 34.2

ps, respectively. The RMS delay errors in DLL35s are 7.3 ps, 9.2 ps, 10.1 ps, 13.0 ps,

respectively. The value are incremental from DLL35〈0〉 to DLL35〈3〉 due to the jitter

accumulation in DLL28 and also the mismatch of four DLLs.

Figure 5.26: Test results of the DLL35s when the frequency of the clock is 100 MHz.

Nonlinearity

The nonlinearity performances of the conversion are obtained by code density test

from the collection of the events by a random sampling signal at the condition of 27 C

and the power supply of 3.3 V. The DNL and INL performances are shown in Figure 5.27.

The maximum value of DNL and INL are 0.58 LSB and 0.63 LSB, respectively.

Conversion time

In the proposed prototype chip, the conversion time of the TDC is mainly determined

by the circuit delay and the signal processing in the samplers, the encoders and readout

circuits. The measured conversion time is less than 100 ns which corresponds to 10 Tclk.The value is the same as that in the TDC using a single DLL.

Power dissipation

Finally, the total power dissipation of the prototype is 129 mW while the static power

is 23 mW and the dynamic power is 106 mW. The static power is mainly produced by


Figure 5.27: DNL and INL of the TDC using the DLL array embedded in 3-channel TDC. The

data are obtained from the collection of the events by a random sampling signal at the condition

of 27 C and the power supply is 3.3 V.

the DLL array. This value can be reduced by increasing the dimension of the transistors

in the delay cells or using advanced CMOS technologies.

Table 5.2: Overall performances of the proposed multi-channel TDCs

Items Prototype I (IMOTEPD) Prototype II (TDC-ADLL)

Number of channel 16 3

Die size 3.7 mm × 8.3 mm 3.6 mm × 2.5 mm

Dynamic range 10 µs(maximum) 10 µs(maximum)

Reference clock 50 MHz(typical) 100 MHz(typical)

Bin size(LSB) 625 ps(typical) 71 ps(typical)

Number of bits 16 20

Conversion time ≤ 10 Tclk ≤ 10 Tclk

RMS jitter of DLLs ≤ 50 ps ≤ 7 ps

Peak-to-peak jitter of DLLs ≤ 120 ps ≤ 21 ps

DNL ± 0.17 LSB ± 0.58 LSB

INL ± 0.35 LSB ± 0.63 LSB

Power Dissipation 28.8 mW (1.8 mW/channel) 23 mW (7.6 mW/channel)


5.4 Conclusions

This chapter presents the design techniques of a multi-channel coarse-ne TDC using

a single DLL and DLL array for PET imaging. The wide range is basically achieved by

counter-based circuits. The precision depends on the multiphase sampling techniques

based on low-jitter DLL techniques. In the proposed TDC, the key technique is how to

generate multiphase clocks and/or the strategies of time interpolations. Both the charge-

pump DLL and the DLL array are discussed. The former can achieve a resolution of

several hundred picoseconds and the latter can obtain a bin size of several ten picoseconds.

To evaluate the performances, two prototype chips are designed and tested. The

overall performances of this prototype chip are listed in Table 5.2. From the analysis of

results, this kind of TDCs can achieve wide measured range, high precision and multiple

channels. Moreover, the proposed can obtain high conversion speed, good conversion

linearity and low power.

Chapter 6

Design of a Multi-Channel Time-BasedAnalog-to-Digital Converter

In the previous work, the ADC function is not realized in the front-end readout

chip(IMOTEPAD). As a solution, a discrete 14-bit 20-MSamples/s ADC followed with

each front-end ASIC to quantize the 64-channel voltage signals. Although this scheme

can nish the digitizing, the precision of the voltage signals is usually decreased due to

the charge injection of MOS switches in the 64-to-1 multiplexer and the sample-and-hold

operation in the ADC. Meanwhile, the synchronization of output data is very hard work

for the PCB design. To overcome these problems, an integrated multi-channel ADC is

proposed to replace the functions of the used discrete ADC in this chapter.

As shown in Figure 6.1 (a), three schemes are widely used to construct the analog-

to-digital interface for an imaging system. The rst one is the architecture shown in

Figure 6.1 (b). It uses parallel single-channel ADCs in each front-end readout channel

and a digital multiplexer. Slow or moderate ADCs such as SAR architectures can satisfy

the design requirements in this architecture. However, this method will occupy large die

size in the ASIC and dissipates large power.

In addition, Figure 6.1(c) shows the method using an analog multiplexer and a high-

speed ADC such as ash or pipeline architectures. The ash ADCs which dissipate large

power consumption are not suitable for the high-resolution design. Although pipeline

ADCs can achieve good tradeos of speed, resolution and power, their implementations

are very complex in CMOS technologies. Meanwhile, using a single pipeline ADC requires

an analog multiplexer to realize the parallel-to-serial conversion of the multiple voltage

signals. The operations in this multiplexer will decrease the precision of the voltage

signals.

The third scheme is the analog-to-digital interface using a multi-channel ADC shown

in Figure 6.1(d). The multi-channel ADC can save both the die area and the power dis-

sipation. Moreover, the reference clock and the enable signals can be easily synchronized

in the integrated multi-channel ADC.

A Wilkinson ADC is widely utilized for the front-end electronics of imaging applica-

tions due to its low power dissipation and the feature being easily extended to multiple

123

124 Design of a Multi-Channel Time-Based Analog-to-Digital Converter

Detector

Detector

Detector

particle

particle

particle

A Av

Channel #1

nTo PC

Vin<1>

Vin<2>

Vin<N>

Ana

log-

to-D

igita

l Int

erfa

ce

A Av

Channel #2

A Av

Channel #N

Front-End Readout Chains

ADC(#1)

ADC(#2)

ADC(#N)

n

n

n

n

Vin<1>

Vin<2>

Vin<N>

Mul

tiple

xer

(c)

nADC_out

SW<1>

SW<2>

SW<N>

Vin<1>

Vin<2>

Vin<N>

(b)

High-speedADC

ADC_out

Channel#1

Channel#2

Channel#N

n

n

n

n

Vin<1>

Vin<2>

Vin<N>

Mul

tiple

xer

(d)

ADC_out

Multi-channel ADC

Dig

ital S

igna

l Pro

cess

ing

(DSP

)

(a)Single-channel ADC

Figure 6.1: Design considerations of the ADC for imaging detector systems.(a)Block Diagram

of typical front-end electronics with ADC technology; (b)The analog-to-digital interface using

parallel single-channel ADCs;(c)The analog-to-digital interface using a single high-speed ADC

;(d)The analog-to-digital interface using a multi-channel ADC.

channels. In this method, the analog-to-digital conversion is divided to two steps, the

voltage-to-time conversion and the time-to-digital conversion. The voltage-to-time con-

version is implemented by using a comparator and a ramp generator. The time-to-digital

conversion of the conventional ramp ADC is realized by a high-resolution counter with a

stable reference clock. This architecture suers from long conversion time, which limits

their applications. Usually, a high-frequent clock is utilized for the sake of improving the

conversion speed. However, the clock frequency is limited by the used CMOS technologies.

As an alternative solution, the multiple sampling techniques is proposed for high-precision

time-to-digital conversion. The sampling rate is improved by using a moderate-resolution

counter and DLL-based circuits. This Wilkinson ADC is an important representative in

the family of "Time-based ADCs".

In this study, a multi-channel time-base ADC is proposed to construct the analog-

to-digital interface. The main specications of the proposed ADC are as follows.

Overview of Time-based ADCs 125

• Number of channels: 64

• Resolution: 12 bits(typical)

• Sampling rate: ∼ 1 MHz.

• Power dissipation: < 1 mW/channel.

6.1 Overview of Time-based ADCs

A time-based ADC is the converter that quantizes analog signals such as voltage

and current by using the unit of the delay time. Its diagram is shown in Figure 6.2.

Generally, the operations of time-based ADCs are divided into two operational steps,

voltage-to-digital conversion and time-to-digital conversion. According to this concept,

pulse-width-modulation ADCs, VCDL-based ADC, VCO-based ADC, classic Wilkinson

ADCs and their improved architectures can be classied into time-based ADCs [125, 126].

Analog-to-Time Conversion

(ATC)

Time-to-Digital Conversion

(TDC)

Start

StopInput 0101...0101

Figure 6.2: Diagram of the time-based ADC.

6.1.1 Pulse-width-modulation ADC

Pulse-width-modulation ADC was one of the oldest ADC architecture which was

found in a patent [127] and in The Data Converter Handbook from Analog Devices [128].

This ADC utilized a sampling pulse to take a sample of the analog signal. This sampled

voltage was then changed to a time-domain signal by using a pulse width modulator. This

time interval which was proportional to the sampled analog signal was also quantized by

a higher-speed clock. Thus, the number of the clock pulse was generated by a 5-bit

counter into nal binary codes. The challenges of this architecture mainly lie to the

implementation of the pulse width and the precision of the sampled clocks.

6.1.2 VCDL-based ADC

In the eld of time-based ADC, the designers tried to nd a linear voltage-controlled

delay line (VCDL) which can directly transfer voltage-domain signals to time-domain

signals [129]. The VCDL-based ADC was an idea from the research of a delay-locked

loop. The signal voltage was sampled and stored in a capacitor. This stored voltage then

drives a voltage-controlled delay circuit so that a time interval between Start and Stop


was generated. This time interval was nally digitized by a high-resolution TDC. The

challenges of the VCDL-based ADC depended on the design of a linear voltage-controlled

delay cell and the fast high-resolution TDC.

6.1.3 VCO-based ADC

Another choice of the time-based ADC was to use voltage-controlled oscillator which

can realize voltage-to-frequency conversion [130]. This idea was from a phase-locked loop.

In this scheme, a limited sampled time, Tsample, was rstly set to provide a time window.Once the tuning voltage was changed, the frequency of the output clock varied. Thus,

the number of pulse was quantized by a counter. To speed-up the operations, a ring VCO

was usually utilized. However, a time residue existed when the tuning voltage varies. To

quantize this residue error, noise shaping techniques was utilized. The challenges of this

ADC were determined by the precision and linearity of the VCO and the noise shaping

techniques.

6.1.4 Classic Wilkinson ADC

The Wilkinson ADC or ramp ADC was designed by D.H. Wilkinson in 1950s. The

operation of the ADC was based on the comparison of the input voltage versus a linear

ramp voltage.

The architecture of the Wilkinson ADC is shown in Figure 6.3. It generally consists

of a ramp generator, a comparator, a n-bit counter, and n-bit registers. A high-linearity

ramp voltage is generated by an integrator and reset periodically by a MOS switch.

When the ramp voltage increases, the counter starts to count numvbers. This ramp

voltage compares to the input voltage by using a high-speed comparator with the xed

latency. If the ramp voltage exceeds the input voltage, a pulse is generated. This pulse,

so called as 'Hit', is utilized to sample outputs of the counter with a reference clock. The

sampled number of the counter is then registered and encoded. The operational timing of

the ramp ADC is shown in Figure 6.4. For a multi-channel topology, the ramp generator

and the counter are shared by all channels. Each channel consists of one comparator and

registers.

In 1992, O.B. Milgrome et al. proposed a 12 bit ramp ADC for VLSI applications

in nuclear science [131]. The tested chip contained a linear ramp circuit, precise high

speed comparators, a pipelined counter, and double buering storage latches fabricated

in a 2 µm CMOS technology. The prototype of these circuits successfully combined

digital frequencies in excess of 70 MHz, with analog signals smaller than 1 mV. Test

results illustrated 1/4096 rms errors at conversion rates above 30 KHz, with less than 4

mW/channel power dissipation.

In 1997, an 8-channel ADC ASIC using a Wilkinson-type architecture was fabricated

in a 1.2 µm CMOS process for the use in multi-channel applications such as the PHENIX

detector [132]. The ADC design features include a dierential positive-ECL input for the

high speed clock and selectable control for 11 or 12-bit conversions making it suitable for


Figure 6.3: Block diagram of Wilkinson Ramp ADC.

C-1 C

C

C+1 C+2 C+3 C+4

Figure 6.4: Timing of Wilkinson ramp ADC.

use in multiple PHENIX subsystems. [132].

In 2005, the group of University of Pavia, Italy proposed a Wilkinson type A/D

converter as well as all the digital logic required for reading out a 16×16 array of X-ray

detectors [133]. The proposed ADC architecture and read-out strategy allowed us to

handle an event rate as large as 106 event/s over the whole array and 104 event/s over

the single row of the array with a resolution of 10 bits, consuming only 77 mW from a

3.3 V power supply. The A/D converter and the logic were embedded in an ASIC to

be bump-bonded on top of the detector, which includes also the front-end electronics

required for processing the sensor output signals. Two years later, a complete read-out

channel suitable for large arrays of X-ray detectors was mature and used for spectrometry

applications in space [134]. It basically consisted of a front-end circuit for processing the

detector signal, a Wilkinson A/D converter for the analog-to-digital conversion and the

digital logic required to ensure the correct handshaking between all the blocks of the

read-out channel. This chip was fabricated in 0.35 µm CMOS technology. The on-board

A/D converter features 10 bits of resolution with a maximum conversion time of 210 µs.The INL and DNL of the whole read-out channel are equal to 3.3 LSB and 0.2 LSB,

respectively.


6.1.5 Improved ramp ADC

The maximum conversion time of a classic ramp ADC can be given as

Tc(max) = Tclk · 2n + Tdly + Tclbr (6.1)

where Tclk is the period of the reference clock. n is the number of bits for the ADC. Tdlyis the total delay time of the registers. Tclbr is the time for the calibration. Since the Tdlyand Tclbr are xed delay time in the specic applications, the maximum conversion time is

mainly determined by Tclk and n. To reduce Tclk, a fast technology with a high-frequencyclock should be employed. For example, to achieve 12-bit resolution and 1-MHz sampling

rate, the ADC needs a clock of up to 3.2 GHz. The development of such an ADC is

a big challenge in CMOS technologies. On the other side, decreasing n becomes more

attractive. However, using this method means that achieving a fast conversion will loss

the resolution of the ADC.

To keep the same level of the resolution, the class architecture has been improved

in some contributions. The mostly used method is to reduce the time resolution by

the time interpolation and multiphase sampling techniques. The timing of the proposed

multiphase sampling is shown in Figure 6.5. Multiphase delayed clocks are generated by

using DLL techniques. The states of delayed clocks are sampled to thermometer codes

by the same Hit signal. The thermometer codes are then converted to binary codes. If

2m clocks are generated, an m-bit binary code can be obtained.

Figure 6.5: Timing of the time interpolation and the multiphase sampling.

With this method, time measurements using two-level conversion scheme should

be utilized. The counter-based circuits are employed for the coarse conversion; time

interpolation and multiphase sampling techniques are adopted for the ne conversion. A

tradeo between the clock frequency and the resolution should be made. If the coarse

resolution and the ne resolution are 5 bits and 7 bits, respectively, the reference clock


of the ADC is down to about 32 MHz to achieve a sampling rate of 1 MHz. These

specications are acceptable and relatively easier to be realized for an ADC in modern

CMOS technologies.

In 1998, J.L. Cura et al proposed a 12-bit resolution, 3 µs conversion time, integratedCMOS ramp ADC in 1.2 µm CMOS technology [135]. The circuit featured a linear ramp

generation whose slope depends on the polarity of the input signal and a time converter

using interpolation techniques within one cycle of the clock. If the input voltage exceeded

Vdda/2, a ramp starting from VDDA with negative slope was generated. On the contrary,

a ramp starting from ground with positive slope was generated. This approach was

responsible for reducing the conversion time. The architecture allowed a wide signal

input range and an equivalent clock speed in excess of 1 GHz, leading to a signicant

reduction in the conversion time, while improving on DNL and INL.

In [64], an ADC was designed by using the Wilkinson architecture and multiphase

sampling techniques. In this novel architecture, a single DLL was placed into each chan-

nel. The chip was fabricated in AMS 0.35 µm CMOS technology. With a 100-MHz

clock, the maximum conversion time is 1.34 µs, corresponding to a 746 kHz sampling

rate. However, the attractive feature of such an architecture is low power dissipation.

With a power supply voltage of 3.3 V, the power consumption was only of 3.3 mW +

0.5mW/channel [64].

In [136], a 10-bit 20 Msamples/s integrating ADC using single-slope ramp ADC

followed by 20 ps resolution TDC was proposed. Both resolution and sampling rate were

in the moderate level which will be very useful for future applications. However, this

contribution only contained the simulation results of the schematics. The test results

have not yet been published.

In [137], an 80 MS/s ADC based on single-slope conversion was presented which

utilized a gated ring oscillator (GRO) TDC to achieve an ENOB of 6.45 bits. The

resulting 0.13 µm CMOS prototype circuit was simple and compact in its implementation

and consumed 6.4 mW of power. The circuit featured that the input voltage was sampled

on a capacitor. The voltage on the capacitor then decreased linearly by connecting a

constant current source in parallel. The ramp voltage compared the threshold voltage

and generated a pulse which was quantized by the GRO TDC. Although high sampling

rate was obtained, the resolution was sacriced. However, this example indicated that

time-based ADCs can achieve high conversion rate so that can be utilized for high-speed

applications.

6.1.6 Comparison of time-based ADCs

The performances of time-based ADCs depend upon both voltage-to-time conversion

and time-to-digital conversion. The performance comparisons of the available ADCs are

concluded in Table 6.1. It is indicated that the time-based ADCs can obtain both high-

resolution and moderate speed. One notes that the Wilkinson ADCs and their improved

architectures can achieve a resolution of 12 bits and the sampling rate in the range of

several kHz to several hundred kS/s. If sacricing the resolution, the improved ramp


Table 6.1: Comparison of available time-based ADCs

Typy of ADCResolution

(bit)

Sampling

Rate(Hz)Power Year Reference

PWM 5 6k N/A 1942 [128]

VCO-based 8 100M N/A 2006 [130]

Wilkinson 12 30k 4mW/Ch 1992 [131]

Wilkinson 12 75k 5mW 1997 [132]

Wilkinson 10 5k 0.4mW/Ch 2007 [134]

Improved Ramp 12 300k N/A 1998 [135]

Improved Ramp 12 746k 0.5mW/Ch 2007 [64]

Improved Ramp 10 20M N/A 2007 [136]

Improved Ramp 6 80M 6.4mW 2009 [137]

architecture can speed up the ADC to the sampling rate in the range of several MS/s,

even up to 80 MS/s. The second feature of time-based ADCs is that the VCO-based ADC

can achieve 8-bit resolution and 100-MS/s sampling rate with low power dissipation.

These performances can be competitive with the famous pipeline ADCs or even make

time-based ADCs probably substitute pipeline ADCs in the future CMOS technology.

In this study, an ADC should be developed for PET imaging. The resolution is at

least 12 bits and the sampling rate is about 1 MS/s. Thus, the improved ramp ADC is

selected for our solution.

6.2 Proposed time-based ADC for PET imaging

Figure 6.6 shows the architecture of the proposed ADC. The new architecture which

is based on the improved ramp ADC architecture consists of two parts, voltage-to-time

converter (VTC) and time-to-digital converter (TDC). The voltage-to-time converter con-

sists of a ramp generator and comparators. The ramp generator is an integrator which is

driven by a high-linearity current source. All components of the ramp generator are in-

tegrated. The comparator is composed of multi-stage ampliers to achieve high-sensitive

Hit pulses with xed latency. The VTC supports the conversion precision of 14 bits. The

time-to-digital converter is composed of a Gray-code counter, a multiphase clock genera-

tor(MCG) realized by a digital DLL, registers and encoders. The Gray-code counter and

the digital DLL are selected due to low-power design considerations.

The blocks in the proposed ADC are mainly realized by digital circuits except the

ramp generator, comparators and associated bias circuits. Thus, this scheme is a 'Big-

Digital-Small-Analog' architecture which is suitable for technology scaling. In addition,

the delay lines is controlled by the digital signals generated by the digital DLL. Com-

Proposed time-based ADC for PET imaging 131

Figure 6.6: Block diagram of proposed ADC. The multiphase sampling techniques are proposed

for the enhancement of the sampling rate.

pared to controlled voltages, digital signals can drive large loads so that more conversion

channels can be integrated.

6.2.1 Ramp generator

The ramp generator can be implemented by a switched-capacitor integrator with a

reset function. The model of the ramp voltage is given as

Vramp(t) =A

A+ 1(Vref +

1

Cf

∫ T

0

Iref (t) dt) (6.2)

where, A is the gain of the operational amplier. Vref and Iref are the reference voltage

and the reference current, respectively. Cf is the integrated capacitance. T is the period

of the reset signal. The linearity of the ramp voltage mainly depends on the reference

current which is generated by a constant current mirror.

The proposed schematic of the ramp generator is shown in Figure 6.7. The integrator

is composed of a high-gain operational amplier, an integrated capacitor, and a high-speed

MOS switch. The switch is controlled by a reset signal which determines the operational

range of the ramp voltage. The slope of the ramp is adjusted by changing the value of

the reference current and the capacitor. The baseline of the ramp voltage is set by a


A1

A3

Vref1

Vref3

Vref2

VDDA

GND

Iref

Cf

Vramp

R

Reference current GenerationIntegrator

MP1

MP2 MP3 MP4

MP5

MP6 MP7

MP8

MN9 MN10

MN11MN12

Reset

Reset

Resetb

Figure 6.7: Schematic of the proposed high-linearity ramp generator [132].

reference voltage (Vref3) which is dependent on the specic applications. To generate

a high-linearity current, a cascade current mirror (MP2 ∼ MP8, MN9 and MN10) is

employed to convey the input current to the integrator. The amplitude of the reference

current of the integrator is well controlled by the value of an integrated resistor and two

reference voltages (Vref1 and Vref2). The current is given as

Iref =Vref2 − Vref1

R· (W/L)12

(W/L)11(6.3)

where (W/L)11 and (W/L)12 are the width-to-length ratio of the transistors M11 and

M12, respectively.

In our design, to achieve the dynamic range of 1.0 V ∼ 3.0 V, the theoretical value

of Vref3, Iref , R and Cf are 1.0 V, 15 µA, 50 kΩ, and 5 pF, respectively. The transient

behaviors of the ramp generator is shown in Figure 6.8. With a reset signal, the ramp

voltage linearly increases. As illustrated in Figure 6.8(c), the slope at both ends is worse

than that in the middle. This is because the circuits are aected by the nonlinearity of

parasitic capacitors and the limitation of the output range of the OPAMP.

The post-simulated results are shown in Figure 6.9. It is indicated that the slope

of the ramp voltage decreases when the circuit works from the worst corner to the best

corner. In addition, the range in the upper end is limited by the corner. Thus, the

dynamic range of the ADC is partly determined by the ramp generator. Moreover, the

linearity becomes better and better from the worst corner to the best corner.

6.2.2 Comparator

A high-speed and high-sensitive comparator is required for the fast and high resolu-

tion conversion in the proposed ADC. Here, a cascade high-gain comparator is selected.

The schematic of the proposed comparator is shown in Figure 6.10. The rst three stages


Figure 6.8: Transient behaviors of proposed ramp generator from the post simulation. Note

that the nonlinearity of the ramp voltage occurred at the both sides of the ramp voltage. (a)

Reset; (b)Vramp;(c)Slope of the ramp voltage.

are implemented by the dierential pair with the gain of about 10. Two current sources

are connected to the PMOS transistors in parallel to reduce the kickback noise. In order

to achieve the sensitive time information, a current amplier is used as the output stage

instead of a high-speed latch. The gain of the output stage is about 20. As a result, the

total DC gain is about 86 dB.

Since the range of the input voltage is 2 V, with 12 bits resolution, the minimum

resolved voltage of the comparator is 2 V/212=0.49 mV. At this voltage level, the noise

and the oset voltage of the comparator should be carefully processed. If the SNR is larger

than 20 dB, the equivalent input noise voltage should be reduced to less than 50 µV. Theoset voltage which is introduced by the selected architecture and the mismatch of the

transistors can not be eliminated only by increasing the area of transistors. Thus, the

oset cancellation strategy should be utilized. In this design, output-oset-storage (OOS),

shown in Figure 6.11, is adopted. Although the comparator with auto-zero techniques

can reduce the oset voltage, large phase noise is introduced due to the operations of

switches.

The simulation results of the proposed comparator are shown in Figure 6.12, where

the output signals are collected when the input voltage scans at a step of 0.5 mV. It

is shown that the time error due to the oset voltage is about 25 ps. With auto-zero

techniques, the peak-to-peak value of the jitter due to the operations of the switch is


Figure 6.9: Post-simulated results of Vramp at the conditions of dierent corners. (a) worst

(vdda = 3 V; temp = 80 degree); (b) typical(vdda = 3.3 V; temp = 27 degree); (c) best (vdda

= 3.6 V; temp = -50 degree).

Iref

Vinp

Vout

Vdda

GND

10 µA

Vinn

×10 ×10 ×10 ×20

Preamplifier Output amplifier

Figure 6.10: Proposed high-precision high-speed comparator for ramp ADCs.

about 3 ps.

The results of the AC simulation of the comparator with the auto-zero techniques

are shown in Figure 6.13. It is indicated that the open gain of the comparator is about

94.8 dB. It means that the designed comparator is sensitive enough for the applications.

6.2.3 Digital DLL

Analog DLLs can generate low-jitter high-precision multiphase clocks but dissipate

high power consumption which limits their applications in low power electronics. More-


Φ1

+

+

_

_A1

+

+

_

_A1

+

+

_

_A1

+

+

_

_A2

Φ2

Φ1Φ2

Φ1Φ2

Φ1Φ2

Φ1Φ2

Φ1Φ2

Φ1Φ2

Φ1Φ2

VinpVinn

Vcmo

Vcmo

Vout

Figure 6.11: High-precision high-speed comparator with output-oset-storage techniques.

Figure 6.12: Post simulation results of the comparator with and without auto-zero techniques.

over, the analog DLL can solve the design problems due to the technology scaling. An

eective solution is to use a digital DLL.

Few contributions on the design of digital DLLs for the multiphase clock generation

were presented. In [138], a digital DLL using a simple structure with a counter-based

delay line was designed to provide the synchronous clock distribution in high-speed digital

systems. Recently, the digital DLLs were developed to generate xed-delay multiphase

clocks for DDR SDRAM controller [139]. The delay elements in these designs were

composed of standard logic gates whose delay time was limited by the used technology.

Moreover, the jitter performances were not improved. For the multi-channel ADC/TDC

dedicated to front-end signal processing, sub-gated delay and low-jitter performances are

required. As a result, a novel digital DLL should be developed.

Architecture and operation principle

The proposed digital DLL is depicted in Figure 6.14. It consists of a digital-controlled

delay line which is composed of linear delay elements, a digital phase detector, a up/down


Figure 6.13: AC response of the comparator. The open-loop gain is 94.8 dB when the frequency

is less than 1 MHz.

counter and a digital lter. Clk_ref is propagated through the digital-controlled delay

line (DCDL), and compared to Clk_out in the phase detector (PD). If a delay dierence

is detected, the closed loop will automatically correct it by changing the controlled digital

numbers so as to adjust the delay time in the DCDL. The steady state is achieved when

Clk_out and Clk_ref have no or tiny phase dierences which can not be detected by the

PD.

τ τ τ

Phase Detector

Up/Down

Counter

Vcnt<3:0>

Clk_refDigital-Controlled Delay Line

Up

Down

Clk_out

τ

Vmid<3:0>Digital Filter

τ τ τStart

τ

Q<0

>

Q<1

>

Q<3

0>

Q<3

1>

delay

time

delay

time

Digital DLL with 625-ps delay

elements

DLL Locked 625 ps

625 ps

Figure 6.14: Diagram of the low-jitter digital DLL.

The locked-state waveforms of the digital DLL are shown in Figure 6.15. By using


8 9 10 9 8 9 10 9 8

Clk_ref

Up/Downb

A<3:0>

Delay timeTarget delay

Target number

Figure 6.15: Waveform when the digital DLL is locked.

two-state Bangbang PD, digital numbers from the outputs of the counter are changed

periodically according to the Up/Downb signal. The delay time of the DCDL, which is

related to the jitter performances, is also uctuated from the smallest value to the largest

value. A target delay is located in the middle of these two values. The peak values of the

digital ripples can be sampled by the Up/Downb. The average value can be achieved by

using these two values. Thus, the jitter of the delay time will be reduced.

Circuit descriptions

The digital-controlled linear delay element is presented in Figure 6.16. It consists of

two standard inverters and four digital-controlled MOS capacitors. The width-to-length

ratio of a unit MOS capacitor is 0.5µm/0.5µm. The dimension of four MOS capacitors

are 1, 2, 4 and 8 units, respectively. The gate ports are connected to the output of

the rst inverter. Thus, the total capacitance at the output of the rst inverter varies

according to the value of 4-bit digital numbers. By using this circuit, good linearity of

the delay time can be easily achieved. The characteristics of this delay cell are shown in

Figure 6.17. A slope of 9.9 ps per one is obtained. The linearity is almost the same as the

idea curve. The drawback of this structure is that the delay clocks have poor duty-cycle

performances due to the dierent load conditions of two standard inverters.

In this study, the digital Bangbang phase detector, shown in Figure 6.18, is employed

due to its good performances on the detection of the picoseconds phase dierence and

the characteristics of the two-state operation. When the DLL is locked, the periodical

changed Up/Down signals can be utilized as sampled clock for the peak-value detection.

Unfortunately, a dead zone exists due to the mismatch. The dierence of below 10 ps is

impossible to be detected in the used 0.35 µm CMOS process.

The 4-bit bi-direction (Up/Down) counter is implemented by JK ip ops and com-

bination logic circuits. The Up/Down ag is connected to the one of the PD outputs

(Up or Downb). The states of the counter are changed while the negative edges of the

Clk_ref are arrived.

The digital lter is composed of sampled registers, a 4-bit full adder and a multi-

plexer. The boundary value of the counter is detected and stored in the registers. The

data is added by one. The carry bit and the 3-bit most signicant bit (MSB) are the

outputs of the lter. These four-bit data are the average value of the boundary value of


Figure 6.16: Schematic of the linear delay element.

Figure 6.17: Characteristics of the linear delay element.

the counter. The multiplexer is employed to select the digital-controlled numbers.

The layout of the whole digital DLL with 16 delay cells is shown in Figure 6.19. The

die size is 690 µm × 73 µm, which is only one third of that in the analog DLL shown in

Figure 4.20.

6.2.4 Gray counter

A Gray counter dedicated to time measurements is demanded to achieve high-

linearity performances for each code. This requires the circuit should be symmetrical

for each output bit. Moreover, the metastability of the register should be taken into ac-

count. Thus, the design proposed in [140] is employed. The diagram of the Gray counter

is shown in Figure 6.20.

In this study, a 10 bit Gray counter is designed. Thus, the maximum coarse resolution


B

A

A

B

C

A

A

C

B

B

B

A

A

B

A

B

B

A

Figure 6.18: Schematic of the phase detector.

Digital-Controlled Delay Line

Phase Detector Digital FilterUp/Down Counter

690 µm

73 µ

m

Figure 6.19: The layout of the proposed DLL.

Gray-to-Binary

Conversion Gray-to-Binary

Conversion Gray code Reg

D Q+

INC

Clk

RSTN

GRAYGNEXTBNEXT

BIN

Figure 6.20: The block diagram of the Gray-code counter [140].

of the ADC is 10 bits. However, in the practical applications, the coarse resolution can

be 6 ∼ 10 with a nominal value of 8 bits.

6.2.5 Sampling and readout circuits

The outputs of both the counter and the timing generator controlled by the digital

DLL are sampled by a pulse signal(Hit) generated by the comparators in the proposed

ADC. The ash sampling scheme of the coarse-ne TDC discussed in Chapter 5 can

sample the states of the timing as fast as possible. However, the stored data may be


wrong due to the metastability of D ip ops.

An architecture using two counters was proposed in [91] to avoid the metastability of

the ip ops. This method required the MSB of the ne conversion to select the sampled

data from two counters. If the MSB of the ne conversion is not correct, the sampled

data is wrong either.

This paper proposes a novel sampling scheme to overcome these problems. The

schematic is shown in Figure 6.21. This scheme uses only one counter. The outputs of

the counter are sampled by both positive and negative edges of the reference clock, Clk.Three FLAG signals, Stop, POS and SEL, are generated from Hit and Clk. The Stopand POS signal sample the outputs of Register0 and Register1. The sampled data are

then selected by the SEL signal which generated from the Stop and POS. The Startsignal, generated from Hit, connects to the input of the delay line. The states of the

delayed timing signals are sampled by the Stop signal.

D Q

Counter Register1 Register0

D Q

D Q

0

1D QD Q

D QD QD Q

D Q

Clk

HitStop

POS

SEL

Data1

Data0

CoarseData

DFF

DFF

DFF

DFF

DFF

D QDFF

Register2

Start

Encoder FineData

τ τ τ τ τ

Delay line

Figure 6.21: The novel sampling scheme of TDC.

The proposed sampling scheme can eectively improve the nonlinearity problems

of the coarse conversion due to the metastability of the ip ops and the clock jitters.

The drawback of this scheme is that the sampling signals, Start, Stop and POS, willintroduce errors due to the metastability. However, these errors can be controlled as less

than one LSB which is greatly smaller than that in the previous architectures.

6.2.6 Timing controller

In order to reduce the number of pads, an on-chip time controller (TCON) is em-

bedded into the proposed ADC. The timing of whole ADC is shown in Figure 6.22. The

generated signals include the reset signal of the time window (Hit_clear), enabling signal

Error analysis 141

of the ADC PISO registers (Read_enable), non-overlapping clocks for the voltage sam-

pling and the oset cancellation(Phi1 and Phi2 ), resolution controlling signals (EOC6

∼ EOC9 ). To generate these timing signals, a 10-bit counter is employed to provide

the reference timing. A state machine modeled by Verilog HDL is utilized to realize the

signal generation.

Figure 6.22: The controlled timing of the whole ADC.

6.3 Error analysis

The quantization process in ideal ADCs introduces irreversible errors. Firstly, the

quantization error, represented by eqnt, can be written as

e2qnt =

V 2LSB

12(6.4)

where, VLSB is the minimum resolved voltage. However, some other errors exist due

to circuit architectures, the temperature variation, mismatches of the technology and

electronic noise. These errors will introduce the oset number, the nonlinearity and the

dynamic performances [141].

6.3.1 Errors introduced by the ramp generator

The ramp generator introduces errors through the nite gain of the OPAMP, input

oset voltage Vos, unstable reference signals such as Iref and Vref , and the mismatch of

the capacitor. Regardless of the variation of the reset period, the DC error voltage can

be given as

Ve,DC = A1+A ·

[∆Vref + Vos − Iref ·t

C(1+C/∆C)

]+ A

(1+A)(C+∆C)

∫ T0 ∆Iref dt (6.5)


where, ∆Vref , ∆Iref , and ∆C are the variations of the Vref , Iref and C, respectively.A is the gain of the operational amplier. To minimize this error, A should be large

enough. In our case, The gain of the two-stage cascode OPAMP is about 85 dB. In

addition, ∆Vref , ∆Iref , and ∆C should be decreased. For the given capacitance, Ve,DCis decreased while Iref is increased.

The error introduced by the charge injection of the MOS switch and the delay by

RC network should also be considered. Assuming that about 1/3 of the amount of the

charges aects the output voltage, the error voltage due to the charge injection is

Ve,ch ≈Cox(WL)sw · (Vdd − VT )

3 · C(6.6)

where Cox is the gate oxide capacitance density. (WL)sw represents the dimension of

the transistor. Vdd is the power supply voltage. VT is the threshold voltage. Ve,ch mainlydepends on the dimension of the switch transistor. A dummy transistor with a inverse

clock should be placed to minimize the charge injection. Moreover, the error due to the

delay is

Ve,dly ≈ IrefReq ·CLC

(6.7)

where, Req represents the equivalent serial resistance of the metal. CL is the load capac-

itance of the generator output. Ve,dly exists due to the delay introduced by Req and CL.Besides, Ve,dly is proportional to the slope of the ramp voltage, whose value is Iref/C.

As a result, the total error voltage introduced by ramp generator can be written as

Ve,ramp ≈√V 2e,DC + V 2

e,ch + V 2e,dly (6.8)

According to Eq. 6.2 and Eq. 6.8 , the time error due to the ramp generator is given as

Te,ramp ≈A+ 1

A· CIref· Ve,ramp (6.9)

6.3.2 Errors introduced by the comparator

The comparison of the input voltage and the ramp voltage is critical. The errors

introduced by the comparator mainly due to the input oset voltage Vos,cmp and the

delay of the circuits. The time error due to the oset voltage is given as

Te,off ≈C

IrefVos,cmp (6.10)

It is indicated that Te,off is proportional to Vos,cmp. Generally, the oset cancellation

should be utilized to minimize Vos,cmp. However, the prorogation delay time of the

comparator can not be reduced. This delay time can be modeled as

Te,prpgtn ≈ R1C1

[ln(2) + ln( 2Av1

2Av1−1) + ln( 2A2v1

2A2v1−1

)]

+R2CLln( 2A3v1

2A3v1−1

) (6.11)

Experimental results 143

where, Av1 represents the gain of the preamplier. R1 and R2 are the output equivalent

resistances of the rst three stages and the last stage, respectively. C1 is the input

capacitor of the preamplier. CL is the load capacitance of the comparator. Both Te,offand Te,prpgtn aect the precision of generated sampling signal and the linearity of the

ADC. The time error due to the comparator is given as

Te,off ≈C

IrefVos,cmp +R1C1

[ln(2) + ln(

2Av1

2Av1 − 1)

](6.12)

6.3.3 Errors introduced by the counter and the DLL

The errors due to the counter and the DLL lie in the jitter performance of the input

clock and the jitter generated by the circuits. For a counter, the jitter mainly is due to

the input jitter and the electronic noise. The total jitter due to the counter is

δe,cnt ≈√δ2in + δ2

noise (6.13)

where, δin is the jitter of the input clock. δnoise is the jitter due to the electrical noise.

For a digital DLL, the jitter model [119] is described by

δe,DLL ≈√δ2in + δ2

DCDL + δ2Vd

(6.14)

where, δDCDL is the jitter from the DCDL. δV d is the jitter from the digital lter. Indeed,

the worst-case jitter actually occurs at the output of the last delay cell of the DCDL due

to the eect of the jitter accumulation.

6.3.4 DNL model

In addition to the discussed errors, the mismatches of the elements when the fab-

rication is performed also introduce errors. However, the mismatch is very hard to be

modeled. In this dissertation, the error due to the mismatches is marked by a variable

named as 'σmismatch'. There are also other sources of errors which will aect the perfor-

mances of the ADC. However, for simplicity, the DNL model of the proposed ADC can

be calculated as

DNL(i) = Vi−VLSBVLSB

≈ Te,ramp+Te,cmp+δe,cnt+δe,dllTbin

+ σmismatch(i) (6.15)

for performance evaluations. According to Eq. 6.15, both the high-performance VTC

and low-jitter TDC should be designed.

6.4 Experimental results

A prototype circuit based on the architecture and the circuits of the proposed ADC,

which is named as 'PETADC', has been designed and fabricated in AMS 0.35 µm CMOS


technology. The prototype chip is composed of a ramp generator, a 10-bit Gray counter,

a digital DLL with 16 delay cells, 8-channel signal sampling and readout circuits, a timing

controller, bias circuits and a JTAG controller. The photo of the ADC prototype is shown

in Figure 6.23. The die size is 2190 µm × 2600 µm.

Figure 6.23: The microphoto of the proposed ADC.

6.4.1 Performances of VTC

The joint-simulated results of the comparator and the ramp generator are shown in

Figure 6.24. With the given slope of the ramp voltage, the comparison between the input

voltage and the ramp voltage introduce a xed delay time of 35 ns. When the range of

the reset signal varies from 640 ns (corresponding to 6 bits) to 4.096 ns (corresponding

to 12 bits), the range of the delay time is (35 ± 0.2) ns. This delay time contributes the

oset number of the whole A/D conversion. Fortunately, this delay time can be tested

on board and be calibrated via the o-line software.

6.4.2 Performances of the digital DLL

The post simulation results of delayed clocks are shown in Figure 6.25. When the

clock frequency is 100 MHz, the locking time is 14 clock periods. The delay time of the

delay cell is simulated as 545.04 ps ∼ 686.74 ps with a slope of 9.9 ps/1. These values

correspond to an clock range of 91.0 MHz ∼ 114.7 MHz. The delay time of the delay cells

is 622 ps ± 9.9 ps when the clock is 100 MHz. The jitter performances depend on the

jitters of Clk_ref and the jitter generated by the DLL. For the DDLL core, the rms jitter

Experimental results 145

Figure 6.24: Output waveform of the comparator with the ramp generator. Note that the

comparator introduces a delay time of 35 ns. This delay time depends on the slope of the ramp

voltage.

and the peak-to-peak jitter is 0.1 ps and 9.9 ps at 100 MHz clock by post simulation.

However, the jitter-tolerant performances can be achieved when the DDLL core and the

digital lter are used as a multiphase clock generator.

There is nearly no static power dissipation due to all-digital circuits. However, the

dynamic power is about 3 mW.

Figure 6.25: The output waveform of the DDLL.

6.4.3 Performances of the whole ADC

The resolution of the ADC depends on both coarse resolution and ne resolution.

The coarse resolution is programmable, availble coarse resolutions are from 6 bits to 9


Table 6.2: Overall performances of the proposed time-based ADC

Parameters [64] This work

Architecture Wilkinson ADC+analog DLL Wilkinson ADC+digital DLL

Technology AMS CMOS 0.35 µm AMS CMOS 0.35 µm

Resolution 12 (typical) 12(typical)

No. of Channels 4 8

Clock freq. 100 MHz 100 MHz

Sampling rate < 746 kHz 190 kS/s ∼ 1.28 MS/s

DNL >1 LSB ± 0.5 LSB

INL ∼1 LSB ± 0.75 LSB

Power dissipations 3.3 mW + 0.5 mW/Channel 3 mW + 0.2 mW/Channel

bits. The ne resolution is determined by the multiphase sampling. In the rst prototype,

only 4 bit ne resolution is achieved from 16-phase delay clocks. However, the ENOB is

simulated about 3.56 bits. Thus, the available resolution of the ADC is 9 ∼ 12 bits. The

nominal resolution is 12 bits.

With a 100 MHz clock , the maximum conversion time equals to 5.26 µs for 12-bitresolution and 0.78 µs for 9-bit resolution. This is corresponding to the sampling rate of190 kSamples/s and 1.28 MSamples/s, respectively. The DNL and INL are simulated as

0.5 LSB and 0.75 LSB. The power dissipation of the ADC includes two parts. The static

power of the VTC is about 3 mW which mainly contributed by the static current in the

ramp generator, the comparator and their bias circuits. However, for each channel, only

the comparator dissipates static power consumption. The value is only 200 µW. Thus,

the static power condition of the ADC is 3 mW + 0.2 mW/Channel. The dynamic power

is about 20 mW.

The experimental results are listed in Table 6.2. It is indicated that the proposed

ADC has better performances on the linearity and the power dissipation than that of

ADC proposed in [64].

6.5 Conclusions

This chapter proposes a time-based ADC for PET imaging. The ADC consists of

a voltage-to-time converter and a time-to-digital converter. The VTC is realized by a

ramp generator and high-resolution high-speed comparators. The TDC is implemented

by a digital DLL, a Gray-code counter and digital readout circuits. A prototype chip is

designed to evaluate the performances of the ADC using the proposed architecture and

circuit techniques. The simulated results illustrate that the proposed time-based ADC is

proper for the small animal PET imaging.

For future developments, the prototype will be tested to evaluate its performances.

Conclusions 147

Besides, the circuits for the proposed ADC can be extended to 64 channels and be inte-

grated with analog front-end processing circuits and time-measurement circuits.

Chapter 7

Conclusions

7.1 Proposed work

This study presents the design of a full-custom front-end readout chip dedicated to

the measurement of both energy quantities and time stamps for PET imaging systems

which are based on the detector module consisting of a LYSO(Ce) scintillator crystal each

read at both ends by two Photonics Corp. MCP PMTs.

In the energy measurements, the weak current signals from detectors are read out

by a regulated-cascode preamplier and shaped by a CR-RC shaper. The peak values of

the shaped signals in each channel are detected by an analog memory with a monostable

circuit and digitized by an ADC. In the previous work at IPHC, the digitizing function is

realized by a discrete 14-bit 20-MSamples/s commercial ADC chip. In order to achieve

more compact size and to improve the conversion precision, this thesis proposes an inte-

grated multi-channel time-based ADC to replace the function of the external ADC chip

and. The proposed ADC which is realized by the Wilkinson-type architecture with a

digital DLL has several attractive features such as high resolution, low power dissipation

and small die area.

In the time measurements, this thesis proposed a multi-channel 625-ps TDC realized

by counter-based circuits and time interpolations using a low-jitter charge-pump DLL

for the coincidence events. Besides, PET with time-of-ight capability has been shown

to provide a better reconstructed image compared to the conventional positron tomogra-

phy. In the TOF-PET approach, for each detected event, the measurement of the TOF

dierence between two 511-keV photons provides an approximate value for the position

of the annihilation. The approximation is directly limited to the capability of measuring

the arrival time of the two photons. The ASIC needs to include a high-precision TDC

for achieving the required time resolution with good stability. This thesis proposes a

coarse-ne TDC based on a low-jitter DLL array for the resolution enhancement. Precise

multiphase clock generation using low-jitter DLL techniques are discussed.

Since 2007, three prototypes have been designed and fabricated in AMS 0.35 µmCMOS technology. In the front-end analog signal processing chip, the dynamic range,

the linearity, and the power dissipation are optimized. The input dynamic range from

149

150 Conclusions

few fC to more than 100 pC can be achieved. The analog output range of the front-

end readout circuits is from 1.2 V to 3.2 V. The shaping time is 280 ns and the power

dissipation is reduced to less than 15 mW. In the TDC prototype based on a DLL array,

the RMS jitter and the peak-to-peak jitter of the used DLL are reduced to 7 ps and 21

ps, respectively. The bin size of the TDC has been reduced to 71ps with a reference clock

of 100 MHz. In the multi-channel time-based ADC chip, a maximum resolution of 12

bits, a sampling rate of ∼1 MS/s, and the power dissipation of 3 mW + 0.2 mW/channel

are achieved.

The main contributions of this thesis are listed as follows.

• Research on advances in novel techniques of front-end readout and signal processing

for PET imaging. The mostly used techniques are surveyed and concluded in this

work. Moreover, the future research directions of front-end readout ASICs are

pointed out.

• Proposals for a monolithic architecture of the front-end readout ASIC with high-

precision TDC and time-based ADC. This idea will introduce a new research direc-

tion for small-animal PET imaging system with the axial-oriented crystal coupled

with dual photodetectors at both sides.


loop (DLL) techniques. Not only a single low-jitter charge-pump DLL but also the

DLL array are realized for time interpolations. Besides, a digital DLL with linear

delay elements is also constructed to overcome the challenges of technology scaling.

• Design and test of a multi-channel high-precision coarse-ne TDC using counter-

based circuits and DLL techniques. The TDCs based on both a single DLL and a

DLL array are studied. It is indicated that this TDC architecture is very suitable

for the PET imaging applications. Moreover, the resolution enhancement using a

DLL array is rstly reported in this eld.


Wilkinson ramp ADC and digital TDC techniques. The creative ADC which can

digitize the voltage signals from a large number of readout channels provides a

possibility to achieve the one-chip solution with all-digital outputs for the proposed

PET imaging system.

7.2 Future work and perspectives

For the future developments, the performance evaluation of the proposed time-based

ADC circuits will be carried out. Moreover, since CMOS technologies have moved their

process nodes to nanometers, design considerations for the challenges due to technology

scaling will be taken into account. Furthermore, several new research directions should

be pointed out.

Future work and perspectives 151

• Monolithic front-end readout ASIC with digital outputs

For the method of the charge integration, both the energy quantity and the time-

stamp information with digital outputs should be acquired. A trend for front-end ASICs is

to adopt a one-chip solution that both front-end readout chains and the analog-to-digital

interface are integrated into single die.

Several advantages will be achieved. Firstly, the output digital signals are robust and

easily processed by the programmable FPGA and a personal computer. In fact, this is the

mostly used approach for the front-end electronics since the last thirty years. Secondly,

higher precision will be achieved. Since all blocks is integrated into a single die. The

parasitic parameters and the interference are reduced. Moreover, less reduplicate blocks

will be utilized. Thus, the precision can be improved. Thirdly, lower power consumption

will be obtained. While these blocks are integrated together, the supply and bias circuits

can be managed together. Last, more compact size and low material. The cost of the

whole system will be reduced as well.

However, several challenges of the monolithic integration should be addressed. First,

substrate coupling and power noise will reduce overall performances of the front-end

electrical system. As discussed in the previous parts, the front-end readout chains and

the analog-to-digital interface are mainly composed of mixed-signal circuits. Since the

fabrication of both analog circuits and digital circuits on a common substrate, the digital

substrate noise will coupled to the analog part. As a result, the precision of analog

circuits is decreased. So does the noise due to the power supplies. Second, technology

scaling reduces the intrinsic gain of the transistor and the voltage headroom of analog

circuits, which make the design of analog circuits more dicult. This challenge will be

occur in circuits using the submicron meter CMOS technologies. Moreover, more and

more nanometer eects as we move below 130nm node occur and make the physical

verication especially challenging.

Besides, two research trends should be pointed out for PET imaging applications.

• Energy sampling using a high-speed ADC

The concept was based on a deadtimeless pipelined processing of the photosensors

signals. After shaping and sampling by a free-running ADC, the pulses are digitally

ltered to extract time and energy. The data was processed and selected online before

storage. The diagram of this scheme is depicted in Figure 7.1.

According to Nyquist-Shannon sampling theorem, if the signal x(t) contains no fre-quency higher than fs, it is completely determined by giving its ordinates at a series

of points spaced 12fs

seconds apart. Thus, for the digitizing shaping, the pulses can be

recovered by using a high-speed ADC as long as its sampling rate is larger than two times

of the signal frequency.

The high energy resolution can be achieved by using this methods which mainly

prots from the high-performance analog shaper and modern digital circuits. In addition,

this method can be proposed for multi-channel applications. The digital signal processing

circuit can be shared by all front-end channels. Besides, this architecture can integrate

152 Conclusions

Energy-Ao

Vdd_hvRf

Cf

Rbias

V1 V2 V3Shaper

CR-RC shaper or semi-Gaussian shaper

ADC DSPTime stamps

x(t) = a0tn+a1tn-1+a2tn-2…High speedPreamp

x(t) = 0

energy, time

Fitting

Figure 7.1: The front-end electronics using novel digital shaping. The shaped pulse is oversam-

pled by a high-speed ADC and then processed by all-digital circuits.

intelligent blocks such as micro-controller (µC), digital signal processor (DSP) togetherwith front-end electronics into a front-end system-on-chip (SoC). The bottle neck of this

architecture is the design of low-noise front-ends and the high-speed ADC. The challenges

of mixed-signal integration should be addressed as well.

• Time sampling using a high-precision TDC

In recent years, high-precision TDC chips such as CERN HPTDC, TDC130 [142]

and FPGA-based TDCs [89] have been developed. The specication of the HPTDC is 8

channels with 25ps resolution and TDC130 can even achieve a resolution of 10 ps. Mean-

while, the FPAG-based TDC can be used for picoseconds level time measurements. A

novel signal processing approach based on multi-threshold-voltage sampling is proposed

in [37, 143]. In this method, the signals from the detector are modeled as a fast linearly

edge followed by a slower exponential decay. The principle is shown in Figure 7.2. The

signal can be sampled in the energy domain and the time interval derived by the discrim-

inators can be measured by high precision TDCs. Thus, sampled data can be processed

by using the o-line software. The energy quantity, the arrival time and the decay time

can be obtained by the calculation.

The proposed hardware dedicated to this method is shown in Figure 7.3. The cir-

cuits of each detector channel consist of a transimpedance amplier, four discriminators

followed by a 4-channel high-precision TDC. The proposed hardware which is very sim-

ple can be realized by both discrete circuits and integrated circuits. A prototype was

implemented in discrete circuits on PCB boards and reported in [143]. However, the

monolithic integration of these circuits is an important research direction for PET imag-

ing, in particular, for the detectors including the large number of channels.

Future work and perspectives 153

b

0 p

p

Figure 7.2: The principle of the signal processing based on multi-threshold-voltage sampling

method.

-Ao

Vdd_hv

V1

Digital signal processing

(DSP)

Figure 7.3: The hardware to realize the multi-threshold-voltage sampling method.

154 Conclusions

List of Figures

1 Le module de détection. (a) Image de synthèse des quatre modules de détection;

(b) Image de synthèse du système complet intégrant l'électronique "front-end". viii

2 L'architecture proposée des circuits de lecture monolithique "front-end". . . . . ix

3 un canal schématique de la puce de lecture "front-end". . . . . . . . . . . . . xi

4 Photo de la puce prototype (IMOTEPA). . . . . . . . . . . . . . . . . . . . xii

5 L'architecture d'un CTN basé sur des techniques d'un compteur et une matrice

des boucles à verrouillage de délai. . . . . . . . . . . . . . . . . . . . . . . xiii

6 Photo du prototype CTN trois cannax de haute précision. . . . . . . . . . . . xiv

7 Schéma de principe du CAN proposé. . . . . . . . . . . . . . . . . . . . . . xv

8 La photo du CAN proposé. . . . . . . . . . . . . . . . . . . . . . . . . . . xv

1.1 Principle of clinic PET imaging systems [2].(a)The generation of the positron;

(b)The generation of γ-rays from the annihilation of a positron and an electron;

(c)The architecture of the tracer 18FDG; (d)Detector ring and scanning;(e)Computerized

tomography; (f)Image reconstruction. . . . . . . . . . . . . . . . . . . . . . 2

1.2 Diagram of a imaging platform dedicated to biomedical research on small an-

imals. In this imaging platform, a microPET, a microSPECT and a microCT

are combined together to creat both the prole and molecular-level images for

small animals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 The arrangement of the scintillator crystals and photodetectors. (a)Scintillator

crystal with individually coupled photodetector [6];(b)Scintillating crystals with

axial-oriented photodetector [1, 7]; . . . . . . . . . . . . . . . . . . . . . . 5

1.4 The rst prototype of detector module for the proposed MicroPET for biomedical

research of small animals [1]. (a)Detector modules; (b)The detector module with

front-end electronics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5 Images of PLANACON MCP and its output waveform. (a)Photocathode face

of PLANACON MCP PMT; (b)Anodes face of PLANACON MCP PMT. (c)

Typical output waveform. . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.6 The diagram of the front-end readout circuit for each detector channel. . . . . 7

1.7 The schematic of the data acquisition board which can process 256-channel sig-

nals for each MCP detector [11]. . . . . . . . . . . . . . . . . . . . . . . . 8

1.8 The diagram of front-end readout circuits for PET imaging. . . . . . . . . . . 12

2.1 Basic architectures of the analog section in front-end systems [39]. . . . . . . . 20

155

156 List of Figures

2.2 Basic architecture and signal ow of modern front-end electronics systems.(a)

Photo-electric conversion;(b)Signal acquisition;(c)Pulse height analysis;(d)Peak-

detect-and-hold;(e)Analog-to-digital conversion;(f)Time discriminator;(g)Time mea-

surement and digitizing;(h)Digital signal processing. . . . . . . . . . . . . . 22

2.3 The simplied model of the photodetector.(a)DC model;(b)Small-signal model . 24

2.4 Signal acquisition using a voltage-sensitive amplier;(a)Basic equivalent model;(b)Simplied

schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5 The mostly used CMOS voltage-mode ampliers [45].(a)Common-source ampli-

er;(b)Source follower or common-drain amplier;(c)Common-gate amplier;(d)Cascode

amplier;(e)Folded cascode amplier;(f)Dierential-pair amplier. . . . . . . . 26

2.6 Signal acquisition using a current-sensitive amplier;(a)Basic equivalent model;(b)Simplied

schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.7 The mostly used CMOS current-mode amplier [46, 47]. (a)Simple current

amplier based on a current mirror;(b)A simple current amplier using three-

transistor cascode current mirror;(c)A current amplier using four-transistor cas-

code current mirror;(d)A modied current mirror using cascode current mirror

for low-voltage power supply;(e)A regulated cascode current amplier. . . . . . 28

2.8 Signal acquisition using a charge-sensitive amplier;(a)Basic equivalent model;(b)Simplied

schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.9 Schematic of a cascode charge-sensitive preamplier [48]. . . . . . . . . . . . 31

2.10 Schematic of the CR-RC shaper circuit.(a)Open-loop CR-CR shaper;(b)Active

CR-RC shaper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.11 Output waveform of an active CR-RC shaper with the shaping time of 3 µs. The

input voltage varies from 0.1 V to 1V. . . . . . . . . . . . . . . . . . . . . 34

2.12 The semi-Gaussian shaper (a)The architecture.(b)The schematic of the integrator. 34

2.13 Typical output waveforms of a semi-Gaussian shaper [41, 42] . . . . . . . . . 35

2.14 Schematic of the front-end electronics to realize pulse height measurements using

CR-RC shaper or semi-Gaussian shaper with a monostable circuit. . . . . . . 36

2.15 Front-end electronics to realize pulse height measurements using CR-RC shaper

or semi-Gaussian shaper followed by a peak-track-and-hold block which tracks

the pulse and stores the information of the peak value. The circuit is imple-

mented by a OTA, a diode, a capacitor and a voltage buer with a feedback

loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.16 The modied architecture of peak-track-and-hold circuits [53]. . . . . . . . . . 37

2.17 The two-phase peak-track-and-hold circuit [55]. . . . . . . . . . . . . . . . . 38

2.18 Ideal transfer function of a 3-bit ADC. . . . . . . . . . . . . . . . . . . . . 39

2.19 Survey results of the number of bits versus the sampling rate for ADCs released

in the past 20 years [59]. . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

2.20 Survey results of the power dissipation versus the sampling rate for ADCs re-

leased in the past 20 years [59]. . . . . . . . . . . . . . . . . . . . . . . . . 41

2.21 "Time Walk" due to the variable slope of the current pulse. . . . . . . . . . . 43

2.22 A simple model of CFD circuits. . . . . . . . . . . . . . . . . . . . . . . . 43

157

2.23 Basis of time-to-digital conversion.(a)TDC block symbol;(b)Time interval of

Start and Stop signal;(c)Transfer curve of a 3-bit TDC . . . . . . . . . . . . 44

2.24 Architecture of a TDC using current integration and analog-to-digital conversion. 46

2.25 The TDC using dual counters to overcome the metastability of D ip op in the

digital counters [83]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.26 Architecture of the TDC based on a single DLL [83]. . . . . . . . . . . . . . 47

2.27 Architecture of a cyclic TDC. . . . . . . . . . . . . . . . . . . . . . . . . 48

2.28 Architecture of GRO TDC. . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.29 Time amplier and DLL-based TDC.(a)Schematic of a time amplier; (b)Architecture

of TA-based TDC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.1 Proposed one-channel schematic of the front-end ASIC [8, 102]. . . . . . . . . 52

3.2 Regulated cascode transimpedance amplier.(a)The schematic; (b)Small signal

analysis model, where the body eect is neglected. . . . . . . . . . . . . . . 53

3.3 One-channel schematic of the front-end ASIC. . . . . . . . . . . . . . . . . 54

3.4 Schematic of the proposed shaper circuit. . . . . . . . . . . . . . . . . . . . 56

3.5 Schematic of the proposed high-performance two-stage OPAMP. . . . . . . . . 56

3.6 Schematic of the bias circuits for the proposed OPAMP. . . . . . . . . . . . 57

3.7 Proposed time-stamp method for PET imaging. . . . . . . . . . . . . . . . . 57

3.8 Design of high-speed weak current comparator. (a)Circuit of positive-feedback

high-speed current comparator proposed by H.Tra in 1992 [76]; (b)Modied

high-speed current comparator using two diode transistors [78]; (c)Optimized

circuit for high-speed low-current comparator by using one diode transistor and

fast inverters [79]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.9 Proposed high-speed current comparator as a discriminator for time-stamp cir-

cuits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.10 Principle of the Binary-weight current steering DAC. . . . . . . . . . . . . . 60

3.11 Architecture of high-performance current steering DAC. . . . . . . . . . . . . 61

3.12 Schematic of the proposed analog memory. . . . . . . . . . . . . . . . . . . 62

3.13 Photo of the prototype chip (IMOTEPA). . . . . . . . . . . . . . . . . . . 63

3.14 Output transient responses of the shaper in "direct-signal mode". . . . . . . . 63

3.15 Nonlinearity of the 10 channels according to dierent injected charge from 10 to

104 pC in "hold mode". . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.16 Nonlinearity of the energy value versus the gain code (results from Channel 9). 65

3.17 Test results of "time walk" of the 10 channels. . . . . . . . . . . . . . . . . 65

3.18 Trigger eciency according the 10 channels. . . . . . . . . . . . . . . . . . 66

4.1 The architecture of a charge-pump delay locked loop. . . . . . . . . . . . . 70

4.2 The architecture of a digital delay locked loop. . . . . . . . . . . . . . . . . 71

4.3 The architecture of a dual-loop delay locked loop. . . . . . . . . . . . . . . 72

4.4 The architecture of a mixed-mode delay locked loop. . . . . . . . . . . . . . 73

4.5 The architecture of a charge-pump delay locked loop with jitter models. . . . 75

158 List of Figures

4.6 The schematic of delay cells. (a)Current-starved delay cell; (b)Dierential de-

lay cell; (c)Weighted current-adder inverter cell; (d)Linear delay element using

variable MOS capacitor. . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

4.7 The schematic of three-state phase detectors. (a)Three-state detector using stan-

dard D ip op; (b)Phase detector using RS latches; (c)Phase detector using

true-single-phase-clock (TSPC) ip op. . . . . . . . . . . . . . . . . . . . 78

4.8 The waveform of a three-state phase detector. (a)The phase of Clk_out is behind

that of Clk_ref; (b)The phase of Clk_out is before that of Clk_ref; (c)The phase

of Clk_out equates to that of Clk_ref. . . . . . . . . . . . . . . . . . . . . 78

4.9 Circuits for charge pumps. (a)Charge pump using single-end current mirror with

high-speed switches; (b)Dierential charge pump. . . . . . . . . . . . . . . . 79

4.10 Circuits for loop lter. (a)Two-pole lter; (b)One-pole lter, the capacitor can

be realized by MOS transistors. . . . . . . . . . . . . . . . . . . . . . . . 80

4.11 The diagram of the proposed charge-pump DLL. . . . . . . . . . . . . . . . 81

4.12 The architecture of the VCDL and the schematic of the bias circuits and a

current-starve delay cell. . . . . . . . . . . . . . . . . . . . . . . . . . . 82

4.13 DC characteristics of Vcp and Vcn versus Vc. . . . . . . . . . . . . . . . . . 83

4.14 Delay curve of the used current-starved delay cells. . . . . . . . . . . . . . . 83

4.15 Schematic of a Bangbang phase detector which consists of three cross-coupled

RS latch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

4.16 Typical waveforms of a Bangbang phase detector. . . . . . . . . . . . . . . . 85

4.17 Schematic of the proposed charge pump circuit. . . . . . . . . . . . . . . . 85

4.18 Schematic of the loop lter. . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.19 Characteristic curve of parameters scan for charged current and capacitor in loop

lter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.20 Layout of the DLL prototype with 32 delay cells. . . . . . . . . . . . . . . . 87

4.21 Test waveform of the DLL with 32 delay cells. The Bangbang phase detector

is employed. The gure shows that the DLL is well locked. The rst output

(Q<0>) and the last output(Q<31>) has small time dierence. . . . . . . . 87

4.22 Modied current-starved delay cells with the novel bias circuit. . . . . . . . . 89

4.23 The delay time of a delay cell versus the controlled voltage. It illustrates that

the slope of curve is reduced by adding DC current sources. . . . . . . . . . 90

4.24 Current-starved delay cell with programmable DC current source. . . . . . . 91

4.25 Schematic of dynamic phase detector which consists of two TSPC ip ops. . . 91

4.26 The comparison of the waveform for both two-state Bangbang PD and three-

state dynamic PD when DLL is locked. . . . . . . . . . . . . . . . . . . . 92

4.27 Comparison of the transfer characteristic curve for both Bangbang PD and dy-

namic PD. The dynamic PD has no dead zone and smaller generated jitters.

However, Bangbang PD has a dead zone of ± 10 ps in 0.35 µm CMOS technology. 92

4.28 Schematic of the improved charge pump circuit. A feedback circuit is added to

improve the mismatch of the charging and discharging current so that the jitter

due to Vc can be reduced. . . . . . . . . . . . . . . . . . . . . . . . . . . 94

159

4.29 Simulated charging and discharging current versus the controlled voltage. The

data are collected from the input current varying from 10 µA to 40 µA. With a

100-pF capacitor, the available range of the charging and discharging current is

10 µA ∼ 30 µA while the controlled voltage operates in the range of 1 V ∼ 2 V. 94

4.30 Schematic of the improved loop lter for the DLL array. . . . . . . . . . . . 95

5.1 The distribution diagram of the TDCs with the resolution versus the measured

range according to the survey on the TDC architectures [124]. . . . . . . . . . 99

5.2 The timing of the coarse-ne TDC based on the counter and the time interpolation.100

5.3 The architecture of a coarse-ne TDC using ash sampling. Here, the counter

can be binary counter or Gray-code counter. The multiphase clock generator

can be a single DLL or a DLL array. . . . . . . . . . . . . . . . . . . . . . 101

5.4 The proposed architecture of a 16-channel 625-ps TDC based on a 9-bit counter

and a DLL with 32 delay cells. . . . . . . . . . . . . . . . . . . . . . . . . 102

5.5 The delay line in the DLL used for the IMOTEPD. With a 50 MHz reference

clock, 32 delayed clocks with precise delay of 625 ps can be obtained. . . . . . 103

5.6 The proposed two-buer architecture for delay cells in the DLL. The buer is

realized by two inverters, the dimension of the second inverter is 20 times than

that of the premier one. . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

5.7 The coarse conversion using the dual-counter architecture. . . . . . . . . . . 106

5.8 The schematic of the proposed 9-bit binary counter. . . . . . . . . . . . . . 106

5.9 The reset circuits of the counters. . . . . . . . . . . . . . . . . . . . . . . 107

5.10 The data format of the time words for the proposed TDC. . . . . . . . . . . 107

5.11 The architecture of PISO registers used in the TDC. In the write mode, the data

from the Hit registers and encoder are written in parallel into the registers. In

the read mode the data are then read out in serial. The one-bit register can be

realized by standard a D ip op with transmission-gate switches. . . . . . . . 108

5.12 The timing of the readout method for 64-channel converted data. . . . . . . . 108

5.13 The layout of the TDC prototype which is named as IMPTEPD. The prototype

integrates a DLL which use the circuits described in Section 4.2 in Chapter 4, two

9-bit binary counters, 16-channel digital readout circuits and a JTAG controller. 109

5.14 The dierential nonlinearity of IMOTEPD. The maximum value is ± 0.35 LSB

(where 1 LSB = 625 ps). . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.15 The tested dierential nonlinearity (DNL) of TDC circuits built in IMOTEPAD.

The DNL of the ne conversion is ± 106.2 ps which is corresponding to ± 0.17

LSB when the bin size is 625 ps. . . . . . . . . . . . . . . . . . . . . . . . 110

5.16 The tested integrated nonlinearity (INL) of TDC circuits built in IMOTEPAD.

The INL of the ne conversion is ± 193.7 ps which is corresponding to ± 0.31

LSB when the bin size is 625 ps. . . . . . . . . . . . . . . . . . . . . . . . 111

5.17 The topology of a DLL array [120, 83]. Two kinds of DLL should be used to

construct the array. The time taps of delay cells in both classes of DLL are Tm

and Tn, respectively. The bin size of ADLL can be obtained by delay dierence

of Tm and Tn (where Tm > Tn). Each DLL is locked in one clock period. . . . 112

160 List of Figures

5.18 The principle of the time interpolation using a DLL array. . . . . . . . . . . 114

5.19 Bin size versus the frequency of the input clock. . . . . . . . . . . . . . . . 114

5.20 The architecture of the multi-channel TDC based on a DLL array. . . . . . . 115

5.21 The diagram of the proposed DLL array. . . . . . . . . . . . . . . . . . . 116

5.22 Proposed pipeline readout circuits for ne conversion in the prototype TDC. A

open-loop duty-cycle-correction (DCC) circuit should be placed when the delayed

signal is a clock. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

5.23 The photo of the three-channel high-resolution TDC prototype (TDC-ADLL).

In this chip, a DLL array consisting of ve DLLs can generate 140 clock phases

in one clock period. With a clock of 100 MHz, the typical resolution of 71 ps

can be obtained. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.24 The photo of the test board of the high-resolution TDC. . . . . . . . . . . . 119

5.25 Test Waveform of the rst output and the last output of the DLL35〈0〉 when the

frequency of the clock is 100 MHz. . . . . . . . . . . . . . . . . . . . . . . 119

5.26 Test results of the DLL35s when the frequency of the clock is 100 MHz. . . . . 120

5.27 DNL and INL of the TDC using the DLL array embedded in 3-channel TDC.

The data are obtained from the collection of the events by a random sampling

signal at the condition of 27 C and the power supply is 3.3 V. . . . . . . . . 121

6.1 Design considerations of the ADC for imaging detector systems.(a)Block Di-

agram of typical front-end electronics with ADC technology; (b)The analog-

to-digital interface using parallel single-channel ADCs;(c)The analog-to-digital

interface using a single high-speed ADC ;(d)The analog-to-digital interface using

a multi-channel ADC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.2 Diagram of the time-based ADC. . . . . . . . . . . . . . . . . . . . . . . 125

6.3 Block diagram of Wilkinson Ramp ADC. . . . . . . . . . . . . . . . . . . . 127

6.4 Timing of Wilkinson ramp ADC. . . . . . . . . . . . . . . . . . . . . . . 127

6.5 Timing of the time interpolation and the multiphase sampling. . . . . . . . . 128

6.6 Block diagram of proposed ADC. The multiphase sampling techniques are pro-

posed for the enhancement of the sampling rate. . . . . . . . . . . . . . . . 131

6.7 Schematic of the proposed high-linearity ramp generator [132]. . . . . . . . . 132

6.8 Transient behaviors of proposed ramp generator from the post simulation. Note

that the nonlinearity of the ramp voltage occurred at the both sides of the ramp

voltage. (a) Reset; (b)Vramp;(c)Slope of the ramp voltage. . . . . . . . . . . 133

6.9 Post-simulated results of Vramp at the conditions of dierent corners. (a) worst

(vdda = 3 V; temp = 80 degree); (b) typical(vdda = 3.3 V; temp = 27 degree);

(c) best (vdda = 3.6 V; temp = -50 degree). . . . . . . . . . . . . . . . . . 134

6.10 Proposed high-precision high-speed comparator for ramp ADCs. . . . . . . . . 134

6.11 High-precision high-speed comparator with output-oset-storage techniques. . . 135

6.12 Post simulation results of the comparator with and without auto-zero techniques. 135

6.13 AC response of the comparator. The open-loop gain is 94.8 dB when the fre-

quency is less than 1 MHz. . . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.14 Diagram of the low-jitter digital DLL. . . . . . . . . . . . . . . . . . . . . 136

161

6.15 Waveform when the digital DLL is locked. . . . . . . . . . . . . . . . . . . 137

6.16 Schematic of the linear delay element. . . . . . . . . . . . . . . . . . . . . 138

6.17 Characteristics of the linear delay element. . . . . . . . . . . . . . . . . . . 138

6.18 Schematic of the phase detector. . . . . . . . . . . . . . . . . . . . . . . . 139

6.19 The layout of the proposed DLL. . . . . . . . . . . . . . . . . . . . . . . . 139

6.20 The block diagram of the Gray-code counter [140]. . . . . . . . . . . . . . . 139

6.21 The novel sampling scheme of TDC. . . . . . . . . . . . . . . . . . . . . . 140

6.22 The controlled timing of the whole ADC. . . . . . . . . . . . . . . . . . . . 141

6.23 The microphoto of the proposed ADC. . . . . . . . . . . . . . . . . . . . . 144

6.24 Output waveform of the comparator with the ramp generator. Note that the

comparator introduces a delay time of 35 ns. This delay time depends on the

slope of the ramp voltage. . . . . . . . . . . . . . . . . . . . . . . . . . . 145

6.25 The output waveform of the DDLL. . . . . . . . . . . . . . . . . . . . . . 145

7.1 The front-end electronics using novel digital shaping. The shaped pulse is over-

sampled by a high-speed ADC and then processed by all-digital circuits. . . . . 152

7.2 The principle of the signal processing based on multi-threshold-voltage sampling

method. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

7.3 The hardware to realize the multi-threshold-voltage sampling method. . . . . . 153

162 List of Figures

List of Tables

1.1 CMOS technology roadmap . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1 Several contributions on front-end electronics for high-energy physics and med-

ical imaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.2 Several contributions on analog-to-digital converters (ADC) . . . . . . . . . . 42

2.3 Several contributions on time-to-digital converters (TDC) . . . . . . . . . . . 45

3.1 Comparison of the overview performances of the front-end readout ASICs . . . 67

4.1 The performance comparison of the proposed and the optimized charge-

pump DLL in the locking states . . . . . . . . . . . . . . . . . . . . . . 95

5.1 Lookup table of the thermometer-to-binary conversion . . . . . . . . . . 117

5.2 Overall performances of the proposed multi-channel TDCs . . . . . . . . 121

6.1 Comparison of available time-based ADCs . . . . . . . . . . . . . . . . . . 130

6.2 Overall performances of the proposed time-based ADC . . . . . . . . . . . . 146

163

164 List of Tables

Bibliography

[1] S. Salvador. Conception et realisation d'un module de detection d'un tomographe

a emission de positrons dedie a l'imagerie du petit animal. PhD thesis, University

of Strasbourg, 2009.

[2] D.W. Townsend. Physical principles and technology of clinical pet imaging. Annals

Academy of Medicine, 33(2):133 145, 2004.

[3] D. J. Burdette. Very high resolution small animal pet. 2000.

[4] D. Brasse, I. Piqueras, and J.-L. Guyonnet. Design of a small animal pet system

high detection eciency. In Nuclear Science Symposium Conference Record, 2004

IEEE, volume 4, pages 2412 2416 Vol. 4, oct. 2004.

[5] G. D. Hutchins, M. A. Miller, V. C. Soon, and T. Receveur. Small animal pet

imaging. ILAR Journal, 49(1):54 65, 2008.

[6] William W. Moses. Photodetectors for nuclear medical imaging. Nuclear Instru-

ments and Methods in Physics Research Section A: Accelerators, Spectrometers,

Detectors and Associated Equipment, 610(1):11 15, 2009. New Developments In

Photodetection NDIP08, Proceedings of the Fifth International Conference on New

Developments in Photodetection.

[7] S. Salvador, D. Huss, and D. Brasse. Design of a high performances small animal

pet system with axial oriented crystals and doi capability. Nuclear Science, IEEE

Transactions on, 56(1):17 23, feb. 2009.

[8] N. Ollivier-Henry, J. D. Berst, and C. Colledani et al. A front-end readout mixed

chip for high-eciency small animal pet imaging. Nuclear Instruments and Methods

in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associ-

ated Equipment, 571(1-2):312316, 2007.

[9] N. Ollivier-Henry, P.Bard, and D. Brasse et al. Imotepd: A low-jitter 16-channels

time-to-digital converter based on delay locked loop for small animal pet imaging

applications. records of 2008 IEEE nuclear science symposium and medical imaging

conference, Dresden(Germany), Oct. 2008.

[10] X. Fang, N. Ollivier-Henry, andW.Gao et al. Imotepad:64-channel front-end asic for

small animal pet imaging. Nuclear Instruments and Methods in Physics Research

165

166 Bibliography

Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, Sub-

mitted.

[11] IPHC Microelectronic Group. Design review du micro-circuit "imotepad 64".

T.B.D, Mar. 2009.

[12] W. F. Jones, L. G. Byars, and M. E. Casey. Positron emission tomographic images

and expectation maximization: a vlsi architecture for multiple iterations per second.

IEEE Transactions on Nuclear Science, 35(1):620624, 1988.

[13] D. F. Newport and J. W. Young. An asic implementation of digital front-end

electronics for a high resolution pet scanner. IEEE Transactions on Nuclear Science,

40(4):10171019, 1993.

[14] W. Wai-Hoi, G. Hu, and N. Zhang et al. Front end electronics for a variable eld pet

camera using the pmt-quadrant-sharing detector array design. IEEE Transactions

on Nuclear Science, 44(3):12911296, 1997.

[15] B. K. Swann, J. M. Rochelle, and D. M. Binkley et al. A custom mixed signal

cmos integrated circuit for high performance pet tomograph front-end applications.

Nuclear Science Symposium and Medical Imaging Conference, 1:2428, 2002.

[16] M. S. Musrock, J. W. Young, and J. C. Moyers et al. Performance characteristics

of a new generation of processing circuits for pet applications. IEEE Transactions


[17] A. Saoudi and R. Lecomte. A novel apd-based detector module for multi-modality

pet/spect/ct scanners. IEEE Transactions on Nuclear Science, 46(3):479484, 1999.

[18] M. L. Woodring, J. F. Christian, and K. S. Shah et al. Development of an asic

for apd-based small animal pet. Nuclear Science Symposium and Medical Imaging

Conference, 2:737741, 2001.

[19] Z. Deng, J. Y. Yeom, and T. Ishitsu et al. Design of new front-end electronics for

animal pet. Nuclear Science Symposium and Medical Imaging Conference, 3:1543

1546, 2002.

[20] T. K. Lewellen, C. M. Laymon, and R. S. Miyaoka et al. Development of a data

acquisition system for the mices small animal pet scanner. Nuclear Science Sym-

posium and Medical Imaging Conference, 2:10661070, 2002.

[21] F. Habte and C. S. Levin. Study of low noise multichannel readout electronics for

high sensitivity pet systems based on avalanche photodiode arrays. IEEE Transac-

tions on Nuclear Science, 51(3):764769, 2004.

[22] J. F. Pratte, G. De Geronimo, and S. Junnarkar et al. Front-end electronics for

the ratcap mobile animal pet scanner. IEEE Transactions on Nuclear Science,

51(4):13181323, 2004.

167

[23] K. S. Shah, R. Grazioso, and R. Farrell et al. Position sensitive apds for small

animal pet imaging. IEEE Transactions on Nuclear Science, 51(1):9195, 2004.

[24] J. F. Pratte, S. Robert, and G. De Geronimo et al. Design and performance of

0.18um cmos charge preampliers for apd-based pet scanners. IEEE Transactions


[25] Maurizio Conti. State of the art and challenges of time-of-ight pet. Physica Medica,

25(1):1 11, 2009.

[26] William W. Moses. Recent advances and future advances in time-of-ight pet. Nu-

clear Instruments and Methods in Physics Research Section A: Accelerators, Spec-

trometers, Detectors and Associated Equipment, 580(2):919 924, 2007. Imaging

2006 - Proceedings of the 3rd International Conference on Imaging Techniques in

Subatomic Physics, Astrophysics, Medicine, Biology and Industry.

[27] P. Fischer, I. Peric, M. Ritzert, and T. Solf. Multi-channel readout asic for tof-pet.

2000.

[28] M. Pedrali-Noy, G. Gruber, B. Krieger, E. Mandelli, G. Meddeler, W. Moses, and

V. Rosso. Petric - a positron emission tomography readout integrated circuit. 2000.

[29] J. F. Pratte, C. M. Pepin, and D. Rouleau et al. Design of a fast shaping amplier

for pet/ct apd detectors with depth-of-interaction. IEEE Transaction on nuclear

science, 2:807811, 2002.

[30] Y. C. Tai, A. F. Chatziioannou, and R. W. Silverman et al. Micropet ii: an ultra-

high resolution small animal pet system. Nuclear Science Symposium Conference

Record, 3:18481852, 2002.

[31] J. D. Martinez, J. M. Benlloch, and J. Cerda et al. High-speed data acquisition

and digital signal processing system for pet imaging techniques applied to mam-

mography. IEEE Transactions on Nuclear Science, 51(3):407412, 2004.

[32] P. E. Vert, J. Lecoq, and G. Montarou et al. Innovative electronics architecture

for pet imaging. Nuclear Science Symposium Conference Record, pages 30573059,

2006.

[33] P. Guerra, J. Espinosa, and J. E. Ortuno et al. New embedded digital front-end for

high resolution pet scanner. IEEE Transactions on Nuclear Science, 53(2):770775,

2006.

[34] S. S. Junnarkar, J. Fried, and S. Southekal et al. New time to digital converter,

signal processing, data acquisition, calibration and test hardware for ratcap. Nuclear

Science Symposium Conference Record, pages 45974601, 2007.

168 Bibliography

[35] J.-F. Genat, G.Varner, and H. Frisch F. Tang. Signal processingforpicosecondreso-

lutiontimingmeasurements. Nuclear Instruments and Methods in Physics Research

A, 607:387393, 2009.

[36] Texas Instruments. Dsps in medical imaging.

http://focus.ti.com/general/docs/gencontent.tsp?contentId=51435, White pa-

per, 2008.

[37] Q. Xie, C. M. Kao, Z. Hsiau, and C. T. Chen. A new approach for pulse processing

in positron emission tomography. IEEE Transactions on Nuclear Science, 604(1-

2):327330, 2009.

[38] N. Ollivier-Henry, J.D. Berst, C. Colledani, Ch. Hu-Guo, N.A. Mbow, D. Staub,

J.L Guyonnet, and Y. Hu. A front-end readout mixed chip for high-eciency small

animal pet imaging. Nuclear Instruments and Methods in Physics Research Section

A: Accelerators, Spectrometers, Detectors and Associated Equipment, 571(1-2):312

316, 2007. Proceedings of the 1st International Conference on Molecular Imaging

Technology - EuroMedIm 2006.

[39] L. Fabris and P. Manfredi. Optimization of front-end design in imaging and

spectrometry applications with room temperature semiconductor detectors. IEEE

Transaction on Nuclear Science, 49(4):19781986, 2002.

[40] P.F. Manfredi and M. Manghisoni. Front-end electronics for pixel sensors. INucl.

Instrum. Methods A, 465:140147, 2001.

[41] Z. Y. Chang and W. M.C. Sansen. Low-noise wide-band ampliers in bipolar and

cmos technologies. Kluwer Acdamic Publisher, 1991.

[42] Helmuth Spieler. Semiconductor detector systems. Oxford university press, 2005.

[43] K.Iniewski. Medical imaging:principle,detectors and electronics. OA John Wiley

and Sons, Inc.,Publication, 2009.

[44] C. L. Melcher. Scintillation crystals for pet. The Journal of Nuclear Medicine,

41(6):1051 1055, 2000.

[45] B. Razavi. Design of analog cmos integrated circuits. The McGraw-Hill Companies,

Inc., 2001.

[46] K. Koli. Cmos current amplier: Speed versus nonliearity. PhD thesis, Helsinki

University of Technology, 2000.

[47] S.M. Park and H.-J. Yoo. 1.25-gb/s regulated cascode cmos transimpedance

amplier for gigabit ethernet applications. IEEE Journal of solid-state circuits,

39(1):112121, 2004.

169

[48] G. Gramegna, P. O'Connor, P. Rehak, and S. Hart. Cmos preamplier for low-

capacitance detectors. Nuclear Instruments and Methods in Physics Research A,

390:241245, 1997.

[49] W. Sansen and Z.Y. Chang. Limits of low noise performance of detector read-

out front ends in cmos technology. IEEE Transaction On Circuits and Systems,

37(11):13751381, 1990.

[50] K. T. Z. Oo, E. Mandelli, and W. W. Moses. A high-speed low-noise 16-channel

csa with automatic leakage compensation in 0.35-µm cmos process for apd-based

pet detectors. IEEE Trasaction On Nuclear Science, 54(3):444453, 2007.

[51] T. Noulis, C. Deradonis, S. Siskos, and G. Sarrabayrouse. Detailed study of par-

ticle detectors ota-based cmos semi-gaussian shapers. Nuclear Instruments and

Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and

Associated Equipment, 583(2-3):469 478, 2007.

[52] P.F. Buckens and M.S. Veatch. A high performance peak-detect and hold circuit

for pulse height analysis. IEEE Trans. Nucl. Sci., 39(4):753757, 1992.

[53] M.W. Kruiskamp and D.M.W. Leenaerts. A cmos peak detect sample and hold

circuit. IEEE Trans. Nucl. Sci., 41(1):295298, 1994.

[54] M.N. Ericson, M.L. Simnpson, and C.L. Britton. A low-power cmos peak detect and

hold circuit for nuclear pulse spectroscopy. IEEE Trans. Nucl. Sci., 42(4):7247288,

1995.

[55] G.D. Geronimo, P. O'Connor, and A. Kandasamy. Analog cmos peak detect and

hold circuits.part1. analysis of the classical conguration. Nuclear instruments and

methods in physics research A, 484:533543, 2002.

[56] B. Razavi. Principle of data conversion system design. Wiley-IEEE Press, 1995.

[57] K.H. Lundberg. Analog-to-digital converter testing.

Http://web.mit.edu/klund/www/papers, 2002.

[58] R.H. Walden. Analog-to-digital converter survey and analysis. IEEE J. Select.

Areas Commun., 17(4):539550, Apr. 1999.

[59] B. Le, T.W. Rondeau, J.H. Reed, and C.W. Bostian. Analog-to-digital converters.

IEEE Signal Processing Magazine, pages 6977, Nov. 2005.

[60] S. Shahramian, S.P. Voinigescu, and A.C. Carusone. A 35-gs/s, 4-bit ash adc

with active data and clock distribution trees. Solid-State Circuits, IEEE Journal

of, 44(6):17091720, June 2009.

170 Bibliography

[61] W. Gao, Ch. Hu-Guo, T. Wei, D. Gao, and Y. Hu. A 12-bit 2.5ms/s multi-channel

ramp analog-to-digital converter for imaging detectors. In Imaging Systems and

Techniques, 2009. IST '09. IEEE International Workshop on, pages 183186, May

2009.

[62] J.. Bouvier, M.. Dahoumane, D.. Dzahini, J.Y. Hostachy, E.. Lagorio, O.. Rossetto,

H.. Ghazlane, and D.. Dallet. A low power and low signal 5-bit 25 ms/s pipelined

adc for monolithic active pixel sensors. Nuclear Science, IEEE Transactions on,

54(4):11951200, Aug. 2007.

[63] S. Gambini and J. Rabaey. Low-power successive approximation converter with 0.5

v supply in 90 nm cmos. Solid-State Circuits, IEEE Journal of, 42(11):23482356,

Nov. 2007.

[64] E. Delagnes, D. Breton, F. Lugiez, and R. Rahmanifard. A low power multi-

channel single ramp adc with up to 3.2 ghz virtual clock. Nuclear Science, IEEE

Transactions on, 54(5):17351742, Oct. 2007.

[65] G.B Clayton. Data converters. Wiley, 1982.

[66] J.D. Lenk. Simplied design of data converters. Elsevier, 1997.

[67] S.R.Norsworthy. Delta-sigma data converters: theory, design and simulation. IEEE,

1997.

[68] M.Gustavsson. Cmos data converter for communications. Kluwer, 2000.

[69] C.shi and et al. Data converters for wireless standards. Kluwer, 2002.

[70] A. Rodriguez and et al. Cmos telecom data converters. Kluwer, 2003.

[71] Rudy J. van de Plassche. Cmos integrated analog-to-digital and digital-to-analog

converters. Springer, 2nd Edition, 2003.

[72] F.Maloberti. Data converters. Springer, 2007.

[73] A. Roermund, H. Casier, and M.Steyaert. Analog circuit design: Smart data con-

verters, lters on chip, multimode transmitters. Springer, 2009.

[74] B. Razavi and B.A. Wooley. Design techniques for high-speed, high-resolution

comparators. Solid-State Circuits, IEEE Journal of, 27(12):1916 1926, dec. 1992.

[75] Y. Degerli, N. Fourches, M. Rouger, and P. Lutz. Low-power autozeroed high-speed

comparator for the readout chain of a cmos monolithic active pixel sensor based

vertex detector. Nuclear Science, IEEE Transactions on, 50(5):1709 1717, oct.

2003.

[76] H. Tra. Novel approach to high speed cmos current comparators. Electronics

Letters, 28(3):310312, 1992.

171

[77] G. Palmisano and G. Palumbo. High performance cmos current comparator design.

IEEE Trans. Circuits Sytem II, 43:785790, 1996.

[78] L. Ravezzi, D. Stoppa, and G.-F. Dalla Betta. Simple high-speed cmos current

comparator. Electronics Letters, 33(22):18291830, 1997.

[79] H. Lin, J. Huang, and S.-C. Wong. Simple high-speed cmos current compara-

tor. Proceedings of IEEE International Symposium on Circuits and Systems, 2000,

Geneva., 2:713716, 2000.

[80] M.L. Simpson, C.L. Britton, A.L. Wintenberg, and G.R. Young. An integrated,

cmos, constant-fraction timing discriminator for multichannel detector systems.

Nuclear Science, IEEE Transactions on, 42(4):762 766, aug. 1995.

[81] M.L. Simpson and M.J. Paulus. Discriminator design considerations for time-

interval measurement circuits in collider detector systems. Nuclear Science, IEEE

Transactions on, 45(1):98 104, feb. 1998.

[82] D.M. Binkley, B.S. Puckett, B.K. Swann, J.A. Rochelle, M.S. Musrock, and M.E.

Casey. A 10-mc/s, 0.5- mu;m cmos constant-fraction discriminator having built-in

pulse tail cancellation. Nuclear Science, IEEE Transactions on, 49(3):1130 1140,

jun. 2002.

[83] M. Mota. Design and characterization of cmos high-resolution time-to-digital con-

verters. PhD dissertation, UNIVERSIDADE TECNICA DE LISBOA, 2000.

[84] A.Mantyniemi. An integrated cmos high-precision time-to-digital converter based

on stabilised three-stage delay line interpolation. PhD dissertation, University of

Oulu, 2004.

[85] M.A.Z. Straayer. Noise shaping techniques for analog and time to digital converter

using voltage controlled oscillator. PhD thesis, MIT, 2008.

[86] S.Henzler. Time-to-digital converter. Springer, Springer series in advanced micro-

electronics, Jun. 2007.

[87] M. Tanaka, H. Ikeda, M. Ikeda, and S. Inaba. Development of monolithic time-

to-amplitude converter for high precision tof measurement. Nuclear Science, IEEE

Transactions on, 38(2):301305, Apr 1991.

[88] F. Bigongiari, R. Roncella, R. Saletti, and P. Terreni. A 250-ps time-resolution

cmos multihit time-to-digital converter for nuclear physics experiments. Nuclear

Science, IEEE Transactions on, 46(2):7377, Apr 1999.

[89] Mircea Bogdan, Henry Frisch, Mary Heintz, Alexander Paramonov, Harold Sanders,

Steve Chappa, Robert DeMaat, Rod Klein, Ting Miao, Peter Wilson, and

Thomas J. Phillips. A 96-channel fpga-based time-to-digital converter (tdc) and

172 Bibliography

fast trigger processor module with multi-hit capability and pipeline. Nuclear In-

struments and Methods in Physics Research Section A: Accelerators, Spectrometers,

Detectors and Associated Equipment, 554(1-3):444 457, 2005.

[90] A.F. Kirichenko, S. Sarwana, O.A. Mukhanov, I.V. Vernik, Y. Zhang, J. Kang,

and J.M. Vogt. Rsfq time digitizing system. Applied Superconductivity, IEEE

Transactions on, 11(1):978981, Mar 2001.

[91] J. Christiansen. An integrated high resolution cmos timing generator based on an

array of delay locked loops. Solid-State Circuits, IEEE Journal of, 31(7):952957,

Jul 1996.

[92] B.K. Swann, B.J. Blalock, L.G. Clonts, D.M. Binkley, J.M. Rochelle, E. Breeding,

and K.M. Baldwin. A 100-ps time-resolution cmos time-to-digital converter for

positron emission tomography imaging applications. Solid-State Circuits, IEEE

Journal of, 39(11):18391852, Nov. 2004.

[93] F. Baronti, L. Fanucci, D. Lunardini, R. Roncella, and R. Saletti. On the dierential

nonlinearity of time-to-digital converters based on delay-locked-loop delay lines.

Nuclear Science, IEEE Transactions on, 48(6):24242431, Dec 2001.

[94] M. Mota and J. Christiansen. A high-resolution time interpolator based on a delay

locked loop and an rc delay line. IEEE Journal of Solid-State Circuits, 34(10):1360

1366, 1999.

[95] P. Dudek, S. Szczepanski, and J.V. Hateld. A high-resolution cmos time-to-digital

converter utilizing a vernier delay line. Solid-State Circuits, IEEE Journal of,

35(2):240247, Feb 2000.

[96] P. Chen, Shen-Luan Liu, and Jingshown Wu. A cmos pulse-shrinking delay element

for time interval measurement. Circuits and Systems II: Analog and Digital Signal

Processing, IEEE Transactions on, 47(9):954958, Sep 2000.

[97] C.-C. Chen, P. Chen, C.-S. Hwang, and W. Chang. A precise cyclic cmos time-

to-digital converter with low thermal sensitivity. IEEE Transactions on Nuclear

Science, 52(4):954958, Aug. 2005.

[98] Yue Liu, U. Vollenbruch, Yangjian Chen, C. Wicpalek, L. Maurer, T. Mayer,

Z. Boos, and R. Weigel. A 6ps resolution pulse shrinking time-to-digital converter

as phase detector in multi-mode transceiver. In Radio and Wireless Symposium,

2008 IEEE, pages 163 166, 22-24 2008.

[99] M. Z. Straayer and M. H. Perrott. A multi-path gated ring oscillator tdc with rst-

order noise shaping. IEEE Jounal of Solid-State Circuits, 44(4):10891098, April

2009.

173

[100] I. Nissinen, A. Mantyniemi, and J. Kostamovaara. A cmos time-to-digital converter

based on a ring oscillator for a laser radar. ProcIEEE ESSCIRC, pages 469472,

Apr. 2003.

[101] M. Lee and A. A. Abidi. A 9b 1.25 ps resolution coarse-ne time-to-digital converter

in 90 nm cmos that amplies a time residue. Proc.Symp. VLSI Circuits, pages 168

169, Jun. 2007.

[102] N. A. MBOW. Conception et integration en technologie cmos d'un circuit de lecture

et d'identication de coincidences a resolution temporelle de l'ordre de la nanosec-

onde destine a l'imagerie biomedicale. PhD thesis, University of Strasbourg, 2009.

[103] Y. Hu, J.L. Solere, and R. Tutchetta D. Lachartre. Design and performance of a

low-noise, low-power consumption cmos charge amplier for capacitive detectors.

IEEE Trasaction On Nuclear Science, 45(1):119123, 1998.

[104] H. Mathez, G.-N. Lu, and P. Pittet et al. A charge-sensitive amplier associated

with apd or pmt for 511 kev, photon-pair detection. Nuclear Instruments and


Associated Equipment, In Press(Uncorrected Proof, Available online), 2009.

[105] S. M. Park and C. Toumazou. Gigahertz low noise cmos transimpedance amplier.

IEEE International Symposium on Circuits and System, Hong Kong:209212, 1997.

[106] J. Deveugele and M.S.J. Steyaert. A 10-bit 250-ms/s binary-weighted current-

steering dac. Solid-State Circuits, IEEE Journal of, 41(2):320329, feb. 2006.

[107] M. Koizumi, J. Kataoka, and S. Tanaka et al. Development of a low-noise analog

front-end asic for apd-pet detectors. Nuclear Instruments and Methods in Physics

Research Section A: Accelerators, Spectrometers, Detectors and Associated Equip-

ment, 604(1-2):327330, 2009.

[108] Y. Moon, J. Choi, K. Lee, D.-K. Jeong, and M.-K. Kim. An all-analog multiphase

delay-locked loop using a replica delay line for wide-range operation and low-jitter

performance. IEEE Journal of Solid-State Circuits, 35(3):377384, 2000.

[109] Y.-J. Jung, S.-W. Lee, D. Shim, W. Kim, C. Kim, and S.-I. Cho. An all-analog

multiphase delay-locked loop using a replica delay line for wide-range operation and

low-jitter performance. IEEE Journal of Solid-State Circuits, 36(5):377384, 2001.

[110] H. Changv, J.Lin, and C. Yang et al. A wide-range delay-locked loop with a xed

latency of one clock cycle. IEEE journal of solid-state circuits, 37:10211027, 2002.

[111] E. Song, S.-W. Lee, J.-W. Lee, J. Park, and S.-I. Chae. A reset-free anti-harmonic

delay-locked loop using a cycle period detector. IEEE journal of solid-state circuits,

39(11):20552061, 2004.

174 Bibliography

[112] H.-H. Chang, J.g-Y. Chang, C.-Y. Kuo, and S.-I. Liu. A 0.7-2-ghz self-calibrated

multiphase delay-locked loop. IEEE journal of solid-state circuits, 41(5):10511061,

2004.

[113] K.-H. Cheng and Y.-L. Lo. A fast-lock mixed-mode dll with wide-range operation

and multiphase outputs. Proceedings of Design, Automation and Test in Europe,

2006. DATE '06., 2, 2006.

[114] K.-H. Cheng and Y.-L. Lo. A fast-lock wide-range delay-locked loop using

frequency-range selector for multiphase clock generator. IEEE Transactions on

Circuits and Systems II: Express Briefs, 54(7):561565, 2007.

[115] C.-N. Chuang and S.-I. Liu. A 0.5-5-ghz wide-range multiphase dll with a cal-

ibrated charge pump. IEEE Transactions on Circuits and Systems II: Express

Briefs, 54(11):939943, 2007.

[116] R.L. Aguiar and D.M. Santos. Simulation and modelling of digital delay locked

loop. 42nd Midwest Symposium on Circuits and Systems, 2:843846, 1999.

[117] R.L. Aguiar and D.M. Santos. Modelling charge-pump delay locked loops. Proceed-

ings of ICECS '99. The 6th IEEE International Conference on Electronics, Circuits

and Systems., 2:823826, 1999.

[118] A. Ghaari and A. Abrishamifar. A novel wide-range delay cell for dlls. Proceed-

ings of ICECE '06. The 4th International Conference on Electrical and Computer

Engineering, pages 497500, 2006.

[119] W. Gao, D. Brasse, Ch. Hu-Guo, D. Gao, and Y. Hu. Precise multiphase clock

generation using low-jitter delay-locked loop techniques for positron emission to-

mography imaging. Nuclear Science, IEEE Transactions on, 57(3):10631070, Jun.

2010.

[120] J. Christiansen. An integrated cmos 0.15 ns digital timing generator for tdc's and

clock distribution systems. IEEE Transactions on Nuclear Science, 42(4):753757,

1995.

[121] O. Bourrion and L. Gallin-Martel. An integrated cmos time-to-digital converter for

coincidence detection in a liquid xenon pet prototype. Nuclear Instruments and


Associated Equipment, 563(1):100 103, 2006. Proceedings of the 7th International

Workshop on Radiation Imaging Detectors - IWORID 2005.

[122] A. S. Yousif and J. W. Haslett. A ne resolution tdc architecture for next generation

pet imaging. IEEE Transaction on Nuclear Science, 54(5):15741582, 2007.

[123] X. Kang, S. Wang, and Y. Liu et al. A simple smart time-to-digital convertor based

on vernier method for a high resolution lyso micropet. volume 4, pages 2892 2896,

oct. 2007.

175

[124] W. Gao, D. Gao, Ch. Hu-Guo, and Yann Hu. Integrated high-resolution multi-

channel time-to-digital converters (tdcs) for pet imaging. Book title: Biomedical

Engineering, Trends, Researches and Technologies, ISBN: 978-953-7619-X-X, Dec.

2010.

[125] S. Naraghi. Time-based analog-to-digital converters. In PhD thesis, The University

of Michigan, 2009.

[126] S. Naraghi, M.courcy, and M.P.Flynn. A 9bit 14 uw 0.06mm2 pulse position mod-

ulation adc in 90nm digital cmos. In IEEE International Solid State Circuits Con-

ference(ISSCC2009), Feb. 2009.

[127] A.H Reeves. Electric signaling system. In U.S. Patent 2272070, Feb. 1942.

[128] Edited by W.Kester. Analog device the data conversion handbook. In

http://analog.com/library/analogDialogue/archives/39-60/data _ conversion _

handbook.html, 2005.

[129] H. Pekau, A.Yousif, and J.W. Haslett. A cmos integrated linear voltage-to-pulse-

delay-time converter for time-based analog-to-digital converters. In ISCAS2006,

pages 23732376, 2006.

[130] J. Kim and S.Cho. A time-based analog-to-digital converter using a multiphase

voltage-controlled oscillator. In ISCAS2006, pages 39343937, 2006.

[131] O.B. Milgrome, S.A. Kleinfelder, and M.E. Levi. A 12 bit analog to digital converter

for vlsi applications in nuclear science. In IEEE Transaction on Nuclear Science,

volume 39, pages 771776, 1992.

[132] M.S. Emery, S.S. Frank, Jr. Britton, C.L., A.L. Wintenberg, M.L. Simpson, M.N.

Ericson, G.R. Young, L.G. Clonts, and M.D. Allen. A multi-channel adc for use in

the phenix detector. Nuclear Science, IEEE Transactions on, 44(3):374378, Jun

1997.

[133] V. Ferragina, P. Malcovatia, F. Borghetti, A. Rossini, and et al. Implementation of

a novel read-out strategy based on a wilkinson adc for a 16x16 pixel x-ray detector

array. In IEEE international symposium on circuits and systems,2005.ISCAS2005,

volume 6, pages 55695572, 2005.

[134] A. Rossini, S. Caccia, G. Bertuccio, and et al. A complete read-out channel with

embedded wilkinson a/d converter for x-ray spectrometry. In IEEE Transaction on

Nuclear Science, volume 54, pages 12161222, 2007.

[135] J.L. Cura and D.M. Santos. A novel 12-bit, 3 us, integrating-type cmos analog-

to-digital converter. In Integrated Circuit Design, 1998. Proceedings. XI Brazilian

Symposium on, pages 7476, Oct. 1998.

176 Bibliography

[136] T. Fusayasu. A fast integrating adc using precise time-to-digital conversion. In

IEEE Nuclear Science Symposium Conference Record, 2007. NSS'07., pages 302

304, 2007.

[137] M. Park and M.H. Perrot. A single-slope 80ms/s adc using two-step time-to-digital

conversion. In Circuits and Systems, 2009. ISCAS 2009. IEEE intenational Sym-

posim on, pages 11251128, 2009.

[138] C.-S. Hwang, W.-C. Chung, C.-Y. Wang, H.-W. Tsao, and S.-I. Liu. A 2 v clock

synchronizer using digital delay-locked loop. ASICs, 2000. AP-ASIC 2000. Pro-

ceedings of the Second IEEE Asia Pacic Conference on, pages 9194, 2000.

[139] C.-C. Chung, P.-L. Chen, and C.-Y. Lee. An all-digital delay-locked loop for ddr

sdram controller applications. VLSI Design, Automation and Test, 2006 Interna-

tional Symposium on, pages 14, 2628, 2006.

[140] C.E. Cummings. Sunburst design, inc. synthesis and scripting tech-

niques for designing multi-asynchronous clock designs. http://www.sunburst-

design.com/papers/CummingsSNUG2001SJ _AsyncClk.pdf, Rev 1.2, 2001.

[141] W. Gao, D. Gao, Ch. Hu-Guo, and Y. Hu. Design of a 12-bit 2.5 ms/s multi-channel

single-ramp analog-to-digital converter (adc) for imaging detector systems. IEEE

Transaction On Instruments and Measurements, In press.

[142] CERN. Tdc. http://tdc.web.cern.ch/tdc/, 2006.

[143] H. Kim, C. M. Kao, and Q. Xie et al. A multi-threshold sampling method for tof pet

signal processing. Nuclear Instruments and Methods in Physics Research Section A:

Accelerators, Spectrometers, Detectors and Associated Equipment, 602(2):618621,

2009.

Biography

Mr. Wu GAO was born in Hubei Province, P. R. China, in 1982. He received his

B.S., M.S. degrees from Northwestern Polytechnical University (NPU), Xi'an, China, in

2004, and 2007, respectively, all in computer science and technology. From Sept. 2006 to

Sept. 2007, he studied in NPU. Since Oct. 2007, he worked towards his PhD degree on

microelectronics in Université de Strasbourg (UDS), France.

From Feb. 2004 to Sept. 2007, he jointed Engineering Research Center of Embed-

ded System Integration (ERCESI), Ministry of Education, Xi'an, China as an IC design

engineer to develop driver and controller ICs for TFT-LCD and power supply ICs for

portable consumer electronics. From Oct. 2007 to now, he was integrated in the micro-

electronics group of Insitut Pluridisciplinaire Hubert Curien(IPHC), UDS/CNRS/IN2P3,

Strasbourg, France as a PhD student. His research interests focus on the design of analog

and mixed-signal integrated circuits and the development of front-end readout chips for

high-energy physics and biomedical imaging applications.

Publications

• W. Gao, D. Gao, D. Brasse, Ch. Hu-Guo, and Y. Hu. "Precise Multiphase Clock

Generation Using Low-Jitter Delay-Locked Loop Techniques for Positron Emission

Tomography Imaging". IEEE Transaction on Nuclear Science. Vol.57, No.3, pp.

1063-1070, June 2010.

• W. Gao, D. Gao, Ch. Hu-Guo, and Y. Hu. "Design of a 12-bit 2.5 MS/s multi-

channel single-ramp analog-to-digital converter (ADC) for imaging detector sys-

tems". IEEE Transaction on Instruments and Measurements.(In press)

• W. Gao, D. Gao, Ch. Hu-Guo and Y. Hu. "Design Techniques of High-Resolution

Multi-Channel Time-to-Digital Converters (TDCs) for PET Imaging", Biomedical

Engineering, Trends, Researches and Technologies(English Book). ISBN 978-953-

7619-X-X, published by INTECH, 2010. (Invited chapter, in press)

• X. Fang, W. Gao, Ch. Hu-Guo, D. Brasse, B. Humbert and Y.Hu. "Development

of a Low-Noise Front-End Readout Chip Integrated with a High-Resolution TDC

for APD-Based Small-Animal PET", IEEE Transaction on Nuclear Science. (In

press)

177

178 Bibliography

• N. Ollivier-Henry, W. Gao, X. Fang, N. A. Mebow, D. Brasse, B. Humbert, C.

Hu-Guo, C. Colledani and Y. Hu." Design and Characteristics of a Full-Custom

Multi-Channel Front-End Readout ASIC for Small Animal PET Imaging", IEEE

Transaction on Biomedical Circuits and Systems. (In press)

• X. Fang, N. Ollivier-Henry, W. Gao, Ch. Hu-Guo, C.Colledani, B. Humbert, D.

Brasse and Y.Hu. "IMOTEPAD: 64-Channel Front-End ASIC for Small Animal

PET Imaging", Nuclear Instruments and Methods in Physics Research Section A:

Accelerators, Spectrometers, Detectors and Associated Equipment. (Accepted)

Communications

• W. Gao, D. Gao, T. Wei, Ch. Hu-Guo, and Y. Hu. "A 12-bit Low-Power Multi-

Channel Ramp ADC Using Digital DLL Techniques for High-Energy Physics and

Biomedical Imaging". Proceedings of The 10th Intenational Conference on Solid-

State and Integrated Circuit Technology (ICSICT2010), Nov., 2010 (Shanghai,

China, 15-minute Oral Talk)

• W. Gao, D. Gao, Ch. Hu-Guo, T. Wei, and Y. Hu. "A Low-Jitter Multiphase

Digital Delay-Locked Loop for Nuclear Instruments and Biomedical Imaging Ap-

plications". Proceedings of The 5th IEEE Conference on Industrial Electronics and

Applications (ICIEA2010), pp.1715-1718, June, 2010 (Taichung, Tainwan, China,

Poster)

• W. Gao, X. Fang, Ch. Hu-Guo, D. Brasse, B. Humbert, and Y. Hu ," A Low-

Noise Front-End Readout Chip Integrated with a 89-ps TDC for APD-Based Small-

Animal PET", Proceedings of International Conference on advancements in Nuclear

Instrumentation, Measurement Methods and their Applications(ANIMMA2009),

pp.1-6, Jun., 2009 (Marseille, France, 20-minute Oral Talk)

• W. Gao, Ch. Hu-Guo, T. Wei, D. Gao, and Y. Hu."A 12-bit 2.5 MS/s multi-

channel ramp analog-to-digital converter (ADC) for imaging detectors". Proceed-

ings of IEEE Workshop On Imaging system and Technique(IST2009), pp.183-186,

May, 2009. (Shenzhen, China, 20-minute Oral Talk)

• W. Gao, D. Gao, T. Wei, Ch. Hu-Guo, and Y. Hu."A High-Resolution Multi-

Channel Time-to-Digital Converter(TDC) for High-Energy Physics and Biomedi-

cal Imaging Applications", Proceedings of The 4th IEEE Conference on Industrial

Electronics and Applications (ICIEA2009), pp.1133-1138, May, 2009 (Xi'an, China,

20-minute Oral Talk)

• W. Gao, Ch. Hu-Guo, N. Ollivier-Henry, Y. Hu, D. Gao and T. Wei."A 71ps-

Resolution Multi-Channel CMOS Time-to-Digital Converter for Positron Emission

Tomography Imaging Applications ". Proceedings of International Conference on

179

Imaging Theory and Applications(IMAGAPP2009). pp.171-176, Feb., 2009 (Lis-

bon, Portugal, 20-minute Oral Talk)

(5 oral talks, 1 poster)

Strasbourg, 2010

Ecole Doctorale de Physique, chimie-physique

Documents

Transcript of Ecole Doctorale de Physique, chimie-physique