[IEEE 2011 IEEE International Symposium on Circuits and Systems (ISCAS) - Rio de Janeiro, Brazil...

4
Abstract—A monolithic switched-capacitor voltage converter to drive the virtual V SS for low standby power “drowsy” modes is presented. The voltage converter has been fabricated on a 0.13- μm digital CMOS process. Test chip measurements demonstrate current drive capability from 10 μA to 100 μA with power efficiency over 86%. The test chip measurements also demonstrate that the chip level power efficiency of the converter is at least 2x better than the conventional linear regulation approach, which is commonly used in high volume commercially available designs. I. INTRODUCTION MOS technology scaling has led to a tremendous growth in integrated circuit (IC) applications. Standby and dynamic power consumption have become critical factors for battery-powered portable devices. Handheld devices such as cell phones must conserve power to extend battery life—IC standby current (I SB ) of 100 μA or less is expected [1]. However, performance and transistor scaling drive designs to faster and leakier transistors that do not intrinsically meet this I SB requirement, requiring inclusion of low standby power states, which are particularly effective for static random access memory (SRAM). Ideally, the IC operating state is not lost in the low standby power drowsy mode. Drowsy modes apply supply collapse and may apply reverse body bias (RBB) to reduce transistor leakage currents [1-3]. One method to apply NMOS RBB is to fix V substrate to V SS and raise the internal virtual V SS connected to the NMOS transistor sources. PMOS RBB is applied by raising the N- well bias above the memory supply voltage, V DD , typically to an I/O voltage. A large SRAM on a 90-nm Intel CMOS process gives a fair idea of the effect of RBB on subthreshold current [4], where a 16x reduction in I SB is achieved. This circuit design technique does not require process changes, although process changes can improve its efficacy [5]. Additional voltages used in these drowsy schemes must be generated on-die to avoid increasing the system level (bill of materials) cost. The power supply and regulation scheme employed in the 90-nm XScale [6] caches is shown in Fig. 1. Large NMOS transistors M N provide a conducting path to the V SS pad when the circuit is in active operation. Similarly, a large PMOS transistor M P connects the N-Well bias to the V DD pad. In drowsy mode, transistors M N and M P are cutoff and internal supply nodes V well and V SS_bias are controlled by PMOS T. Chen and V. Chaudhary are now with Intel Corp. body and V SS regulator circuits, respectively. A conventional linear V SS regulator is used. The basic regulator design on a 0.18-μm process is described in detail in [4]. The regulator should contribute very little to the overall power dissipation. This linear regulator has poor power efficiency, given as 1 – (V SS_bias / V DD ). For example, for a 100 μA I SB with a 1 V V DD and a 650 mV V SS_bias , 65% of the power is dissipated by the power transistor in the linear voltage converter. Thus, the maximum power efficiency, control circuits such as the amplifier and driver (on the order of 5 μW) is 35%. This paper describes the design and implementation of a monolithic switched capacitor (SC) voltage converter for providing SRAM bias voltages in low standby power drowsy modes. The SC voltage converters have significantly improved power efficiency over its linear counterparts. In Section II, the design and analysis of SC voltage converters, as well as the specific design, are presented. Section III describes the physical implementation on a foundry 0.13-μm digital process and test chip measurement results. The paper concludes with a brief overview of the SC voltage converter performance. II. VOLTAGE CONVERTER ANALYSIS AND DESIGN For monolithic integration, a SC-based voltage converter is a very suitable candidate since it requires only semiconductor switches and capacitors [7,8]. In addition, on-die SC-based voltage converters are more power efficient and require less effective area than other approaches. They easily scale with the CMOS fabrication process as well. Efficient Voltage Conversion for SRAM Low Standby Power Modes Lawrence T. Clark, Tai-Hua Chen, and Vikas Chaudhary School of Electrical, Computer and Energy Engineering Arizona State University, Tempe, AZ 85287, USA C Fig. 1. Drowsy mode circuit configuration using a linear regulator to provide the raised SRAM transistor source bias V SS_bias . The PMOS transistor bulks can be regulated as shown or directly connected to a higher V DDIO voltage during Drowsy mode reduced standby power operation. 978-1-4244-9474-3/11/$26.00 ©2011 IEEE 73

Transcript of [IEEE 2011 IEEE International Symposium on Circuits and Systems (ISCAS) - Rio de Janeiro, Brazil...

Abstract—A monolithic switched-capacitor voltage converter

to drive the virtual VSS for low standby power “drowsy” modes is presented. The voltage converter has been fabricated on a 0.13-μm digital CMOS process. Test chip measurements demonstrate current drive capability from 10 μA to 100 μA with power efficiency over 86%. The test chip measurements also demonstrate that the chip level power efficiency of the converter is at least 2x better than the conventional linear regulation approach, which is commonly used in high volume commercially available designs.

I. INTRODUCTION MOS technology scaling has led to a tremendous growth in integrated circuit (IC) applications. Standby and

dynamic power consumption have become critical factors for battery-powered portable devices. Handheld devices such as cell phones must conserve power to extend battery life—IC standby current (ISB) of 100 μA or less is expected [1]. However, performance and transistor scaling drive designs to faster and leakier transistors that do not intrinsically meet this ISB requirement, requiring inclusion of low standby power states, which are particularly effective for static random access memory (SRAM). Ideally, the IC operating state is not lost in the low standby power drowsy mode.

Drowsy modes apply supply collapse and may apply reverse body bias (RBB) to reduce transistor leakage currents [1-3]. One method to apply NMOS RBB is to fix Vsubstrate to VSS and raise the internal virtual VSS connected to the NMOS transistor sources. PMOS RBB is applied by raising the N-well bias above the memory supply voltage, VDD, typically to an I/O voltage. A large SRAM on a 90-nm Intel CMOS process gives a fair idea of the effect of RBB on subthreshold current [4], where a 16x reduction in ISB is achieved. This circuit design technique does not require process changes, although process changes can improve its efficacy [5].

Additional voltages used in these drowsy schemes must be generated on-die to avoid increasing the system level (bill of materials) cost. The power supply and regulation scheme employed in the 90-nm XScale [6] caches is shown in Fig. 1. Large NMOS transistors MN provide a conducting path to the VSS pad when the circuit is in active operation. Similarly, a large PMOS transistor MP connects the N-Well bias to the VDD pad. In drowsy mode, transistors MN and MP are cutoff and internal supply nodes Vwell and VSS_bias are controlled by PMOS

T. Chen and V. Chaudhary are now with Intel Corp.

body and VSS regulator circuits, respectively. A conventional linear VSS regulator is used. The basic regulator design on a 0.18-μm process is described in detail in [4]. The regulator should contribute very little to the overall power dissipation. This linear regulator has poor power efficiency, given as 1 – (VSS_bias / VDD). For example, for a 100 μA ISB with a 1 V VDD and a 650 mV VSS_bias, 65% of the power is dissipated by the power transistor in the linear voltage converter. Thus, the maximum power efficiency, control circuits such as the amplifier and driver (on the order of 5 μW) is 35%.

This paper describes the design and implementation of a monolithic switched capacitor (SC) voltage converter for providing SRAM bias voltages in low standby power drowsy modes. The SC voltage converters have significantly improved power efficiency over its linear counterparts. In Section II, the design and analysis of SC voltage converters, as well as the specific design, are presented. Section III describes the physical implementation on a foundry 0.13-μm digital process and test chip measurement results. The paper concludes with a brief overview of the SC voltage converter performance.

II. VOLTAGE CONVERTER ANALYSIS AND DESIGN For monolithic integration, a SC-based voltage converter is

a very suitable candidate since it requires only semiconductor switches and capacitors [7,8]. In addition, on-die SC-based voltage converters are more power efficient and require less effective area than other approaches. They easily scale with the CMOS fabrication process as well.

Efficient Voltage Conversion for SRAM Low Standby Power Modes

Lawrence T. Clark, Tai-Hua Chen, and Vikas Chaudhary School of Electrical, Computer and Energy Engineering

Arizona State University, Tempe, AZ 85287, USA

C

Fig. 1. Drowsy mode circuit configuration using a linear regulator to providethe raised SRAM transistor source bias VSS_bias. The PMOS transistor bulks canbe regulated as shown or directly connected to a higher VDDIO voltage duringDrowsy mode reduced standby power operation.

978-1-4244-9474-3/11/$26.00 ©2011 IEEE 73

A. Switched Capacitor Voltage Converters The basic stack-of-two SC voltage converter is illustrated in

Fig. 2(a). The SC voltage converter operates by switching capacitors between series and parallel configurations. In the charging configuration (φ1), the capacitors are connected in series and draw charge from the main power supply (see Fig. 2(b)). In the discharging phase φ2 as shown in Fig. 2(c), the capacitors are in parallel and deliver charge to the load. Non-overlapping clocks are employed to avoid potential short circuit current from VDDL to VSS during switching. In the charging phase, the power supply VDD supplies charge to the top plate of capacitor C1. In the meantime, displacement currents charge C2, as well as powering the output load. In the discharging phase, C1 is disconnected from VDD and connected in parallel with C2 to the output. In this configuration, the load is supplied only by the capacitors.

An important limitation of SC voltage converters is the limited output voltage range that they can provide with high efficiency. The number, not the size, of the stacked capacitors sets the natural output voltage [8]. At this natural output voltage, the voltage potential across capacitors C1 and C2 are very close. Thus, there is minimal energy loss when switching between configurations. The efficiency decreases as the desired output voltage diverges from the natural output voltage. A high clock rate is required for VDDL to be as close to desired output value as possible. However, a higher frequency clock consumes more active power in the clock buffers/drivers, which in turn lowers the overall power conversion efficiency.

B. Design Considerations The input and output power analysis is straightforward. In

the charging phase, the SC voltage converter draws current IDD from input VDD, as shown in Fig. 2(b). The displacement current Igen splits into two branches: one branch to charge C2 and the other, Iload, to the load. Consequently, ISS collects Icap and Iload and flows to VSS. Iload dominates Igen since the path through RL is the better conducting path to VSS than through C2. Eventually Iload approximately equals IDD. In the discharging phase (see Fig. 2(c)), the parallel capacitors deliver current to the load. Since the capacitors are discharging, the current direction is opposite to that of the load current. At the node VDDL the sum of Icap1, Icap2, and Iload is zero. Similarly, the sum of currents is almost zero at either node VSS1, VSS2, or VSS3, and no current flow from VSS2 to VSS.

The effective current from VDD or to the VSS is dependent on the duty cycle of the driving clock as

1 2 1_DD eff

Q Q QI

T Tφ φ φ+

= = , (1)

where T is the switching clock period, Qφ1 is IDD/t1, and Qφ2 is 0 since the SC voltage converter is disconnected from VDD in φ2. Assuming a 50% duty cycle clock, the input power of the SC voltage converter is

_ 2DD

IN DD DD eff DD

IP V I V= ⋅ = ⋅ (2)

as charge is injected from VDD only in φ1. At the natural operation point, the output voltage VDDL is half of VDD. In φ1, Iload ≈ IDD and Iload is also a function of VDDL and RL. In φ2, Iload is still a function of VDDL and RL. Ideally, Iload again approximately equals to IDD. Consequently, the output power of the SC voltage converter is

2DD

OUT DDL load DD

VP V I I= ⋅ = ⋅ . (3)

Comparing (2) and (3) shows that under ideal-load conditions POUT equals to PIN, which gives 100% power efficiency. In practical applications VDDL cannot reach its natural value due to the energy required by switching so power efficiency is less than 100%.

(a)

(b) (c) a

Fig. 2. SC voltage converter (a) circuit topology, simplified (b) chargingphase configuration, and (c) discharging phase configuration.

Fig. 3. Proposed SC voltage converter for drowsy cache VSS generation.

74

C. Application to SRAM Drowsy Modes Fig. 3 shows the proposed SC voltage converter for raising VSSH in drowsy memory applications. Raising VSSH applies RBB and collapses the transistor VDS and VGS, reducing both Ioff and Igate. Here, the memory virtual VSSH is raised (rather than the virtual VDD lowered as in Fig. 2). In the proposed circuit (see Fig. 3), during the discharging phase, the capacitors are absorbing charge from the load instead of providing it. The fundamental operation is the same as the example illustrated previously. In the charging phase, C1 and C2 are connected in series and are switched to a parallel formation across VDD and VSSH during discharging.

III. IMPLEMENTATION AND MEASUREMENT RESULTS

A. Circuit Design In [1], the virtual VDD and VSSH for drowsy mode operation

are 1 V and 650 mV, respectively. The output voltage of the SC voltage converter is set by the number of series capacitors. To set the natural output voltage precisely to 650 mV, a stack height of 20 is required. This requires more die area and increases the control circuit complexity. However, a three capacitor stack configuration (single stack produce voltage drop of 1/3 VDD) can raise the virtual VSSH to 666.67 mV or higher only—too high to ensure data retention. Hence, the proposed SC voltage converter employs stack-of-two configuration exactly as shown in Fig. 3. The target value is 150 mV away from its natural output voltage (500 mV) so a lower frequency clock is required. The natural output voltage is compromised to gain head room for data retention, lower power consumption in the control circuits, and simpler circuit timing. The resulting efficiency is still a significant improvement over the linear voltage converter.

SC voltage converter simulation results with fully extracted parasitics, delivering a 100 μA output current are shown in Fig. 4. The clock frequency is 4 MHz. The simulation drives the voltage from VDD to the target output voltage of 650 mV in approximately 500 μs. The target output voltage is chosen based on the data retention voltage taken from the Intel XScale microprocessor caches on a 90-nm process [6] at worst-case drowsy mode operation temperature (50°C). The output voltage ripple is 100 μV, less than 0.2 % of the output

voltage. When the SC voltage converter is disabled, the configuration is fixed to either charging or discharging mode. In either case there is no dc current from the VDD to VSS and thus no static power consumption when the SC voltage is inactive.

B. Measured Results The proposed SC voltage converter has been fabricated on a

0.13-μm bulk CMOS process technology, as shown in Fig. 5. The on-die capacitors are implemented using a multi-layer interdigitated structure [9] using the top five layers of metal. This type of on-die capacitor utilizes both plate and fringe capacitances to improve the capacitance density. In this process, each 100 μm by 100 μm, M4 to M8 interdigitated metal capacitor has a capacitance of 9.31 pF, as confirmed by measurements using an LCR meter. Excepting the upper metals, the converter integration area penalty is minimal as the clock and control circuits occupy less than 10% of the area under the capacitors. Other circuits and interconnections can co-exist on M1-M3, or the capacitors can use less metal layers to allow more circuit layers beneath them. Consequently, the capacitors can occupy areas above the SRAMs that they provide the drowsy voltages for.

The fabricated SC voltage converter was tested at room temperature with a VDD of 1 V. A resistive load mimics the memory circuit in Drowsy mode, with the values varied to account for different load characteristics. Fig. 6(a) shows the measured output voltage vs. clock frequency. Higher clock frequencies drive the output closer to the natural value, as expected. The dashed line represents the target output voltage of 650 mV. Higher clock frequencies also improve the current drive capability, as evident in Fig. 6(b).

Target values for SOC on-die SRAMs standby current vary with die size and technology, from 50 μA to 100 μA. A 4 MHz clock satisfies the latter requirements with a 101.98 μA output current and a 650.5 mV output voltage while a 2 MHz clock satisfies the former requirements with a 49.75 μA output current and a 626.5 mV output voltage. Finally, Fig. 6(c) shows the measured power efficiency including all overhead, i.e., switch and clock driver energy.

Fig. 4. Simulated drowsy mode SC voltage converter output. The inset showsthe clock and output ripple waveforms.

Fig. 5. Test die photomicrograph before bonding. Other probe structure padsand the pad ring are also evident. The SC voltage converter layout issuperimposed. The converter area is dominated by the interdigitated metalcapacitors. The switch circuitry is superimposed on that area.

75

C. Analysis The 0.13-μm fabricated SC voltage converter is more than

70% efficient throughout the target load current range, which was chosen to represent the +/- 3σ corner SRAM transistor variation, a 2x improvement over a linear voltage converter (Fig. 6). The performance of the SC voltage converter is highly dependent on the clock frequency. There are several approaches to regulate the SC voltage converter but all dissipate significant amounts of power. Alternatively, the converter clock can be divided down from the system reference clock and can employ a simple multiplexer to switch from one frequency to another. The control circuit and scheme for this approach is straightforward. In [1,6] the output voltage is pre-programmed by an MOS resistive divider. Testing with guard band is used to ensure that the target VSSH voltage is sufficient to ensure memory state retention under worst-case conditions.

Each die will have a different data retention voltage and loading characteristic due to systematic and random process variations. Thus the SC capacitor operating frequency must be set so that each die retains state, preferably with a properly guard banded test [10]. Unlike the linear regulator approach, no feedback regulation is needed. Fig. 7 shows the trajectory of the VSSH node when the power gating transistor (MN in Fig. 1) is gated off. The SC converter is operating when the gate of

transistor MN is de-asserted. The low pass characteristic provided by the decoupling capacitance on the VSSH node ensures no overshoot on VSSH.

IV. CONCLUSIONS A fully integrated SC voltage converter for SRAM RBB

and supply collapse drowsy data retention modes has been described. Test chip measurements of the voltage converter fabricated on a foundry 0.13-μm process demonstrate functionality. The test converter delivers the required output voltage of 650 mV with current drive ranging from 9.7 μA to 107 μA. Power efficiency is increased by over 2x compared to previous monolithic linear dc-dc converters. A similar circuit configuration will work for memories that reduce the VDD. At the IC level, this translates into a 50% reduction in ISB, e.g., from 100 μA to 50 μA. The clock frequency provides a simple low energy control mechanism, which can be set at test time to generate the necessary voltage level to guarantee state retention for each SRAM or IC.

REFERENCES [1] L. Clark, et al., “Standby power management for a 0.18μm

microprocessor,” Proc. ISLPED, 2002, pp. 7–12. [2] N. Kim, et al., “Drowsy instruction caches: leakage power reduction

using dynamic voltage scaling and cache sub-bank prediction,” in Proc. IEEE/ACM Int’l Symp. Microarchitecture, Istanbul, Turkey, Nov. 18–22, 2002, pp. 219–230.

[3] H. Mizuno and T. Nagano, “Driving source-line cell architecture for sub-1-V high-speed low-power applications,” IEEE J. Solid-State Circuits, vol. 31, no. 4, pp. 552–557, 1996.

[4] R. Islam, A. Brand, and D. Lippincott, “Low power SRAM techniques for handheld products,” Proc. ISLPED, Aug., 2005, pp. 198–202.

[5] S. Zhao et al., “Transistor optimization for leakage power management in a 65 nm CMOS technology for wireless and mobile applications,” Proc. IEEE Symp. VLSI Technology, June, 2004, pp. 14–15.

[6] F. Ricci et al., “A 1.5 GHz 90 nm embedded microprocessor core,” IEEE Symp. VLSI Circuits Tech. Dig., June, 2005, pp. 12–15.

[7] T. Umeno, et al., “A new approach to low ripple-noise switching converters on the basis of switched-capacitor converters,” Proc. ISCAS, June, 1991, pp. 1077–1080.

[8] K. Ngo and R. Webster, “Steady-state analysis and design of a switched-capacitor dc-dc converter,” IEEE Trans. Aero. Elect. Syst., vol. 30, no.1, pp. 92–101, Jan. 1994.

[9] A. Imamura, et al., “Bending-comb capacitor with a small parasitic inductance,” IEEE Symp. VLSI Circuits, June, 2002, pp. 22–25.

[10] L. Clark, D. McCarroll, and E. Bawolek “Characterization and debug of reverse-body bias low power modes,” Electronic Device Failure Analysis, vol. 6, no. 1, Feb. 2004, pp. 13-21.

100k 1M 10M

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

Out

put V

olta

ge (

V)

Clock Frequency (Hz)

RL = 35.05 kΩ

RL = 20.09 kΩ

RL = 9.970 kΩ

RL = 7.510 kΩ

RL = 3.427 kΩ

100k 1M 10M0.0

20.0

40.0

60.0

80.0

100.0

120.0

Out

put C

urre

nt (

μA)

Clock Frequency (Hz)

RL = 35.05 kΩ

RL = 20.09 kΩ

RL = 9.970 kΩ

RL = 7.510 kΩ

RL = 3.427 kΩ

0 1 2 3 4 520

30

40

50

60

70

80

90

Pow

er E

ffic

ienc

y (%

)

Clock Frequency (MHz)

RL = 35.05 kΩ

RL = 20.09 kΩ

RL = 9.970 kΩ

RL = 7.510 kΩ

RL = 3.427 kΩ

(a) (b) (c)

Fig. 6. Measurement results of the SC voltage converter: (a) output voltage vs. switching frequency, (b) output current vs. switching frequency, (c) Powerefficiency vs. clock frequency for the different loads, which represent the current demands of the regulated SRAM across process corners.

Fig. 7. Simulated VSSH node voltage when entering the drowsy mode at VDD =1.0 V and 50ºC. No overshoot occurs even without feedback regulation on theSC voltage converter providing a nominal 666 mV output.

76