© 2014 IEEE

Proceedings of the IEEE International Solid-State Circuits Conference (ISSCC 2014), San Francisco, USA, February 9-13, 2014

# A Sub-ns Response On-Chip Switched-Capacitor DC-DC Voltage Regulator Delivering 3.7W/mm2 at 90% Efficiency using Deep Trench Capacitors in 32nm Soi CMOS

T. Andersen, F. Krismer, J. W. Kolar, T. Toifl, C. Menolfi, L. Kull, T. Morf, M. Kossel, M. Brändli, P. Buchmann, P. A. Francese

This material is published in order to provide access to research results of the Power Electronic Systems Laboratory / D-ITET / ETH Zurich. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the copyright holder. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.



### 4.7 A Sub-ns Response On-Chip Switched-Capacitor DC-DC Voltage Regulator Delivering 3.7W/mm<sup>2</sup> at 90% Efficiency Using Deep-Trench Capacitors in 32nm SOI CMOS

Toke Meyer Andersen<sup>1,2</sup>, Florian Krismer<sup>1</sup>, Johann Walter Kolar<sup>1</sup>, Thomas Toifl<sup>2</sup>, Christian Menolfi<sup>2</sup>, Lukas Kull<sup>2</sup>, Thomas Morf<sup>2</sup>, Marcel Kossel<sup>2</sup>, Matthias Brändli<sup>2</sup>, Peter Buchmann<sup>2</sup>, Pier Andrea Francese<sup>2</sup>

#### <sup>1</sup>ETH, Zurich, Switzerland, <sup>2</sup>IBM Research, Rüschlikon, Switzerland

For an on-chip or fully integrated microprocessor power-delivery system, the on-chip power converter must 1) be designed using the same technology as the microprocessor, 2) deliver high power density to supply a microprocessor core with small area overhead, 3) achieve high efficiency, and 4) perform fast regulation over a wide voltage range for dynamic voltage and frequency scaling (DVFS). On-chip switched-capacitor (SC) converters have gained increasing popularity for this application due to their ease of integration using only transistors and capacitors readily available in the chosen technologies [1-6].

Historically, on-chip SC converters have been perceived as low power converters with output powers below 150mW [1-5]. However, the scalability of output power with chip area does not limit SC converters to being low power; the 1.65W maximum output power in [6] and the 840mW maximum output power presented in this paper exemplify the feasibility of high power on-chip SC converters. SC designs [1-4] utilize reconfigurable power stages for increased output and/or input voltage ranges as well as interleaving techniques to minimize the output voltage ripple, e.g., in [1], where  $3.8mV_{pp}$  output ripple is reported for a 41-phase interleaved SC converter. Using bulk CMOS, designs are limited in efficiency to 81% in [1] and in power density to 0.19W/mm<sup>2</sup> in [2]. Regarding on-chip capacitor technologies, MIM capacitors are used in a 22nm tri-gate technology in [3], and 93% efficiency is reported using ferroelectric capacitors in [4], but both designs achieve low power densities (<0.1W/mm<sup>2</sup>). Employing deep trench capacitors has shown superior efficiency and power density performances, e.g., 4.6W/mm<sup>2</sup> at 86% efficiency for a single phase unregulated on-chip SC converter [5]. Furthermore, multi-GHz sampling frequencies are used in hysteretic control loops to achieve fast response times to transient events, e.g., 3-to-5ns response time in [3] and <1ns response time in [2]. In this paper, we utilize the deep trench capacitor and thin-oxide transistors available in 32nm SOI CMOS to design a high power (840mW) and fast response (<1ns) 16-phase interleaved reconfigurable on-chip SC converter that achieves 86.4% maximum efficiency at 2.2W/mm<sup>2</sup> in the 2:1 configuration and 90.0% maximum efficiency at 3.7W/mm<sup>2</sup> in the 3:2 configuration.

The overall system diagram of the implemented SC converter is depicted in Fig. 4.7.1. Two capacitors can be configured to provide either a 2:1 or a 3:2 ideal voltage conversion ratio by toggling between a charging and a discharging state at 50% duty cycle. A 16-phase interleaving technique is employed to reduce the input current and output voltage ripples, thereby omitting the need for a dedicated output decoupling capacitor. The on-chip load consists of a programmable resistor array, which can be externally programmed by the digital configuration interface. Also the gear signal, which sets the power stage in the 2:1 or 3:2 configuration, is externally controlled. The clocked comparator compares  $V_{out}$  with  $V_{hys}$  and produces a clock signal clk<sub>trig</sub> for the digital clock interleaver, which generates the clock signals clk<sub>0-15</sub> for each SC converter unit. The 250ps sampling period (4GHz) of the comparator ensures <1ns response time to transient events.

The power stage implementation shown in Fig. 4.7.2 consists of 2 capacitors and 12 switches, and it can be reconfigured between the 2:1 and 3:2 ideal voltage conversion ratios [2]. A gate driver generates the gate signals  $v_{g1-9(s)}$  for the charging and discharging states for each transistor as shown in the table in Fig. 4.7.2. In recent works, the generation of  $v_{g1-9}$  depends on several internal nodes and/or additional external voltage supplies [1,2]. However, this design proposes a simplified gate driver implementation that only depends on  $V_{in}$ ,  $V_{out}$ , and gnd. The level-shifted non-overlapping gate signals  $v_{g,(n/p)(H/L)}$  are generated as in [5], and multiplexers controlled by the gear configuration signal are used to change the clock feeds for  $v_{g(4,5,7,7s)}$ . All transistors  $M_{1-9(s)}$  are thin-oxide devices ( $V_{max} \approx 1.2V$ ) for low on-state resistance and fast transition times. Since transistor  $M_5$  should always be off in the 2:1 configuration,  $M_{5s}$  is implemented to protect the gate-source of  $M_5$  in the charging state against overvoltage (gnd–Vin). With a nominal input voltage of  $V_{in}=1.8V$  and an output voltage range

of 0.7 to 1.1V, node  $V_x$  approximately equals  $V_{out}/2$  in the discharging state of the 3:2 configuration, thereby exposing the drain-source of  $M_6$  to overvoltage  $(V_{in}-V_{out,mir}/2=1.45V)$ . The stacking transistor  $M_{6s}$  effectively protects  $M_6$  against this overvoltage situation. Due to symmetrical transistors,  $M_7$  undesirably turns on in the discharging state of the 3:2 configuration due to a positive gate-drain voltage  $(V_{out}/2)$ ;  $M_{7s}$  ensures that  $M_7$  remains turned off.

The 16-phase digital clock interleaver, of which a 4-phase example implementation is shown in Fig. 4.7.3, produces 16 time-interleaved clock phases. Every second output of the shift register is inverted to average out dissimilar currents delivered in the charging and discharging states of the 3:2 configuration, thereby keeping the output voltage ripple  $V_{ripple}$  low. The comparator clock frequency  $f_{rig}$  is determined by the number of interleaved stages *N* and a specified maximum switching frequency  $f_{sw,max}$  of each converter unit following the equation in Fig. 4.7.3. Furthermore,  $f_{trig}$  is upper limited by the total loop latency  $t_{tat}$  (combined latency of the comparator, the digital controller, and the gate driver) to ensure that the SC converter has time to react on a trigger signal before the next comparison event. Having 16 phases in this design,  $f_{trig}$ =4GHz results in  $f_{sw,max}$ =125MHz; furthermore, the loop latency is minimized to  $t_{tat}$ =200ps.

The on-chip programmable load, which is implemented as an array of 31 switchable resistors (resulting in 32 different load values including the no load), can provide a load step between any two load levels within 50ps. Such fast load steps are used to evaluate the <1ns response under worst-case conditions. In Fig. 4.7.4, the measured transient responses when stepping between 0.1x and 1x nominal load, corresponding to 30mA and 365mA output current, are shown. As observed, the output voltage is maintained several nanoseconds after the transient event, verifying the <1ns response of the regulation loop. The output voltage droop is caused by the large input voltage droop, which is due to 1) the slower response of the external input power supply and 2) the parasitic capacitances and inductances of the power distribution network connecting to the chip.

The measured efficiencies and power densities are shown in Fig. 4.7.5 for three different load levels, where resistances and voltages are measured using Kelvin contacts. For  $V_{\rm in}$ =1.8V, the efficiency is >70% over the specified output voltage range of 0.7 to 1.1V with maximum efficiencies of 86.4% and 90.0% in the 2:1 and 3:2 configurations, respectively. For high loads, the maximum power density, and thereby the maximum output voltage  $V_{\rm out,max}$ , is limited by the 4GHz comparator clock frequency, whereas for  $V_{\rm out}$ = $V_{\rm out,max}$ , the maximum power density is limited by the maximum on-chip load.

Figure 4.7.6 shows how this design compares to prior art, revealing >9× improvement in power density while providing <1ns response time and 90% efficiency. An output ripple of  $30mV_{pp}$  is achieved without using a dedicated decoupling capacitor, and the 840mW maximum output power verifies the feasibility of high power on-chip SC converters.

The on-chip SC converter micrograph is shown in Fig. 4.7.7. The total converter area including gate drivers and the digital controller is 0.15mm<sup>2</sup>.

#### References

[1] G.V. Piqué, "A 41-Phase Switched-Capacitor Power Converter with 3.8mV Output Ripple and 81% Efficiency in Baseline 90nm CMOS," *IEEE ISSCC Dig. Tech. Papers*, pp. 98-100, Feb. 2012.

[2] H.-P. Le, J. Crossley, S.R. Sanders, and E. Alon, "A Sub-ns Response Fully Integrated Battery-Connected Switched-Capacitor Voltage Regulator Delivering 0.19W/mm<sup>2</sup> at 73% Efficiency," *IEEE ISSCC Dig. Tech. Papers*, pp. 372-373, Feb. 2013.

[3] R. Jain, B. Geuskens, M. Khellah, et al., "A 0.45-1V Fully Integrated Reconfigurable Switched Capacitor Step-Down DC-DC Converter with High Density MIM Capacitor in 22nm Tri-Gate CMOS," *IEEE Symp. VLSI Circuits*, pp. 174-175, Jun. 2013.

[4] D. El-Damak, S. Bandyopadhyay, and A.P. Chandrakasan, "A 93% Efficiency Reconfigurable Switched-Capacitor DC-DC Converter Using On-Chip Ferroelectric Capacitors," *IEEE ISSCC Dig. Tech. Papers*, pp. 374-375, Feb. 2013.

[5] T. M. Andersen, F. Krismer, J.W. Kolar, et al., "A 4.6W/mm<sup>2</sup> Power Density 86% Efficiency On-Chip Switched Capacitor DC-DC Converter in 32 nm SOI CMOS," *IEEE Applied Power Electronics Conf. and Exposition (APEC)*, pp. 692-699, Mar. 2013.

[6] H. Meyvaert, T. Van Breussegem, and M. Steyaert, "A 1.65W Fully Integrated 90nm Bulk CMOS Intrinsic Charge Recycling Capacitive DC-DC Converter: Design & Techniques for High Power Density," *IEEE Energy Conversion Congress and Exposition (ECCE)*, pp. 3234-3241, Sep. 2011.











Figure 4.7.5: Measured efficiencies and power densities for  $V_{\rm in} \mbox{=} 1.8V$  over the full output voltage range.









Figure 4.7.4: Transient responses for  $V_{in}$ =1.8V and  $V_{out}$ =840mV.

| Design                                | Piqué [1]<br>ISSCC 2012 | Le [2]<br>ISSCC 2013 | El-Damak [3]<br>ISSCC 2013        | Jain [4]<br>VLSI 2013      | Meyvaert [6]<br>ECCE 2011 | This Work                              |
|---------------------------------------|-------------------------|----------------------|-----------------------------------|----------------------------|---------------------------|----------------------------------------|
| Technology                            | 90nm bulk               | 65nm bulk            | 32nm SOI                          | 22nm tri-gate              | 90nm bulk                 | 32nm SOI                               |
| Conversion ratios (M)                 | 2:1, 3:2                | 5:2, 3:1             | 3:1, 2:1,<br>3:2, 1:1             | 2:1, 3:2,<br>5:4, 1:1      | 2:1                       | 2:1, 3:2                               |
| Capacitor<br>type                     | MOS +<br>fringe metal   | MOS                  | Ferroelectric                     | MIM                        | MOS                       | Deep Trench                            |
| Interleaving                          | 41                      | 18                   | 4                                 | 8                          | 21                        | 16                                     |
| Cfly / Cout                           | 14pF / 85pF             | 3.88nF / 0           | 1nF / 10nF                        | - / 100pF                  | 0.57nF / 0                | 1nF / 0                                |
| Vin                                   | 1.1 to 2V               | 3 to 4V              | 1.5V                              | 1.23V                      | 2.35 to 2.6V              | 1.8V                                   |
| Vout                                  | 0.7V                    | 1V                   | 0.4 to 1.1V                       | 0.45 to 1V                 | 1.03                      | 0.7 to 1.1V                            |
| P <sub>out,max</sub>                  | 9.5mW                   | 121mW                | 1.1mW                             | 88mW                       | 1.65W                     | 840mW                                  |
| tresponse                             | Ξ.                      | <1ns                 | <1ms                              | 3 to 5ns                   | ≈15µs                     | <1ns                                   |
| V <sub>droop</sub>                    | -                       | 76mV                 | -                                 | <25mV                      | 95mV                      | 94mV (due to<br>V <sub>in</sub> droop) |
| V <sub>ripple,pp</sub><br>@ nom. load | 3.8mV                   | -                    | -                                 | 43mV                       | -                         | 30mV                                   |
| η <sub>max</sub> @ Μ                  | 75%, 81%                | 71.5%, 73%           | 90%, 91%,<br>93%, 80%             | 82%, 71%,<br>73%, 68%      | 69%                       | 86.4%, 90.0%                           |
| ρ (W/mm²)<br>@ η <sub>max</sub>       | 0.038, 0.025            | 0.19, 0.19           | 0.0006, 0.0010,<br>0.0013, 0.0016 | 0.062, 0.100, 0.126, 0.243 | 0.42                      | 2.17, 3.71                             |

Figure 4.7.6: Performance summary and comparison with recently published on-chip SC voltage regulators.

## **ISSCC 2014 PAPER CONTINUATIONS**

