A New Design Optimization Methodology of Fully Differential Dynamic Comparator

: The need to reduce the time to market for high-performance integrated circuits has become a primary concern in modern electronics design. Many efforts are currently being made to streamline the design process for increasing complexity circuits while providing optimal performances, especially for nanoscale technologies. This paper presents a new and effective methodology for the design of fully differential comparators to achieve a high-performance operation using dynamic topology and nanoscale technology. The proposed methodology is not process dependent and can be applied to similar conventional comparator structures to optimize the operation speed while ensuring good offset cancellation, efficient noise immunity, and reduced design time and complexity. The design steps include theoretical analysis and simulation-based optimization of the comparator speed, as well as offset and noise reduction within a minimal design time. All the analog and digital building blocks are designed using dynamic topologies, including the clock generator, to ensure high speed and synchronized operation. The resulting circuit is a new two-stage dual clock fully differential comparator. Compared with its equivalent counterparts, it provides improved operation speed, and reduced offset voltage and kickback noise. This comparator is designed in the TSMC 65 nm CMOS process. Its performance shows that it achieves a 1.25 GHz operation speed, presents less than 9 mV offset error, and generates a kickback noise of less than 40 mV with a 10 k Ω input resistance during the reset phase only. It consumes 213 µW from a 1.2 V power supply at 1.25 GHz.


Introduction
The scaling of silicon technologies has been one of the primary factors that have allowed for outpacing the exponential increase of performance demand over the past few decades.Transistor scaling increases the integration density and operation speed.At the same time, the resulting circuits are more sensitive to random and systematic errors, such as offset and noise.Additional circuitry for error compensation and noise suppression is then needed, leading to a drastic increase in design time and effort.Therefore, in modern circuit design, optimization methodologies to improve performances have become mandatory not only to answer to the increasing design constraints, but also to compensate for increased errors and noises while optimizing the time to market.Recently, optimization methodologies have become a major research field in MOS circuit design [1]- [3].
Dynamic comparators are largely used in advanced mixed signal systems, such as analog to digital converters (ADCs).The design constraints of these systems are usually stringent, depending closely on those of the comparator.To improve the immunity of ADCs to sensed common noise, a specific variant of the dynamic comparator is usually used [4]- [7]; it is a six-terminal circuit that compares an input voltage difference to a reference voltage difference [8] and is commonly called a differential pair comparator or fully differential dynamic comparator (FDDC).However, it is slower than the common four-terminal-like circuit and is more sensitive to kickback noise, as well as process and mismatch variations [9].Achieving high performance and good noise immunity with a six-terminal dynamic comparator requires more design effort and time than with the common four-terminal one.Therefore, design methodologies could help designers significantly reduce design time and efforts.
An FDDC was employed in [4] for its low kickback noise, good power efficiency, and simple dynamic structure.To reduce mismatch effects on loop stability, the authors kept the comparator gain at low values, leading to a considerable decrease in the operation speed.As for immunity to comparator noise, the authors applied a noise-shaping successive approximation register quantizer to all stages in a pipelined ADC.Although the proposed technique has advantages other than the noise immunity of the comparator, it remains complex and specific to the designed ADC.In [5], a charge distribution FDDC was used to implement a level-crossing ADC.It was constructed with two separate comparators to compare the differential input voltage to a differential reference voltage.The two separate comparators were more sensitive than an all-in-one FDDC when it came to the process and mismatch variations and noise unbalance.
The suppression of the sensed common noise was less efficient.In addition, the comparison was performed over two clock cycles, which affected the operation speed.Moreover, a static second stage was added to the comparator to increase the gain, making the comparators even slower.Another FDDC was used in [7] to implement a SAR-assisted noise-shaping pipeline ADC.The proposed structure included self-calibrated current sources to compensate for mismatches and operated with two synchronized clocks.The circuit design achieved good performance.However, the proposed comparator was specific to the designed ADC and the operation speed was very low.
As for offset compensation, mismatches are usually calibrated off-chip to reduce the design complexity [4], [5].In [7], a background calibration for interstage offset was proposed to compensate for comparator mismatches.Even if the operation speed was not altered, there were "dead zones" in the calibration scheme that reduced its efficiency.Moreover, the proposed scheme mainly relies on the overall system architecture and can hardly be reproduced with a different circuit design.
The comparator gain is also an important feature to implement high-resolution ADCs.It is usually increased by using preamplification stages or multistage comparators.In [5], a three-stage comparator was used, but only the first one was dynamic.Thus, the comparator gain was high, whereas the operation speed was low.Likewise, a two-stage dynamic comparator, in which only the first stage is dynamic, was also presented in [10].In [11], a three-stage, fully dynamic comparator was proposed.However, the presented structure was not fully differential, and the three stages operated over the same clock period.
The current paper presents a new two-stage fully dynamic fully differential comparator.The decision is made over the entire clock period.Also, additional circuitry is added to generate synchronized clocks, to reduce kickback noise, and to compensate for mismatches.The whole system is fully dynamic without a considerable increase in design complexity.It achieves a fully differential comparison, optimal operation speed, good immunity to kickback noise, and self-calibrated offset voltage.The proposed design is process independent and can be used in different applications.
Section 2 presents the proposed system architecture of the FDDC, including clock generation, kickback noise immunity, and offset calibration.The new two-stage FDDC is presented and discussed in section 3. Its operation is also detailed and compared with the one-stage-like circuit.Section 4 describes the proposed circuit and how it ensures immunity to kickback noise while also detailing the clock generator design.Section 5 presents the proposed design technique for a digital offset self-calibration scheme using full custom dynamic circuits.Section 6 presents the simulation results and circuit characterization.A comparison to state-of-the-art performances is also addressed.

Proposed System Architecture
The comparator is typically a one-bit ADC.When the difference between the compared voltages is about a few hundred millivolts or more, the decision process is usually accurate and fast.However, as the input voltage decreases to a few millivolts and less, the decision process becomes much slower and more sensitive to the input signal quality, as well as to circuit nonidealities such as offset and switching noises.Indeed, analog signals usually present noise.Noise is random and common to comparator inputs.On the other hand, a dynamic comparator is usually designed with small MOS devices, which makes the circuit more sensitive to process and mismatch variations, especially when designed in nanometer-scale technologies.Moreover, because of the dynamic operation of the comparator, there are large voltage variations in the internal nodes between devices.These variations are coupled through parasitic capacitors to the comparator inputs as a voltage signal creating a disturbance that is usually called kickback noise.This switching noise is added to the analog input signal and affects the comparison results.Kickback noise cannot be removed, but there are a few techniques to reduce its effects on the decision process [12], [13].
Fig. 1(a) shows a four-terminal comparator, which is known as the strong-arm latch comparator and has been largely used in ADC design [14].It presents two inputs and two outputs.One input is generated from an external voltage source, while the other comes from a resistive ladder.This affects the two inputs with different noise levels, making the comparison process only effective when the sensed voltage is greater than the difference between the two input noise signals.In contrast, a six-terminal comparator is shown in Fig. 1(b); this is called the differential pair comparator [8], [15] or FDDC [16], [17], and has been largely used in pipeline and SAR ADCs [18].This comparator presents four inputs and two outputs.
then OP and OM else OP and OM Thus, the common noise in each differential input is cancelled separately, which considerably improves the comparator precision.This section describes the top-level architecture of the proposed FDDC, including immunity to kickback noise and self-calibration of the offset voltage.The proposed comparator is a new two-stage FDDC.The clock generator produces two synchronized clock signals clk and clks to ensure the operation of the first and second stages, respectively.To reduce the input noise, an RC circuit can be added at the comparator inputs as a first-order filter, but at the price of a reduced operation speed.In the proposed solution, the resistance of a CMOS switch and parasitic capacitor Cp at the comparator inputs together form an RC filter.These two components are too small to affect the comparator speed but also too small to ensure the cancellation of the kickback noise.Therefore, in the proposed scheme, the switches, together with the input parasitic capacitors, are used as track-and-hold circuit blocks, which are controlled to reduce the effect of noise on the decision process.Two clock signals clkn and clkn' are used to control the switches to operate only during the comparator reset time (when clk = `0') before beginning a new cycle.Thus, kickback noise appears at the comparator inputs for a limited period, during which the decision process cannot be affected.
The clock generator provides four synchronized clock signals: clk, clks, clkn, and clkn'.These clock signals ensure a three-phase operation comparator: track-and-hold, decision, and reset.The circuit is designed so that the track-and-hold, as well as a part of the decision process, are performed during the reset time, which improves the comparator speed compared with the state-of-theart method.The comparator is described in detail in section 3, while noise suppression and clock generation are presented in section 4.
To compensate for the comparator offset errors, a three-phase operation system is proposed.First, the initial reset phase is controlled using the external signal reset.When this signal is high, the two N-bit outputs d+ and d− of the two counters are initialized to zero.Thus, the initial reset phase allows for initializing the capacitor banks to equal initial charges.This represents the initial state S0 of the two FSMs used in the self-calibration process.At that time, the eight input switches

New Two-Stage Fully Differential Comparator
The operation speed is a primary constraint in the comparator design.The comparison speed can be defined as the time required to provide a valid output decision.A dynamic comparator operates under a clock signal clk alternating decision and reset phases in each clock cycle.
The two phases of decision and reset usually correspond to the two clock levels '1' (on) and '0' (off).Thus, denoting the decision and reset times by ton and toff, respectively, the total comparison time tclk is equal to: A track-and-latch circuit, basically a Set Reset (SR) latch, is usually added to the comparator outputs to retrieve static output signals.Thus, the decision time ton is typically the sum of two times: the comparison time tc needed by the dynamic comparator to produce a valid output, and the SR latch time tSR required by the SR latch to change state according to the comparator outputs.The decision time ton is then equal to: Once the SR latch state has changed, the comparator outputs can be reset to the initial value without affecting the SR latch state until the next decision process begins.Inserting (3) into (2), the total comparison time tclk is then defined in terms of the comparison time tc, the SR latch response time tSR and the reset time toff: The comparison time tc depends on the internal capacitor sizes, internal feedback loops, and the value of the resolved input voltage.For a few hundred millivolts of the input voltage, tc can be small and reach nano and picoseconds according to the comparator structure.However, when resolving near 0 V input values, the comparator output evolution becomes slow and tc tends to infinity.Therefore, to sense micro and nanovolt input values in a reduced time, it is necessary to minimize the comparator internal capacitors by using small devices, and to improve the comparator structure by creating positive feedback loops, immunity to switching noises, and compensation for process and mismatch variations.
A double tail and three-stage triple-latch comparators are designed with a 28 nm MOS process [11].The first one is a two-stage double tail comparator that includes only one positive feedback loop, while the second one includes three positive feedback loops.The first one achieves a comparison time tc equal to 50 ps against 27 ps for the second comparator when resolving the 5 mV input value.Nevertheless, in the two comparators, the stages operate during the same clock period, which makes tc the sum of the response times of all stages put in a series.Moreover, there is no improvement for tSR and toff in the total comparison time tclk in (4).In [3], a two-stage dual-clock latch comparator is proposed.The comparator includes one feedback loop.However, the second stage is controlled by a second clock, reducing the on-time ton in (2) to tc only.Thus, the total comparison time tclk defined in (4) becomes: Moreover, the second stage is built with a stack of two elements only, which reduces the total capacitor seen at the outputs of the first stage, leading to a minimal comparison time tc.The comparator is designed with a 180 nm MOS process and achieves a comparison time of 900 ps when resolving a 25 µV input value.However, the second stage operates when only one of the first-stage outputs decreases to a threshold value.If both outputs reach this value, the second-stage outputs will not be complementary, and the comparison decision will not be valid.This happens when resolving small input values and when the PMOS threshold voltage |VTHP| is larger than VDD/2, which is usually the case in scaled technologies like 65 nm and below.
In the present work, a new two-stage FDDC where the comparison speed is optimized with no restriction on technology use is proposed.Indeed, as shown in Fig. 3, each stage includes a positive feedback loop, which reduces the comparison time tc compared with [3].In addition, the positive feedback loop in the second stage provides complementary outputs, regardless of the technology parameters used.Moreover, the two stages operate under two different clock signals as in [3], which reduces the total comparison time tclk to the sum of the decision time tc and the reset time toff as defined in (5).This voltage difference is denoted as ΔVINPUT and is equal to: ) (  )

Clock Generator and Kickback Noise Suppression
The generation of synchronized clock signals is achieved by sequential circuits using data flip flops (DFFs).The true single-Phase clock (TSPC) DFF presented in [19] is considered to design the clock generator in the proposed system in Fig. 2(c).It is a nine-transistor, threestage DFF operating with one single clock signal and including no more than three stacked devices per stage.This circuit is shown in Fig. 4, where a reset command and inverter are added to the output.This structure is convenient and should provide an operational speed greater than the comparator.A detailed description of the circuit can be found in [19]. .This also corresponds to the reset of the comparator first stage.Vector c is then equal to (0 0 1).Second, state S1 corresponds to the reset of the two comparator stages, for which c is equal to (0 1 0).Third, state S2 is the state where the comparator first-stage operation begins, which corresponds to c equal to (1 0 0).Fourth, state S3 is the continuation of state S2 with c still equal to (1 0 0).This last state is required because the first-stage operation is slower than the second one.Therefore, high and low levels of clk must last longer than those of clks and clkn.
The finite state machine (FSM) is depicted in Fig. 5(b).It has no inputs and generates the three outputs: clk, clks, and clkn.The gate-level and circuit-level synthesis are given in Fig. 5(c) and (d), respectively.An inverter is added to generate the complement of clkn.State S0 is the sample-and-hold state, while state S2 is the decision phase.Inserting states S1 and S3 in between S0 and S2 allows for reduction of kickback noise effects on the decision process.Indeed, as discussed in [13], isolating the decision process from the sample-and-hold phase can significantly reduce the effect of kickback noise on the decision process.However, the clock generation in [13] used delay circuits, and outputs were not synchronized.Hence, the design was specific to the chosen clock timing, as well as to the technology used.In contrast, the proposed design generates synchronized .

Proposed Offset Self-Calibration Technique
In Fig. 2(c), the proposed offset regulator receives the comparator static outputs Q+ and Q− and generates two N-bit outputs d+ and d−.These outputs are then used to control the 2N binary-weighted transistors (Mdi+) and (Mdi−) shown in Fig. 3(a).The least significant bit (LSB) transistor is set to minimal dimensions, while, for the other weighted transistors, the channel width is doubled until reaching the most significant bit (MSB) transistor.
The main idea is to create a progressive charge imbalance to compensate the comparator offset as in [3], [20].However, in [3], a high-level design methodology for the self-calibration scheme is proposed.As a result, the circuit is slow and large because of the large number of chained gates.Whereas in [20], the offset regulation is off chip and too complex for a circuit-level design.In the present work, the proposed offset regulator is minimalist and could be easily designed at the circuit level.
The proposed offset regulator block diagram is presented in Fig. 6.The circuit input stage is an FSM, which receives the comparator static outputs Q+ and Q− and generates two control digits e+ and e−.te two N-bit control words to calibrate the two capacitor banks in the comparator shown in Fig. 3(a).In these capacitor banks, two cases will not be used to avoid a significant variation in the capacitive compensation load: "all transistors are on" and "all transistors are off", which correspond to   The generated outputs are used to control two N-bit counters.Fig. 8(a) shows the FSM of an N-bit counter.
The module-level design of the counter is shown in Fig. 8(b), while the proposed circuit-level design is shown in Fig. 8(c).In the proposed circuit-level design, the first DFF is reset to '1' instead of '0' to initialize the N-bit counter to 1 instead of 0. Fig. 9 shows the modified DFF.However, to avoid the counter reaching 2 N −1, the on time of the external signal calib is set to exactly 2 N −2 cycles.

Simulation and Comparison
To validate the proposed design methodology and evaluate the proposed circuit performances, the proposed two-stage FDDC shown in Fig. 3    Thus, the decision time ton is equal to 400 ps and 360 ps in the basic and proposed comparators, respectively.The speed improvement of 40 ps in the proposed comparator is then about 10%, as in [3].However, in [3], the two-stage comparator could not operate properly when powered by voltages equal to 1.2 V and below.The proposed design operation is independent of the technology used, as discussed in section 3.
In the second simulation set, the clock generator shown in Fig. 5 reduced, the circuit remains immune to kickback noise during the decision phase, which is essential to ensure high accuracy.
In the fourth simulation set, the offset correction is simulated using the circuit shown in Fig. 13   In the fifth simulation set, the operation of the offset regulator FSM shown in Fig. 7(a) is evaluated using the circuit shown in Fig. 15.In this circuit, an offset voltage equal to 50 mV is added in series with a positive comparator input.In Fig. 18, the trip points VTR+ and VTR− are determined as 741 µV and −77 µV, respectively.The offset voltage VOS is determined using (7) and is equal to 332 µV.Thus, the proposed self-calibration method has effectively reduced the offset voltage from 50 mV to a few hundred microvolts.
In the sixth simulation set, the circuit in Fig. 15 is used again with an offset voltage equal to 150 mV to evaluate the maximum offset correction that the designed system could achieve.Fig. 19 shows the resulting offset regulator FSM outputs.The system goes through states S0, S1, S2, and S3.However, the control signal calib is set to '0' before the FSM reaches state S2, that is, before Q+ and Q− change to the opposite logic levels.Indeed, the control signal calib is used to disable the counter incrementation when it reaches 2 N −2, as discussed in section 5. Therefore, the enable signal E− is no longer identical to e−.Fig. 20 shows the resulting counters outputs d+ and d−, which are equal to 1 and 62, respectively.In the seventh simulation set, the offset voltage is determined while considering the process and mismatch variations.Fig. 22 shows the offset variation of the designed FDDC under mismatch variation with and without offset calibration with a 100-run Monte Carlo simulation.Without offset calibration, the offset voltage VOS has a maximum variation of ±160 mV.This offset is reduced to ±9 mV after calibration.times.The proposed design achieves an effective selfcalibration of the offset voltage.The maximal offset The standard deviation is reduced from 41.3 mV to 2.23 mV after calibration, resulting in a decrease of more than 18 correction can be improved by increasing the channel length of the calibration transistors, as discussed in [3], or by increasing the number of charges in the capacitor banks.
The proposed design performance is summarized in Table. 1.This table also presents the performance achieved in current related works on FDDCs.The proposed design is the only one that includes offset calibration and noise cancellation in FDDCs.It achieves the second-best energy efficiency after a 40 nm CMOS design [22].However, in [22], no offset regulation is proposed, which would increase the consumed power and decrease the operation speed.

Conclusions
The current paper presented a new and effective methodology design for FDDCs, including kickback noise immunity and offset self-calibration.In the proposed design, the kickback noise is almost null during the decision phase and less than 40 mV during the reset phase.Moreover, the proposed FDDC achieves an effective digital offset self-calibration, in which the offset voltage is reduced more than 18 times.The proposed circuit is designed with minimalist building blocks and consumes no more than 213 µW at a 1.25 GHz comparison rate.It achieves high performance compared with the current state-of-the-art achievements in terms of offset calibration, noise cancellation, operation speed, power consumption, and design simplicity.Moreover, the proposed design methodology is generic and independent of the technology used.

Figure 1 :
Figure 1: Strong-arm latch comparator (a) dynamic comparator (b) fully differential dynamic comparator.The inputs are a differential analog input signal and differential reference voltage.The two outputs are complemented: a positive output OP and negative output OM.The positive output OP goes high when the differential analog input voltage VIN+ − VIN− is greater than the reference voltage difference VREF+ − VREF−:

Fig. 2
Fig. 2 describes the proposed system.The symbol shown in Fig. 2(a) presents the input and output terminals of the system.Fig. 2(b) illustrates the clock diagram of the external and internal clock signals, while Fig. 2(c) depicts the top-level architecture.

Figure 2 :
Figure 2: Proposed system (a) symbol view (b) clock diagram (c) architecture.control signals d+ and d− by 1 to compensate for the mismatches in the comparator as well as in the switches at the comparator inputs.This process continues as long The circuit operates as follows: in the first stage, a differential analog input voltage ΔVIN = (VIN+ − VIN−) and differential reference voltage ΔVREF = (VREF+ − VREF−) are applied to the four input pair transistors (M1−4).The voltages VIN+ and VREF− are applied to transistors (M1,4), which have a common drain.These transistors generate two currents and feed node X− with a current, which is the image of the sum of the two applied voltages (VIN+ + VREF−).Likewise, considering the circuit symmetry, transistors (M2,3) feed node X+ with a current, which is the image of the sum of the two applied voltages (VIN− + VREF+).When the clock signal clk is low, the tail transistors (M5,6) turn off, while the reset transistors (M11−14) turn on.This allows for initializing the latch nodes X+, X−, O+ and O− to VDD.Conversely, when clk goes high, the tail transistors (M5,6) close while the reset transistors (M11−14) open.At this time, the four input pair transistors feed the latch nodes X+ and X− with a differential current ΔIX = IX+ − IX−, which is the image of the voltage difference between the sums of the applied voltages.

6 )
The resulting ΔIX activates the latch transistors (M7−10) which operate as a strong positive feedback loop to regenerate the outputs O+ and O− to complementary logic levels.The generated outputs are then applied to the input transistors (Ms1,s2) of the second stage.Transistors (Mdi+(i=1..N)) and (Mdi−(i=1..N)) are two capacitor banks, each one including N binary-weighted charges.These capacitor banks are controlled by two N-bit inputs, d+ = (di+(i=1..N)) and d− = (di−(i=1..N)), and are used to compensate for process and mismatch variations.This specific structure of the charges also reduces the switching noise and improves the operation speed [3].Considering the second stage, when clks is high, outputs Os+ and Os− are initialized to '0' turning off transistors (Ms3,s4).As clks becomes low, reset transistors (Ms5,s6) open.As shown in Fig. 3(b), this happens at the end of the reset phase of the first stage, where both outputs O+ and O− are initialized to VDD.Thus, transistors (Ms1,s2) turn off like the other four ones.When one of the first stage outputs O+ and O− begins decreasing, transistor (Ms1) or (Ms2), respectively, begins operating to charge one of the output voltage Os+ and Os−, respectively, to VDD.When the applied input voltage difference is too small, both O+ and O− can decrease before regenerating to logic levels.Then, transistors (Ms3,s4) will operate as positive feedback to maintain one of the outputs to '0' while the other one charges to VDD.Without these transistors, this may result in both outputs Os+ and Os− at VDD.In this case, when these signals are applied to the SR latch, they create an undefined state, resulting in a wrong output Q+ and Q− decision.The last stage is a NOR gate SR latch.It maintains its state when the applied signals Os+ and Os− are initialized to '0' and keeps or changes the state when the outputs are complemented, resulting in static outputs Q+ and Q−.

Fig. 5
Fig.5shows the design details of the proposed clock generator.The circuit generates four signal outputs clk, clks, clkn and clkn'.Because the last two are complementary, the circuit states can only be defined according to the three outputs clk, clks, and clkn.These three outputs are denoted by vector c = (clk clks clkn) = (x x x), where x is equal to '1' or '0'.As described in Fig.5(a) and (b), the circuit goes through four states S0, S1, S2, and S3.First, the clock generator is initialized to state S0 with an external reset = '1'.This state corresponds to the sampleand-hold phase by connecting external signals to the comparator inputs (Fig.1(c)).This also corresponds to the reset of the comparator first stage.Vector c is then equal to (0 0 1).Second, state S1 corresponds to the reset of the two comparator stages, for which c is equal to (0 1 0).Third, state S2 is the state where the comparator first-stage operation begins, which corresponds to c

Figure 5 :
Figure 5: Proposed clock generator design (a) clock diagram (b) Moore finite state machine (c) gate-level design (d) circuit-level design.outputs and can be reproduced without considering the technology used or transistor size.

Figure 6 :
Figure 6: Block diagram of the proposed offset regulator.
d+ and d− equal to 0 and 2 N −1, respectively.Therefore, the two N-bit control signals d+ and d− should be initialized to 1, for which all the binaryweighted transistors are on, except for the LSB transistor.Then, according to Q+ and Q− levels, d+ and d− incrementation will either be stopped by setting E+ and E− to '0' or pursued by setting E+ and E− to '1'.The incrementation should stop before reaching 2 N −1, for which all transistors are blocked.The case d+ and d− equal to 2 N −2 turns off all the calibrating transistors, except the LSB one.The parasitic capacitors of the blocked transistors can be neglected compared with those of the on transistors.

Figure 7 :
Figure 7: Proposed FSM design to control the two counters, (a) Moore FSM, (b) proposed circuit-level design.Each conducting transistor is then equivalent to a capacitor.As a result, when d+ and d− are equal to '1', the N−1 largest capacitors are in parallel.This sets the calibrating capacitive load at the maximum value on both sides of the comparator.When applying a 0 V-input voltage, the comparator output Q+ is either high or low.When Q+ is high, the comparator is considered as exhibiting a positive offset voltage.To compensate for this offset, d− is incremented by 1 (d− = d−(initial) +1 = 2).This corresponds to a first step decrease of the capacitive load on the right side of the comparator with respect to the left side.Hence, in the next comparison cycle, the positive offset voltage either decreases toward 0 V or becomes negative.If the offset voltage is still positive in the next cycle, that is Q+ is still high, d− is incremented again.This

Figure 8 :
Figure 8: Proposed N-bit counter design (a) Moore FSM (b) module-level design (c) proposed circuit-level design to start the counter from 1.Then, the system enters a final state S7 where both outputs e+ and e− are set back to '0' again.Because the circuit is symmetrical, considering Q+ and Q− equal to '0' and '1', respectively, leads to states S4, S5 and S6 which are symmetrical to states S1, S2 and S3, respectively.Fig. 7(b) shows the proposed FSM circuit synthesis.It uses dynamic circuits and the DFF shown in Fig. 4. To generate static outputs e+ and e− with maximal operation speed, switched circuits with positive feedback are used.

Figure 9 :
Figure 9: First DFF of the N-bit counter (a) symbol (b) circuit-level design.
has been designed in the TSMC 65 nm CMOS process using standard-threshold MOS devices.The offset calibrating capacitor banks are set to six bits.The basic comparator shown in Fig.1(b), followed by a NAND-based SR latch is also designed using the same standard-threshold devices and will be used on a comparison basis to show the advantages of the proposed structure.In the first simulation set, both FDDCs are simulated at room temperature under nominal operating conditions.They are powered by 1.2 V supply voltage and operate at a 1.25 GHz clock frequency.A first DC voltage source is set to −300 µV and connected to the differential input voltage, while a second DC voltage source is set to VCM = 950 mV and is connected to both reference inputs to set VREF = (VREF+ − VREF−) to 0 V. Fig.10shows the transient analysis results for both comparators.This figure is used to determine the decision time ton for both structures.In Fig.10(a), the decision time ton of the basic comparator, as defined in (3), is equal to the difference between when clk goes high and when the negative output Q− of the SR latch crosses the mid supply voltage value (VDD/2 = 600 mV).In this first case, the output Q+ and Q-transition must happen during the clk on-time.Otherwise, the decision could not be made, and the comparator output would be invalid.In Fig.10(a), Q+ transition happens slightly before clk transition.In the proposed circuit, the decision time ton is equal to the comparison time tc, as discussed in section 3. The comparison time tc in Fig.10(b), corresponds to the difference between when clk goes high and when the negative output Os− of the second stage crosses 600 mV.In this second case, Os− transition must happen during the clk on-time.However, since Os− logic level is maintained during the reset, Q+ and Q-transition could happen at any time of the clock cycle, even after clk transition.

Figure 10 :
Figure 10: Transient analysis of the fully differential dynamic comparator (a) basic comparator (b) proposed two-stage comparator.

Figure 11 :
Figure 11: Transient evolution of the generated clocks.

Figure 12 :
Figure 12: Kickback noise simulation (a) simulation circuit (b) kickback noise at the comparator inputs.
(a).The differential analog input is connected to a triangular voltage source VINPUT = VIN+ − VIN− with a slope equal to 1mV/10 ns.The differential reference inputs VREF+ and VREF− are connected to a common mode voltage source VCM = VREF+ = VREF− = 950 mV.

Figure 13 :
Figure 13: Offset self-calibration simulation (a) simulation circuit (b) ideal transfer characteristic (c) real transfer characteristic.The differential input voltage VREF = VREF+ − VREF− is then equal to 0 V. Thus, considering the two voltage differences, VINPUT and VREF, the ideal transfer characteristic of the dynamic comparator would be similar to the one presented in Fig.13(b).Here, both hysteresis and offset are null.However, in real conditions, the comparator always exhibits hysteresis and offset[21].Fig.13(c) shows the realistic transfer characteristic.The hysteresis window is centered on VM and delimited by trip points VTR+ and VTR−.The offset voltage VOS is defined as the difference between VM and VREF:

Fig. 16 Figure 15 :
Fig. 16 shows the applied input signals reset, calib, and calib'.The reset action initializes both FSM outputs e+ and e− to '0', which corresponds to state S0 of the FSM shown in Fig. 7(a).Then, with Q+ and Q− equal to '1' and '0', respectively, the FSM outputs e+ and e− become '0' and '1', respectively, which corresponds to state S1.After 35 clock cycles, Q+ and Q− change to the opposite logic levels, leading e+ and e− to change to '1' and '0', respectively.This change lasts two clock cycles, which corresponds to state S2 followed by state S3 in the FSM.After these two cycles, the outputs e+ and e− are set back to '0' which corresponds to the FSM last state S7.Two signals E+ and E−, which are identical to e+ and e−, are also generated.Indeed, because calib = 1, an AND logic operation between calib and e+ and e−, as shown in Fig. 6,

Figure 16 :
Figure 16: FSM input output signals of the offset regulator when VOS = 50 mV.results in the two signals E+ and E−.These signals are applied to the enable inputs of two 6-bit counters, leading to two offset calibration control signals d+ and d−, respectively.Fig. 17 shows the generated control signals, where d+ and d− are equal to 3 and 37, respectively.

Fig. 21 Figure 17 :
Fig. 21 is used to determine the maximal offset correction.The obtained offset voltage after correction is 4.33 mV.Thus, the system can achieve a maximal offset correction of 145.67 mV.

Figure 19 :
Figure 19: FSM input output signals of the offset regulator when VOS = 150 mV.

Figure 22 :
Figure 22: Monte Carlo simulation of the offset voltage (a) without calibration (b) with calibration.

Table 1 :
Summary and comparison of the characteristics of fully differential dynamic comparators.