

<u>ISSN:</u> <u>2278 – 0211 (Online)</u>

# Implimentation Of Clocked Pair Shared Flip-Flop For Low Power Applications By Double Edge Triggering Techniques

L.Shanmughaprakash P.G Student/VLSI Design, SRM University, Chennai, India N.Saraswathi Asst.Prof selection grade, SRM University, Chennai, India A.Jagadeeswaran P.G Student/VLSI Design, Nandha Engineering College , Erode, Tamilnadu, India

# Abstract:

Power consumption is the most important aspect in regards to system performance. This project focus on low power clocking methods to reduce the power consumption in chips and flip-flops by reducing the number of clocked transistors used in it. 40% of the clocked transistors are reduced by using Clocked Pair Shared Flip-Flops [CPSFF]. This would be able to reduce 24% of the clock driving power in most of the SOC. In addition to this low swing and double edge clocking can be used to build clocking system. A simple transistor DET design is used to reduce the leakage current. In this scheme transistor sizes and pulse generation circuit can be further reduce for power saving. Here UMC CMOS 180nm technology is use in SPICE tool to design the proposed structure.

Key words: Flip-flop, low power, svl.

#### 1.Introduction

The SYSTEM-ON-CHIP (SoC) design is integrating hundreds of millions of transistors on one chip, whereas packaging and cooling only have a limited ability to remove the excess heat. All of these results in power consumption being the bottleneck in achieving high performance. The clock system, which consists of the clock distribution network and sequential elements (flip-flops and latches), is one of the most power consuming components in a VLSI system. It accounts for 30% to 60% of the total power dissipation in a system [3]. As a result, reducing the power consumed by flip-flops will have a deep impact on the total power consumed. A large portion of the on chip power is consumed by the clock drivers. Caution must be paid to reduce clock load when designing a clocking system.

Many contemporary microprocessors selectively use master-slave and pulsed-triggered flip-flops. Traditional master-slave single-edge flip-flops, for example, transmission gated flip-flop, are made up of two stages, one master and one slave. Another edge-triggered flip-flop is the sense amplifier-based flip-flop (SAFF). All of these hard edged-flip-flops are characterized by a positive setup time, causing large D-to-Q delays. Alternatively, pulse-triggered flip-flops reduce the two stages into one stage.

- Setup time and hold time: describe the timing requirements on the D input of a Flip-Flop with respect to the Clk input. Setup and hold time define a window of time which the D input must be valid and stable in order to assure valid data on the Q output.
- Setup Time (Tsu) Setup time is the time that the D input must be valid before the Flip-Flop samples in Figure 1.
- Hold Time (Th) Hold time is the time that D input must be maintained valid after the Flip-Flop samples in Figure 1.
- Propagation Delay (Tpd) Propagation delay is the time that takes to the sampled D input to propagate to the Q output.



Figure 1: Timing Diagram



Figure 2: Clocking system

Power consumption in high-performance integrated circuits that has been one of the most critical constraints in high-performance designs recently in Figure 2. There are three main sources of power dissipation in the latch. Internal power dissipation of the latch, including the power dissipated for switching the output loads. Local clock power dissipation, presents the portion of power dissipated in local clock buffer driving the clock input [10] of the latch. Local data power dissipation, presents the portion of power dissipated in the logic stage driving the data input of the latch

#### 2. Techniques For Implementing Double Edge Triggering Flip-Flops

The DEFF design will use more clocked transistors than SEFF design generally. However, the DEFF design should not increase the clock load too much. The DEFF Design should aim at saving energy both on the distribution network and flip-flops. It is preferable to reduce circuits' clock loads by minimizing the number of clocked transistors [6]. Furthermore, circuits with reduced switching activity would be preferable. Low swing apability is very helpful to further reduce the voltage on the clock distribution network for power saving, if applicable. Due to the fact that voltage scaling can reduce power efficiently, the cluster voltage scaling (CVS) systems are widely used, shown in the Figure 3(a) and 3(b).



Figure 3(a): Explicit pulse generation of DET



*Figure 3(b): Evaluvation of clock pulse* 

Most of the flip flops are designed to operate in single clock edge i.e. either in positive edge or negative edge. In double edge triggering [8] the flip flop is made to operate in both clock edges. With this method the opposite clock edge will not be wasted and speed of operation is increased.

# 3.Literature Survey Of Exisiting System On CCFF, CDMFF, CPSFF

# 3.1. Conditional Data Mapping Flip-Flop (CDMFF)

Conditional data mapping flip-flop (CDMFF) uses seven clocked transistors, resulting in about 50% reduction in the number of clocked transistors, hence CDMFF used less power than CCFF and CDFF. This shows the effectiveness of reducing clocked

transistor numbers to achieve low power. CDMFF outperforms CCFF and CDFF. However, there is redundant clocking capacitance in CDMFF.



Figure 4: Circuit Diagram of CDMFF

When data remains 0 or 1, the pre-charging transistors, P1 and P2, keep switching without useful computation, resulting in redundant clocking. Clearly, it is necessary to reduce redundant power consumption here. Further, CDMFF has a floating node on critical path because its first stage is dynamic. When clock signal CLK transits from 0 to 1, CLKDB will stay 1 for a short while which produces an implicit pulse window for evaluation. During that window, both P1, P2 are off show in Figure 4.

In addition, if transits from 0 to 1, the pull down network will be disconnected by N3 using data mapping scheme (N6 turns off N3); if it is 0, the pull down network is disconnected from GND too[10]. Hence internal node is not connected with Vdd or GND during most pulse windows, it is essentially floating periodically. With feature size shrinking, dynamic node is more prone to noise interruption because of the un driven dynamic node. If a nearby noise discharges the node, pMOS transistor P3 will be partially on, and a glitch will appear on output node. In a nanoscale circuit, a glitch not only consumes power but could propagate to the next stage which makes the system more vulnerable to noise.

# March, 2013

Hence, CDMFF could not be used in noise intensive environment. Unlike CDMFF, other dynamic flip-flops employ structure to prevent the floating point. For example, SDFF CDMFF,has a keeper at node while HLFF, and CCFF have a transistor connecting to Vdd, respectively. Both methods serve to increase noise robustness of node.

#### 3.2. Clocked Pair Shared Flip-Flop (CPSFF)

CDFF and CCFF use many clocked transistors. CDMFF reduces the number of clocked transistors but it has redundant clocking as well as a floating node. To ensure efficient and robust implementation of low power sequential element, we propose Clocked Pair Shared flip-flop (CPSFF) to use less clocked transistor than CDMFF and to overcome the floating problem in CDMFF. In the clocked-pair-shared flip-flop, clocked pair (N3, N4) is shared by first and second stage. An always on pMOS, P1, is used to charge the internal node rather than using the two clocked precharging transistors (P1, P2) in CDMFF.

Comparing with CDMFF, a total of three clocked transistors are reduced, such that the clock load seen by the clock driver is decreased, resulting in an efficient design. Further the transistor N7 in the clocked inverter in CDMFF is removed. CPSFF uses four clocked transistors rather than seven clocked transistors in CDMFF, resulting in approximately 40% reduction in number of clocked transistors. Furthermore the internal node is connected to Vdd by an always on P1, so is not floating, resulting in enhancement of noise robustness of node. This solves the floating point problem in CDMFF. The always ON P1 is a weak pMOS transistor size. This scheme combines pseudo nMOS [16] with a conditional mapping technique where a feedback signals, comp, controls nMOS N1. When input D stays 1, Q=1, N5 is on, N1 will shut off to avoid the redundant switching activity at node X as well as any short circuit current. Pmos P2 should pull Q up when D transits to 1. The second nMOS branch (N2) is responsible for pulling down the output of Q if D=0 and Y=1 when the clock pulse arrives. pMOS in I1 should turn on nMOS N2 when D=0.



Figure 5: Circuit Diagram of CPSFF

Although P1 is always ON, short circuit only occurs one time when D makes a transition of 0 to 1, and the discharge path is disconnected after two gates delay by comp (turning off N1). After that, if remains at 1, the discharge path are already disconnected by N1; there would be no short circuit. The clocked-pseudo-nMOS scheme is different from the general idea of conventional pseudo-nMOS logic in that we use clocked transistors in the pull down branch. P1, N1, N3, and N4 should be properly sized to ensure a correct noise margin.

Several low power techniques in Section II can be easily incorporated into the new flipflop. Unlike CDMFF, low swing is possible for CPSFF since incoming low voltage clock does not drive pMOS transistors. Low swing voltage clock signals could be connected to the nMOS transistors N3 and N4, respectively. In addition, it is easy to build double edge triggering flip-flop based on the simple clocking structure in CPSFF. Further CPSFF could be used as a level converter flip-flop automatically, because incoming clock and data signals only drive nMOS transistors. Inputs are driven by the inverters, and the output is driving a capacity load of 14 minimum inverters. In the clocked-pair-shared flip-flop, clocked pair (N3, N4) is shared by first and second stage. An always on pMOS, P1, is used to charge the internal node rather than using the two clocked pre charging transistors (P1, P2) in CDMFF.

# 4. Proposed Techniques For Cpsff With Double Edge Triggering Flip Flops

4.1. Power Consumption Reduction Techniques

- Double Edge Triggering.
- Low Swing Voltage.
- Conditional Operation and Clock Gating.
- Reducing Leakage Power.
- Reducing Short Current Power.
- Reducing Capacity Of Clock Load.

These methods are briefly explained as follows:

#### 4.1.1.<u>Double Edge Triggering</u>

It is using half frequency on the clock distribution network will save approximately half of the power consumption on the clock distribution network. However the flip-flop must be able to be double clock edge triggered. For example, the clock branch shared implicit pulsed flip-flop (CBS-ip DEFF) [15] & [16], is a double edge triggered flip-flop. Double clock edge triggering method reduces the power by decreasing frequency in equation.

#### 4.1.2.Low Swing Voltage

It is using a low swing voltage on the clock distribution network can reduce the clocking power consumption since power is a quadratic function of voltage. To use low swing clock distribution, the flip-flop should be a low swing flip- flop. For example, low swing double-edge flip-flop (LSDFF) is a low swing flip-flop. In addition, the level converter flip-flop is a natural candidate to be used in low swing environment too. For example, CD-LCFF-ip could be used as a low swing flip-flop since incoming signals only drive nMOS transistors. The low swing method reduces the power consumption by decreasing voltage.

There are two ways to reduce the switching activity:

- Conditional
- Clock gating.

#### 4.1.3. Conditional Operation

For dynamic flip-flops, like hybrid latch flip-flop (HLFF) [6] semi dynamic flip-flop (SDFF), there are redundant switching activities in the internal node. When input stays at logic one, the internal node is kept charging and discharging without performing any useful computation. The conditional operation technique is needed to control the redundant switching. For example, in CDFF, a feedback transistor is inserted on the discharging path of 1st stage which will turn off the discharging path when keeps 1. Internal node will not be kept discharging at every clock cycle. In CCFF, it neither uses a clocked NOR gate to control an nMOS transistor in discharging path when keeps 1. The redundant switching activity is removed in both cases. This reduces the power consumption by decreasing data activity in the equation.

#### 4.1.4.Clock Gating

When a certain block is idle, we can disable the clock signal to that block to save power. Gated master slave flip-flop was proposed in .Both conditional operation and clock gating methods reduce power by decreasing switching activity.

#### 4.1.5.<u>Reducing Leakage Power</u>

Using Dual Vt/MTCMOS[4] to reduce the leakage power in standby mode. With shrinking feature size, the leakage current increases rapidly, the MTMOS technique as well as transistor stacking, dynamic body biasing, and supply voltage ramping could be used to reduce leakage standby power consumption. A data retention flip-flop is proposed in .

#### 4.1.6. Reducing Short Current Power

Split path can reduce the short current power, since pMOS and nMOS are driven by separate signals.

#### 4.1.7. Reducing Capacity Of Clock Load

80% of non clocked nodes have switching activity less than 0.1. This means reducing power of clocked nodes is important since clocked node has 100% activity. One effective way of low power design for clocking system is to reduce clock capacity load by minimizing number of clocked transistor. Any local clock load reduction will also

decrease the global power consumption. This method reduces power by decreasing clock capacity in equation.

### **5.Implemented On CPSFF With DET**

Most of the flip flops are designed to operate in single clock edge i.e. either in positive edge or negative edge. In double edge triggering [8] the flip flop is made to operate in both clock edges. With this method the opposite clock edge will not be wasted and speed of operation is increased.

In double edge triggering flip flop the number of clocked transistor is high than single edge triggering flip flop. This method is preferable to the circuits which consist of reduced number of clocked transistors [17]. In dual edge triggering the flip flop is triggered in both edges of clock pulses. So the half of the clock operating frequency is enough and it will reduce the power consumption.



Figure 6: Circuit Diagram of CPSFF with DET

Instead, applying the clock signal to the flip flop the dual pulse is applied using dual pulse generator scheme [14] shown in Figureure 6. The flip flop will evaluate the output in both edge of the clock.

Delay buffer play a vital role in the area of interactive and non-interactive application like, IP telephony, interactive voice/video, video conferencing, Video-on-demand (VOD), streaming audio/video, Virtual reality etc. The level of *delay* requirement determined by degree of interactivity. For example the interactive voice applications will require strict delay and video application requires less delay. The relaxed *delay* requirements for streaming applications are in the order of seconds. *Delay* requirements

are important in the satellite communication to *synchronize* the data pocket from earth station to satellite and vice-versa. Existing delay buffer is Ring Counter with Clock gated by Celement

#### 6.Simulation Result

It is difficult to apply the low power techniques introduced in previous section to CDMFF. For example, the clock structure with precharging transistors P1, P2 in CDMFF[1] makes it difficult to apply double edge triggering. Nor can CDMFF be used in a low swing clock environment. (Note that the incoming low swing clock signal cannot drive pMOS, P1 and P2, in high voltage block (VDDH), because the pMOS transistors will not turn off by a low swing voltage, resulting in short circuit power consumption.

When D=1, Q=0(previous state output), Qb=1 is applied to the transmission gate, the output becomes 1 and the Q=0 makes the transistor N5 OFF. As N5 is already connected to ground, the transistor is neglected. Thus the value 1 will be passed to the transistor N1 that makes it to be switched ON. As the CLK in transistor P1 is connected to GND, this transistor is always ON. Thus Vdd will always pass through the transistor P1 (i.e. 1).

When both CLk and CLkDB are HIGH, both the transistors N3 and N4 are ON. But as these transistors are connected to ground they act as Pull Down Network and the output will be passed on to Ground. When D=1 ia applied to the transistor N2, the inverted value (i.e. 0) makes the transistor OFF. Thus the pull down network is neglected. As the transistor P1 is always ON, the Floating Node is neglected and transistor P2 will be ON. Thus the value of Qn+1=1.

However, there is redundant clocking capacitance in CDMFF. When data remains 0 or 1, the pre charging transistors, P1 and P2, keep switching without useful computation, resulting in redundant clocking. Clearly, it is necessary to reduce redundant power consumption here. Further, CDMFF has a floating node on critical path because its first stage is dynamic. When clock signal CLK transits from 0 to 1, CLKDB will stay 1 for a short while which produces an implicit pulse window for evaluation. During that window, both P1, P2 are off. In addition, if transits from 0 to 1, the pull down network will be disconnected by N3 using data mapping scheme (N6 turns off N3); If D is 0, the pull down network is disconnected from GND too. Hence internal node is not connected with Vdd or GND during most pulse windows, it is essentially floating periodically.

The pulse generator consists of two transmission gates and four inverters shown in Figureure 5. When clk=1 the upper TG is ON and lower TG if OFF the output pulse=0. When the clk transit from 1 suddenly the pulse=1. That is the output of the invertor I3 is '1' after three inverter delay. Similarly, When clk=0 the lower TG is responsible to produce the *pulse* at negative edge of the clock. The pulse generator circuit is the external circuit it may drive one or more flip flop.Whenever the pulse is high the q output follows the d input.



Figure 7: Simulation result of CDMFF



Figure 8: Simulation result of CPSFF

The waveform of CPSFF is shown in the Figure 8.In this circuit consist of input signal for clock and Data, output for Q and Qb.



Figure 9: Simulation result of DET implemented on CPSFF

With feature size shrinking, dynamic node is more prone to noise interruption because of the un driven dynamic node. If a nearby noise discharges the node, pMOS transistor P3 will be partially on, and a glitch will appear on output node [14]. In a nanoscale circuit, a glitch not only consumes power but could propagate to the next stage which makes the system more vulnerable to noise. The power comparison table and chart as show in below Figure 10 in power pie chart.

| P-FF              | CCFF   | CDMFF  | CPSFF  | DET<br>CPSFF |
|-------------------|--------|--------|--------|--------------|
| Number of         |        |        |        |              |
| transistors       | 26     | 22     | 17     | 24           |
| Average           |        |        |        |              |
| Power(mw)         | 4.025  | 2.691  | 2.162  | 1.501        |
| Max.Power<br>(µw) | 1.923  | 1.423  | 1.009  | 0.775        |
| Min.Power<br>(µw) | 0.2383 | 0.2067 | 0.1621 | 0.122        |

| Table 1: | Comparation | of power | analysis |
|----------|-------------|----------|----------|
|          | 1           | J I      | ~        |



Figure 10: Power comparison of various FFs

# 7. Conclusion & Future Enhancements

One effective method of low power clocking system that is reducing capacity of the clock load by minimizing number of clocked transistor is analyzed. Following the approach, one novel DETCPSFF is proposed, which reduces local clock transistor number by about 40%. In view of power consumption of clock driver, the new CPSFF outperforms prior arts in flip-flop design by about 24%. Furthermore, several low power techniques, including low swing and double edge clocking, can be studied to incorporate into the new flip-flop to build clocking systems.

Future enhancements can be made in this application in order to reduce more power and thus also reduce the cost of the system. It also increases simplicity and efficiently reduces power to a greater extent. And apply futher flip flop topology methos.

#### 8.Reference

- Peiyi Zhao, Jason McNeely, Weidong Kuang, Nan Wang, and Zhongfeng Wang, Design of Sequential Elements for Low Power Clocking System IEEE, 2011
- H. Kawaguchi and T. Sakurai, "A reduced clock-swing flip-flop(RCSFF) for 63% power reduction," IEEE J. Solid-State Circuits, vol.33, no. 5, pp. 807–811, May 1998.
- Chandrakasan, W. Bowhill, and F. Fox, Design of High-Performance Microprocessor Circuits, 1st ed. Piscataway, NJ: IEEE Press,2001.
- J. Tschanz, S. Narendra, Z. P. Chen, S. Borkar, M. Sachdev, and V. De, "Comparative delay and energy of single edge-triggered & dual edgetriggered pulsed flip-flops for high-performance microprocessors," in Proc. ISPLED, Huntington Beach, CA, Aug. 2001, pp. 207–212.
- P. Zhao, J. McNeely, P. Golconda, M. A. Bayoumi, W. D. Kuang, and B. Barcenas, "Low power clock branch sharing double-edge triggered flip-flop," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 15,no. 3, pp. 338–345, Mar. 2007.
- L. Kim and S. Kang, "A low-swing clock double edge-triggered flip-IEEE J. Solid-State Circuits, vol. 37, no. 5, pp. 648–652, May 2002.
- J.Suzuki, K. Odagawa, and T.Abe, "Clocked CMOS calculator Circuitry," IEEE Journal of Solid State Circuits, vo. SC-8, Dec 1973, pp 462-469.
- N. Nedovic, V. G. Oklobdzija, W. W. Walker, "A Clock Skew Absorbing Flip-Flop", 2003 IEEE ISSCC, San Francisco, Feb. 2003.
- Jinn-Shyan Wang, "A new true-single-phase- clocked double-edge-triggered flip-flop for low-power VLSI design,", in proceedings of IEEE ISCAS 1997, pp1896-1899.
- A.Jagadeeswaran, Dr C.N.Marimuthu "Power optimization techniques for Sequential Elements Using Pulse Triggered Flip-Flops with SVL Logic" international journal of iosr jvsp Volume 1, Issue 4 (Nov. - Dec. 2012), PP 31-36.
- W. Zhao and Y. Cao. "New generation of predictive technology model for sub-45nm design exploration," In IEEE Intl. Symp. On Quality Electronics Design, 2006

- Hamid Partovi, Robert burd, UdinSalim, Frederick weber, Luigi DiGregorio, Danold Draper, "Flow-throug Latch and Edge-Triggered Flip-flop Hybrid Elements," IEEE International Solid-State Circuits Conference, 1996.
- 13. N. Nedovic, M. Aleksic, V.G. Oklobdzija, "Conditional techniques for low power consumption flip-flops", ICECS'01, pp. 803-806 vol.2
- D. Markovic, R.W. Brodersen, "Analysis and design of low-energy flip-flops" ISLPED' 01, pp. 52–55
- 15. B. Nikolic et al., "Improved sense-amplifier-based flip-flop: design and measurements", Solid-State Circuits, IEEE Journal of, Vol. 35, Issue: 6, June 2000 pp. 876-884
- M. Cooke et al., "Energy recovery clocking scheme and flip-flops for ultra lowenergy applications" ISLPED '03, pp. 54-59
- K.H. Cheng, Y.H. Lin, "A dual-pulse-clock double edge triggered flip-flop for low voltage and high speed application", ISCAS '03, pp. 425-428 vol.5