# Generalization of Logic Picture-Based Power Estimation Tool

M.H. Amin, M.F. Fouda, A.M. Eltantawy Elect. & Comm. Dept. – Faculty of Engineering Cairo University Giza, Egypt engm.hosny, mfatouh, a.shafiey@aucegypt.edu M.B. Abdelhalim College of Computing & IT AASTMT Cairo, Egypt mbakr@ieee.org H. H. Amer Electronics Eng. Dept American Univ. in Cairo Cairo, Egypt hamer@aucegypt.edu

#### *Abstract*— in this paper, the Logic Picture-based power estimation tool is extended to include more features. The Logic Picture is a technique used for CMOS dynamic power estimation in both combinational and sequential logic circuits. This technique proved to be more accurate and less time consuming compared to other techniques. This work enhances the tool by calculating the maximum power consumption in sequential circuits. Furthermore, it is extended to include all types of Flip-Flops and takes into account the power consumption in the internal nodes of these Flip-Flops. Finally, it is shown how to incorporate all the features and enhancements for Design Space Exploration in sequential circuits.

*Index Terms*— Maximum Power Estimation, Dynamic Power Estimation, Logic Picture, Toggle Rate, Flip-Flop Power Consumption, Power-Aware Design-Space Exploration.

## I. INTRODUCTION

With the smaller features in today's CMOS integrated circuits and the higher transistor density per chip, power consumption has become a very important factor in the design cycle. Therefore, it is important to have a CAD tool that would accurately calculate the average and maximum dynamic power consumption in a digital CMOS circuit [1]. In [2, 3], the concept of Logic Pictures (LPs) was introduced for calculating the exact average dynamic power consumption of combinational and sequential circuits. Moreover, the maximum power consumption in combinational circuits was obtained. However, this approach focused only on D-type flip-flop circuits.

In this paper, several issues are studied to enhance the power estimation tool proposed in [2, 3]. First, it is shown how to calculate the maximum power consumption in sequential circuits. Next, the approach is extended using other types of flip-flops, namely J-K and T flip-flops. Then, the power consumed inside the flip-flops is modeled and estimated. Finally, all modifications are integrated to show how the tool is used for Design Space Exploration; comparisons between different realizations of the same FSM, in terms of the average and maximum power consumptions are performed to find the best implementation.

This paper is organized as follows: Section II discusses the maximum power estimation in sequential circuits. Next, Section III extends the tool for other types of Flip-Flops, namely JK and T Flip-Flops. Power consumption inside all Flip-Flop types is then studied in Section IV. Section V shows how all the enhancements are integrated to allow for efficient Design Space Exploration. Finally, conclusions are drawn in Section VI.

## II. MAXIMUM POWER IN SEQUENTIAL CIRCUITS

A search for the LP transition that consumes maximum power in the circuit is conducted. This maximum power is proportional to the product of the number of switching nodes during this transition and the capacitance of these switching nodes. For simplicity, node capacitances can be approximated by node fan-out [4, 5]. In the example shown in Fig. 1, all nodes have single fan-out except the output node (Y) with a fan-out of 2. Therefore, the product of the number of switching nodes during this transition and the fan-out of these switching nodes for all LP transitions is given in Table 1.

It can be noticed that the maximum number of fan-out scaled node transitions is four. This situation occurs in multiple LP transitions, i.e., from LP2 to LP4, from LP6 to LP2, .... These are the transitions that produce the maximum instantaneous power consumption. To account for the maximum power dissipation over a long period of time (i.e., energy consumption which affects power supply design and heat-sink requirements), it is required to find which of these transitions occur more frequently than others. Hence, the probability of LP transitions as well as the steady-state probability of LPs should be taken into account. The resulting Probability Weighted Node Switching (PWNS) can be calculated by equation (1).

Table 1: Fan-out scaled switching nodes

| To<br>From | LP1 | LP2 | LP3 | LP4 | LP5 | LP6 |
|------------|-----|-----|-----|-----|-----|-----|
| LP1        | 0   | 2   | 2   | 0   | 0   | 0   |
| LP2        | 0   | 0   | 0   | 4   | 2   | 4   |
| LP3        | 0   | 0   | 0   | 4   | 4   | 4   |
| LP4        | 0   | 0   | 0   | 0   | 2   | 2   |
| LP5        | 0   | 0   | 0   | 2   | 0   | 2   |
| LP6        | 2   | 4   | 4   | 0   | 0   | 0   |

Table 2: Probability weighted node switching (PWNS)

| To<br>From | LP1 | LP2    | LP3   | LP4   | LP5   | LP6    |
|------------|-----|--------|-------|-------|-------|--------|
| LP1        | -   | -      | -     | -     | -     | -      |
| LP2        | -   | -      | -     | 0.125 | -     | 0.0625 |
| LP3        | -   | -      | -     | 0.25  | 0.125 | 0.125  |
| LP4        | -   | -      | -     | -     | -     | -      |
| LP5        | -   | -      | -     | -     | -     | -      |
| LP6        | -   | 0.1875 | 0.375 | -     | -     | -      |

 $PWNS(LP_i, LP_j) = Fan-Out Scaled Switching Nodes (LP_i, LP_j) * Transition Probability (LP_i, LP_j)_{Steady-State} (1)$ 

$$\begin{aligned} Transition \ Probability \ (LP_i, LP_j)_{Steady-State} = \\ Prob(LP_i | LP_j) * Prob(LP_j)_{Steady-State} \end{aligned} \tag{2}$$

More details about (2) can be found in [3]. Table 2 has the PWNS for all possible LP transitions that consume the maximum power. It shows that the transition from  $LP_6$  to  $LP_3$  is the most frequent transition.



Fig. 1 Example circuit based on D FF



Fig.2 Example state diagram.

## III. EXTENDING THE TOOL FOR OTHER TYPES OF FLIP-FLOPS

Previous work in [3] focused only on sequential circuits implemented using the D-FF; this is the special case where the output of the FF (the next state) has the same value as the input, and hence they have the same toggle rate. In this section, the tool is extended to other flip-flop types (namely J-K and T FFs) to make the tool applicable for any sequential circuit implementation because, in general, the toggle rate of the next state is different from that of the FF input and must be calculated separately.

| Table 3: State | Table for | the IK  | FF-based | circuit |
|----------------|-----------|---------|----------|---------|
| Table 5. State |           | THE JIC | rr-baseu | uncunt  |

| Present<br>State | Inputs |    | Inputs Internal Nodes |    | Internal Nodes |    |  | Inputs Internal Nodes |  | odes | Next<br>State |
|------------------|--------|----|-----------------------|----|----------------|----|--|-----------------------|--|------|---------------|
| Y                | X1     | X0 | NO                    | N1 | N2             | N3 |  |                       |  |      |               |
| 0                | 0      | 0  | 0                     | 0  | 1              | 0  |  |                       |  |      |               |
| 0                | 0      | 1  | 1                     | 0  | 1              | 1  |  |                       |  |      |               |
| 0                | 1      | 0  | 1                     | 0  | 1              | 1  |  |                       |  |      |               |
| 0                | 1      | 1  | 1                     | 1  | 1              | 1  |  |                       |  |      |               |
| 1                | 0      | 0  | 0                     | 0  | 0              | 1  |  |                       |  |      |               |
| 1                | 0      | 1  | 1                     | 0  | 0              | 1  |  |                       |  |      |               |
| 1                | 1      | 0  | 1                     | 0  | 0              | 1  |  |                       |  |      |               |
| 1                | 1      | 1  | 1                     | 1  | 1              | 0  |  |                       |  |      |               |

The FSM example in [3] (that was used to build the circuit shown in Fig. 1) is shown in Fig. 2. It is used again to realize circuits using JK-FF (Fig. 3) and T-FF (Fig. 4). Then The improved tool is used to calculate the state tables shown in Table 3 and Table 4. The LPs for the J-K implementation and those of the T implementation are shown in Table 5 and Table 6 respectively. Using the steady state probabilities of the circuit's states Prob(S0)=0.25, Prob(S1)=0.75 [3], the LP probabilities are shown in Table 7. The transition probabilities between LPs are then calculated to finally get the toggle rate of any circuit node in Table 8. Monte Carlo simulations were performed to verify the results, and the obtained toggle rates were found to be identical to the toggle rates calculated using the proposed tool.

| Present<br>State | In | puts | Internal Nodes |    |    | Next<br>State |    |
|------------------|----|------|----------------|----|----|---------------|----|
| Y                | X1 | X0   | NO             | N1 | N2 | N3            | N4 |
| 0                | 0  | 0    | 0              | 0  | 0  | 0             | 0  |
| 0                | 0  | 1    | 0              | 1  | 0  | 1             | 1  |
| 0                | 1  | 0    | 0              | 0  | 1  | 1             | 1  |
| 0                | 1  | 1    | 1              | 1  | 1  | 1             | 1  |
| 1                | 0  | 0    | 0              | 0  | 0  | 0             | 1  |
| 1                | 0  | 1    | 0              | 0  | 0  | 0             | 1  |
| 1                | 1  | 0    | 0              | 0  | 0  | 0             | 1  |
| 1                | 1  | 1    | 1              | 0  | 0  | 1             | 0  |

Table 4: State Table for the T FF-based circuit

Table 5: LPs for the JK FF-based circuit

|                | LP1  | LP2  | LP3  | LP4  | LP5  | LP6  |
|----------------|------|------|------|------|------|------|
| Node<br>values | 0010 | 1011 | 1111 | 0001 | 1001 | 1110 |
| State          | S0   | S0   | S0   | S1   | S1   | S1   |

Table 6: LPs for the T FF-based circuit

|                | LP1   | LP2   | LP3   | LP4   | LP5   | LP6   |
|----------------|-------|-------|-------|-------|-------|-------|
| Node<br>values | 00000 | 01011 | 00111 | 11111 | 00001 | 10010 |
| State          | S0    | S0    | S0    | S0    | S1    | S1    |

## Table 7: LP Probabilities for the

| (a) JK F | F-based circuit | _ | (b) T FF-based c | ircuit |
|----------|-----------------|---|------------------|--------|
| LPs      | Prob            |   | LPs              | Prob   |
| LP1@S0   | 0.0625          |   | LP1@S0           | 0.0625 |
| LP2@S0   | 0.125           |   | LP2@S0           | 0.0625 |
| LP3@S0   | 0.0625          |   | LP3@S0           | 0.0625 |
| LP4@S1   | 0.1875          |   | LP4@S0           | 0.0625 |
| LP5@S1   | 0.375           |   | LP5@S1           | 0.5625 |
| LP6@S1   | 0.1875          | ] | LP6@S2           | 0.1875 |

Table 8: Toggle Rates

| a) JK FF-based circuit |             |  |  |  |  |  |
|------------------------|-------------|--|--|--|--|--|
| Node                   | Toggle Rate |  |  |  |  |  |
| N0                     | 0.375       |  |  |  |  |  |
| N1                     | 0.375       |  |  |  |  |  |
| N2                     | 0.28125     |  |  |  |  |  |
| N3                     | 0.375       |  |  |  |  |  |
|                        |             |  |  |  |  |  |

| (b) T FF-based circuit |             |  |  |  |  |  |
|------------------------|-------------|--|--|--|--|--|
| Node                   | Toggle Rate |  |  |  |  |  |
| NO                     | 0.375       |  |  |  |  |  |
| N1                     | 0.25        |  |  |  |  |  |
| N2                     | 0.25        |  |  |  |  |  |
| N3                     | 0.375       |  |  |  |  |  |
| N4                     | 0.375       |  |  |  |  |  |



## IV. POWER ESTIMATION INSIDE FLIP-FLOPS

The logic picture technique for estimating the average and maximum instantaneous power consumption in sequential circuits handles the different types of Flip-Flops as black boxes [2, 3]. However, these Flip-Flops contain nodes that may toggle with any change either in the input or the clock. This section solves this problem and the behavior of the internal nodes of different types of Flip-Flops is examined.

To simplify the analysis, it is assumed that Flip-Flops work in normal operation mode, i.e. clear and set signals are deactivated. Also, the inputs of the Flip-Flops change exactly at the active edge of the clock.

The problem can be formulated as follows: it is required to get the internal switching activity in terms of the probabilities of the logic pictures of the external nodes which are the outputs and the inputs of the Flip-Flop. The main idea is that, the internal structure of the Flip-Flop can be considered as a regular sequential circuit but with an input of a deterministic behavior (CLK) and internal nodes with feed-back connections. Hence, the same concept of Logic Pictures is applied to this circuit with small modifications. Let the Logic Pictures inside the Flip-Flop be called Internal Logic Picture (ILP).

The switching of the internal nodes occurs at two instances (the two edges of the clock). Figure 5 shows the switching activity on the internal nodes of the D Flip-Flop as shown in Figure 6 [6].

Table 9: Stage 1 D Flip-Flop ILP Table

|      | Qold | D | S1 | S2 | <b>S3</b> | S4 | Qbar | Q |
|------|------|---|----|----|-----------|----|------|---|
| ILP1 | 0    | 0 | 0  | 1  | 1         | 1  | 1    | 0 |
| ILP2 | 0    | 1 | 1  | 1  | 1         | 0  | 1    | 0 |
| ILP3 | 1    | 0 | 0  | 1  | 1         | 1  | 0    | 1 |
| ILP4 | 1    | 1 | 1  | 1  | 1         | 0  | 0    | 1 |

An off-line analysis is performed to get the total transitions of the internal nodes, i.e. the sum of the transitions at both edges. The procedure of this analysis can be summarized in the following steps; each step is illustrated by applying it to the D Flip-Flop in Figure 6 as an example, however, this has also been applied to both JK and T Flip-Flops. The implementation used for JK Flip-Flop can be found in [7].



Fig.5 Timing diagram example for the internal nodes of 74LS74 D-FF [6]

Fig.6 74LS74 D-FF internal structure [6].



- 1- Construct an ILP table of all possible logic pictures inside the Flip-Flops at stage 1 (CLK=0 for positive edge triggered FF and CLK=1 for negative edge triggered). This can be done either by simulation or by hand analysis. Table 9 lists the internal logic pictures of the D Flip-Flop example in Figure 5.
- 2- Get the transition of each node at the active edge (positive edge in this example). This results in the 1<sup>st</sup> transition vector of each ILP. Table 10 shows the ILP table of the transitional stage. Different nodes may not switch instantaneously to their next values; however, each node switches only once reaching a final steady state ILP.
- 3- Get all the possible next ILPs at stage 1 again, and get the 2<sup>nd</sup> transition vector from the transitional state to this ILP. The sum of the two gives the total transitions when moving from an ILP to another one. Table 10 summarizes the transition vectors between the ILPs.
- 4- Finally, the total number of transitions, as shown in Table 11, of the internal nodes can be calculated in terms of the probabilities of the inputs and outputs. This step is the only step performed at run time; other steps must be performed offline for any supported types of Flip-Flops. Equation (3) is used to calculate the switching activity of any internal node.

| Possible<br>Transitions | @+ve edge   | @-ve edge   | Total       |
|-------------------------|-------------|-------------|-------------|
| ILP1 >> ILP1            | [0 0 1 0 0] | [0 0 1 0 0] | [0 0 2 0 0] |
| ILP1 >> ILP2            | [0 0 1 0 0] | [10110]     | [1 0 2 1 0] |
| ILP2 >> ILP3            | [0 1 0 1 1] | [1 1 0 0 0] | [1 2 0 1 1] |
| ILP2 >> ILP4            | [0 1 0 0 1] | [0 1 0 0 0] | [0 2 0 0 1] |
| ILP3 >> ILP1            | [0 0 1 0 1] | [0 0 1 0 0] | [0 0 2 0 1] |
| ILP3 >> ILP2            | [0 0 1 0 1] | [10110]     | [1 0 2 1 1] |
| ILP4 >> ILP3            | [0 1 0 1 0] | [1 1 0 0 0] | [1 2 0 1 0] |
| ILP4 >> ILP4            | [0 1 0 0 0] | [0 1 0 0 0] | [0 2 0 0 0] |

Table 11: Total transitions



where  

$$P(ILP_i) = P(S_k) \times P(FF \text{ input vector}|S_k)$$
(4)

Note that  $S_k$  is defined as the state that includes ILP*i*. Moreover, the above procedure will be repeated for all FFs in any circuit under study.

Table 12: toggle rates of D-FF internal nodes

| Signal              | Toggle Rate |  |  |  |  |  |
|---------------------|-------------|--|--|--|--|--|
| S1                  | 0.375       |  |  |  |  |  |
| S2                  | 1.5         |  |  |  |  |  |
| S3                  | 0.5         |  |  |  |  |  |
| S4                  | 0.375       |  |  |  |  |  |
| S5                  | 0.375       |  |  |  |  |  |
| Q (internal effect) | 0.375       |  |  |  |  |  |

To verify the validity of this procedure, it has been applied to the example of the finite state machine example with its different implementations with D, JK and T Flip-Flop to get the switching activity of the internal nodes of the Flip-Flops. Monte Carlo simulation has been performed to verify the results. Table 12 shows the details of toggle rates of the internal nodes in case of the D Flip-Flop implementation of the finite state machine.

The switching of the internal nodes also must be considered in the calculation of the maximum instantaneous power and its related parameters discussed in Section II.

## V. DESIGN SPACE EXPLORATION

From the previous results, the proposed tool can be used for Design Space Exploration (DSE). For a given FSM, the tool can be used to find which type of FF is better from the point of view of the average switching power consumption ( $P_{sw}$ ). As  $P_{sw} = \frac{1}{2}Vdd^2 f_{clk} \sum C_i \alpha_i$  where  $\alpha_i$  is the toggle rate of node i and  $C_i$  is the node capacitance which is proportionally related to the fan-out of node i ( $F_i$ ) [4, 5], the switching power is proportional to  $\sum F_i \alpha_i$  as all other parameters are constant for a specific technology. Interconnect capacitance could be considered for power estimation; however, it depends on the circuit layout and does not affect the toggle rate estimation methodology.

As an example, the different realizations of the finite state machine in Figure (2) are explored. **First, the Flip-Flops are considered black boxes.** In all the above circuits, the fan-out is only 1 for all nodes except node Y in Fig. 1 and node N4 in Fig. 4 which have a fan-out of 2. Therefore, for Fig. 1,  $\sum F_i \alpha_i = 2.125$ , for Fig. 4,  $\sum F_i \alpha_i = 2$ , and for Fig. 3,  $\sum F_i \alpha_i = 1.40625$ . Therefore, the JK-FF is the best flip-flop selection for this design.

From the point of view of the maximum switching power consumption, it is found from Table 1 that the maximum value of fan-out scaled switching nodes was **four** for the D-FF based realization. The enhanced tool produces the same value for the JK-FF and larger value for the T-FF based realizations (6 in our case).

However, if the internal nodes of Flip-Flops are considered, the results are different. It is found that the D-FF is the best choice for this simple FSM where  $\sum F_i \alpha_i =$  7.675. In addition, it has the lowest fan-out scaled switching nodes. Table 13 summarizes all the results of both the average power and maximum instantaneous power for all the discussed realization of FSM example.

| Realization | External<br>power | Max fan-out scaled<br>node switching<br>(external) | Overall<br>power | Max fan-out scaled<br>node switching<br>(overall) |  |  |  |
|-------------|-------------------|----------------------------------------------------|------------------|---------------------------------------------------|--|--|--|
| D-FF        | 2.125             | 4                                                  | 7.675            | 12                                                |  |  |  |
| JK-FF       | 1.40625           | 4                                                  | 8.40625          | 16                                                |  |  |  |
| T-FF        | 2                 | 6                                                  | 9                | 19                                                |  |  |  |

Table 13: Power estimation for different FSM realizations

#### VI. CONCLUSIONS

In this paper, it was shown how to use the concept of Logic Pictures to accurately calculate the maximum power consumption in sequential circuits. Furthermore, the concept is used to estimate the power of different Flip-Flops based realizations. Finally, the concept is extended to estimate the power consumed inside the Flip-Flops. These enhancements allow for efficient Design Space Exploration w.r.t different realizations of sequential circuits.

Even though this technique is illustrated using a simple sequential circuit, it important to note that it can be easily applied to larger and more complex circuits.

### REFERENCES

[1] M. M. Mano and C. R. Kime, "Logic and Computer Design Fundamentals", 4<sup>th</sup> Edition, Prentice Hall, 2007.

[2] M. F. Fouda, M. B. Abdelhalim, and H. H. Amer, "Average and Maximum Power Consumption of Digital CMOS Circuits Using Logic Pictures", *ICCES'09*, Cairo, Egypt, pp. 225-230, 2009.

[3] M. F. Fouda, M. B. Abdelhalim, and H. H. Amer, "Power Consumption of Sequential CMOS Circuits Using Logic Pictures", *BEC'10*, Tallinn, Estonia, pp. 133-136, 2010.

[4] M. Xakellis and F. Najm, "Statistical Estimation of the Switching Activity in Digital Circuits", *DAC'94*, San Diego, CA, USA, pp. 728-733, 1994.

[5] F. Aloul and A. Sagahyroon, "Estimation of the Weighted Maximum Switching Activity in Combinational CMOS Circuits", *ISCAS'06*, Kos Island, Greece, pp. 2929–2932, 2006.

[6] SN54/74LS74A D flip-flop datasheet, Motorola Inc.
[7] SN54/74LS109A JK flip-flop datasheet, Motorola Inc.

Table 10: Stage 2 D Flip-Flop ILP Table

| Before the active edge |    |    |    |    | After the active edge |   |   |   |    |    |    |   |   |      |   |
|------------------------|----|----|----|----|-----------------------|---|---|---|----|----|----|---|---|------|---|
| D                      | S1 | S2 | S3 | S4 | Qbar                  | Q | Ι | ) | S1 | S2 | S3 | S |   | Qbar | Q |
| 0                      | 0  | 1  | 1  | 1  | 1                     | 0 | 0 | 1 | 0  | 1  | 0  | 1 | 1 | 1    | 0 |
| 1                      | 1  | 1  | 1  | 0  | 1                     | 0 | 0 | 1 | 1  | 0  | 1  | 1 | 0 | 0    | 1 |
| 0                      | 0  | 1  | 1  | 1  | 0                     | 1 | 0 | 1 | 0  | 1  | 0  | 1 | 1 | 1    | 0 |
| 1                      | 1  | 1  | 1  | 0  | 0                     | 1 | 0 | 1 | 1  | 0  | 1  | 1 | 0 | 0    | 1 |