Friday, May 15, 2009

When NOT to have "shut-down" power domain?

There has been a lot of talk about low power these days and one of the techniques which is widely published and used is to create shut-down region also some times referred to as multi-supply design or MTCMOS based design. You can search the net to find 1000s of articles on why this is useful and possible advantages associated with it.

But, have you ever wondered when NOT to create shut-down power domain in your chip?

Here are possible dis-advantages in creating shut-down region in your design:

- If your circuit is such it there is very little possibility to put it to sleep.
- If the cycle time to wake-up the circuit is not an acceptable spec for the application in which it is being used.
- Most of the EDA tools want this shut-down power domain in a separate hierarchy module. While this is not difficult to do, but if you are inheriting your RTL which is already 90% pre-written based on previous chip, it make it challenging to re-group your hierarchy to meet this need.
- Moreover, even if you have a hierarchy (module) identified to be shut-down, but not every element in there can be shut-down can introduce addition design constraints such as adding retention flops or creating always-on islands for these elements.
- MTCMOS switches occupy a finite area in your chip. So, by adding them you are going to have a bigger die size. In addition, you'll have to add control circuit to shut-down and wake up these cells. This also means more cells and thus bigger die size.
- MTCMOS switches are leaky when in on-state. This will add to the leakage power of the design when in on-state. This becomes an issue when your leakage power specs are very tight.
- MTCMOS requires accurate analysis of rush-current and circuit transients due to sudden wake-up of shut-down regions. While this is doable, it is going to put extra burden on the back-end designer to come up with the right number of switch cells and robust power plan.
- Using shut-down regions will add extra steps in your chip design flow in terms to IR analysis, verification and implementation. If you have tight design schedule and this flow is not in-place it is a sure no-go. This also means adding new tools and licenses in your flow which may be another reason from business stand-point (Yeah, lot of you'll dis-agree to this but how many times you have used certain tool just because that was the only one available for you?)

Any more you can think of? Feel free to comment...

Saturday, May 2, 2009

Temperature Inversion. Is this a new 45nm effect?

TSMC recently announced that temperature inversion will be one of the challenging effects for 45nm process.

Here is the article I am referring to: http://www.edn.com/article/CA6434612.html

So, what is temperature inversion and how it affects delay and leakage power?

Actually, this effect was there at 65nm process node as well. But, people just didn't notice it so much. Basic concept is that transistor's threshold voltage increases at lower temperature and in turn will cause more delay at lower temperature as compared to higher temperatures.

This is contrary to the fact that you design your chip for worst case and best case condition. Worst case being worst PVT. Now, at 65nm and below this is no longer true. Your worst case can be not PV-Tmax but it will be PV-Tmin. So, what happens is that if you design your chip with worst case condition(say 125C) and meet the clock frequency, you'll still have timing paths failing at Tmin (say -40C).

This is easy to handle. Most of the P&R tools support multi-corner-multi-mode analysis. So, you can just plug-in the data and let the tool optimize for all those scenarios or determine your worst delay scenario and just work on that. The challenging part is that you need to have your libs characterized not only for the worst temp corner but also for the lower temp corner in worst mode.

That's not all. With this, your leakage power measurement also gets impacted. Leakage power is generally pretty good at Tmin. But, is worst at Tmax. Traditionally, if you are optimizing for timing at Tmin, and also tend to measure leakage at Tmin, you get wrong impression of your chip dissipating very less power. You need to measure leakage power at the Tmax. There are two ways today's P&R tools handle this:
a) Vendor will tell you to cut-past worst leakage numbers in the Tmin libraries and this way you are measuring worst delay and worst leakage using the same .libs. This libraries can only be used by P&R tool and later you still use original libs when you go to sign-off.
b) Some tools allow leakage corner based reporting. Here you don't need to create separate libs with mix and match as in case (a) but instead just instruct P&R tool as to which libs you want leakage power to be measured on.

There are some more effects like HVT cells being more leaky at lower temperature nodes and it's implications. I'll discuss that in one of the later posts.