Overcoming electromigration analysis limitations for larger on-die power grids
A joint Mentor Graphics-University of Toronto (U of T) team has developed a novel physics-based approach for assessing electromigration (EM) that, in contrast to existing methods, offers the speed and accuracy required to analyze today’s increasingly large power grids.
Electromigration is the mass transport of metal atoms due to momentum transfer between electrons (driven by an electric field) and the atoms in a metal line. As a result of continued IC scaling, the size reduction of metal line cross sections results in higher current densities. EM degradation has become a major reliability concern for the design of on-die power grids in large integrated circuits.
According to research team member Dr Valeriy Sukharev, Technical Lead and Principal Engineer in the Design-to-Silicon division at Mentor, one reason for the increasing concern is that current EM assessment techniques, such as Black’s empirical model, are no longer sufficiently accurate.
Current electromigration strategies
Standard EM analysis typically breaks up a general grid metal structure (the so-called ‘power grid’) into branches; assesses the reliability of each branch separately using Black’s model; and then uses the earliest branch failure time as the failure time for the whole grid. The Mentor/U of T team argues that this approach is highly inaccurate for a number of reasons.
First, it ignores material flow throughout the tree, from branch to branch. As a result, if the individual branches happen to be so short that they are deemed immortal (due to the Blech effect in short lines), then the entire grid appears to be immortal. This is highly optimistic and can be entirely misleading to designers. In reality, due to material flow across the tree, failures can and do happen even if the branches are short.
Second, the technique’s inherent assumption of no material flow between branches effectively means that the reliabilities of nearby metal lines are independent of each other. Here, a traditional approach can produce results that are highly pessimistic. In practice, two identical connected lines that carry the same current density can have quite different values for mean time-to-failure (MTF). So, connected lines can, in fact, influence each other, leading to different failure times. Due to the resulting pessimism, designers are faced with an apparently ever-shrinking margin between the design-specific current densities and the current densities allowed by EM design rules. This makes EM sign-off very hard to achieve and leads to the overuse of metal resources in the grid.
Meanwhile, traditional EM checking tools assume that a power grid fails as soon as any one of its lines fails. This has been referred to as a series model of grid failure. The main problem here is that such an assumption ignores the inherent redundancy in the many parallel paths of the power grid.
There are already some alternatives and variations available but these also have serious limitations.
One alternative, the mesh model, deems a grid to have failed not when the first line fails, but when the voltage drop at any grid node exceeds a user specification. However, the mesh model still uses Black’s model to compute the EM degradation of individual lines. Another alternative is to adapt Korhonen’s physical EM models to interconnect trees. To get a more realistic estimate of grid reliability, one could use a mesh model and abandon Black’s model in favor of more physical EM models. However, the current implementations of the physical EM models are slow, requiring up to 32 hours to estimate the failure time for a 400k-node grid. They also fail to accurately account for multi-voiding generation across the whole tree.
The Mentor/U of T proposal
The proposal from Mentor and U of T comprises a fast physics-based electromigration checking approach that accounts for material flow and the coupling of stress in interconnect trees, allowing for arbitrary complex geometries. The process removes unrealistic assumptions inherent in traditional industrial tools.
Computational speed is improved by using an efficient filtering scheme and a fast predictor-based approach. This has been proved to have minimal impact on accuracy. The MTFs estimated using the physics-based approach are on average 3x longer than those based on a (calibrated) Black’s model. In addition, the new method is suitable for use on very large power grids.
The method begins by establishing a one-dimensional (1D) physical model for EM degradation within branches, as proposed by Korhonen. That model is then extended by introducing boundary laws at junctions to track material flow and stress evolution in multi-branch interconnect trees.
The extended Korhonen model starts out as a system of partial differential equations (PDEs) coupled by the boundary laws, which are then scaled and discretized to reduce the model to a system of ordinary differential equations (ODEs). The method then moves on to numerically solve the ODE system at successive time-points to track the stress evolution and find the corresponding time of void nucleation(s).
The random nature of EM degradation is accounted for using a Monte Carlo method. Successive samples of grid time-to-failure are found, until the estimate of the overall MTF has converged.
Computation speed is enhanced by using a filtering scheme that estimates upfront the set of trees that are most likely to impact the MTF assessment of the grid, with minimal impact on accuracy. The process also includes a predictive scheme that allows for faster MTF estimation by extrapolating the solution (stress curve) obtained from a few initial time-points.
Evaluation
The Mentor/U of T approach was tested on a number of IBM power grid benchmarks on a quad-core 3.4GHz Linux machine with 32GB of RAM.
The MTFs estimated using the physics-based approach were on average 3x longer than those based on a (calibrated) Black’s model, supporting the claim that Black’s model is not accurate enough for modern power grids and confirming the need for physical models.
Having achieved a run-time of less than three hours for the largest grid (700K nodes), this approach has been demonstrated as suitable for large VLSI circuits.
The details of the physics-based process involve some highly complex equations. Dr Sukharev and his colleagues from U of T’s Electrical and Computer Engineering faculty, Dr. Farid Najm and PhD candidate Sandeep Chatterjee, describe these in detail in a recent award winning paper, “Fast Physics-Based Electromigration Checking for On-Die Power Grids”. Recipient of the IEEE/ACM William J. McCalla ICCAD Best Paper Award for the Back-End category at the recent ICCAD conference in Austin, it can now be downloaded from the ACM Digital Library (subscription)
Dr. Sukharev’s earlier white paper on the subject, Electromigration Analysis at Advanced Nodes, is also available for download free of charge.