A hierarchical methodology removes DFT from the critical path for large designs. The methodology is compatible with other techniques such as channel sharing, which can further reduce ATPG turn-around time and test cost.
Hierarchical DFT, in which DFT insertion and sign-off pattern generation is done at the block level, is now standard procedure for most large IC designs. In a hierarchical methodology, DFT work can start early, eliminating spikes in the workload towards tapeout, and reducing the compute resource requirements. The combination of hierarchical DFT and embedded deterministic test (EDT) channel sharing maximizes channel allocation to further improve ATPG turn-around time and test cost.
Why use hierarchical DFT
Leading semiconductor companies are seeing a dramatic increase in design size, while the number of top-level pins for test access is remaining relatively static. In addition to being very large, these designs also target advanced process node, and often need to meet quality standards that require the use of several fault models. The combination of large design size, advanced node, test pin limitations, and quality requirements poses a huge challenge for the ATPG tool and the DFT engineers responsible for the task. This is where a divide-and-conquer approach with hierarchical DFT saves the day. Because many design tasks, like synthesis and physical layout, are already implemented hierarchically, performing hierarchical DFT is consistent with that approach. Figure 1 illustrates the concept of hierarchical DFT.
Numerous published case studies show the benefits of hierarchical DFT. These include:
- An up to 10X performance gain in ATPG, diagnosis, and pattern verification;
- An up to 2x pattern count reduction;
- Getting DFT out of the critical path; and
- Enabling core re-use
Patterns are generated at the core level with much faster runtimes and less compute resources (memory) than would be needed for full-chip ATPG. With the use of an IEEE 1687 IJTAG infrastructure, hierarchical DFT is highly automated, flexible, and scalable.
Manage pin limitations and boost compression with EDT channel sharing
In large designs, the number of chip-level pins available for scan test data is limited. There are several techniques to manage this. These include input channel broadcasting, where a set of scan channel input pins are shared among multiple identical cores. Modern multicore architectures contain many heterogeneous IP cores, each with a different EDT controller. For this situation, channel sharing is a good option. Figure 2 illustrates the concept.
EDT channel sharing allows scan input channels to be shared across multiple, heterogeneous cores. The compression architecture separates control and data channels. Now, the control channels can be individually accessible and uniquely allocated, and the data channels can be shared among a group of cores. Channel sharing boosts compression ratios by about another 2X and lends added flexibility to DFT planning in SoC design flows.
However, all cores sharing data channels must be present when patterns are generated because the ATPG tool needs to be able to predict the expected outputs based on the input stimulus. In a non-hierarchical flow, all cores are present during pattern generation as a rule. But in a hierarchical DFT approach, patterns are generated at the wrapped core level then retargeted and merged at the chip level. In this case, sharing of channels must be done inside a hierarchical boundary defined for pattern generation. This involves grouping cores together in design regions that are also wrapped. Creating this new level of hierarchy allows for channel sharing across cores inside the region, and patterns generated at that region level are retargeted to the chip level.
The benefits of a hierarchical DFT flow were documented in a case study at DATE 2016. On a 4.3M flop design with 18 cores, the ATPG turn-around time was found to be 11X faster with a hierarchical flow against a traditional flow. The load on the DFT engineer was found to be more uniform, saving weeks of work at the most critical time of the project.
In a more recent study published by Spreadtrum at DTIS 2018, the combination of hierarchical DFT and channel sharing was studied on a new generation, high-end mobile chipset. The project team compared the results to the data from a functionally similar, previous generation mobile chip.
The previous chip had about 3 million scan flops and was designed for 28nm. It used channel sharing and a flat DFT methodology in which all cores were tested together through 100 chip-level pins. The large memory footprint needed to load the design limited the number of machines available for ATPG, which took several weeks.
The design used for the Spreadtrum study used 14nm technology and contained nearly 7 million scan flops, but was still limited to 100 chip-level test pins. The design team was concerned that ATPG would be impossible to complete within the design schedule.
Channel sharing with hierarchical DFT requires that patterns be generated for the entire group of cores that share data channels. This design had five such “sub-chip” regions that each contained multiple EDT blocks that used channel sharing. Patterns generated at those sub-chip levels were retargeted to the chip level in the same automated fashion as any block-level patterns.
The new ATPG methodology resulted in an average 3X reduction in memory footprint, even before accounting for the 2.3X increase in design size. That is, the 7 million scan flop design had a 3X lower ATPG memory footprint than the 3 million scan flop design. The reduced memory footprint freed up compute resources for use in other critical tapeout tasks. ATPG runtime, again not scaled to a per-flop basis, was reduced by over 10X. The overall reduction in test time was about 1.7X, and test coverage for both stuck-at and transition delay increased.
The savings to the overall schedule was greater than the runtime improvements. In a flat approach, ATPG must wait for the full chip netlist to be completed, putting DFT work in the critical path to tapeout. In a hierarchical flow, ATPG occurs earlier in the schedule as each core is completed. By the time the top-level design is complete and ready for tapeout, the test patterns have already been generated and verified.
The rapid pace of design scaling calls for change to traditional DFT flows. The basic hierarchical methodology involves inserting DFT and generating sign-off test patterns at the core level, and retargeting the patterns to the top level. Another technique for managing pin-limited design is EDT channel sharing. A recent case study by Spreadtrum demonstrates that the two techniques are compatible and result in combined benefits to memory footprint, ATPG runtime, test channel allocation, and overall tapeout schedule. The value of hierarchical DFT is well established, but it does not preclude using other techniques like channel sharing to further improve DFT efficiency.
 D. Trock, R. Fisette, “Recursive hierarchical DFT methodology with multi-level clock control and scan pattern retargeting,” Design, Automation & Test in Europe (DATE) 2016.
 B. Lu, et.al., “The test cost reduction benefits of combining a hierarchical DFT methodology with EDT channel sharing — A case study,” International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DIS) 2018.
About the author
Geir Eide is the product marketing director for Tessent ATPG and Compression at Mentor, A Siemens Business. As a 20-year veteran of the test and DFT industry, Geir has presented seminars on DFT and test throughout the world. Geir earned his MS in Electrical and Computer Engineering from the University of California at Santa Barbara.