How high-level synthesis helps optimize low power designs – Part Two

By Paul Dempsey |  2 Comments  |  Posted: December 16, 2013
Topics/Categories: EDA - ESL, IC Implementation  |  Tags: , , ,  | Organizations:

Continuing our series on high-level synthesis (HLS) for low power design. Part Two details how HLS helps you make and evaluate architectural decisions.

The first part of this series provides a primer on high-level synthesis and can be read here.

Decisions made at the architectural level have the greatest impact in terms of quality of results (QoR). That is true whether optimizing for power, performance, or area. Some experts, such as Dr. Gary Delp at LSI, claim that architectural decisions can reduce power by 80% [1]. Restating that:  architectural decisions that are not power-aware (usually they do not consider it all) can lead to designs with 5X greater power consumption than those that have identified the more efficient architectures.

The goal of this series of articles is to explain how high-level synthesis can be used most effectively when creating low power designs. It assumes that we do not need to convince the designer of the importance of good architectural decisions (but for some specific examples of how HLS helped design teams maximize the impact of architectural decisions, please see the success stories on our web site, www.ForteDS.com).

This installment describes specific low power architectural techniques that can easily be applied with HLS. It also shows how the HLS environment helps the designer evaluate multiple scenarios.

Low power architectural decisions

The number and types of scenarios a designer may consider when evaluating architectures are limited only by imagination and deadlines. The latter is usually the critical factor. The role of HLS in architectural decision-making is to make the process quicker and more concrete. Then, high-level synthesis provides a rapid path to implementation once the best architecture has determined.

Memory tends to be power-hungry, so optimizing its use is both common and has a high impact. The memory architecture is usually determined first, and then RTL code is designed around that.

Given resources and other demands in a traditional RTL flow, usually only one memory architecture can be explored in full.  By contrast, HLS allows designers to prototype and implement RTL code while exploring multiple memory architectures. A designer can literally click a button (or keystroke, if you prefer) and vary the number of memory ports or even swap between full-speed, half-speed, and quarter-speed memories. Figure 1 shows an example of how to change memory types within the HLS environment using Forte’s Cynthesizer tool suite.

Figure 1: Changing the memory architecture with HLS (Source: Forte Design Systems)

Figure 1: Changing the memory architecture with high-level synthesis (Source: Forte Design Systems)

After evaluating a range of available memory types, a full RTL implementation can be synthesized via HLS. The reports and analysis provided in the HLS environment help the designer determine the viability of the RTL implementation. HLS synthesis reports can be analyzed to determine if, for example, the performance and QoR looks acceptable. The RTL implementations can be simulated to generate more detailed performance and power metrics. The RTL code can also be synthesized to gates with an existing RTL toolset, and the gates then used to get more accurate estimates. Figure 2 shows an analysis of multiple implementations within HLS.

Figure 2: Evaluating architectural trade-offs in the HLS environment (Source: Forte Design Systems)

Figure 2: Evaluating architectural trade-offs in the high-level synthesis environment (Source: Forte Design Systems)

Memories are one example of the low power architectural trade-offs that can be explored in great depth using HLS. Performance trade-offs are another. At the most trivial level, performance constraints can be loosened or tightened to help concretely evaluate trade-offs between power, performance, and area. Instead of discussing HLS’ generic capability for exploring performance, we’ll look at its application for power specifically below.

Clock tree power can account for 50% or more of the overall consumption of a complex system-on-chip. Techniques such as clock gating and reducing clock frequency are commonly used to reduce that consumption. Clock gating will be a major topic of a future installment, so here we will discuss the use of HLS when designers are creating architectures with multiple clock frequencies.

When designing RTL by hand, there is one target clock speed for each block, and the clock frequency of a block is not changed unless there is a crisis in meeting the constraint. There are several for this inflexibility,

The main one is that it is simply is not practical to hand-write multiple RTL implementations for multiple clock speeds, picking only one and discarding the rest.

Another consideration is that designs with multiple clock speeds require clock-domain crossing (CDC) IP between blocks of different frequencies. HLS can mitigate both of these concerns by automatically implementing the RTL, including the CDC IP.

A third limitation is that it is not feasible to try and route an arbitrary number of clocks. This can be mitigated by only considering a defined set of clocks, starting with those that must be available for performance or I/O reasons.

Directing HLS to generate RTL for different clock speeds is as simple as changing a number (clock speed) in a constraints file. Once generated, each RTL implementation can be evaluated, simulated, and synthesized to determine the impact on power and QoR in general.

Going deeper

We have discussed the importance of high-level synthesis in making and evaluating architectural decisions for low power design, including a few of the more important trade-offs to consider.  Future installments will outline how HLS increases the value designers get from their existing low power tools, and discuss optimizations that can only be implemented using HLS.

High-level synthesis resources beyond low power

There is a great deal more to high-level synthesis than its qualities for low power design. For a more general introduction to its strengths and advantages, one good resource is the Forte Design Systems YouTube channel (www.youtube.com/ForteDesignSystems).

The YouTube channel also includes an introduction to the SystemC C++ class library, which allows complex designs to be created and verified at a high-level.

References

[1] Delp, G., et.al., “Design & Verification of Low Power SoCs,” ISQED09, Session 5D.

About the author

David Pursley is director of Product Marketing for Forte Design Systems. He previously held various positions as a field applications engineer, technical marketing engineer, marketing manager, and product line manager in the fields of Electronic Design Automation (EDA) and Embedded Computing Technology (ECT).

David would welcome reader’s comments on this and other articles in the HLS for low power series and on HLS generally. You can contact him at dpursleyATForteDSDOTcom.

Company details

Forte Design Systems
Corporate Headquarters
Suite 302
100 Century Center Court
San Jose
CA 95112
USA

T: +1 800-800-6494
W: www.forteds.com/lowpower

Comments are closed.

PLATINUM SPONSORS

Synopsys Cadence Design Systems Siemens EDA
View All Sponsors