Slow winter or new spring for hardware design?

By Jack Erickson | No Comments | Posted: November 1, 2013
Topics/Categories: EDA - ESL | Tags: 5G, low-power design, multicore, TLM | Organizations: Cadence Design Systems

Jack Erickson is director of product management at Cadence Design Systems. He had held numerous AE and marketing positions in past 15 years at Cadence, covering simulation, synthesis, physical synthesis, floorplanning, and equivalence checking.

If you’re looking for an entertaining gonzo take on the history and current state of hardware design, I highly recommend “The Slow Winter” by James Mickens, the “Galactic Viceroy of Research Excellence” at Microsoft.

The premise is that it’s no longer lucrative or cool to be a hardware designer because process scaling has hit a wall in the form of voltage leaks, excessive heat, and…Pauly Shore. Thus, gone are the days when hardware designers could afford to buy Matisse paintings and throw them out a hotel window while partying with Aerosmith.

These are valid points – process scaling has slowed not only for these reasons, but also because of the economics of undertaking a chip design project at advanced nodes. The trend then shifted to throwing more and more processor cores onto a chip. But there’s a limit to that – how many of us need to render nine copies of the Avatar planet in 1080p while simulating nuclear explosions?

So what is a hardware designer to do? The article paints a pretty bleak picture of the protagonist’s new living conditions and his behavior in meetings as he tries to evangelize the need to control power consumption. This is probably a bit of an exaggeration. At the same time, it’s probably a good thing that the excesses of the Matisse-chucking era are a thing of the past. Constraints drive innovation, and hardware designers are innovative, so we are beginning to see new kinds of hardware innovation.

For instance the Moto X has “eight cores” (not all on one SoC) – but it is not just eight parallel application processors designed to replicate a desktop computer in your hand. Yes, 4 of those are GPU cores (the Adreno 320…more on those in a bit), and 2 are application processors. The other 2 are interesting. There’s a natural language processor, which is a DSP that processes voice commands, and has an always-on listening state that enables you to wake the phone with a command. And there’s a contextual awareness processor, which is a microcontroller that fuses the various always-on sensor inputs with a software layer to display the appropriate data depending on whether the phone is resting on a table, being pulled out of a pocket, having a hand waved over it, etc.

Apple’s new iPhone 5S “M7 motion co-processor” performs a similar role. Both of these enable more natural interaction with the phone while conserving power by shutting off the largest power consumers. This type of approach combines new hardware architectures with accompanying software to make your phone less of a traditional computer and more of an on-demand outboard brain.

And if the notion of all this compute horsepower in a smart phone seems silly, give your phone to a 12 year-old for a couple hours. Those four GPU cores will be put to good use with all the high-end graphics of today’s games. If that kind of load were run on a general purpose application processor like the old days, we would in fact have to connect each processor to a dedicated coal plant. This is why graphics cores are taking an ever-larger share of the area of these SoCs and are becoming prominent differentiators. Clearly there is a great deal of hardware innovation still happening in this space.

The same goes for video, where our 12-year-old consumer creates YouTube videos, watches Netflix movies, and Facetimes his friends. Thankfully for the sake of your wireless bill, the recently ratified H.265 spec (also known as High-Efficiency Video Coding, or HEVC) improves the data compression of video streams. This reduces the bandwidth required for today’s videos, or enables transmission of higher resolution video like 4K. Of course, this requires new encoding and decoding algorithms. These first appear as software running on existing processors. But now that the standard is solidified, design teams are implementing these algorithms as dedicated hardware in order to improve performance and reduce power.

And as more ultra-HD video starts being produced and consumed, more wireless bandwidth will be required, which is why carriers are already looking to the next generation 5G standards and the hardware that will comprise its backbone. As the linked article states, this is in the 2020 timeframe, which will be when our 12 year old consumer of today is on campus at a university possibly developing the next bandwidth-consuming platform similar to the origins of Google, Napster, Facebook, and Snapchat.

It’s well-publicized that recently there has been an explosion of innovation on the software side while at the same time hardware improvements no longer come for “free” due to the slowing of process scaling. However there is still a great deal of innovation happening on the hardware side in the form of co-design with software, as well has hardware algorithm design and implementation. This type of innovation is far more challenging than riding the process scaling wave, which is why those that are still designing hardware after the party is over are the true innovators.

But this more challenging type of innovation requires more powerful tools for exploring and verifying the software-hardware interaction and figuring out how best to apply these algorithms as hardware. SystemC with TLM enables very high-level models and algorithms to be refined toward hardware yet still simulated fast enough for software verification. But this only recently became useful as a result of having an automated path from this level down into the production implementation flows via high-level synthesis. HLS is being used more and more often by these innovators to explore the different micro-architecture options and then apply these algorithms into hardware implementations according to the application-specific constraints.

Yes, the easy giants may be dead but there is still plenty of opportunity out there now that we have the tools needed to conquer them. Bring on the ghosts that Schrödinger left behind!