Mastering the memory maze

By Lane Mason | No Comments | Posted: December 1, 2007
Topics/Categories: EDA - ESL | Tags: memory, non-volatile memory | Organizations: Cadence Design Systems

Since the early 1980s,most of the semiconductor business has been enthralled by the microprocessor, the PC and commodity DRAMs. For all the talk of potential ‘better markets’ and ‘more profitable businesses to be in’, PCs and their brethren came to represent 35- 40% of the industry’s output. They constituted the prime platform for not only computing, but for watching DVDs and sometimes TV, for playing video games, for instant messaging, for surfing the web, and for email. No matter how ill suited it was to performing the multitude of tasks laid at its doorstep, the PC won out over dedicated devices.

But the industry is changing, at first around the edges, but perhaps more substantially among core products and next-generation platforms. There are important features, functions, characteristics, consumer demands, and capabilities that the PC today cannot manage very well. Thus, even so broadly defined a platform has found that its market does have limitations.

The consumer electronics (CE) sector has surged forward and started to drive the development of the chipmarket. Embedded DRAM? Not in a PC – but it was in the Nintendo GameCube games console. Knock-your-socks-off graphics and G(raphics)DDR3 chips with 700MHz clocks? Again, they are for gaming, not PCs.Meanwhile, the notion of ‘portable electronics’ is no longer best exemplified by what happens in laptops, but by Apple iPods, cell phones, pagers and PDAs. These products are incorporating evermore complexmulti-chip packages (MCPs), have given birth to a unique and revitalized LP DRAMroadmap, and are pushing battery and all aspects of low power (LP) technology to its limits.You will not find an LP DRAMormobile DDR DRAMin a laptop computer.

Figure 1. Low power DRAM selection

PCs have almost been forced into a mature product phase. This implies slow evolution, predictable roadmaps, and low gross margins for the end system and for most of its (non-monopolized) components. All the action and all the uncharted territory are in the consumer space.

Although we are entering the DDR3 era, a lotmore is happening in thememory products space, particularly now that PCs have a reduced influence. So how can wemap this newmarketplace? It is pretty hard to findmainstay DRAMin the consumer space, and when found, it occurs only in small quantities. Rather, the CE world is built on LP DRAMs; GDRAMs; a disproportionate amount of PSRAMs and legacy SDRAMs; cell phones with NOR flash, bare die andmulti-chip packages; and NAND flash. Indeed,NAND is asmuch consumer electronics ‘memory of choice’ as DRAMis for the PC.

In that context, we want to look across many of the specialized memory requirements of the CE market, and at the memory products that are set to be the mainstays of its growth during the coming decades.

Taking the overarching trend into consideration, the challenge for the CE product designer is that his or her device needs are almost completely out of phase with the PC market. So, what are the benefits and some of the difficulties in selecting from among different memory types for consumer applications?

We can initially define two broad categories of DRAM that can be used in consumer products. The first we can call ‘PC DRAM’. It is aimed at PCs and servers. It is most commonly found on DIMMs. It meets the JEDEC JESD79 (DDR1), JESD79-2 (DDR2) or JESD79-3 (DDR3) specs. And it is generally priced on a commodity basis. The second category is ‘Consumer DRAM’. It includes memories that are not often found on DIMMs and which meet other JEDEC specs- for example, Low-Power DDR (LPDDR, LPDDR2), Pseudo-Static RAM (PSRAM), and Graphics DDR (GDDR).

Both PCs and CE devices require low cost. Arguably the PC segment requires the lower DRAMcost as mainstream PCs use at least eight or 16 memory chips, and servers typically use a minimum of 36 high-density memory chips. By contrast, a consumer product is likely to use just one or two minimum-density memory chips (Figure 1). So, on a price-per-bit basis, PC DRAMis usually the lowest cost DRAMyou can buy, sometimes by a factor of two in comparison with some types of Consumer DRAM. For example, at the time of writing, the spot price for a fast 512Mbit DDR2 DRAMis less than $2 and for slower speed grades the price is closer to $1 – that is a half-billion transistor chip for less than what you might have paid for your morning coffee (albeit from the gas station rather than a specialty shop).

Not surprisingly, CE manufacturers are drawn to off-chip DRAMs and PC DRAM in particular because they offer amazing speed and density at extremely low cost. However, these self same characteristics also make PC DRAMs a challenge for designers. Let us look more closely at the tension here.

DDR3 should beat DDR2 on cost-per-bit by 2009

Speed

PC memories are priced on a speed basis, the faster the better. However, the speed of mainstream PC DRAM often exceeds what consumer applications need. That speed capability can hit highly sensitive CE price metrics in different ways – whether it is for I/Os that are capable of high-speed operation (but which draw more power than is needed at lower speeds) or the extra cost of a part that was designed for a higher speed than the base specification. The CE designer should remember that most DRAM families embrace a wide range of operating frequencies. For example, although many DDR2 devices being produced right now have 333MHz (DDR2-667) or 400MHz (DDR2-800) maximum speed grades, the JEDEC spec for DDR2 also requires these devices to operate at a minimum speed of 125MHz. This allows PC DRAM to be used in more consumer applications.

Cost equations

There are two cost equations that come into play, and both are different for a PC and a CE product. The first is the usage cost equation. In a PC or server, there is one memory controller and four, eight, 16, 32 or more memory chips. Clearly, the complexity in this interface belongs in the one memory controller. But in a consumer device there is often one memory controller and one memory chip, so the complexity can logically go in either place.

The second issue is the design cost equation. The number of chipsets or CPUs that talk to DRAM is quite small. At the time of writing, there are more than 10 available PC motherboards carrying DDR3 and all use the Intel P35 ‘Bearlake’ chipset. For the PC, the design cost-benefit equation puts only the bare minimum of necessary circuitry in the DRAM itself, and all the complexity of interfacing into the memory goes into the chipset or CPU. The consumer designer is not so lucky.

Each consumer product needs its own DRAM interface and even though there will be many consumer chips that interface to DRAM, most of the accompanying complexity is still in the memory controller in the consumer chip.

Density

The next challenge for consumer device manufacturers is the density of PC DRAM. On a per-bit basis, DDR2 is already cheaper than DDR1 and has been for more than a year. Assuming that a DDR2/DDR3 price crossover will happen sometime in 2009, a consumer manufacturers specifying a product today could reasonably expect that DDR3 will become cheaper than DDR2 on a per-bit basis during that current design’s lifetime. However, they need to carefully consider whether they can use the density offered by DDR3. This starts by specification at 512Mbit, but in available parts starts at 1Gbit, or 128MByte.

Figure 2. Example 128MByte system needing 21Gbit/s peak bandwidth

The density of PC DRAM is driven by the PC industry, of course. Microsoft recommends a minimum of 1GByte of DRAM to run Windows Vista, so it is immediately obvious why the 512Mbit devices will not be produced. If you put eight DRAM chips on a DIMM, each one byte wide, and if your minimum recommended amount of memory is 1GByte in the system, then each of those DRAM chips needs to be 1Gbit or 128MByte.

Some 128MByte in a single chip is a lot of memory – many more megabytes than was used in any computer 15 years ago for logic design, 3D transistor extraction and simulation, or for general office applications. Some more modern examples of how you could use 128MByte would be approximately 20 frames of uncompressed 1080p HD video, or about 80 compressed JPEG files from an 8-megapixel camera.

Power advantages

A memory system using PC DRAM will use a lot more power than a consumer DRAM. PC DRAM uses a ‘pseudo-differential’ I/O signaling system, Stub-Series Terminated Logic (SSTL). It offers great advantages in timing when used in a large memory system with many DRAM chips all sharing the same bus, but this level of timing accuracy may not be necessary for the point-to-point connections often found in a CE device.

A serious disadvantage is that these SSTL I/Os use about 10mA each, both on the chip side and on the DRAM side. That is a hefty burden if you have 16 or 32 I/Os all contributing to the power drain. However, Consumer DRAM may use different I/O types – pads with LVCMOS signaling levels in the case of LPDDR, or Pseudo-Open-Drain (POD) in the case of GDDR3. These I/Os typically use less power than the pads found in PC DRAM.

Data widths

PC DRAM may be a poor fit in a consumer system because of the availability of different data width parts, specifically 32bit-wide parts in a single package. The objective of PC DRAM is often to build the highest density DIMM possible. In that case, placing 8bit-wide or even 4bit-wide DRAM chips on the DIMM is advantageous in building a high density DIMM because it allows eight or 16 chips to be placed in parallel on a 64bit memory data bus. In CE devices, 32bit-wide memories are often needed because of the interface speed required, because they are interfacing a DDR 32bit bus to a single-data-rate 64bit on-chip bus, or because both these requirements must be met. Unfortunately for the CE designer, 32bit PC DRAM is not within the specification and thus is generally unavailable.

Burst lengths

Each successive generation of PC DRAM has increased the amount of memory that must be fetched with each read transaction. SDR allowed single memory data word access, DDR1 fetches at least two words, DDR2 fetches four and DDR3 fetches eight (excepting DDR3’s BC4 mode which we do not discuss here). The DDR1, 2 and 3 fetch modes are known as 2n, 4n and 8n prefetch respectively.

NAND is to CE as DRAM is to PC

In practical terms, this means that for the 32bit memory bus commonly found in CE devices, SDR fetches four bytes at a time, DDR1 fetches eight, DDR2 fetches 16, and DDR3 fetches 32. This is acceptable assuming that the CPU cache is 32byte or less, and/or that the memory traffic falls neatly into the fetch size of the memory chosen.

But two problems can occur. If we go to a 64bit memory bus – as happens in some consumer devices – then the DDR3 8n prefetch size of 64byte exceeds the 32byte cache line size of many CPUs. The other problem concerns video compression/decompression applications that require very short transfers.

To solve burst length problems, one answer is to consider a device with a smaller prefetch. DDR1, LPDDR1 and GDDR1 all offer a 2n prefetch; DDR2, GDDR2, GDDR3 and LPDDR2 all offer 4n. Another option is to use a faster, narrower memory bus, for example, where a 333MHz 64bit DDR3 bus might be a problem; a 667MHz 32bit DDR3 bus may be acceptable as half the amount of data can be fetched from the memory on the narrower, faster bus (Figure 2).

Availability

PC DRAM is available from many different suppliers, so even the most rigorous second-, third-, or fourth- sourcing policies can be put in place. However, PC DRAM pricing can be volatile and large shifts in supply or demand affect prices. Consumer DRAM tends to have fewer manufacturers so while it may not be as easy as PC DRAM to source, the price tends to be more stable.

Depending on the type of device the consumer manufacturer is building, they may have a requirement for a long lifetime availability of the DRAM parts – Denali Software has some customers requesting up to 10-year availability for memories.

Historically, PC DRAMs have been available for many years after a generation has stopped being designed into PCs. The chips are also produced by many manufacturers, so the long-term availability outlook for PC DRAM is generally good.

Consumer DRAM, however, tends to stop being produced when its market drops below a profitable level, and these availability windows can be harder to predict because of the often short lifetimes of high-volume consumer products. However, for the consumer DRAM available today, we can say that the shelf life for GDDR3 has particularly long prospects because the technology is used in the Sony Playstation3, the NintendoWii, and the Microsoft Xbox 360.

It is also important to remember that any two specifications of DRAM are generally incompatible with each other. For example, DDR2 cannot directly replace DDR1 and DDR3 cannot directly replace DDR2. One way to ensure better memory availability is therefore to design the memory controller, PHY interface, and I/Os to support more than one memory standard.

Even in the PC space, the Intel P35 chipset supports both DDR2 and DDR3, and the first generations of AMD CPUs that support DDR3 will also support DDR2. In the consumer space, some other popular configurations we see are combinations of DDR2 and LPDDR1 or of SDR, LPSDR, DDR1 and LPDDR1.

Designing PC DRAM into a consumer device

Several techniques can be used to help fit a PC DRAM into a consumer device. The price of the PC DRAM is usually right. The main problems are that the density, speed, power usage and data bus widths may be wrong.

Data width is the easiest to solve. Denali has a memory controller that runs at the same frequency as the memory but where the onchip bus can be 1X, 2X or 4X the width of the memory bus. As an example, for a DDR2-800 part that has a 400MHz clock, that same 400MHz clock would be used for both the controller and the memory. This allows a CPU with a 16, 32 or 64bit on-chip bus to use a 16bit wide PC DRAM. Frequency matching between the memory bus speed and the on-chip bus speed can be handled with various synchronous or asynchronous on-chip bus interfaces provided with the controller. If the density and speed requirements of the CE device match the PC DRAM used, then this is a good approach.

Console demand gives GDDR3 a Halo of availability

If the speed and density problems are not solved using the strategy outlined above, a slightly more complex solution is to use a controller where the memory runs at twice the speed and half the width of the memory controller. This allows for the on-chip bus to be 2X, 4X or 8X the width of the memory bus. A good example of this might be a system with an on-chip bus that is 64bit wide and runs at 333MHz connected to a memory controller with a synchronous connection to the on-chip bus, but also connected to a 16bit wide DDR3-1333 memory running at 667MHz (1333Mbit per second per pin). This system is well balanced for throughput on both sides and allows the use of a narrow, dense, high-speed memory with a slower, wider on-chip bus.

If this approach still does notmeet the speed, power or density goals of the system, then the designermust use Consumer DRAM.

Designing Consumer DRAM into a consumer device

Designing in Consumer DRAM is much like designing in PC DRAM, except there are usually more choices. For example, both the LPDDR and GDDR DRAM families offer many choices of 256Mbit memory density with 32bit data widths optimized for speed or low power. A smaller selection of 128Mbit devices with 32bit data width can also be found.

Before DDR there was SDR (Single Data Rate) DRAM, which typically offered half the bandwidth of DDR. Some SDR devices are still available, including LPSDR in 64Mbit density and 16bit widths, and a few higher-power (3.3V) SDR devices of 16 and 64Mbit. Memory selection in the consumer DRAM area requires careful attention to the design goals, the specification of the system and memory, and the future availability of the chosen DRAM. In general we recommend supporting at least two different technologies, for example, both LPSDR and LPDDR, or GDDR1 and DDR1, or LPDDR and DDR2.

The future

One of the most interesting new developments for the makers of all kinds of CE equipment is the LPDDR2 specification, currently on a fast track to approval at JEDEC. Offering both DRAM and non-volatile variants that can share the same bus, and a very wide range of speeds and densities, LPDDR2 promises to find many applications in both mobile and non-mobile CE products.

In the graphics world, GDDR4 devices are already in production, and GDDR5 is at the specification stage. As the GDDR family becomes increasingly targeted on specific high-speed graphics applications, its usefulness as a general-purpose consumer DRAM may be limited.

In this article we have sought to establish the benefit and viability of using low-cost, high performance DRAM in consumer designs, as well as describing some of the difficulties in selecting different memory types for consumer applications. Denali’s own product line addresses this often confusing marketplace by offering memory controllers to interface to a wide range of memory devices and system applications.