A white paper from Synopsys lays out some of the challenges of making effective use of DDR memory subsystems in SoC designs.
The paper, authored by Tim Kogel, a solution architect at Synopsys, begins by pointing out that as SoCs become more complex, the demand on the memory subsystem from CPUs, GPUs, displays and other blocks are increasing more rapidly than the available bandwidth to the external memory. The demand on the memory subsystem is also shaped by the application that the SoC is running, further reducing its predictability.
Sophisticated memory controllers can be programmed to work around these issues, and the limitations imposed by the physical architecture of SDRAMs, using strategies including reordering memory transactions to banks of memory that have already been pre-charged for access. However, the use of these strategies has to be traded off against meeting any Quality of Service (QOS) guarantees for latency-sensitive traffic – a trade-off that can only be done with an understanding of the application’s needs.
Kogel’s paper goes on to describe a typical multi-port DDR memory controller, and how it can be configured to balance memory implementation factors such as the choice of DDR memory protocol, device size and speed bin. It also discusses trade-offs in deciding how many memory ports the controller should have with the trade-off being between using more ports with a parallel interconnect topology and fewer ports with multiplexing initiators in the interconnect fabric. This is a trade-off between better memory utilisation at the cost of greater use of routing resources in the first case, versus less use of wiring resources in the second but the chance that the multiplexing initiators will make poor decisions that reduce memory utilisation.
The paper also explores the trade-offs in sizing the command arrays and read reorder buffers in the scheduler – making them bigger gives the scheduler more flexibility, at the cost of greater area and power consumption. It also looks at some of the issues involved in meeting QoS requirements, including trying to balance the bandwidth and latency demands of cameras as varied as displays and GPUs, and the use of split command queues to manage low and high priority memory requests.
Other issues covered include address mapping strategies, and the mapping of detailed QoS information that can be embedded in the AXI bus protocol to the limited number of priority classes offered by typical memory controllers.
The paper concludes with a comparison of techniques to optimise the performance of DDR memory controllers, including spreadsheet based analysis, simulation, emulation, FPGA based prototyping, and virtual prototyping. There’s a more detailed description of virtual prototyping using the Synopsys Platform Architect for Multi Core Optimization environment, which includes a generic traffic generator, a memory subsystem model, and an interconnect model.
There’s more detail of how the environment is set up and used to explore the architecture and optimisation of a DRR memory subsystem and its controller here.