Keeping up with the bandwidth demands of embedded displays
Increasing resolutions and rising frame rates are making it more challenging than ever to drive embedded displays effectively.
Increasing resolutions and rising frame rates are making it more challenging than ever to drive embedded displays effectively.
Mobile devices such as phones and tablets now include 4K displays and cameras. Augmented reality (AR) and virtual reality (VR) applications demand even higher resolutions and frame rates to provide a realistic visual experience when the displays are mounted so close to the eye. And in automotive applications, dashboard displays are having to deliver more information than ever as they start to include autonomous driving overlays, external camera feeds, driver assistance features and infotainment.
The increase in the number of pixels to be processed and pushed to the displays is making heavy demands on embedded processors and display interfaces. AR and VR applications, for example, may need to be able to drive two 2k x 2k pixel displays, at frame rates of 90frame/s or more – from a thermally limited headset power budget of around 2W.
Synopsys and Arm have collaborated to provide an interoperable display processor unit (DPU) and MIPI DSI IP solution that enables 4K embedded displays for smartphones and AR/VR devices.
Building a better display processor
Arm provides Mali graphics and display processor IP to tackle these challenges. The latest addition to the Mali family is the Arm Mali-D71, Arm Assertive Display 5 and Arm CoreLink MMU-600 optimised for delivering 4K120 performance for next generation premium mobile devices.
Arm’s Mali-D71 DPU scans out the display pixels from memory, and can do other tasks such as power-efficient composition, alpha blending of multiple layers, high-quality scaling and in-line rotation. It will also handle gamma correction and colour management for the final display. The DPU also has a co-processor interface through which partners can connect their own display processing algorithms. The DPU eventually drives parallel RGB signals out to a DSI controller developed by Synopsys.
Figure 1 Arm Mali-D71 and Synopsys DesignWare MIPI DSI Host Controller IP with DSC encoder (Source: Synopsys)
The Mali-D71 DPU does all these operations with a single pass through memory, so the system doesn’t have to do multiple region writes to memory. This saves power. The processor also uses a lossless frame-buffer compression technique to reduce system bandwidth.
Supporting hardware includes the MMU, and a hardware composer that can take multiple display, composite them using the underlying hardware, and then send them to the display.
The Mali-D71 DPU needs to address the performance requirements of 4K120 displays, so has been given additional composition layers, greater performance, and optimisations to support Android multi-window use cases that need additional scaling capability.
Arm’s DPU has also gained support for native command-word panels to talk to DSI. Arm has also looked at ways of improving the way displays handle ultra-high-definition (UHD) and high data rate (HDR) content, for example by mapping HDR 10 content to a standard display.
The Mali-D71 has two display pipelines, each of which can drive a display. Communication between the composition units and the display output units in the DPU makes it possible to dedicate the two display pipelines to drive a single display at double the performance. The architecture is also structured to double the number of composition layers available to eight when driving one display.
The main components of the DPU are the layer processing unit, the composition unit, and the display output unit. The layer processing unit is responsible for fetching a layer from memory in an efficient way and then feeding it to the composition unit, which composes the scene, does any scaling necessary, and then sends the data to the display output unit. This handles any final adjustments to match the data to the target panel’s characteristics, and then sends it on to the Synopsys DesignWare MIPI DSI Host Controller IP.
Supporting hardware includes a block called the AFBC DMA unit, which is dedicated to easing image rotation.
The processor has other optimisations. For example, if it receives a 4K layer and two scaling engines are available, it can take the layer, split it in two and scale them in parallel. This is important because it enables the operating clock frequency to be reduced and so saves power.
The other optimisation is side-by-side processing, in which the processor uses the resources of both display pipelines but only drives one display. Effectively, a frame is split into two, the two halves are processed in parallel, any overlap between them is resolved, and then the resultant frame is sent to the single display. Again, this means the operating frequency can be reduced, allowing the processor’s supply voltage to be reduced to further cut power.
The display output unit and the DPU together perform image-processing functions, such as gamma correction, dithering for any banding in the scene, and RGB to YUV conversions. They can also perform a light-splitting function on the output, so each frame can be split in two but with shared timing.
Building a better MIPI DSI host controller
The MIPI Alliance’s Display Serial Interface (DSI), is a high-speed serial interface between host processor and display module. This standard has been enhanced by MIPI’s adoption of Display Stream Compression (DSC), a visually lossless compression standard proposed by the Video Electronics Standards Association. DSC has a constant output bit rate, will output 8, 10 or 12bit signals and supports both YUV and RGB signalling.
The ability to compress display data is increasingly important. For example, a typical smartphone that requires WQHD+ resolution would demand that the Cetus DPU drive 6.3Gbit/s on its RGB interface (given 8bit/component RGB data and a 60Hz refresh rate.)
Synopsys’ DesignWare MIPI DSI Controller IP can look at this data-stream, see that it can be optimised to be sent over four channels instead of eight, and puts it through a DSC encoder to halve the overall bandwidth. Synopsys offers the DesignWare DSI Controller with VESA DSC encoder. The figure below, shows an example of a WQHD display at 60Hz with 24 bits per pixel (bpp) needing 6.3Gbit/s to transfer data from the application processor to the display device. In this example, one link of one display output unit connects to one MIPI DSI host controller, which then connects to the display device using a 4-lane MIPI D-PHY. The DSI host controller with VESA DSC encoder supports visually lossless compression by a factor of 2x or 3x, reducing the required bandwidth to 3.16Gbit/s. The MIPI DSI controller will also take care of generating the packets according to the DSI specification.
Figure 2 Simplified diagram of a WQHD resolution display in a smartphone application (Source: Synopsys)
For a VR or AR application, the head-mounted display may require 2K by 2Kpixel per eye, at refresh rates of at least 90Hz to overcome the latency implications of VR. Supporting this demands an aggregate bandwidth of 26.6 Gbit/s. Handling this means splitting the work between the two display pipelines of the DPU, and then sending the resultant data down two paths of Synopsys’ DesignWare MIPI DSI Controller. DSC encoding can help reduce the bandwidth here as well, by a factor of three.
Figure 3 Simplified diagram of a 4K resolution display in a smartphone or AR/VR application (Source: Synopsys)
Each of the DSI streams is now carrying 4.4Gbit/s to the PHY, which carries this bandwidth on four data lanes for each stream. At the display driver, the system needs to receive the two packets, do the DSC decoding to retrieve the data, and then synchronise the two halves of each frame so they can be displayed in sync on the target panel.
Using the DSC compression strategy reduces the number of lanes across which data is driven, saving cost. Using D-PHY (V1.1), it is possible to send WQHD images at 60frame/s on three lanes, rather than using six or eight lanes. The MIPI D-PHY protocol, V1.2, can carry the same amount of data using 2 lanes . By using DSC, designers will be able to further reduce the number of lanes to two, saving cost, transmission bandwidth, and power.
Synopsys’ DesignWare MIPI DSI Host Controller, with VESA DSC encoder, and runs in a number of different modes.
The DesignWare MIPI DSI Host Controller IP with DSC encoder is configurable from 1 to 4 lanes, enabling aggregate bandwidths of up to 30Gbit/s. the host controller is also useful in situations, such as automotive displays, which have limited switching speed on the target display panel. In these situations, using DSC compression can keep data rates below 1Gbit/s, helping to overcome this limitation.
The host controller IP can be coupled with the DesignWare MIPI D-PHY IP to create a complete, interoperable solution for integration into application processors.
Authors
Hezi Saar, staff product marketing manager, Synopsys.
Vassilis Androutsopoulos. senior product manager, Arm.
Sign up for more
If this was useful to you, why not make sure you’re getting our regular digests of Tech Design Forum’s technical content? Register and receive our newsletter free.