Six companies have come together to launch a foundation intended to develop standards for software that can run on a wide variety of computers based on a heterogeneous mix of processors.
The founding members of the Heterogeneous System Architecture (HSA) Foundation are AMD, ARM, Imagination Technologies, MediaTek, and Texas Instruments. The companies said they will work together to drive a single architecture specification and provide a simplified programming model to make it easier to write code that targets general-purpose processors and graphics processing units (GPUs) at the same time. As the first standard to be brought under the HSA umbrella is AMD’s approach to I/O memory management, support for virtualized environments will form part of the portfolio of specifications that the group will develop.
It’s unclear as to how the HSA Foundation will work with the Khronos Group, which hosts the OpenCL standard for portable GPU code, but the foundation is promoting the use of OpenCL as part of its efforts. At DATE, Chris Schlaeger, director of the operating system research center at AMD said the company is hoping to extend OpenCL AMD is extending OpenCL and with it the design of GPUs to let them handle more general purpose code. The plan is to let CPUs and GPUs share data directly using standard C/C++ pointers as well as finding ways to allow a GPU to timeslice between applications – something that is not easily done today. They are working out how you context-switch a GPU with 20,000 registers.
However, based on the information available right now, it appears that the programming model will extend beyond OpenCL, which is a relatively high-level standard with limited opportunities for optimization at runtime. HSA will work on a ‘virtual instruction set architecture’, not dissimilar conceptually to the virtual instruction sets employed by Java and Microsoft’s Common Language Runtime, which provides the infrastructure for languages such as C# and F#. The intention, presumably, is for runtime engines to distribute code written to the ISA to actual hardware engines. It will be interesting to see the architecture for this and how it will handle parallelizable operations: vector instructions, anyone?