IP supplier FotoNation has decided to embrace the use of high-level synthesis (HLS) in the creation of cores for smartphones and other high-integration, low-power systems.
For more than a decade, the company has worked with a variety of algorithms that include techniques such as histogram of oriented gradients (HOG). But, in recent years, the engineering team has focused increasingly on algorithms that exploit neural networks. Corneliu Zaharia, vice president of VLSI at FotoNation, says: “We have been doing detection since 2007. In hardware, at that time, the use of neural networks was not an option. But the quality of the neural network is not matched by any traditional algorithm, especially on object detection. Now we have the power to use these algorithms.”
Although the compute power is now available to run neural-network algorithms, power efficiency remains a concern in smartphone and similar designs. Custom hardware is a necessity. Potentially, high-level synthesis could improve the turnaround time of image-processing cores, particularly when it comes to customising core for individual customers. But FotoNation wanted to be sure that it would not result in an unwanted tradeoff against compute efficiency or be difficult to fit into its design flow.
“We have a working RTL flow. If something is working why change?” Zaharia asks. “But we see design cycles are much faster now, especially in machine learning. New types of neural network appear every day. An algorithm we start with today could be obsolete in two years. We need to speed up the prototyping process. Also, most of our algorithms are hardware and software.
“We need a better way to do codesign and we on the hardware side would like to communicate with our software colleagues better. We think C-model simulation will help us a lot. And if we have a C model, maybe we don’t need to spend as much time as we do by building the RTL by hand from the model.”
High-level synthesis, in principle, provides the ability to construct IP cores from C models rapidly. To test whether that would be the case, FotoNation embarked on a two-phase test of the Catapult HLS Platform provided by Mentor, a Siemens business. The team first built a test project based around a number of common image-processing tasks: resampling; kernel-based operations; and gamma correction.
“We could do multiple pixels per clock easily. Doing that in RTL involves some complexity in redesigning the RTL. From C, we just have to tell the tool that we need a more parallel approach. We completed this around four times faster than the average RTL-based design and the quality of the source code it generated was good. It was readable: five years from now I would be able to look at the code and maintain it,” Zaharia says.
“There were no issues with clock frequency, but the area was a bit bigger [than one hand-coded in RTL], around 15 per cent bigger. There is maybe some room for improvement there. The results were quite different from block to block.”
After that trial, FotoNation started a full project with HLS, one that would perform face recognition based on neural networks. “The first challenge was how to implement the face detection on an FPGA. We had a first FPGA in about two weeks but we didn’t pay attention to area. The design occupied the whole FPGA, which is good for fast prototyping but not good for production.”
Work on microarchitecture, such as use of memories helped improve the overall area efficiency of the generated IP. During this evaluation, FotoNation found the people best placed to create the C that could be used for generation remained the hardware engineers rather than architects or software developers as they had a better understanding of how code changes would affect resource usage. With the HLS in place Zaharia says: “We can do prototyping very quickly. It’s very good for go/no-go on a project. We think HLS will help us perform design exploration faster but we do still need to think about microarchitecture properly.”