Debugging with virtual prototypes – Part Four

By Achim Nohl | 1 Comment | Posted: February 4, 2014
Topics/Categories: Embedded - Architecture & Design, Integration & Debug, EDA - Verification | Tags: big.LITTLE, hypervizor, low power, multicore, software debug, virtual prototype, virtualization | Organizations: ARM, Synopsys

The fourth installment discusses the extra levels of debug capability available when using virtual prototypes through the example of an ARM big.LITTLE-based embedded system.

The other parts of this series are available as follows, though for this particular article we would strongly recommend that you first review Part Three, if you have not had the chance to do so already. It introduces some of the virtual prototyping concepts that apply to multicore designs and this type of asymmetric multicore architecture specifically.

To read Part One, which provides an introduction to core techniques for virtual prototypes, click here.

To read Part Two, which illustrates virtual prototype tools and techniques using the example of Linux bring-up on an ARM-based SoC, click here.

To read Part Three, as noted above, click here.

Debugging an ARM big.LITTLE system using virtual prototyping

This case study focuses on issues surrounding the hypervisor under a ‘Task Migration’ use model for a big.LITTLE scenario.

ARM introduced big.LITTLE processing in 2011. It is a heterogeneous architecture which seeks to create multicore processors that, in essence, adapt to the performance or power priorities of the tasks running on the final system at different times.

As such, its original structure combined a Cortex-A15 core, capable of handling heavier processing loads, with a more energy-efficient but less powerful Cortex-A7 core, which kicks in when a system is carrying out less onerous tasks.

The architecture is now coming to market in some of its first silicon implementations across a number of applications. Others will follow. It was supplemented in 2012 by a version of big.LITTLE processing based on an ARMv8 architecture Cortex-A53 and Cortex-A57 cores in a similarly compatible combination.

For the purposes of this article, the original Cortex-A15/Cortex-A7 combination is used.

big.LITTLE task migration use model

ARM supports a variety of use models for big.LITTLE processing. The most prominent are known under the terms “Task Migration” “CPU migration” and “Hetergeneous MultiProcessing HMP”. We focus on the initally supported use-model Task Migration here. Even though today’s big.LITTLE processing systems employ HMP, the use model is still relevant to illustrate the debugging concepts enabled by virtual prototypes.

The big.LITTLE Task Migration model enables seamless software migration from one processor cluster to the other, depending on the use context and performance requirements. It sounds like this requires a heavy rewrite or modification of the software stacks (e.g., Linux/Android). In fact, it doesn’t.

Task migration is achieved by having the software run not on the hardware, but on top of a new layer. This layer operates in the hypervisor mode and performs the task-migration without Linux/Android even knowing about it. Nevertheless, the Linux/Android power management structure initiates the task (Figure 1).

Figure 1 big.LITTLE processing – task migration (Source: Synopsys)

Hypervisors and interrupts

The hypervisor software layer shields the hardware and payload software from directly communicating with each other. We call this a Linux/Android payload, as it is the software the user intends to run.

Why is the hypervisor needed? Interrupts are a good example of its utility.

Assume you are running multiple guest OSes on a system. Interrupts coming from the hardware could be for any of the OSes. In this scenario, a hypervisor will first intercept the interrupt and then decide to which guest OS it needs to be addressed.

But for big.LITTLE processing, things work the other way around. Multiple processor clusters share the same interrupt controller. The hypervisor ensures the transparency of the clusters for the OS and does the task migration. But the hypervisor needs its own interrupts and should not interfere with the OS.

Hardware-supported interrupt virtualization

To enable interrupt trapping in the hypervisor, ARM provides specific hardware support in the processors. In the processor’s hypervisor mode, a higher privileged exception vector enables the trapping of interrupts before the OS can react. When an interrupt comes in, it is the hypervisor that first handles it. If it is for the OS, it configures a virtualized interrupt controller, which is a replica of the real interrupt controller seen by the OS. The OS will then handle the interrupt as if there was no hypervisor in-between. Memory management unit (MMU) virtualization plays an important role here as well.

In this configuration, an interrupt has to go quite a way before it arrives at the user’s application – all the way from the hardware, through the hypervisor and into the OS. This is where virtual prototypes become extremely useful in debugging such a lengthy chain, because all aspects of the system can be observed and traced (Figure 2).

Figure 2 Hypervisor and Linux integration tracing (Source: Synopsys)

We need an overview of what is happening. We need, as discussed in earlier installments, the bird’s eye view offered by virtual prototypes to assess where things go wrong. Where does the interrupt stop? Is it stuck in the hypervisor? Is it not arriving at the interrupt controller? The VP ensures that different layers of software and even hardware can be traced and debugged at any time to spot software integration bugs. The developer is aware of which CPU is active. He or she can identify when the actual interrupt arrived from hardware and trace through the many software layers within the system from the hypervisor up to Android. Virtual prototype tracing and debugging is aware of the hypervisor layer because it tracks the mode that is exposed from the underlying CPU models (in this case, the ARM Fast Models).

Furthermore, debugging services are always available. Their function does not depend on any embedded software daemons. This is very useful for debugging the interaction between the hypervisor and the Linux kernel. You simply attach one debugger to the hypervisor and another to the Linux kernel. This is a very powerful option when debugging task migration. Even during phases where one CPU is powering down and the other is powering up, debugging and tracing is possible without limitation, as shown in Figure 2 earlier.

This is a delicate phase as the entire context of one CPU is saved and subsequently restored by the other CPU. This can entail obscure and hard-to-find defects (e.g., if the context saving is incomplete because the system forgets about saving the secure mode registers). In response, Figure 3 shows that one instance of the debugger is attached to the Cortex-A15 cluster, which is powering down, while a second instance of the debugger is connected to the Cortex-A7, which is powering up at the same time.

Figure 3 Multi-cluster debugging (Source: Synopsys)

Conclusion

Some readers have already used the big.LITTLE processing-based SoCs and will have encountered these challenges without having access to virtual prototyping technologies. Others may soon encounter them.

As noted in earlier articles, the purpose of these design and debug examples is to demonstrate the extra capabilities that virtual prototyping brings to an engineer’s toolbox. They are seen here in the light of a technology that continues to gain adoption.

In particular, the bird’s eye view that virtual prototyping offers is very powerful in helping to debug changes that involve architectural shifts and novel uses of technologies such as the hypervisor.

Further virtual prototype information and earlier installments

This article is adapted in part from a white paper entitled Debugging Embedded Software Using Virtual Prototypes

Part One: An introduction to core techniques for virtual prototypes – click here.

Part Two: An illustration of virtual prototype tools and techniques using the example of Linux bring-up on an ARM-based SoC – click here.

Part Three: Focusing on detecting heap memory corruption using software assertions – click here.

Author

Achim Nohl is a solution architect at Synopsys, responsible for virtual prototypes in the context of software development and verification. Achim holds a diploma degree in Electrical Engineering from the Institute for Integrated Signal Processing Systems at the Aachen University of Technology, Germany. Before joining Synopsys, Achim worked in various engineering and marketing roles for LISATek and CoWare. Achim also writes the blog Virtual Prototyping Tales on Embedded.com.

Company info

Synopsys Corporate Headquarters

700 East Middlefield Road

Mountain View, CA 94043

(650) 584-5000

(800) 541-7737

www.synopsys.com

Sign up for more

If this was useful to you, why not make sure you’re getting our regular digests of Tech Design Forum’s technical content? Register and receive our newsletter free.