Overcoming the limitations of data introspection for SystemC

By Christian Genz | No Comments | Posted: December 1, 2009
Topics/Categories: EDA - ESL | Tags: architecture, SystemC to RTL

The verification, test and debug of SystemC models can be undertaken at an early stage in the design process. To support these techniques, the SystemC Verification Library uses a concept called data introspection. It lets a library routine extract information from SystemC compound types, or a user-specified composite that is derived from a SystemC type. Unfortunately, data introspection has some limitations, especially when the number of language features applied is on the increase. For example, native C++ data types will not appear in metadata extracted by introspection.

This paper describes a non-intrusive analysis technique that aims to overcome the drawbacks with existing data introspection. It is a hybrid technique based on joining a parser that collects static information with a code generator that evaluates run-time information.

1. Introduction

The system description language, SystemC, is now widely accepted for use during hardware/software (HW/SW) co-design and has been approved as a standard by the IEEE consortium. In complementing hardware description languages (HDLs), it furnishes system architects with concepts and techniques—such as object orientation—that were previously only available in software programming languages. Armed with these qualities, SystemC can help drastically shorten project development time by allowing multiple phases to be parallelized, such as test suite generation, synthesis, application programming and the authoring of driver software.

One way in which SystemC architects achieve major gains in productivity is through the use of features it makes available in C++. Such features include polymorphism to support the clever reuse of existing algorithms and data structures, as well as the insertion of arbitrary software libraries within HW/SW co-design prototypes. In addition, many levels of design abstraction are supported, allowing the description of cycle-accurate hardware structures as well as untimed software algorithms.

Source: University of Bremen

FIGURE 1 Figure 1. Applications for analysis

The power and the flexibility of the language underpin its greatest benefits, but these attributes can also have a vast impact on the complexity of SystemC design analysis. Figure 1 shows the application of metadata within the context of such applications as synthesis, verification, debug and visualization.

SystemC supports the extraction of metadata by using the kernel interface at run time. Additionally, the Open SystemC Initiative (OSCI) introduced the SystemC Verification Library (SCV) in December 2002 and this added data introspection to the language’s analysis capabilities. While data introspection uses run-time information to assemble the metadata for a given model, the metadata itself will suffer from information being lost during the compilation process. We have developed a tool that seeks to overcome such limitations and ensure that key principles are maintained as the metadata is generated. Specifically, we secure the following:

The circuit behavior—that is the operational semantics of the analyzed SystemC program—shall be described in metadata.
Not only SystemC data types but arbitrary data types like user-defined types will appear in analysis results. Without a knowledge of inheritance and member types inside classes, any observed design hierarchy will be incomplete.
All names of declarations (as variables, data types and functions) shall be known after analysis. Otherwise, the names of ports, modules and signals will not be stored in the metadata or may be ambiguously identified using internal SystemC names (‘sc_object’).

Our approach is fully non-intrusive. No modifications occur to either the SystemC library that is needed for elaboration, or the compiler. We do not alter the behavior nor the architecture of the SystemC model. As a result, the proposed methodology is suited to dynamic models whose growth when running may not be foreseeable. This is an important feature since SoC designs tend to contain software partitions that have this characteristic.

2. State extraction

To obtain the kind of complete hierarchy identified as necessary in Section 1, our approach distinguishes between model hierarchies of two different kinds. The first is static and can be derived by parsing. The second is dynamic and has to be examined using run-time information.

To derive the dynamic hierarchy, we extract the start state, a part of each valid SystemC model that can be simulated. The observed system state is defined by a concrete set of values for all variables, declared in the program. These variables again define the state space of the respective model at the beginning of a simulation when the function sc_start is called.

Architecture

As shown in Figure 2, the extraction methodology is partitioned into four phases. First, syntactical analysis derives a given SystemC program and generates a parse tree that holds the static architecture of the input program. The parser and an additional scanner have been developed for this application and support special features, such as cross probing. Both tools have been implemented using the Purdue Compiler Compiler Tool Set (PCCTS) [3].

Source: University of Bremen

FIGURE 2 Architecture of the approach

The parse tree is also called an ‘Abstract Syntax Tree’ (AST) and is used for communication within and across different phases. However, since our elaboration only takes SystemC programs as input, we implemented the inverse function of the parser. The AST-synthesis generates a SystemC program from its parse tree to establish communication between the instrumentation and the elaboration.

So that we get the result of a model’s elaboration in the form of an AST, that model must be annotated beforehand. Then, the set of automatically generated functions that have been annotated on the model to implement our elaboration. The generation of those functions is realized by the instrumentation.

The outcome of the state extraction is to unroll all the generic nodes inside the AST; that is, to derive concrete values from all the expressions. Compared with standard SystemC data introspection, our technique expands the AST that has previously been computed by the parser, instead of only storing values of variables in the structures of the kernel.

AST-Synthesis

The AST is represented by an acyclic graph. Consisting of six different node types, it offers a way of expressing any arbitrary combination of control flow operation and data dependency. And because all the declarations that are allowed in C++ can be represented within the AST too, the state spaces can also be represented in an AST graph.

While the semantic of the program is directly transferred to the AST, each manipulation of the AST will have an influence on the corresponding SystemC document that is a result of the AST-synthesis. Thus, each operation in the program can be captured during run-time by just adding an additional statement to the AST. The synthesis of the AST is much less complex than the syntactical analysis. While the analysis frequently faces difficulties with ambiguities that arise when it is deriving a parse tree, the generation of a program from its AST is unambiguous.

To be able to support the cross probing facilities as described above, tokens forming the AST integrate additional information besides line numbers, such as byte positions. Using this information, the generated code of the AST-synthesis is written to files that use the same names as the input files of the syntactical analysis.

Instrumentation

The instrumentation of the source code is realized via the generation of a set of functions. These functions will be executed in a binary compiled from the output files of the AST-synthesis. They are called ‘recorder functions’ and each records a state variable after a change in its value. Therefore, the recorder function stores each modified value in the AST of the corresponding variable during elaboration.

All user-defined data types are compounds of primitive types (e.g., ‘int’) and pointers. Hence, only entities that are declared as native types or pointers are going to be recorded when their values change. To store value changes of variables and avoid any unwanted impacts on the AST, the propagation of the value changes must happen directly after computing the respective expression but before adjacent expressions are elaborated.

In cases where the stack frame changes during elaboration (e.g., when passing function calls, statement blocks or overloaded operators), the stack frame of the corresponding AST has to change. Consequently, more recorder functions have to be generated to enlarge or shrink the respective AST when entering or leaving a statement block.

Source: University of Bremen

FIGURE 3 Hybrid analysis

To be able to expand the AST of a SystemC program during run-time, an inclusion directive is generated inside the AST that includes the source code or our parser. Additionally, the AST is extended by a sequence of instructions that causes the static (syntactical) analysis of the model during elaboration. So before elaboration starts, an exact copy of the AST that was also used as input during the instrumentation phase is handed over to the simulator. A simplified representation of the combination of static and dynamic analysis with the help of instrumentation can be seen in Figure 3.

State elaboration

Instead of elaboration by pure interpretation, as has been done in similar research [2], our approach follows a hybrid strategy. After static analysis has finished, a simulated analysis takes place for the purpose of elaboration. This allows you to elaborate expressions, without knowing the respective source code. This is important when system calls are used or when a program is linked to external libraries, both common techniques in system design.

However, the state elaboration—and by extension the state space extraction—is limited to variables and functions that are declared inside the analyzed model. Other declarations (e.g., identifiers that have been declared in an external library exclusively) do not appear in the resulting AST. Only known entities that occur in the AST can be annotated with recorder functions. Finally, only nodes that are attached to recorder functions can be expanded to values.

All the values of the elaborated state variables are written to the AST automatically because the instrumentation code for those variables is compiled automatically too. Hence after elaboration, the AST becomes a tree whose leaves will reflect one of the following:

constant values;
state variables;
undefined functions; or
undefined variables.

Consequently, all control and data operations of the AST are placed between the root and the leaves. Variables that describe the state space can be traversed in depth after elaboration to then be extracted.

3. Example

We applied the implementation to various SystemC models. One specifically computed the Euclidean algorithm within a combinational process (Figure 4).

Source: University of Bremen

1 struct Euclid : public sc_module
2 {
3 sc_in<unsigned int> portA;
4 sc_in<unsigned int> portB;
5 sc_out<unsigned int> portC;
6 unsigned int valA;
7 unsigned int valB;
8
9 void calc ()
10 {
11 valA = portA.read();
12 valB = portB.read();
13 while (valA & valB)
14 max (valA, valB) -= min (valA, valB);
15
16 portC.write (valA);
17 }
18
19 SC_CTOR (Euclid)
20 {
21 SC_METHOD(calc);
22 Sensitive << portA <<portB;
23 }
24 };

FIGURE 4 Euclidean algorithm

Only the ports were declared using SystemC types here. The computation itself was done on two integer variables. The loop that calculated the greatest common devisor (line 13) was not a SystemC construct. Also, note that the functions min/max were user-defined. Thus, the interface of the module was clear to SCV. But considering the behavior, data introspection could only observe a black box without manual annotations.

After parsing, our approach considered valA and valB as part of the architecture. During elaboration, the attached SystemC applications were not only aware of the changing values of signals, but also had a direct knowledge of the control sequence that caused any of those changes.

4. Summary

This paper has described an analysis strategy for the extension of SystemC models with non-intrusive reflection capabilities. Our approach facilitates the state extraction of SystemC programs without being limited with regard to the abstraction level of the model. Advantages over pure simulative or statical analysis techniques such as [1] & [2] have been shown, as well as the need for a combination of static and dynamic strategies.

Future work in this area will concentrate on extending this approach with syntactical control mechanisms for the instruction phase. By doing this, we will be able to apply arbitrary modifications to different models in an automatic way that is less error prone and time-consuming than the manual insertion of macros to insert additional operations into SystemC models.

References

[1] F. Rogin, C. Genz, R. Drechsler, and S. Rülke, “An Integrated SystemC Debugging Environment,” Embedded Systems Specification and Design Languages: Selected contributions from FDL’07, E. Villar, Ed. Springer-Verlag, 2008, pp. 59–71.

[2] C. Genz, R. Drechsler, G. Angst, and L. Linhard, “Visualization of SystemC Designs,” IEEE International Symposium on Circuits and Systems, 2007, pp. 413–416.

[3] T. Parr, Language Translation using PCCTS and C++: A Reference Guide, 1997, Automata Publishing Company.

Acknowledgments

This work was supported in part by the German Federal Ministry of Education and Research (BMBF) and by Concept Engineering, Freiburg, Germany within Project Herkules.

The University of Bremen

Bibliothekstrasse 1

D-28359 Bremen

Germany

T: +49 421 218-63932

W: www.uni-bremen.de