Machine learning and visualization ‘needed for coverage’

By Chris Edwards |  1 Comment  |  Posted: February 6, 2018
Topics/Categories: Blog - EDA  |  Tags: , , ,  | Organizations:

Traditional functional coverage has run out of steam and novel methods to improve the understanding of what tests are doing are needed to make progress. That is the view of Greg Smith, director of verification innovation and methodology improvement at Oracle.

At the February DVClub Europe seminar organized by UK-based TV&S, Smith told attendees: “Functional coverage is equivalent to a horse and buggy when we are living in the age of Uber.”

Traditional functional coverage has been outpaced by complexity, Smith said. “When I first came out of school, we didn’t need functional coverage. But complexity has exploded.”

The problem that faces users of functional coverage is that it deals poorly with the contextual information that design and verification engineers need to understand how well a particular section of logic is performing – and whether it is broken or not.

“When the industry invented functional coverage, we could observe that a change occurred and that it caused a particular decision to be made but the observable effect was still local,” Smith argued. The problem is that the effects of that decision may ripple through to far-flung parts of the SoC. Bigger coverage objects would, in principle, capture those effects but those larger objects become prohibitively complex and time-consuming to express.

Example of a box plot used for rapid evaluation of the effectiveness of various tests (Source: Oracle)

Image Example of a box plot used for rapid evaluation of the effectiveness of various tests (Source: Oracle)

“We’ve got to look at it from a different perspective. At Oracle we decided to use machine learning, or more specifically data science to verifications. It pulls in techniques like statistical analysis, machine-generated random stimulus with feedback and even graph-based modeling and coverage,” Smith said.

“With machine learning, the good news is that the computers are not going to be finding the bugs by themselves anytime soon so your jobs are safe. But machine learning brings new data analysis techniques and provide insights into otherwise hidden aspects of your design. Using the data that is readily available, you can apply it in new ways. You really can get a deep understanding of what your stimulus is doing or not doing. So you can identify where stimulus is working or totally missing.

“At Oracle is we started using statistical analysis and we brought in some visualizations that showed us some things about the stimulus that we had no idea were happening. We also used pure static analysis and looked at metrics at the end of a test to see what happened and ask: was this what we wanted to do?”

Smith said it is relatively easy to get started. “We started by just consuming log file data. Comma-separated value files that we sucked into a database to perform analyses.”

Static analysis of data provided the source of box-and-whisker plots that showed the number of times particular tests were triggered during verification. These graphs helped show where tests were likely failing to do a good job. “You can see how tests have run over time. If a bug morphs you may see that the test is no longer creating the types of event you expect. You can then go and look at why the test is no longer effective.”

Similar analyses are used for pulling tests out of a regression suite to reduce turnaround time without losing effective coverage. “Do I have tests that are doing the same thing? Maybe I want to choose just one of a group of tests.”

The graphical analysis can quickly find tests that are not performing their expected tasks, Smith said. For a particular unit test that was meant to expose three types of stall condition, “the designer found that the third wasn’t getting exercised at all. The irritator had a bug in it and never created the stall condition. This feedback came five minutes after running the tests and [the designer] didn’t have to write the coverage object to check for the condition.”

Genetic algorithms

To help automate verification, Oracle’s engineers are now working on genetic algorithms to alter stimulus to improve the test results based on feedback from prior runs. The work is in its early days and can take a large number of runs to deliver results but on a relatively small block, coverage improved from 72 to 98 per cent. “It took about a week and ten thousand tests but we were seeing how we could incorporate this feedback mechanism,” Smith said. Another application of machine learning was to analyze patterns and sequences from state machines to help determine their overall efficiency.

Smith said speed of feedback is important when using these kinds of analysis, which has an impact on data capture and storage. The analysis routines tend to take longer as the database size scales up. For this reason, the database does not collect data over long periods. Instead, it is reset after about of month of data collection. “We want the analysis to take no more than five minutes,” Smith said.

One Response to Machine learning and visualization ‘needed for coverage’

  1. Tudor Timi on February 7, 2018

    What he calls “statistical analysis” is what is already available in coverage analysis tools. I fail to see what is so different about how he presented that they dump “events” in log files and generate those graphs in comparison to analyzing the (filtered) coverage results. Coverage analysis tools (at least the ones from one vendor) can do coverage correlation that says what tests hit which coverage points. This way it’s possible to aggregate results over multiple runs and say what a test family/group/etc. are covering. The only thing I might see is that it’s (maybe) easier to look at a graph than to look at a coverage report (however filtered), but such visualizations can also be built on top of a traditional coverage database (using UCIS, for example).

    Also, the example where the “feedback came in 5 minutes” and without any coverage having to be written is unfair. Hooking up the dumping infrastructure to the relevant events had to be done, which I would expect takes as much time as writing coverage.

    I don’t doubt that the application of genetic algorithms is useful. (This is where you can probably start saying you’re doing machine learning – the dumping part is just the infrastructure for doing it.) This is, however, something that can be done on top of traditional coverage engines (again, using UCIS and maybe other proprietary APIs). Yoav Hollander has written a lot about coverage maximization on his Foretellix blog.

    The only reason I can come up with to not build such visualizations/maximization algorithms on top of traditional coverage is if the necessary APIs don’t exist. This was probably the case when they started out and they rolled their own data dumping stuff. Even so, now that the APIs exist, I would push toward using them. If something is missing I would open discussions with the tool vendors and favor enhancing the relevant standards.

    The original title of the talk, “Coverge is Useless”, is unfortunate, seeing as how much of what was described in it is, in spirit, exactly coverage, just not in the traditional form one expects it (SystemVerilog covergroups/coverpoints). It would have been much more interesting to talk in more detail about the real machine learning aspects (their genetic algorithm, for example).

Leave a Comment


Synopsys Cadence Design Systems Mentor - A Siemens Business
View All Sponsors