Doc Formal: The crisis of confidence facing verification
In this three-part series, I look at broad challenges facing semiconductor verification and explain how they have grown to leave us facing a crisis of confidence. I will explore some of the key reasons why, despite astronomical growth in constrained random, emulation, and FPGA prototyping, we continuously grapple with poor quality.
This crisis of confidence has reached fever pitch: verification schedules routinely run late, bugs are often missed, silicon re-spins happen, and even worse, disgruntled customers are walking away from projects, hanging silicon vendors out to dry! When did this start? How did we get ourselves into such a mess? And how do we get out of it?
The verification meta model
To best understand our plight, we need to appreciate that verification is never done in isolation, but is part of the bigger picture. It is one of the most expensive activities within any semiconductor design project, and its success – or otherwise – is heavily influenced by four components, all of which must interplay with each other. These components are the process, the verification technology, the verification methodology, and the engineers that use them. This quartet represents the four pillars of what we can call a ‘verification meta model’.
Just like the models used in engineering, this meta model assumes the presence of some inputs and produces outputs, and its behavior should reflect an predictable relationship that preserves those elements. Our meta model bases that relationship on a core — the process — with the technology, methodology, and human engineers forming its primary inputs.
So that we are all on the same page, let me expand upon the ideas behind each of the four pillars.
A series of actions that are executed to fulfill an end goal.
In this sense, you can think of a process as a description of concrete things must be executed to carry out a task successfully. For our meta model, the task itself could be a complex set involving not only verification, but also tasks needed to enable verification (for example, training an engineer in a certain skill before he/she can become productive). A process in our meta model is a collection of heterogeneous tasks that need to be performed to obtain high quality verification without breaking project deadlines. “High quality” means finding as many bugs as possible early in the design cycle and ensuring that none leak through at the end of the project.
An application of a specific verification technique to find bugs
In some cases, such as formal, the verification technology is used to build exhaustive proofs of correctness and/or compliance. It can also be a directed testing technique that establishes whether SoC bring up is being done successfully. Examples of well-known verification technologies include dynamic simulation, formal, emulation, and FPGA prototyping. Each describes a useful and complex family of techniques that are applied in practice, involving tools and methods to perform verification.
The overall ethos guiding all aspects of verification as it relates to, and interplays with, the design process.
Whereas verification technology typically outlines the core principles of how a certain technology works in practice and provides tools (such as simulators, emulators, model checkers, or FPGA platforms), methodology is necessary to ensure the right application of a verification technique at the right time and by the right people in the design flow.
Design and verification engineers
The humans responsible for building hardware designs correctly (i.e., per the outlined requirements) without introducing or leaving bugs.
The DV could be a designer who is bringing up his own design and should be careful not to introduce bugs in the design. Or he or she might be the conscientious verification engineer responsible for flushing out all the bugs in a design created by others.
The verification breakdown
A verification meta model that works correctly can be expected to produce high-quality verification efficiently using a process that sets out the right combination of technologies and methodologies to maximize ROI and minimize risk. The end goal is to apprehend as many bugs as possible early in the DV cycle, thereby delivering on the shift-left verification paradigm that economizes both time and resources. In other words, a good verification meta model does not leave bugs in silicon. It certainly does not gift them to customers.
That seems straightforward enough. But why is it so hard to get a good verification meta model in place? Why does verification continues to be a bottleneck? Why does improper execution continue to lead to missed opportunities in the form of overlooked bugs, delayed schedules, frustrated management, unhappy customers, and stressed out DV engineers?
I see the breakdown in verification as the result of flaws in three main areas within the process: planning, training/mentorship, and methodology.
Factor 1: Lack of planning
One common response to the question, “Why is verification struggling?” is that there is a lack of planning. Planning itself is part of a well-oiled verification process; good planning falls out naturally from a good process. The thing to note is that, whereas planning is essential to achieve high quality verification goals, it is not a pillar of the meta model in the way the process is; rather, a good process enables good planning, not the other way around.
Planning includes clearly defining both the high-level and the low-level verification goals and describing a sequence of concrete actions that deliver them. The process, by contrast, pulls together various different plans in such a way that they work together and complement each other.
Factor 2: Lack of training/mentoring
Organizations should have plans not only for training DV engineers in required verification skills, but also about how a particular project should be designed and verified by the trained team. If we fail to define and implement plans for training new DV engineers on requisite verification skills, or if we fail to train experienced engineers on new verification techniques, then we will not obtain good results. The frequent lack of a verification training plan means that bright, talented engineers must work on verification without the appropriate skills. As a result, they can unwittingly cause massive delays, the process can yield poorly verified designs, or you can experience the worst of both. In my experience, most organizations do plan for training, but there does not seem to be a process for identifying what classifies as good and relevant training, and what does not. In many cases the training itself can be good, but I have come across situations where expensive purchased training was not only bad, but also got things wrong.
Once engineers have been trained, where do they go? The path from acquiring fresh skills to delivering production-quality work is often long and there are often no clear milestones along the way. How do organizations ensure that the trained engineer is able to apply his/her skills properly on actual projects? The answer is good mentoring. When engineering teams fail to spot the value of a good mentor who can guide and support the freshly trained engineers, the whole team suffers, and so does the project. The reason I suspect this happens is because organizations do not value mentors. They are often seen as a cost rather than added value. However, the investment typically pays off several times over, as the projects mentors influence are typically much more likely to succeed.
Factor 3: Lack of a cogent verification methodology
Even when project teams devote time and resources to comprehensive training and mentorship, their results can be limited by the verification technologies in use. The unfortunate state of affairs at many companies is that there remains a gap in establishing sensible methodologies around these verification technologies. The lack of a good methodology can have a debilitating effect on projects, even where there is an investment providing good training in basic verification skills in UVM, formal, or emulation.
In a nutshell, the crisis has emerged due to a lack of clear planning that describes which verification technology or combination of technologies should be applied, as well as how, by whom, when, and with what intent.
Verification, in a practical sense, is nothing more than mitigating risk. Why, then, is are risk assessment plans often so absent? An entire project team could be obsessed with meeting verification coverage goals, but thanks to metric-driven verification, the end goal of obtaining 100% coverage cannot be achieved without a verification meta model that optimally combines all the necessary components. Besides: 100% coverage alone is not the best sign-off criteria.
In the second installment of this series, I outline how to optimize the verification meta model and outline a design verification flow that can be applied to any team’s projects with specific reference to Requirements & Specifications and the Verification Strategy and Plan.
In the third and final installment, I extend the argument about a practical flow to the components of Debug and Signoff & Review.
Is your team experiencing a verification breakdown as a result of any of the factors that I’ve outlined here? Are you facing challenges that I have not mentioned? Let me know in the comments or on Twitter (@AshishDarbari).
And don’t forget to check out the last series of Doc Formal posts on the evolution of formal verification (Part One and Part Two)