Challenges of SoCc Verification
Challenges of SoCc Verification
Challenges of SoCc Verification
Noah Bamford, Rekha K Bangalore, Eric Chapman, Hector Chavez, Rajeev Dasari, Yinfang Lin, Edgar Jimenez
Abstract
The challenges of System on a Chip (SoC) Verification is becoming increasingly complex as submicron process technology shrinks die size, enabling system architects to include more functionality in a single chip solution.. A functional defect refers to the feature sets, protocols or performance parameters not conforming to the specifications of the SoC. Some of the functional defects can be solved by software workarounds but some require revisions of silicon. The revision of silicon not only costs millions of dollars but also impacts time to market, quality, customer commitments. In addition to verifying the additional logic features added to Working silicon for the first revision of the SoC requires a the specification for each new generation of SoC, the market robust module, chip and system verification strategy to viability of the SoC must be insured. This means that its uncover the logical and timing defects before tapeout. power characteristics must be acceptable and the performance Different techniques are needed at each level (module, chip must be adequate. It must be able to communicate with other and system) to complete verification. In addition Verification chips and memories on a board and comply with a myriad of should quantify with a metric at every hierarchy to assess standards. Because of the increasing cost of tapeouts and the functional holes and address it. Verification metric can be a decreasing market window, all this must be verified in the precombination of code coverage, functional coverage, assertion silicon design phase. In order to meet this challenge, coverage, protocol coverage, interface coverage and system verification methodology is continually improving. This paper coverage. A successful verification strategy also requires the discusses building a verification infrastructure to meet the test bench to be scalable, configurable, support reuse of verification challenge and how to insure the quality of SoC functional tests, integration with tools and finally linkage to prior to tapeout. validation. The scope of this paper will discuss the verification strategy and pitfalls used in verification strategy and finally II. FUNCTIONAL DEFECTS make recommendations for successful strategy. Before examining how to design a verification infrastructure and insuring the quality of a SoC, it is helpful to examine the I. INTRODUCTION types of defects verification hopes to catch. Functional defects In 1965, Gordon Moore observed that number of transistors in are deviations from the specification and can be classified as a single integrated circuit was doubling every two years and either soft OR hard. This classification is based on the he predicted that this exponential growth would continue in severity of the defect and how it is fixed. A defect is said to be perpetuity. As advances in methodology, tools and process soft if a feature is not being used or the software can be have enabled this growth, the task of verifying the changed to work around the defect. After being fixed, the functionality of the designs as also grown exponentially. The defect still exists in hardware but is no longer a problem for the end user. A hard defect is one which must be fixed by verification challenge are described below changing the implementation of the circuit. As a soft defect does not require a re-spin, the cost of fixing it is much lower. An example of a soft defect is shown in the following case that was seen while verifying a baseband processor. In order to understand defect, a little background is necessary. The baseband includes a Deep Sleep mode where most
Not only is the number of transistors increasing, the functionality is becoming more complex as more features are added The cost of mistakes is increasing rapidly as the cost of masks increases Advances in computer aided design has enabled logic to grow at an exponential rate, which as resulted in the growth of the state space at exponential rate As State space for Verification is increasing, new tools and methodology are required
Seventh International Workshop on Microprocessor Test and Verification (MTV'06) 0-7695-2839-2/06 $20.00 2006
peripherals clocked by high speed clock are shut down (this clock is turned off), and a low speed clock is used by those peripherals which need to remain awake. The defect occurred in the module used to control the entrance and exit from this mode. During the exit of the sleep mode, the module stayed in its WARM state one clock cycle longer than expected. Because of this, other modules inside the SoC had sync-up problems resulting in a poor call quality. The original problem observed was that module was restarting one low speed clock cycle later than specified when exiting sleep. This caused the phone to resynchronize with the network. A software patch was proposed to workaround the problem and solves the synchronization issue. Because no hardware changes were required, the cost of the defect was not as great. A hard functional defect occurs due to logic bugs, incorrect modeling, incorrect timing constraints, and asynchronous interfaces in the SoC or due to unexpected process variations found during manufacturing. A hard functional defect usually requires a re-spin of the SoC and results in customer quality and product delays. An example of hard functional defects can occur with simple scenarios like power up sequence of the part. Analog modules used in SoC verification should be verified standalone with stand alone and with design marginalities for handling process variations. A power on reset an analog module cannot be represented with the spice model during regressions for the full SoC. The reset sequence of the SoC with a semi-accurate module can lead to unexpected scenarios in simulation. This includes timing of the chip coming out of reset and the start of code execution. A design fix was recommended to fix this problem.. It is also recommend to ensure debug modes are built into the part to handle such scenarios. This includes alternate functional modes that can be used by customer and product debug team. III. ADDRESSING VERIFICATION CHALLENGES A verification strategy should define an infrastructure that meets the following requirements in order to facilitate verification in the pre silicon environment: 1. Allows reuse between the Module, Chip and System level explore more phases 2. Includes a robust verification metric which is reusable from one Phase to the next (Example: Module to Chip and System) 3. Minimal redundancy in both the checking and coverage in order to optimize the use of resources 4. Techniques for building a robust reusable verification test bench to accommodate different platforms (Verification, Validation-Includes ATE (Automatic Test Equipment), EVB (Evaluation Board) and System Verification. 5. Allows for both direct and random stimulus and defines their use 6. Challenges of building a robust reusable verification testbench? Vertical Reuse between Module and Chip Verification
The verification strategy for module (IP), chip and system level should complement one another and avoid redundancy. A redundancy results in simulation overhead with verifying same functionality and missing out on the corner case bugs. An IP verification strategy should verify the specifications. A chip level verifications strategy should verify all IPs are integrated correctly, meets the performance parameters of the SoC and finally ensure a flow for validation to verify silicon meets the design requirements. The system verification strategy ensures the use cases from the customer and architecture is verified prior the tapeout. This can result in verification components for IP, chip and system level test benches to be architected for integration and vertical reuse of verification components and verification intellectual property (VIP). In order to meet design cycles set by market constraints, the specification and implementation phases of peripheral IP often occurs concurrently with the integration of the IP in a SoC. This poses a dilemma for verification engineers: should the effort be spent to create a standalone verification environment for the IP or should the SoC verification environment be used? Given the limited resource available for verification, creating the standalone environment can lead to delays in the SoC verification. In the parallel design and integration environment this predicament is especially pronounced. The need for standalone verification can not be overlooked, but at the same time, the SoC verification environment should not be ignored. The need for standalone verification is simple: the cost-benefit equation for verification is favors a small design to a large design: the cost of verifying a two bit adder is significantly smaller than the cost of verifying a sixteen bit adder. The number of possible states that need to be verified in the two bit adder is factors smaller than the number of states that need to be verified in the sixteen bit adder. The same is true in the choice of verifying one IP at a time versus verifying many IP at the same time in the SoC environment. When verifying an IP standalone, the state space is reduced. This makes it possible, given time and machine resource constraints to verification of the possible design state space and therefore increase coverage. One problem with concentrating on standalone verification in the concurrent design and integration environment is the impact on SoC verification. While the majority of bugs found during the typical design cycle are IP related, SoC related defects are also a concern. The standalone environment may make certain assumptions about input signals that are incorrect or the integration of an IP may be incorrect. For example: an input may have been treated as active high during standalone verification, but may in fact be active low. The issue is how to maximize the effort spent on standalone verification (with its greater cost benefit ratio), while minimizing the impact on SoC verification. One answer to this predicament is to take full advantage of Vertical Reuse. In general terms, vertical reuse stresses the development of verification components (VCs) that can be
Seventh International Workshop on Microprocessor Test and Verification (MTV'06) 0-7695-2839-2/06 $20.00 2006
used at both the SoC and standalone environments. In order to ease the integration of these components into the SoC test bench, some constraints must be placed on there architecture of the test bench. A top down design style will facilitate development and should be used to reduce duplication. Duplication can drain resources during both the design stage of the VCs and during the simulation stage of the SoC test bench. For example, to separate IPs may require a common monitor for a bus. If both IPs develop separate monitors (that may be tweaked for that particular design), the development effort has been doubled. In addition the SoC test bench will require both of these monitors to be running, doubling the simulation effort for that monitor. In addition to inefficient use of resources, other issues can hamper virtual reuse if insufficient guidelines are given for development of verification components. Using a base, such as RVM from Synopsys, for all the VCs will allow them to effectively communicate even if they original came from different test benches. In addition, care must be taken in naming global variables, global defines and components. If a component is specific to one IP, it should be named as such, and not given a generic name. Wherever possible, the generic verification components such as reset, clocking and common busses should be identified and architected as early as possible. In name global variables and defines, it is very easy for collision to occur, and either have a failure in compilation or worse, a value that is not expected. An example of an IP test bench is shown below in Figure I
for SoC level verification is important for all aspects of the design process. Great care should be taken to develop the drivers in a top down fashion to maximize the reuse of the drivers in pre-silicon verification as well as post silicon validation. The difficulty faced with pre-silicon software development is that code development is very time consuming in an event based simulator. Tests that would take tens of milliseconds to run in real-time can take days to run and verify in simulation. One way to speed up simulation of the software is to use a hardware emulator or FPGA for development. Emulation tools allow the SoC test bench team to rapidly develop system level drivers for the SoC level tests to use after the drivers have been integrated into the environment. Writing SoC level verification patterns using these drivers will verify the correct operation of the drivers, verify the correct operation of the design itself, and speed up the time taken to complete the verification tests. The drivers can then be ported with a high level of confidence to the post silicon environment very quickly. Using the drivers in the full verification environment will also verify the correct operation of the emulation model. The emulation model can then be passed to the software teams to develop full application code for the SoC before silicon has been produced. The integration of software drivers in the verification environment is very beneficial to all aspects of the design process. The simulation environment serves as a tool to verify the operation of the software that was developed rapidly in the emulation environment. This flow greatly speeds up the overall release process of the SoC as a product and creates a high level of confidence in all aspects of the design. Implementation of Assertions, and response checkers As more peripherals are integrated in a SoC and various types of on-chip buses are introduced, it becomes a challenge that verifying these peripherals and buses interfaces conform to standard/custom protocols that are inflexible and timing relevant. In addition, with the increase of SoC scale and interactions between multiple modules, the cycles of simulation traces to identify the exact cause of an error goes up significantly in traditional debugging process.
Figure I There are other considerations that should be taken into account when developing verification components. In general, we can segment the verification components into two areas: a checking area which includes response checkers, assertion, coverage and monitors, and the stimulus area which includes stimulus and drivers. .Implementation of System Stimulus and software drivers One area of particular interest is the design of system use cases. It can take many man years to develop test cases which are used to verify that the SoC can perform the required tasks. The verification team often times develops what appear to be paired down versions of the actual applications. The software
Assertions are formal properties that describe design functionality or temporal relationships in a concise, unambiguous, and machine-executable way. Such formal semantics make it easier to express low-level signals behaviors such as interfaces protocols. In addition, assertions can be specified inline with RTL code, adding additional observability in the design and helping to detect and diagnose problems quickly. Assertions can also provide coverage metrics that can be combined with other forms of coverage, to confirm that the design has been thoroughly verified. Assertions not only provide a mechanism to verify design correctness but also a technique to ease debugging and measure verification quality. Generally, design engineers write assertions to capture critical assumptions and design intentions during the implementation
Seventh International Workshop on Microprocessor Test and Verification (MTV'06) 0-7695-2839-2/06 $20.00 2006
phase of the design. These assertions are implemented within the design and can be used to verify FSM state transitions, FIFO and decoding logic etc. Verification engineers focus on functional checks on external interfaces based on the design specifications. Protocols violation or specifications errors can be found in this way. Herein assertions are developed in separate files. To solve the verification challenges for interfaces checking and making debug more efficient, assertions technique is adopted in SoC verification flow. There are assertions for protocols and chip level connectivity as well module level interfaces etc. The assertions adopting flow is shown in Figure II. After identifying the checking items to be covered by assertions, it might require considerable coding to describe complex design interfaces. Reusing libraries of pre-defined generic checkers (e.g. OVL) or assertion-based verification IP (AIP) will ease the coding effort as well speed up the adoption of assertions. Assertions that are package in reusable checkers are used in verifying the correctness of protocols and gathering functional coverage for protocols interfaces and connectivity Debugging assertions requires a quite amount of effort as part of the adopting flow. While a failure flags, that could be: 1) a real design bug; 2) assertions coding error; 3) test bench bug. The correctness of assertions can be verified either through dynamic simulation or in formal verification tool. Assertions can be verified in formal verification tool first because formal technique is able to create stimulus using mathematical methods based on input assumptions, without the need to create testbench. It will examine all possible behaviors of the design and try to prove each assertion always holds true. For failure case, formal verification tool will produce a counterexample showing how the assertion is violated. However, there may be properties that cant be recognized or proved by formal tool, because formal analysis method requires high exponential memory. Computing time complexity, also non-synthesizable assertions may not be supported by formal tool. These assertions are verified in dynamic simulation environment with a set of direct test cases or constraint-random stimulus. Assertions provide coverage metrics to control the debugging process and determine that all assertions have been adequately verified. They also provide information that how well the design has been functionally tested. There are two types of coverage in assertions, one is structural coverage that measures each assertion and every path in an assertion is exercised. Another is user-specified functional coverage, which could be sequences coverage that is defined using cover property of SystemVerilog Assertions (SVA), or variables/expressions coverage modeled by functional coverage construct (e.g. SystemVerilog covergroup). They are beneficial to identify the functional tests holes and measure whether interesting scenarios such as specific data path or corner cases have been covered.
In addition to augment the coverage metrics for the SoC verification. assertions improve the quality of verification in other aspects such as documenting, debugging, and reusing etc Help clarify specifications and documentations. Assertions provide an unambiguous way to describe design specifications in executable form, making it easier to capture mismatches between design and documentations, and reduce the misinterpretation of specifications. Accelerate debugging of the design. An assertion in pair with failure message enables it to identify the problem quickly. Besides, assertions inside a design or on external interfaces make it fast to locate the exact cause of an error. Reusable from module level to chip and system level. Assertions verifying interface protocols can be easily customized as reusable checkers and integrated in chip and system level verification, to verify the interconnections and communications among blocks. Assertions within design can also be enabled in higher level integration. However, since assertions can be used as checkers, avoiding duplicate checks in assertions and Response Checker (RSC) becomes a challenge for the IP verification flow, otherwise that will lead to redundancy and simulation overhead. Taking the example of Interrupt Request Controller (IRQC), which is a module that monitors 32 interrupts inputs and generates two composite interrupts as outputs, Figure III shows a brief block diagram for IRQC. IRQC supports edge/level sensitive mode and synchronization selectable for each interrupt source, also each interrupt can be high-true/low-true and individually enabled to either output. Besides, the state registers in IRQC provide visibility of all raw interrupt sources and unmasked sources pending for each interrupt output. The main checking items for IRQC include: 1. General registers read/write bus protocol on external interface 2. IRQC-specific bus error/wait behaviors, e.g.: error occurs when writing to a read-only register; wait cycles are asserted when reading/writing to a specific address. 3. Internal configuration/state registers read/write. These registers define the fundamental operation modes of the design and the states of them determine the design behaviors. 4. Correctness of two interrupts outputs based on inputs and the design configurations/state. Both assertions (e.g. in SVA) and RSC (e.g. in Vera, Synopsys TM) are able to perform above checks, its quite possible that the implementation in assertions and RSC overlap if different persons are responsible for code development without a global guide or monitor. Building a plan that separates the checks with assertions and RSC up front is highly required, which will act as a guideline for the development phases. The plan is created according to the advantages and limitations of different approaches. For example, assertions are best for verifying low-level signals relationships, whereas RSC are more appropriately to check transaction level behaviors that involve extensive data structures or complex algorithms. For the example of IRQC,
Seventh International Workshop on Microprocessor Test and Verification (MTV'06) 0-7695-2839-2/06 $20.00 2006
the item1 and item2 are suitable to be developed with assertions because they target at interface protocol and temporal relationships; while its better to use RSC to implement item3 and item4 since they relate to bus transactions and need to record design status in large time scales. In addition, assertions can be used in coverage analysis as well as Coverage Objects. It will be helpful that EDA tools can automatically integrate all the coverage points with individual metric, which will help isolate functional holes. With a clear plan and by using assertions in the appropriate verification scopes, assertion is an efficient technique to verify peripherals and on-chip buses protocols for the SoC, at the same time the gathering coverage is useful to determine how much the SoC is exercised and increase the confidence that the SoC can work properly.
IRQC Configuration /State Registers 2 Interrupts output