You know the pattern. Week one of simulation, coverage climbs fast — 20%, 40%, 60%. Your constrained-random testbench is firing, the simulator logs look healthy, and the DV lead says things are on track. Then somewhere between 80% and 87%, the curve flattens. You run more seeds. You add more constraints. You let the farm run overnight for a week straight. The number creeps up by fractions of a percent. You are at the plateau.
This is not a simulation bug. It is not a testbench problem you can patch. It is a fundamental property of how constrained-random stimulus generation explores state space, and understanding the mechanics is the first step to doing something about it before tape-out pressure hits.
The State Space Coverage Geometry Problem
A constrained-random testbench generates stimuli within a defined constraint set. Early in the verification campaign, most of that space maps to previously uncovered RTL logic — every new transaction exercises something new. The coverage model registers fresh hits across toggle points, FSM state transitions, functional coverage bins. Progress looks linear.
The geometry changes as coverage increases. At 85%, the remaining uncovered bins are not randomly distributed through stimulus space. They are concentrated in narrow slices: specific FSM transition sequences that require exact multi-cycle setup conditions, functional coverage cross-products that require two independent constrained fields to hit a particular combined value simultaneously, error-injection paths that require the design to be in a specific internal state before the fault can be exercised.
A uniform sampler over constraint space — which is what a constrained-random generator is — hits these narrow slices at a rate proportional to their volume in the constraint space. If a coverage bin requires a particular 4-field combination out of millions of possible combinations, the probability of a random transaction hitting it per cycle is very low. With realistic simulation throughput (millions of cycles per day on a typical farm), you might expect to see that bin hit once every few thousand simulation hours.
The plateau is not a failure of the testbench. It is the correct behavior of a uniform sampler encountering a non-uniform coverage structure.
Where the Hard Coverage Lives
Looking across RTL coverage databases, the bins that resist random simulation cluster into a few repeating categories.
FSM transition coverage holes. Most FSM states get hit early because they are on the main execution path. The transitions that stay uncovered are typically error recovery paths, power management state sequences, or protocol corner cases where entering the target state requires a precise predecessor sequence. A random testbench generates the right input values eventually, but cannot generate them in the right order unless you constrain the ordering explicitly — and you can't write that constraint without knowing which bins are hard first.
Functional coverage cross-product bins. Say you have a covergroup with a cross between a 4-bit opcode field and a 3-bit privilege level. That's 32 cross-product bins. Most fill quickly. The ones that don't are usually semantically meaningful combinations: a specific opcode that is only valid at privilege level 0, exercised at privilege level 2. Random generation will hit it eventually but very rarely. If your testbench constraint set prevents illegal opcodes globally, that bin may never be hit at all.
Deep sequential dependencies. Some coverage holes require a sequence of N correctly-ordered transactions before the target condition is reachable. A memory controller that tracks four outstanding transactions to the same bank, then receives a fifth — random simulation might generate that scenario once every 10,000 bursts. At normal verification throughput, that's a lot of simulation time for one bin.
Toggle coverage on reset-dominated signals. Low-active reset pins, scan enable lines, and power domain crossing signals often have one toggle direction covered early and the other direction locked because it only transitions under conditions the random test generator never produces.
Why More Seeds Don't Help Past a Threshold
A common response to the plateau is to run more seeds with more variation. This helps up to a point — if the uncovered bins are reachable with existing constraints but just haven't been hit yet, more simulation time does close some gaps. But past a certain threshold, the remaining bins are structurally unreachable with your current constraint set, not just statistically unlikely.
The distinction matters. For statistically-hard bins, more simulation time is a valid (if slow) approach. For structurally-unreachable bins, adding simulation time does exactly nothing. You are burning compute and schedule weeks on a strategy that cannot work.
Telling these two categories apart from the coverage report alone is difficult. The UCDB shows you which bins are uncovered. It does not tell you whether they're hard-but-reachable or structurally-blocked by your constraint set. That diagnosis requires reading the RTL, understanding the coverage group definition, and working out why the constraint space doesn't reach that bin — exactly the kind of analysis that takes experienced DV engineers hours per bin.
The Directed-Test Writing Tax
Once you identify that a coverage bin is structurally blocked, the answer is a directed test: a test written to explicitly set up the required preconditions and then execute the sequence that hits the target bin. For a single FSM transition, this might take a few hours. For a complex cross-product scenario with deep setup dependencies, a day or more is not unusual.
A team at a growing AI accelerator startup we worked with had 47 uncovered functional coverage bins at week 6 of their verification campaign, with tape-out 4 weeks away. Their estimate for closing those 47 bins by hand: 3-4 weeks of directed-test writing by two senior DV engineers. That's almost exactly the time they had left, and it assumed no new escapes found in the process.
This is the real cost of the plateau: not the 15% of coverage that's missing, but the weeks of senior engineer time required to close it by hand.
What Coverage Closure Analysis Actually Requires
Closing coverage holes efficiently requires answering three questions for each uncovered bin:
- Is this bin reachable with the current constraint set, or is it structurally blocked?
- If blocked, what constraint modification or directed test would reach it?
- Given limited schedule, which bins carry the highest bug-escape risk if left unclosed?
Question 3 is the one most teams skip under time pressure. Not all uncovered bins represent equivalent risk. An uncovered FSM error recovery path is more dangerous than an uncovered cross-product bin for a corner-case opcode combination that has never appeared in real workloads. Risk triage — prioritizing directed-test effort on bins that, if they contain bugs, would cause silicon failures — is the difference between efficient coverage closure and exhausting test writing that doesn't meaningfully reduce tape-out risk.
We're not saying you should accept low coverage. We're saying that closing every bin mechanically without understanding which ones represent actual defect risk is a poor allocation of a DV team's last few weeks before tape-out.
How AI Coverage Analysis Changes This
The approach we take at Photoniq starts from the same UCDB database your simulator already produces. We read the coverage structure — toggle coverage, FSM state and transition coverage, line and branch coverage, functional coverage bins and their cross products — and build a structural model of how the uncovered regions of the coverage space relate to the RTL logic.
For each uncovered bin, the model estimates two things: what stimulus sequence characteristics are required to hit it (based on the RTL path structure and the constraint space), and what the relative defect risk of leaving it uncovered looks like (based on the structural properties of the logic and where in the design the hole sits).
The output is a ranked list of test scenarios: not random seeds with modified constraints, but specific stimulus patterns that the model predicts will exercise the uncovered logic. A DV engineer reading this list sees, for each recommendation: which bins it targets, what the predicted hit probability is, and why (the RTL path and coverage group relationship).
This doesn't eliminate directed-test writing. A senior DV engineer still needs to translate a test scenario description into actual SystemVerilog. What it eliminates is the analysis phase — the hours spent per bin working out what the required preconditions are, which coverage groups are related, and where to start. That analysis is where the weeks get consumed.
Coverage closure at 95%+ is achievable for most designs within a week when you start with a ranked, explained recommendation list instead of a raw UCDB and a deadline. The plateau is not an engineering inevitability. It is an information problem dressed up as a simulation problem.
Practical Diagnostics Before You Call It a Plateau
Before concluding you're at an irreducible plateau, a few diagnostics are worth running manually:
Check constraint solver utilization. If your constrained-random solver is generating a small number of distinct stimulus patterns due to overly tight constraints, adding randomization within the constraints won't help — you need to widen the constraint set. VCS and Questa both have constraint coverage reports that show you how often the solver is hitting the same solutions.
Identify FSM coverage holes by predecessor state. Export your FSM state coverage from UCDB and map the uncovered transitions against their predecessor states. If the predecessor state itself has low coverage, you have a cascaded hole — fix the upstream coverage first.
Cross-reference functional coverage bins against testbench constraint definitions. If the covergroup cross-product was defined after the constraint set was written, there may be legal combinations in the covergroup that the constraint set explicitly prohibits. These bins cannot be hit without a testbench modification, not just more seeds.
Identifying these categories separates the "run more simulation" problem from the "write a directed test" problem from the "fix the testbench" problem. Each has a different solution, and conflating them wastes schedule.
The plateau at 85% is not where verification ends. It is where the interesting work starts — if you know what you're looking at.