Triple Modular Redundancy (TMR) is a common technique to protect memory elements for digital processing systems subject to radiation effects (such as in space, high-altitude, or near nuclear sources). This paper presents an approach to verify the correct implementation of TMR for the memory elements of a given netlist (i.e., a digital circuit specification) using heuristic analysis. The purpose is detecting any issues that might incur during the use of automatic tools for TMR insertion, optimization, place and route, etc. Our analysis does not require a testbench and can perform full, exhaustive coverage within less than an hour even for large designs. This is achieved by applying a divide et impera approach, splitting the circuit into smaller submodules without loss of generality, instead of applying formal verification to the whole netlist at once. The methodology has been applied to a production netlist of the LEON2-FT processor that had reported errors during radiation testing, successfully showing a number of unprotected memory elements, namely 351 flip-flops.

At high altitude or in space, without the protection of the earth’s magnetic field and atmosphere, integrated circuits are exposed to radiation and heavy ion impacts that can disrupt the circuits’ behavior. This paper focuses on Single-Event-Upsets (SEUs), or soft errors, usually caused by the transit of a single high-energy particle through the circuit. In particular, we consider single bit flips in memory elements embedded in logic, implemented as flip-flops. Protection against SEUs can be obtained in several ways, and in particular this work considers the protection strategy based on the triplication of the storage elements of a circuit, combined with majority voting (

TMR can be either implemented during high level design (

This paper presents a novel way to verify the TMR implementation of a given circuit by executing a heuristic netlist analysis. Our goal is to verify that TMR constructs are insensitive to single bit flips (i.e., the logic is triplicated), and transients on clock or reset (i.e., there are no common reset/clock lines between redundant memory elements). To reduce execution time, we use a

This paper is organized as follows: previous works on the subject are introduced in ‘Previous Work’; ‘Proposed Approach’ details the algorithm together with necessary definitions, its implementation and its complexity; experimental results are shown in ‘Experimental Results’, and ‘Conclusions and Future Work’ draws some concluding remarks.

In the past, different approaches have been proposed for design verification against soft errors. These approaches can be divided in two kinds: fault injection simulation and formal verification.

Fault injection simulators run a given testbench on the design under test (DUT), flipping either randomly or specifically targeted bits. The outputs of the DUT are then compared with a golden model running the same testbench, and discrepancies are reported. Fault injection simulators come in two different flavors: on the one side there are software-based simulators like MEFISTO-L (

Formal verification against soft-errors was introduced by (

This work tries to overcome these limitations and provide full verification of a TMR-based DUT with reasonable analysis time. The idea presented in this paper can be classified as a fault-injection simulation, but follows a different approach as compared to previous work: instead of trying to simulate the whole circuit at once and doing a timing accurate simulation, we focus on a behavioral, timeless, simulation of small submodules, extracted by automatic analysis of the DUT internal structure, with the specific goal of detecting any triplicated FF that is susceptible to the propagation of SEUs in the DUT.

The starting point of our analysis is a radiation hardened circuit, protected by triplication of storage elements and voting (TMR in

Starting from a given design with ^{n} possible FF configurations, for all of the

These submodules are the ^{nf} injections, with _{f}_{f}_{f}^{nf}(1 + _{f}_{f}_{f}

When TMR is applied, each logic cone contains part of the voting logic.

A logic cone where FF triplets have been identified: the valid configurations are shown.

The methodology here presented relies on some assumptions: the whole circuit is driven by only one clock and there are no combinatorial loops. Furthermore, it is assumed that there are no signal conflicts inside the netlist (i.e., two-valued logic) and that there are no timing violations. Finally, we assume that all FFs have one data input and one clock source.

To describe the algorithm, we need to introduce a special directed graph structure. The nodes of this graph have indexed inputs and are associated to a logic function and a value, as outlined in the following. We assume without loss of generality that every gate has just one output. Gates that have

A

_{0} is a set of edges (representing interconnection wires)

_{0}inputs. The set of valid input indices for a node

_{y}

The set of

Let us define the predicate

We define the set of nodes which are directly and indirectly connected to the inputs of a given node

A _{FF}

The value of a node _{FF}_{L}_{1}, _{2}, .._{n}_{FF}_{1}, …, _{n}

The proposed methodology is composed of 3 steps:

Triplet identification: determine all the FF triplets present in each logic cone

TMR structure analysis: perform an exhaustive fault injection campaign on all valid configurations

Clock and reset tree verification: assure that no FF triplet has common clock or set/reset lines

These steps are detailed in the following.

To determine a useful set of valid configurations for a logic cone (here represented by a subgraph), it is necessary to identify which FFs are triplicated, as all the FFs belonging to a triplet have to share the same value. However, the gate naming scheme is usually insufficient. A base assumption for triplet identification is that all triplicated FFs are driven by the same source. An algorithm based on this fact is able to find most triplets, but this simple mechanism is not always sufficient for more complex netlists.

During synthesis, netlists are often optimized in a way that voids this property.

Two nodes _{1} and _{2} are functionally identical (_{1} ≡ _{2}) if _{1}) = _{2}) and _{1}) = _{2}) for all possible configurations of

Testing for functionally identical inputs requires ^{pre_f fs(x)}). However, wrong triplet identification affects the verification of TMR protection only with the reporting of ^{1}

It is worth noting that other algorithms for functional equivalence checking can be used here.

Let us consider the example ofThe first step checks the sets of driving FFs for equality (lines 1–3) before starting from

In a second step the algorithm starts again from ^{2}

Assuming no FFs were duplicated during optimization.

in the same clock cycle, and the algorithm aborts. After terminating successfully, the algorithm returns the set of marked nodes (Alg. 1, line 5). For the example inThe third step verifies that all configurations for this set have the same values for _{i}_{j}

It is worth noting that the worst case scenario for this fast heuristic, i.e., when all FFs are reported as false positives, is when both subgraphs share only the driving FFs and the whole subgraph is duplicated. This is unlikely to happen when analyzing real world netlists, because synthesizers optimize away most redundant parts and introduce redundancy only in rare cases. For the designs used in this work, the non-shared subgraph size is typically less than nine gates as shown by

Before starting the analysis, we optimize our description by removing non-relevant elements, as one-to-one buffer gates. As such buffers do not manipulate the logic value of a signal, it is easy to see that the logic functions are not changed when they are removed.

If the TMR implementation were working correctly, a single bit-flip in one FF should not cause another FF to change its value. If a faulty triplicated FF/voter pair exists, there is at least one FF whose value can be changed by a single bit-flip in another FF. This is true only if the configuration before the bit-flip injection was a valid configuration. The algorithm tries to find such FFs, and if none are found, TMR is correctly implemented.

The main idea of the test algorithm is that complexity can be reduced by checking only small submodules instead of the whole system. In order to do this, we observe that a bit-flip in one FF can only distribute to the next FF during the current clock cycle. It is then possible to determine the set of all FFs which could potentially influence a given FF _{FF}

The algorithm takes each FF _{i}_{i}

has to be performed for all _{FF}_{1}, …, _{k}

Analyzing typical designs with the proposed algorithm showed that the majority of FFs are driven by a very small set of FFs pre_^{n} configurations to be evaluated, and heuristics have to be devised.

A naive approach would use “divide et impera,” splitting every node where |_{i}

Let _{i}_{i}_{i}_{i}

Afterwards, all possible bit configurations are assigned to the dummy inputs connected to _{i}

It is worth noting that this heuristic relies on the fact that synthesizers tend to keep the voting logic close to the originating FFs, and therefore splitting subgraphs with a large number of inputs

However, it cannot be excluded that some voting logic might be broken, resulting in some rare false positive alerts (see ‘Experimental Results’). This will

Verifying that the voters are correctly performing their task is not sufficient to guarantee that TMR structures are working. One also needs to show that transient errors on clock and reset lines are not affecting more than one FF at a time.

Using the detected triplets, it is possible to verify that FFs belonging to the same triplet do not share the same clock and reset lines. This is a simple structural analysis that does not require an heuristic to be performed.

Given _{FF}^{n} possible FF configurations to test, requiring ^{n}) node evaluations.

Determining a subgraph to be analyzed for every node _{FF}_{x}_{x}_{x}^{t} valid configurations we have to evaluate for every subgraph (assuming FF triplication, we expect less than ^{t} ⋅ ^{2}) node evaluations, showing polynomial behavior and outperforming other exponential verification methods.

The algorithm presented in ‘TMR structure analysis’ was implemented as a C++ program called InFault. The graph is obtained in two steps: first a given Verilog netlist is converted into an intermediate file format, which is then read and analyzed by InFault. This separation makes the parser independent from the main program, allowing easy development of parsers for different input files.

The graph itself was implemented in a custom structure, using pointers whenever possible and STL (

The implementation was tested on the submodule netlists of a radiation-hardened LEON2-FT processor (

Comparing InFault to an exhaustive approach, for example for the pci submodule, we have that this module is verified in less than ^{7974} ≈ 4.9 ⋅ 10^{2405} evaluations, showing that InFault provides orders of magnitude of speedup.

As the actual runtime of InFault depends on the choice of the threshold presented in ‘Splitting algorithm,’ we tested several threshold values to determine the speed of the algorithm. In general, smaller thresholds result in shorter runtimes with the drawback of more false positive alerts because of voters that have been broken during subgraph splitting. False positives have to be analyzed by manual graph inspection, or with other means.

The sum of false positives for all nine given designs goes from 63 down to 12. The overall runtime goes from 19 min up to 13 h. For a suggested threshold of 15, the runtime is around 25 min. Please note that the runtime strongly correlates to the internal structure of the design, especially the subgraph sizes, and therefore it is subject to fluctuations among the designs.

To show the effectiveness of the subgraph splitting, the algorithm was tested on the nine netlists, logging the different subgraph sizes before and after splitting (with threshold = 15). ^{32} valid configurations to be checked for each of those nodes, making the splitting heuristic an essential component of our approach. In fact, after splitting the situation is completely different: even though the splitting results in many more subgraphs to be checked, the subgraph sizes are much smaller. There are no subgraphs with more than 54 driving gates, giving no more than 2^{18} valid configurations.

It is worth noting that the results depend on the complexity of the circuit, since the splitting algorithm effectiveness varies by the number of driving FFs. InFault might not have the same performance with other processors with very complex multi-layer logic.

To show its fault detecting capabilities, InFault was verified on a netlist (module

Times in minutes (m), hours (h), and days (d).

Testcase | # gates^{a} |
# FFs | FT-U^{b} |
InFault time (th-15) | InFault time (th-21) | FP^{c} |
---|---|---|---|---|---|---|

Resetgen | 648 | 30 | 8h | <1m | <1m | 0 |

pci mas | 14,379 | 453 | 5d 5h | <1m | 2m | 0 |

pci tar | 13,768 | 546 | 6d 7h | <1m | 10m | 0 |

mctrl | 35,357 | 1,251 | 14d 11h | 1m | 1m | 0 |

fpu | 66,967 | 1,437 | 16d 15h | 10m | 10m | 6 |

amod | 87,193 | 3,303 | 38d 5h | 1m | 3m | 2 |

iu | 147,894 | 4,224 | 48d 21h | 8m | 406m | 3 |

pci | 190,987 | 7,974 | 92d 7h | 4m | 264m | 2 |

Gatecount after mapping library to standard logic cells.

not exhaustive.

False positives, same results for both thresholds.

In this work we presented an algorithm to verify TMR implementation for given netlists. Performing exhaustive verification without the need of a testbench, this approach does not suffer from the quality and coverage of the given testbench as in other solutions. First results show that exhaustive TMR verification of production-ready netlists can be carried out within few hours. To the best of the authors’ knowledge, no other approach provides this kind of performance.

Future work includes replacing the actual simulation/injection step with the identification of triplets followed by formal verification of the correct propagation of flip-flop values through the voting logic.

The author would like to thank Simon Schulz and David Merodio-Codinachs for the help provided in the development of InFault.

The author declares there are no competing interests.

The following information was supplied regarding the availability of data: