Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does it mean for hardware synthesised from Verilog code to be correct

I have read "Nonblocking Assignments in Verilog Synthesis, Coding Styles that Kill!" by Clifford Cummings. He says that the code at the bottom of this question is "guaranteed" to be synthesised into a three flip-flop pipeline, but it is not guaranteed to simulate correctly (example pipeb3, page 10; the "guaranteed" comment is on page 12). The document won a best paper award, so I assume the claim is true. http://www.sunburst-design.com/papers/CummingsSNUG2000SJ_NBA.pdf

My question: How is the correctness of Verilog synthesis defined if not by reference to the simulation semantics? Many thanks.

I suppose the bonus points question is: give the simplest possible Verilog program that has well-defined synthesis semantics and does not have well defined simulation semantics, assuming it is not the code below. Thanks again.

In fact, can someone give me a piece of Verilog thatis well defined when both simulated and synthesised, yet the two produce different results?

The code:

module pipeb3 q3, d, clk);
  output [7:0] q3;
  input [7:0] d;
  input clk;
  reg [7:0] q3, q2, q1;

  always @(posedge clk) q1=d;
  always @(posedge clk) q3=q2;
  always @(posedge clk) q1=d;
endmodule

PS: in case anyone cares, I though a plausible definition of a correct synthesis tool might be along the lines of "the synthesised hardware will do something that a correct simulator could". But this is inconsistent with the paper.

[I now think the paper is not right. Section 5.2 of the 1364-2001 standard clearly says that the meaning of a Verilog program is defined by its simulation that the standard then proceeds to define (non-determinism and all). There is no mention whatsoever of any "guarantees" that synthesis tools must provide over and above simulators.

There is another standard 1364.1-2002 that describes the synthesisable subset. There is no obvious mention that the semantics of synthesised hardware should somehow differ from simulation. Section 5.2.2 "Modelling edge-sensitive storage devices" says that non-blocking assignments should be used to model flip-flops. In standard-speak that means that the use of anything else is unsupported.

As a final note, the section referred to in the previous paragraph says that blocking assignments can be used to calculate the RHS of the non-blocking assignment. This appears to violate Cummings' recommendation #5.

Cliff Cummings is listed as a member of the working group of the 1364.1-2002 standard. This standard is listed as replaced on the IEEE website but I cannot tell what it was replaced by.]

like image 356
user1002059 Avatar asked Dec 20 '22 19:12

user1002059


2 Answers

All -

Time for me to chime in with useful background information and my own opinions.

First - The IEEE-1364.1-2002 Verilog RTL Synthesis Standard was never fully implemented by any vendor, which is why none of us were in any hurry to update the standard or to provide a SystemVerilog version of the synthesis standard. To my knowledge, the standard was not "replaced," and has just expired. To my knowledge, the attributes described in the Standard were never fully implemented by any vendor. The only useful feature in the Standard that I believe was implemented by all vendors was that a vendor is supposed to set the macro `define SYNTHESIS before reading any user code, so that you can now use `ifndef SYNTHESIS - `endif as a generic replacement for the vendor-specific // synopsys translate_on - // synopsys translate_off pragma-comments.

Verilog was invented as a simulation language and was never intended to be a synthesis language. In the late 1980's, Synopsys recognized that engineers really liked this Verilog-simulation language and started to define a subset of the language that they (Synopsys) would recognize and convert through synthesis into hardware. We now refer to this as the RTL synthesis subset, and that subset can grow over time as synthesis tool vendors discover unique and creative ways to convert a new type of description into hardware.

There really is no "correctness of Verilog synthesis defined." Don Mills and I wrote a paper in 1999 entitled, "RTL Coding Styles That Yield Simulation and Synthesis Mismatches," to warn engineers about legal Verilog coding styles that could infer synthesized hardware with different behavior. http://www.sunburst-design.com/papers/CummingsSNUG1999SJ_SynthMismatch.pdf

Consider this, if synthesized results always matched the behavior of Verilog simulations, there would be no need to run gate simulations. The design, as RTL-simulated, would be correct. Because there is no guaranteed match, engineers run gate-sims to prove that the gate behavior matches the RTL behavior, or they try to run equivalence checking tools to mathematically prove that the pre-synthesis RTL code is equivalent to the post-synthesis gate models, so that gate-sims are not required.

As for the bonus question, this is really hard, because Verilog semantics are rather well defined, even if the definition is that it is a legal race condition.

As far as well-defined code in simulation and synthesis with different results, consider:

module code1c (output reg o, input a, b);

  always
    o = a & b;
endmodule

In simulation, you never get past time-0. Simulation will loop forever because of the missing sensitivity list. Synthesis tools do not even consider the sensitivity list when inferring combinational logic, so you will get a 2-input and-gate and a warning about missing sensitivity list items that could cause a mis-match between pre- and post-synthesis simulations. In Verilog-2001 we added always @* to avoid this common problem, and in SystemVerilog we added always_comb to remove the sensitivity list and inform the synthesis tool of the designer-intended logic.

As far as whether the paper should offer guarantees on correct synthesis behavior, it probably should not, but the guarantees described in my paper define what an engineer can expect from a synthesis tool based on experience with multiple synthesis tools.

"As a final note, the section referred to in the previous paragraph says that blocking assignments can be used to calculate the RHS of the non-blocking assignment. This appears to violate Cummings' recommendation #5."

You are correct, this does violate coding guideline #5 and in my opinion should not be used.

Coding guideline #5 is frequently violated in VHDL designs because VHDL variables cannot trigger another process. I find the VHDL-camp evenly divided on this issue. Half say that you should not use variable assignments and the other half use variables to improve simulation performance but then are required to mix variable assignments with a final signal assignment to trigger other processes.

If you violate coding guideline #5 and if your code is correct, the simulation will work and the synthesis will also work, but if you have any mistakes in your code, it is very difficult to debug designs that violate coding guideline #5 because the waveform display for the combinational piece does not make sense. The output of the combinational logic in a waveform display only updates when reset is not asserted and on a clock edge, which is not how real combinational hardware behaves, and this has proven to be a difficult issue when debugging these designs using waveform displays (I did not include this information in the paper).

Regards - Cliff Cummings - Verilog & SystemVerilog Guru

like image 104
Cliff Cummings Avatar answered Dec 29 '22 05:12

Cliff Cummings


I believe the reason that will synthesize correctly is because in real silicon there's no difference between 'blocking' and 'nonblocking'.

Synthesis will read that and create three flip flops chained back to back, as you've described.

This won't be a problem in synthesis (assuming you're not violating flop hold time), because real gates exhibit delays. On the rising edge of clk, it will take several ns for the value d to propogate to q1. By the time d propagates to q1, q1 will have already been sampled by the second flop, similarly with q2 and q3.

The reason this doesn't work in simulation is because there are no gate delays. On the positive edge of clock, q1 will be instantly replaced with d, possibly before q1 was sampled by the second flop. In a real circuit (with proper setup and hold time), q1 is guaranteed to be sampled on the positive edge of clock before the first flop can change its output value.

like image 21
Tim Avatar answered Dec 29 '22 05:12

Tim