THE VLSI HOMEPAGE

A Practical Guide to VLSI Design and Verification..

FIFOs - Architecture and Design

Posted in Digital Design by Nigam on the September 20th, 2007

Introduction

A designer encounters FIFOs in speed matching or data width matching applications. An example of speed matching is when data is being transferred in bursts from a faster clock domain to a slower clock domain that is sampling the data. An example of data width matching is where the sampling clock domain is faster but the data width is narrower than the write side. FIFOs can be synchronous or asynchronous, i.e. the read and write clocks can be synchronous or asynchronous to each other.

Full and Empty flags

The FIFO full and empty status conditions are derived from the write and read pointers of the FIFO. The write pointer always points to the next word to be written and is incremented on a write to the FIFO. The read pointer points to the current word to be read and drives the valid data onto the output port to make the design efficient.

A FIFO can be full or empty when the read and write pointers are equal because of wraparound. To resolve this, an extra bit is added to the pointers - if the MSB of the pointers are different from each other, it indicates a full condition. If the MSB bits are the same, the FIFO is empty.

Pointer Synchronization

Synchronizing read and write pointers in an async FIFO is necessary since the write pointer is generated in write clock domain and read pointer is generated in read clock domain. To generate the empty and full status flags, it is necessary to transfer these pointers from one domain to the other.

Several techniques exist for synchronizing the pointers. One method is to synchronize the read and write strobes and use counters in read and write domains. The read counter tracks the number of valid data entries while the write counter tracks the number of entries to store data. The read counter is decremented on each read strobe, the read strobe is synchronized to write clock before incrementing the write counter. Similarly, the write strobe decrements the write counter and is synchronized to read clock before incrementing the read counter.

The strobes are synchronized using toggle synchronizers and indicate pessimistic empty/full status as there is latency in synchronization. The disadvantages of this method is that large counters are required for large FIFOs and also since there should be atleast two cycles spacing in strobes in slow clock domain (see toggle synchronizer), the data rate is inefficient.

Another method is to synchronize the read and write pointers but this is problematic in binary as more than one bit can change at a time and synchronization is unpredictable. The solution is to use Gray code counters that change one bit at a time, synchronize and generate the empty and full flags.

FIFO Depth

Calculating the depth of the FIFO requires the write and read clock frequency relation, burst rate on the write clock domain, synchronization latency and any idle cycles in the read domain.

Scenario 1:

Consider the case of a FIFO where the write clock frequency is 100 MHz and 50 words are written into the FIFO in 100 clocks while the read clock frequency is 50 MHz and one word is read out every clock.

In the worst case scenario, the 50 words are written into the FIFO as a burst in 500 ns. In the same time duration, the read side can read only 25 words out of the FIFO. The remaining 25 words are read out of the FIFO in the 50 idle write clocks. So the depth of the FIFO should be atleast 25 ( + synchronizer latency) = ~28.

The FIFO depth is calculated as

Depth = Burst_size * { 1 - (Frd/(Fwr * Idle_cycles)) }

Scenario 2:

Consider the case of a FIFO where the write clock frequency is 100 MHz and 80 words are written into the FIFO in 100 clocks while the read clock frequency is 80 MHz and 8 words are read out every 10 clocks. There is no feedback mechanism to throttle the writes to the FIFO.

In the worst case, the FIFO will write 80 words in a burst into the FIFO in 800 ns. In the same time, the read side can read only ~51 words ( (800/125) * 8 ) in that same time period. In the remaining 200 ns, only ~13 words ( (200/125) * 8 ) can be read out of the FIFO leaving 16 words on the floor. So the FIFO will need to be of infinite depth to make this design work!

For more details on FIFO design and Verilog code, the reader is recommended to read Cliff Cumming’s paper on Asynchronous FIFOs

 

Sphere: Related Content

Gray code counters

Posted in Digital Design by Nigam on the September 17th, 2007

While designing modules with asynchronous clock transfers, one may encounter the problem of transferring multi-bit data bus from one clock domain to another. To dual synchronize these bits and hope that all the bits are latched on the same clock is problematic. To eliminate this problem, we use Gray code counters where only one bit changes during each clock transition.

The most common Gray code is where the lower half of the sequence is exactly the mirror image of first half with only the MSB inverted. We illustrate the 3-bit binary Gray code as an example.

Gray Counter

Gray code counter schematic (from Cliff Cumming's paper)

Gray code to equivalent binary conversion is simple and is as shown below

bin[2] = gray[2];

bin[1] = gray[2] ^ gray[1] (XOR function)

bin[0] = gray[2] ^ gray[1] ^ gray[0]

Verilog module is as below

 

CODE:
  1. module gray2binary_converter (binary, gray);
  2.  
  3.     parameter NUM_BITS = 3;
  4.     output [NUM_BITS-1:0] binary;
  5.     input [NUM_BITS-1:0] gray;
  6.  
  7.     reg [NUM_BITS-1:0] binary;
  8.     integer i;
  9.  
  10.     always @(gray) begin
  11.        for (i=0; i<NUM_BITS; i=i+1)
  12.           binary[i] = ^(gray>> i); // Add padded 0's for the significant bits
  13.     end
  14.  
  15. endmodule

Similarly, the Binary to Gray conversion is achieved by

gray[2] = binary[3];

gray[1] = binary[2] ^ binary[1];

gray[0] = binary[0] ^ binary[1];

Verilog code is

CODE:
  1. module binary2gray_converter (gray, binary);
  2.  
  3.    parameter NUM_BITS = 3;
  4.    output [NUM_BITS-1:0] gray;
  5.    input [NUM_BITS-1:0] binary;
  6.  
  7.    assign gray = (binary>> 1) ^ binary; // Right shift binary vector and XOR
  8.  
  9. endmodule

The gray code counter can be implemented using these functions - please refer to Cliff Cumming's excellent paper on asynchronous clock domains.

Sphere: Related Content

Design Verification Flow

Posted in Design Verification by Nigam on the September 16th, 2007

Verification consumes more than 70% of the effort and is on the critical path to tapeout in today's complex, multimillion gate ASICs. To expedite the time-to-market duration, current methodologies focus on parallelizing verification effort with design, automate the verification process partly and also verify at higher abstraction layers.

A typical verification flow is shown in the figure below.

Design Verification flow

Design Verification Flow

The architecture spec details the chip functional requirements and features to be supported and is the starting point for both design and verification. The Verification plan clearly defines the features that will be verified in the design and prioritize testcases based on the schedule and critical features. The entire testsuite will be covered in the plan and both the design/verification teams review this VP to ensure that there are no holes in verification and that all features will be verified.

The verification environment can be a black-box model with no knowledge of implementation details - it applies the appropriate stimulus at the design inputs and checks the outputs against expected behavior. The entire design is black-boxed and advantages is reusability of generic verification IPs. Alternatively, it can be a white-box model with full controllability and observability in the design.

The verification environment integrates the stimulus generators, bus functional models, checkers and scoreboards usually written in a high level verification language (HVL) like specman 'e' or Vera. We will cover these modules in detail in another post.

Testcases can be either random or directed - directed testcases as the name indicates stimulate only one particular feature in the design and observe the response. Random testcases exercise the design within bound but random constraints. The VP determines the number of testcases to be written and the regression suite covers all these testcases.

Coverage metrics track the progress of verification and help in identifying the holes in verification. Any bug uncovered in the regression suite is filed in a bug-tracking database and fedback to the design team for correcting the design. Verification is said to be 100% done when the coverage metrics meet the goals defined in the verification plan.

Janick Bergeron's "Writing Testbenches" covers the entire verification cycle in detail and is highly recommended for any newbie/experienced verification engineer.

Sphere: Related Content

« Previous PageNext Page »

Close
E-mail It