A Fast Crc Implementation on Fpga Using a Pipelined Architecture for the Polynomial Division

A Fast CRC Implementation on FPGA Using a Pipelined Edifice for the Polynomial Removal Fabrice MONTEIRO, Abbas DANDACHE, Amine M’SIR,Bernard LEPLEY LICM, University of Metz, SUPELEC, Rue Edouard Belin, 57078 Metz Cedex phone: +33(0)3875473 11, fax: +33(0)387547301, email: fabrice. [email protected] org ABSTRACT The CRC falsity defiance is a very vile duty on telecommunication applications. The separation towards increasing postulates trounces requires more and more sofisticated utensilations. In this monograph, we bestow a mode to utensil the CRC duty domiciled on a pipeline constitution for the polynomial removal.It ameliorates very tellingly the acceletrounce accomplishment, allowing postulates trounces from 1 Gbits/s to 4 Gbits/s on FPGA utensilions, according to the congruousisation smooth (8 to 32 bits).
1 INTRODUCTION The CRC (Cyclic Redundancy Checking) codes are used in a lot of telecommunication applications. They are used in the inner layers of protocols such as Ethernet, X25, FDDI and ATM (AAL5). However, on modem networks, the want for increasing postulates trounces (balance 1 Gbit/s) is setting the absorptions on accomplishment very haughty. Indeed, the acceletrounce advancement (surdeath clock trounces) due to the technological separation is unable to fit the require.Consequently, new edifices must be devised. Targetting the applications to an FPGA contrivance is an conclusion for this monograph, as it allows low-cost intrigues. The humble and incontrovertible serial utensilation is a pure hardware utensilation of the CRC algorithm.
Unfortunatly, on an FPGA utensilation after a era maximal clock estimate of 250 MHz, maximal postulates trounce is poor to 250 Mbits/s is the best circumstance. Surdeath postulates trounces can simply be obtained through congruousisation. Some congruous edifices own been proposed in the gone-by to address the want for haughty postulates throughput [ 1][2].The main quantity is usually to name the eagerly increasing area balancehead era decent the acceletrounce accomplishment. In this monograph, we bestow a congruous access for the polynomial removal domiciled on a pipeline constitution. The congruousisation can be led to any smooth and is simply lim- ited by the area absorption set on the intrigue. The postulates throughput is closely straightway linked to the congruousisation smooth, as the maximal clock trounce is not very perceptive to it.
2 PRINCIPLE The polynomial removal is the essential influence of the CRC applications.The serial utensilation of the removal is paraden in appearance 1 for the circumstance where the polynomial divisor is G ( X ) = Go + G1. X1 + Gz. X2 + G3. X3 = 1 + X + X 3 . As involved formerly, the postulates throughput of this serial utensilation is entirely low. Very haughty postulates trounces can simply be achieved after a era haughty clock frequencies, which in transform can simply be obtained using rather proud-priced technological disruptions.
Parallelisation of postulates waying is the main disruption to ameliorebuke the acceletrounce accomplishment of a circumference (or rule) if the clock trounce must abide low.Pipelining may be used as an telling congruousisation mode when a repeatitive way must be applied on catholic volumes of ‘data. Former works own addressed the congruousisation quantity in catholic requireing computational applications, especially in arithmetic (eg. [3][4]) and falsity coerce coding circumferences (eg. [11[21[61). In the serial edifice (appearance I), a new postulates bit is inject on each clock cycle. The former cumulated abideder is conjointly multitudinous by X and disjoined by G(z) (where G(z) is the polynomial divisor).
On P Appearance I : Serial polynomial removal for G ( X ) = 1 -tX + X 3 -7803-7057-0/01/$10. 00 02001 IEEE. 1231 successive clock cycle , P bits are injected and P successive plurality and removals are produced. The instant formula (connected to the in of appearance 1) describes the influence produced on one clock cycle. 0 T = [ o o 1 !]=[n Gz 0 1 o 1 1 Go GI 0 i ] 0 3 RESULTS This edifice own been utensiled on FPGA contrivances of the FLEXlOKE ALTERA lineage. These contrivances own their maximal clock estimate poor to 250 MHz. The edifice was tested on the generating polynomials of consultation 1.
The conclusions in consultation 2 were obtained on FPGA contrivances of the FLEXlOKE ALTERA lineage.The edifice tested in these ins utensils a vastly influenceal CRC checker. The synchronisation signals to transcribe and decipher postulates relatively on input and ouput are vastly utensiled. The body was effected using Synplify 5. 3 and MaxPlus11 10. 0. The edifice was tested for 3 irrelative smooths of paralelism on 6 irrelatives test divisor polynomials.
It can be noticed that G17(z) is used on ethernet, FDDI and AALS-ATM, era G14(z) is the test polynomial for the X2. 5 protocol. The clock trounces must be compared to the haughtyest estimate (250 MHz) that can be produced on FLEXlOKE contrivances.The “IC” manifestation resources “logical cells” and is an manifestation of the area decay. The conclusions must be compared to those obtained in [SI. A postulates trounce of 160 Mbits/s was obtained on an ALTERA FLEXIOK contrivance (max. clock trounce of 125 MHz), on a 32-bit congruous CRC runtime-configurable utensilation of the decoder, domiciled on the use of congruous combi- A pipeline constitution can be devised by the utensilation of P successive pluralitys and removals.
However, to observe the clock trounce haughty, the P influences should not be effected in a uncombined combinatorial stop. Thus, the ranks of the P-multiplingldivising stop must be disconnected by chronicles.This is the basic subject of the pipeline constitution. Each of the P congruous bits of an input must be injected in their relative pipeline rank. therefore, they must be injected on irrelative clock cycles. This may be effected if the bits are slow in a displace-register constitution and (cf. the displace register track between [ d i n o ,.
.. , [douto, ... ,doutp-l] in the appearance 2, after a era P = 8 in this in and G ( X ) = 1 + X + X 3 . The influence produced when death from the rank k + l to the rank k of the pipeline (k>O) is vivid in the instant formula, where G ( X ) = 1 + X + X 3 as it is in appearance 2.
ith Ri,J= 0 wheni + j > p - 1. The P bits of an input are wayed in P clock cycles. At each clock cycle, the conclusion of the waying of P bits is profitable at the output of the pipeline constitution. This conclusion (the abideder of the P bits disjoined by G(z) must be cumulated in the [ROO, ROZ] ROI, register using a intermittent access, harmonious to the intrigue of the serial edifice of appearance 1. The cumulated abideder at era t must be multitudinous by X p and then disjoined by G(x). Then, the new particular abideder future out of the pipeline constitution can be cumulated. This way is describet in the instant formula.
Ro,o,ROJ,R0,Sltfl = [Ro,o,RO,l,R0,zIt * M +[Ri,o, Ri,i, Rl,z]t * T f [Do,P-l, 0,Olt natorial stop for the polynomial removal as bestowed in [ 11. The find obtained on the 32-bit congruous edifice is after a erain 16 and 30 eras, that is, 8 to 1. 5 eras using the similar technology (cf. consultation 2). For any co-operation of the intrigue parametres, the latency is alway resembling to P clock cycles where P denotes the congruousisation smooth. It can be noticed that for attached a maximal polynomial divisor quantity, the area decay (estimate of logic cells ) is closely proportional to the congruousisation smooth of the edifice.Furthermore, the conclusions parade that a catholic acception of the congruousisation smooth can be effected after a era a sedate wane of maximal clock estimate.
The ticklish track is due to the M matrix. The complication of this matrix depends on the choosen polynomial (estimate and standing of the non-zero provisions in the polynomial). It so depends on the congruousisation 1232 smooth, but not linearly. Actually, a haughtyer congruousisation smooth can manage to a near close matrix.