1. Introduction
1.1 Orthogonal Frequency Division Multiplexing
Orthogonal frequency division multiplexing (OFDM) is a highly-preferred system for the high data rate applications due to its improved immunity against channel impairments like intersymbol interference (ISI) and fading. OFDM is used as a downlink technique in 4th generation mobile system and many other broadband applications like HDSL, DVB, DAV, etc. With the increasing demand for high data rate, the concept of multiple-inputs multiple-outputs (MIMO) has also evolved. Many techniques have been proposed about this concept like vector permutation techniques and linear pre-coding techniques [1]. One of the important advantages of an OFDM transceiver system is its robustness against frequency selective fading. Rahaman et al. [2] wrote a paper which talked about using multiple antenna schemes over a fading channel. It also talked about OFDM being used in a relay and transmitting links to overcome/reduce these fading effects.
1.2 Fast Fourier Transform
The fast Fourier transform (FFT) and inverse FFT (IFFT) blocks are used for modulation and demodulation in an OFDM system. In this paper, we have implemented the FFT block using Verilog, outputs of which have been compared to the same blocks designed using MATLAB to verify the designs and check the efficiency of the used algorithms.
Discrete Fourier transform (DFT) is a difficult process with a time complexity of an order of O(n2). To implement DFT on hardware is itself a very complex task. FFT has a large range of applications in the telecom domain. Thus, engineers have put in great effort in this field to achieve a real-time hardware implementation of FFT. Different algorithms have been developed in recent years to achieve an efficient implementation of FFT which include split radix algorithm, twisted radix algorithm, parallel FFT, mixed radix algorithm, and vector radix algorithm.
Cooley and Tukey proposed an FFT algorithm which is used to divide the complex DFT into the smaller DFTs with multiplication by twiddle factors. Dividing the DFT in smaller DFTs reduce the time complexity thus increasing efficiency. The general equation for the DFT is given by:
The FFT problem gets decimated by two units in time in radix-2 implementation. Much has been proposed to implement FFT/IFFT efficiently. The main aim was to reduce time, calculations and increase accuracy. A paper published in 2015 [3] a pure design in VHDL was implemented, integrated with IP blocks using XILINX ISE 14.2. The modulator was implemented using 16-QAM in which different bit combinations were converted into corresponding constellation values. In demodulator, care was taken for the values like 2.999 to take it as 3 instead of 2. Fatima [4] designed FFT for 32-point using radix-2 using a smaller number of slices and computations.
In 2010, Merlyn [5] implemented a prototype FFT/IFFT processor which has a variable length to cover different specifications of OFDM applications. In this, they proposed a programmable MDC (mixedradix FFT processor). An extension was made in the binary number while implementing through butterfly structure to avoid overflow danger.
In 2014, a paper published on design of a high-speed OFDM transmitter and receiver [6] analyzed 8- point IFFT/FFT with the radix-2 algorithm. These FFT and IFFT are pass modules, i.e., from Path0 to Path7. Complexity increases as the complex number come to these paths.
Mehta [7] had implemented 512-point IFFT and 512-point FFT blocks using radix-2. In this proposed work, the number of multipliers is reduced using an adder circuit and inter-carrier interface (ICI) selfcancellation techniques which can improve the performance parameters.
A paper issued in 2013, Palekar and Ingole [8] proposed to use efficient multiplication technique to reduce the partial product in FFT/IFFT which used to happen in conventional multiplication technique. A 32-point FFT and IFFT blocks for OFDM by using VEDIC multiplication was coded in VHDL and was synthesized using XILINX ISE software. The speed performance of this design easily satisfies most application requirements. In 2015, Sharma and Singh [9] carry out the same work efficiently using better resource utilization parameter.
To study the effect of various design parameters on the system performance OFDM system using MATLAB program was simulated [10]]. It was observed that the optimum length of FFT/IFFT was 1024 points with the fixed signal-to-noise ratio (SNR) and the best SNR (fixed) was 60 dB and after this value, there was no effect on varying the SNR value.
A significant method to improve radix-2 and radix-4 algorithm was proposed in 2004. It was indexing the output samples again once the output comes from traditional algorithms which save effort while evaluating twiddle factor or accessing lookup table [11]. Later Rao et al. [12] used the same strategy efficiently while calculating 32-point DIT FFT. Mahajan and Chitode [13] implemented novel 16-point DIT FFT using radix-2 based on 8-point FFT. Ramachandran and Vanmathi [14] implemented the design of 32-bit radix-2 DIT in FPGA Spartan 3E using fewer computations as the output of shorter FFTs were used while computing the final output.
Direct computation requires almost 4N multiplications and 4N additions for every value of k. it is computed by implementing:
where
The same formula has been implemented which shows that this algorithm is slow and requires many computations. Its time complexity is [TeX:] $$O\left(n^{2}\right)$$. Results are shown for 4-point FFT in Fig. 1. In Fig. 1, ‘o_re’ and ‘o_im’ are the real and imaginary outputs of the FFT function, respectively.
Simulation result of 4-point FFT on ModelSim (Wave Window).
Comparison of calculations between direct summation method and radix-2 method for 16-point FFT
It is later implemented using radix-2 algorithms which have better efficiency and time complexity. The comparison has been shown in Table 1.
1.3 ADC/DAC
ADC and DAC is an essential block in OFDM used in receiver and transmitter respectively. DAC is used in OFDM as after the application of IFFT (can apply only on digital data) in transmitter block the signal is transmitted. And the transmission medium is air thus it is required to convert the digital data obtained from the previous block to analog signal. Similarly, the ADC block is used after the FFT block in the receiver.
Many designs were modeled and simulated like Generic ADC and DAC, flash ADC, SAR (successive approximation register ADC), pipelined ADC, etc. Suarez [15] in “Behavioral Modeling of Data Converters using Verilog” had stated that these ADCs are modeled using transistor level modeling which can be useful up to 16-bits but for higher bits the model becomes a lot complex making the design impractical. In generic ADC, there is a mismatch between ADC units which are later removed using dynamic element match (DEM).
Flash ADC is designed using high-speed comparator in a cascade fashion. It can be used up to 8-bit resolution because for higher resolution the design becomes complex and consumes more power, but these are ideal for devices requiring large bandwidth.
An ADC (N bits) requires N comparison periods as it starts the next conversion only after the previous one is complete. Pipelined ADC is most popular ADC architecture with the resolution of fewer bits at faster sample rates and of more bits at the lower rates. Ashraf [16] in 2016 designed an 8-bit SAR ADC with an input voltage of 1.2-V. The SAR control logic was designed asynchronously using Verilog coding. The design was implemented in Cadence Virtuoso using 1.8-nm technology. Designing of low power comparator was achieved which results in overall power consumption of 0.75 mW.
A 10-bit DAC with (6+4) bit segmentation was designed in 2006 by Murali Shanmugasundaram and Shanthi Pavan [17]. The simulations run on full transistor level schematic as well as on the macro design over a range of frequencies. This design uses current controlled current sources which is time-varying in nature and thus reduces the simulation time and increases simulation speed. It was implemented in a 0.35um CMOS process. In 2015, Santhanalakshmi and Yashoda [18] also designed a low power SAR ADC using Verilog-A and proved that bypass window technique is more efficient with a lesser number of transitions, i.e., 3 over 4.
Menzter and Wey [19] developed a mixed signal model of a 10-bit pipelined ADC using Verilog. The overall aim of this project was to identify design flaws based on input/output tests. Implementation of FIR filter was done using onboard ADC-DAC and FPGA.
To optimize design and computations it was also modeled using MATLAB and Simulink [20]. Although transistor level modeling is the most appropriate approach for mixed-signal circuits, but alternate modeling is required to avoid the complexity and long computation time.
The ADC results in a ratiometric value. Analog to digital conversions totally depend on the system voltage. Here the ADC assuming maximum voltage to be 10 V is converted to 1023 and any value less than 10 V will be a ratiometric value correspondingly after quantization.
2. Proposed Work
2.1 Fast Fourier Transform
Fig. 2 illustrates the butterfly structures of radix-2 for 16-point. The basic butterfly structure is implemented in the design of 16-point FFT.
16-point FFT using radix-2.
The radix-2 is computed by dividing the Eq. (2) into odd and even terms as mentioned,
But [TeX:] $$W_{M}^{2}=W_{M / 2}$$. Using this substitution, the above equation becomes:
Where
and
and
It implies F1 (k) and F2 (k) are periodic, with M/2 period.
Hence,
By using Eqs. (14) and (15) we compute radix-2 in our program.
Since the input values are complex therefore two different arrays one for real and the other for the imaginary part have been used.
The twiddle factors which are used in FFT/IFFT blocks are generated using the CORDIC algorithm. Each time while performing the calculation using butterfly structure the angle which is needed is passed to the CORDIC algorithm which gives the sine and cosine of the angle except that in the last stage. In the last stage, the twiddle factor is 1 always, therefore the direct value is saved for cosine and sine, i.e., sine of the angle is 0 and cosine of the angle is 1.
Saving the value 1 in a variable makes it easy to compute the values in the last stage which reduces time by 4 times while using the CORDIC algorithm in the fourth (last) stage of 16-point FFT. Thus, in the last stage, both the complexity as well as time taken is reduced.
The code for radix-2 written in Verilog is universal, i.e., on varying the number of data and data values the design calculates FFT itself, so this can be used for calculating any point-FFT. The output has been shown in Figs. 3–7. The final output comes at 4015 ns. But extra delays were added just for the sake of proper visibility in Fig. 7.
Simulation result of 16-point FFT on ModelSim (Wave Window) [0 ns–1200 ns].
Simulation result of 16-point FFT on ModelSim (Wave Window) [1200 ns–2300 ns].
Simulation result of 16-point FFT on ModelSim (Wave Window) [2300 ns–3500 ns].
Simulation result of 16-point FFT on ModelSim (Wave Window) [3500 ns–4000 ns].
Simulation result of 16-point FFT on ModelSim (Wave Window - last stage) [4000 ns–4150 ns].
Figs. 3–7 show the FFT simulation of 16-point input using radix-2 algorithm. The parameter a_re, a_im, b_re, and b_im are the outputs of the radix-2 butterfly structure as shown in Fig. 8.
The output result obtained from this proposed work is compared with MATLAB simulations in Fig. 9 using the same algorithm. The MATLAB simulation is used for verification of results (in Fig. 9, ‘x’ is the input variable and ‘y’ is the output variable).
Simulation result of 16-point FFT on MATLAB (x is input and y is output).
It has been observed that the result of the design simulated using Verilog on ModelSim is almost the same as the results obtained in MATLAB. The comparisons of results of 16-point FFT using Verilog and MATLAB are mentioned in Table 2.
Comparison of outputs from Verilog with MATLAB for 16-point FFT
2.2 ADC and DAC
The input signal is analog which takes in values from +10 V to -10 V and converts them to +1023 to - 1023 which comes as output from the digital signal. Any value greater than 10 V will be limited to its maximum valu, i.e., +1023. Similarly, any value lower than -10 V will be limited to -1023. The signal gets modulated using a carrier signal and the signal is quantized at few intervals. And the reverse procedure is used for implementing DAC. The simulation result is shown in Figs. 10 and 11. The simulation contains conversion of analog to digital values and reconstruction of digital values back to analog again.
Simulation result of ADC and DAC on ModelSim (Transcript Window).
Simulation result of ADC and DAC on ModelSim (Wave Window).
In Fig. 10, the transcript shows the input analog signal (‘analog value’), its converted digital signal (‘digital data’) and finally the reconstructed analog signal (‘reconstructed value’) from digital data. Fig. 11 shows the wave simulation (‘c’ is the input analog signal, ‘result_r’ is converted digital signal and ‘result’ is the reconstructed analog signal.).
3. Conclusion and Future Scope
Nowadays OFDM is an important technology that supports the latest standards of wireless communication. To achieve high data rate, we use OFDM in LTE, WiMAX and WLAN, etc. The main modules of an OFDM system, i.e., the FFT and the IFFT, A/D and D/A have been simulated successfully using radix-2 algorithm and reducing the complex mathematics.
This design works better for large N (in N-point FFT) than radix-M (M>2) because the number of multiplications, addition and subtraction is equal to one in radix-2 while the number of multiplications is more in others thus leading to a more complex design. Also, the last stage of the design is much more time efficient than the existing designs (since the time taken is optimized in the last stage).
For A/D and D/A systems, we implemented the architecture using some common algorithms which include simple mathematics reducing its complexity. The proposed FFT design is both speed and time efficient. It is a method in which the resources have been used in an optimized way.
The high data rate requirement in wireless communication has increased exponentially lately. In this paper, we take a review on the traditional concept in terms of its disadvantages and advantages and have tried to improve them. The future networks will continue to rely on OFDM/OFDMA for satisfying its need.