- Home
- Hardware Projects
- VLSI Projects
- Arithmetic Core And Digital Electronics
Arithmetic Core And Digital Electronics
Arithmetic Core And Digital Electronics
IEEE Arithmetic Core And Digital Electronics projects for M.Tech, B.Tech, BE, MS, MCA, BCA Students. Asictron Control System, Chennai Offers Project Training in IEEE 2018 / 2017 / 2016 Arithmetic Core And Digital Electronics.
-
Design and Implementation of Area-Efficient and Low-Power Configurable Booth-Multiplier×
Design and Implementation of Area-Efficient and Low-Power Configurable Booth-Multiplier
Related Courses:This paper presents an area-efficient low-power architecture for configurable booth multiplier. It is synthesized and post-layout simulated using 90 nm CMOS process and it occupies 9511 μm2 and consumes 1.73 mW at 167 MHz. Comparatively, the proposed multiplier architecture requires 43.12% and 75.65% lower area and power, respectively, in comparison with the state of the art work.
-
VLSI Design of High Speed Vedic Multiplier for FPGA Implementation×
VLSI Design of High Speed Vedic Multiplier for FPGA Implementation
Related Courses:In the modern world of digitization, processing of data in real time requires an increase in the operating speed of a system. The processing more often than not utilizes multiplication which is time consuming and introduces considerable amount of delay. As such, there is a need to reduce this delay and achieve faster real time processing of data. This paper proposes a novel architecture for implementation of signed multiplication using the vedic algorithm. An 11x8 bit multiplier was designed using the proposed architecture and implemented using Xilinx ISE Design Suite 13.2 with Spartan 3E as the target FPGA. The maximum clock speed achieved was 203.938 MHz.
-
ASIC Design of Signed and Unsigned Multipliers Using Compressors×
ASIC Design of Signed and Unsigned Multipliers Using Compressors
Related Courses:Arithmetic operations are becoming a bigger concern in the digital system for applications like ALU (Arithmetic and Logic Unit) and DSP (Digital Signal Processing). Our work focuses on novel 4-2 and 5-2 Compressors(CM) applied in multiplication architectures such as Unsigned Wallace tree multiplier, Vedic mathematics using Urdhva Triyakbyam sutra, and Signed Baugh-Wooley Wallace tree multiplier, Signed Booth with Radix 2 and Radix 4. The proposed compressors architectures have shown better results when compared with the existing compressors. The ASIC design Implementation was done using Standard cell 180nm CMOS technology and the Verilog HDL code is tested in Xilinx tool, with the help of ISE Simulator (ISim).
-
Area Efficient Modified Vedic Multiplier×
Area Efficient Modified Vedic Multiplier
Related Courses:This paper describes the design of high speed Vedic multiplier that uses the techniques of Vedic mathematics based on 16 sutras (algorithms) to improve the performance. In this paper the efficiency of Urdhva Tiryagbhyam (vertical and crosswise) Vedic method for multiplication which is different from the process of normal multiplication is presented. Urdhva -Tiryagbhyam is the most efficient algorithm that gives minimum delay for multiplication for all types of numbers irrespective of their size. Vedic multiplier is coded in Verilog HDL and stimulated and synthesized by using XILINX software 12.2 on Spartan 3E kit. Further the design of array multiplier is compared with the proposed multiplier in terms of delay, memory and power consumption.
-
Run-time reconfigurable multi-precision floating point multiplier design for high speed, low-power applications×
Run-time reconfigurable multi-precision floating point multiplier design for high speed, low-power applications
Related Courses:Floating point multiplication is one of the crucial operations in many application domains such as image processing, signal processing etc. But every application requires different working features. Some need high precision, some need low power consumption, low latency etc. But IEEE-754 format is not really flexible for these specifications and also design is complex. Optimal run-time reconfigurable hardware implementations may need the use of custom floating-point formats that do not necessarily follow IEEE specified sizes. In this paper, we present a run-time-reconfigurable floating point multiplier implemented on FPGA with custom floating point format for different applications. This floating point multiplier can have 6 modes of operations depending on the accuracy or application requirement. With the use of optimal design with custom IPs (Intellectual Properties), a better implementation is done by truncating the inputs before multiplication. And a combination of Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm (Vedic Mathematics) is used to implement unsigned binary multiplier. This further increases the efficiency of the multiplier.
-
Low Power Wallace Tree Multiplier Using Modified Full Adder×
Low Power Wallace Tree Multiplier Using Modified Full Adder
Related Courses:Achieving high speed integrated circuits with low power consumption is a major concern for the VLSI circuit designers. Most arithmetic operations are done using multiplier,which is the major power consuming element in the digital circuits. Basically the process of multiplication is realized in hardware in terms of shift and add operation. The optimizationof adder has led to the improvement in performance of multiplier. In this paper, a modified full adder using multiplexer is proposed to achieve low power consumption of multiplier. To analyze the efficiency of proposed design, the conventional Wallace tree multiplier structure is used. The designs are developed using Verilog HDL and the functionalities are verified through simulation using Quartus II. The designs are synthesized in Synopsys Design Compiler using SAED90nm CMOS technology. The ASIC synthesis results of the proposed multiplier shows an average reduction of 37.45% in power consumption, 45.75% in area, and 17.65% in delay compared to the existing approaches.
-
An efficient floating point multiplier design for high speed applications using Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm.×
An efficient floating point multiplier design for high speed applications using Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm.
Related Courses:Floating point multiplication is a crucial operation in high power computing applications such as image processing, signal processing etc. And also multiplication is the most time and power consuming operation. This paper proposes an efficient method for IEEE 754 floating point multiplication which gives a better implementation in terms of delay and power. A combination of Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm (Vedic Mathematics) is used to implement unsigned binary multiplier for mantissa multiplication. The multiplier is implemented using Verilog HDL, targeted on Spartan-3E and Virtex-4 FPGA.
-
A FPGA IEEE-754-2008 DECIMAL64 FLOATING-POINT ADDER/SUBTRACTOR×
A FPGA IEEE-754-2008 DECIMAL64 FLOATING-POINT ADDER/SUBTRACTOR
Related Courses: B.Tech, M.Tech, MSThis paper describes the FPGA implementation of a Decimal Floating Point (DFP) adder/subtractor. The design performs addition and subtraction on 64-bit operands that use the IEEE 754-2008 decimal encoding of DFP numbers and is based on a fully pipelined circuit. The design presents a novel hardware for pre-signal generation stage and an enhanced version of previously published leading zero stage. The design can operate at a frequency of 200 MHZ on a Virtex-5 with a latency of 8 cycles. The presented DFP adder/subtractor supports operations on the decimal64 format and it is easily extendable for the decimal128 format. To our knowledge, this is the first hardware FPGA design for adding and subtracting IEEE 754-2008 using decimal64 encoding.
-
An Efficient Implementation of Floating Point Multiplier×
An Efficient Implementation of Floating Point Multiplier
Related Courses: M.Tech, MS, MCAIn this paper we describe an efficient implementation of an IEEE 754 single precision floating point multiplier targeted for Xilinx Virtex-5 FPGA. VHDL is used to implement a technology-independent pipelined design. The multiplier implementation handles the overflow and underflow cases. Rounding is not implemented to give more precision when using the multiplier in a Multiply and Accumulate (MAC) unit. With latency of three clock cycles the design achieves 301 MFLOPs. The multiplier was verified against Xilinx floating point multiplier core.
-
High Speed 16-bit Digital Vedic Multiplier using FPGA×
High Speed 16-bit Digital Vedic Multiplier using FPGA
Related Courses:In the present paper our objective is to emphasize the importance of Vedic Mathematics for digital applications. Ancient vedic mathematics not only facilitate the complex mathematical operations but also useful for logical applications. In the present work we are using the concept of Urdhva-tiryakbyham, i.e., vertically and crosswise multiplication and it’s implementation for 16-bit multiplication. This technique optimizes the output in term of steps of calculation and therefore reduces the delay of a digital circuit. We implemented these results with the help of front end language—Verilog. Results obtained from simulation and syntheses have been verified on Spartan 3E FPGA using Xilinx ISE Suite are discussed in details. Obtained results have been compared with the most frequently used multipliers in digital circuits which illustrate 38 % reduction in device utilization and 62% reduction in delay.
-
Design and implementation of fast floating point multiplier unit×
Design and implementation of fast floating point multiplier unit
Related Courses:Floating point numbers are the quantities that cannot be represented by integers, either because they contain fractional values or because they lie outside the range re presentable within the system's bit width. Multiplication of two floating point numbers is very important for processors. Architecture for a fast floating point multiplier yielding with the single precision IEEE 754- 2008 standard has been used in this project. The floating point representation can preserve the resolution and accuracy compared to fixed point. Pipeline is a technique where multiple instructions are overlapped in execution. Multiple operations performed at the same time by pipeline will increase the instruction throughput. In several high performance computing systems such as digital signal processors, FIR filters, microprocessors, etc multipliers are key components. The most important aim of the design is to make the multiplier quicker by decreasing delay. Decrease of delay can be caused by propagation of carry in the adders having smallest amount power delay constant.
-
FPGA Implementation of Vedic Floating Point Multiplier×
FPGA Implementation of Vedic Floating Point Multiplier
Related Courses:Most of the scientific operation involve floating point computations. It is necessary to implement faster multipliers occupying less area and consuming less power. Multipliers play a critical role in any digital design. Even though various multiplication algorithms have been in use, the performance of Vedic multipliers has not drawn a wider attention. Vedic mathematics involves application of 16 sutras or algorithms. One among these, the Urdhva tiryakbhyam sutra for multiplication has been considered in this work. An IEEE-754 based Vedic multiplier has been developed to carry out both single precision and double precision format floating point operations and its performance has been compared with Booth and Karatsuba based floating point multipliers. Xilinx FPGA has been made use of while implementing these algorithms and a resource utilization and timing performance based comparison has also been made.Keywords:- Vedic multiplication, FPGA, Floating Point
-
An Efficient Baugh-Wooley Architecture for Signed & Unsigned Fast Multiplication×
An Efficient Baugh-Wooley Architecture for Signed & Unsigned Fast Multiplication
Related Courses:This paper presents an efficient implementation of a high speed multiplier using the shift and adds method of Baugh-Wooley Multiplier. This parallel multiplier uses lesser adders and lesser iterative steps. As a result of which they occupy lesser space as compared to the serial multiplier. This is very important criteria because in the fabrication of chips and high performance system requires components which are as small as possible. Experimental results demonstrate that the proposed circuit not only improves the accurate performance but also reduces the hardware complexity and also less power consumption that is dynamic power of 15.3mW and maximum clock period of 3.912ns is required which is very efficient as compared to the reference paper.
-
An Optimized Design of Reversible Quantum Comparator×
An Optimized Design of Reversible Quantum Comparator
Related Courses:Reversible computing has emerged as promising technology having its applications in emerging technologies like quantum computing, optical computing etc. This paper presents a reversible comparator based on prefix tree grouping methodology. The proposed design is realized by cascading three stages. The first stage is a 1-bit reversible comparator which generates ‘greater’ and ‘equal’ signals of that operand bit. These signals are combined using prefix tree grouping logic to generate final ‘greater’ and ‘equal’ signals. Using these final ‘greater’ and ‘equal’ signals, ‘lesser’ signal is generated in the third stage. The design is optimized in quantum level for efficient performance in all the cost metrics. The proposed 64-bit comparator design results in 14.3% reduced quantum delay, 7.8% reduced quantum cost and 25% reduced garbage outputs when compared with the best existing design of prefix based comparator.
-
Implementation of Open Core Protocol transaction Verification IP using System Verilog UVM methodology×
Implementation of Open Core Protocol transaction Verification IP using System Verilog UVM methodology
Related Courses:This paper presents Implementation of the Reusable Open Core protocol (OCP) transaction Verification IP (VIP) using universal verification methodology (UVM) and very easy to implement the functional coverage and SystemVerilog assertions (SVA) using powerful System Verilog (SV) Language. As verification of the design has become most challenging task, verification is important factor for achieving time to market (TTM) of the product. This paper address briefly on how the UVM methodology and SV enables developing VIP easier and quickly over traditional Hardware description language(HDL) verilog or VHDL and Hardware verification language(HVL) Specman.
-
Low-Power and Area-Efficient Carry Select Adder×
Low-Power and Area-Efficient Carry Select Adder
Related Courses: M.Tech, MS, MCACarry Select Adder (CSLA) is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. From the structure of the CSLA, it is clear that there is scope for reducing the area and power consumption in the CSLA. This work uses a simple and efficient gate-level modification to significantly reduce the area and power of the CSLA. Based on this modification 8-, 16-, 32-, and 64-b square-root CSLA (SQRT CSLA) architecture have been developed and compared with the regular SQRT CSLA architecture. The proposed design has reduced area and power as compared with the regular SQRT CSLA with only a slight increase in the delay. This work evaluates the performance of the proposed designs in terms of delay, area, power, and their products by hand with logical effort and through custom design and layout in 0.18- m CMOS process technology. The results analysis shows that the proposed CSLA structure is better than the regular SQRT
CSLA. -
Area-Time Efficient Scaling-Free CORDIC Using Generalized Micro-Rotation Selection×
Area-Time Efficient Scaling-Free CORDIC Using Generalized Micro-Rotation Selection
Related Courses: M.Tech, MS, MCAThis paper presents an area-time efficient CORDIC algorithm that completely eliminates the scale-factor. By suitable selection of the order of approximation of Taylor series the proposed CORDIC circuit meets the accuracy requirement, and attains the desired range of convergence. Besides we have proposed an algorithm to redefine the elementary angles for reducing the number of CORDIC iterations. A generalized micro-rotation selection technique based on high speed most-significant-1-detection obviates the complex search algorithms for identifying the micro-rotations. The proposed CORDIC processor provides the flexibility to manipulate the number of iterations depending on the accuracy, area and latency requirements. Compared to the existing recursive architectures the proposed one has 17% lower slice-delay product on Xilinx Spartan XC2S200E device.
-
Design and implementation of demodulation technique with complex dpll using cordic algorithm×
Design and implementation of demodulation technique with complex dpll using cordic algorithm
Related Courses: BE, B.Tech, BCACORDIC (Coordinate Rotation Digital Computer) is a simple and efficient algorithm to calculate hyperbolic and trigonometric functions. It is commonly used when no multiplier hardware is available (e.g., simple micro-controllers and FPGAs). The only operations it requires are addition, subtraction, bit shift and lookup table. The pipelined architecture for coordinate rotation algorithm for the computation of loop performance of complex Digital Phase Locked Loop (DPLL) in In-phase and quadrature channel receiver is designed.
The design of CORDIC in the vector rotation mode results in high system throughput due to its pipe-lined architecture where latency is reduced in each of the pipelined stage. For on-chip application, the area reduction in the proposed design can be achieved through optimization in the number of micro rotations. For better loop performance of
first order complex DPLL and to minimize quantization error, the number of iterations are also optimized. -
A New Approach for High Performance and Efficient Design of CORDIC Processor×
A New Approach for High Performance and Efficient Design of CORDIC Processor
Related Courses: BE, B.Tech, MCA, BCAThis paper presents a new approach for the high performance and hardware efficient design of coordinate rotation digital computer (CORDIC) processor structure. The proposed design approach completely eliminates the ROM requirement of constant arctangent values. Furthermore, efficient designs of carry look ahead adders (CLAs), exploiting one input as constant, in the angle adder/subtractor datapath speeds-up the computation while maintaining regularity. The proposed architecture is implemented in FPGA as well as in 180nm standard cell library. The proposed implementation has about 39% delay improvement in FPGA and about 34% delay improvement in standard cell technology as compared to basic structure. About 47% power savings has been achieved in the proposed structure.
-
Design of Plural-Multiplier Based on CORDIC Algorithm for FFT Application×
Design of Plural-Multiplier Based on CORDIC Algorithm for FFT Application
Related Courses: BE, B.Tech, BCACORDIC plural-multiplier is the key module to affecting the speed and accuracy of FFT processor. Considering these demands, the problem of CORDIC algorithm is discussed in detail and the according optimization methods are given in this paper. Then, the hardware pipe-lining structure of the CORDIC multiplier is put forward. Comparison results about RTL simulation results with MATLAB calculation indicate that the design is feasible and practical.
-
Hardware Efficient Architecture for Generating Sine/Cosine Waves×
Hardware Efficient Architecture for Generating Sine/Cosine Waves
Related Courses: BE, B.Tech, BCAThis paper presents a hardware efficient architecture for generating sine and cosine waves based on the CORDIC (Coordinate Rotation Digital Computer) algorithm. In its original form the CORDIC suffers from major drawbacks like scale-factor calculation, latency and optimal selection of micro-rotations. The proposed algorithm overcomes all these drawbacks. We use leading-one bit detection technique to identify the microrotations. The scalefree design of the proposed algorithm is based on Taylor series expansion of the sine and cosine waves. The 16-bit iterative architecture achieves approximately 4.5% and 6.7% lower slice-delay product as compared to the other existing designs. The algorithm design and its VLSI implementation are detailed.
-
FPGA Design of a Fast 32-bit Floating Point Multiplier Unit×
FPGA Design of a Fast 32-bit Floating Point Multiplier Unit
Related Courses: M.Tech, MS, MCAAn architecture for a fast 32-bit floating point multiplier compliant with the single precision IEEE 754-2008 standard has been proposed in this paper. This design intends to make the multiplier faster by reducing the delay caused by the propagation of the carry by implementing adders having the least power delay constant. The implementation of the multiplier module has been done in a top down approach. The sub-modules have been written in Verilog HDL and then synthesized and simulated using the Xilinx ISE 12.1 targeted on the Spartan 3E.
-
Design & Implementation of Floating point ALU×
Design & Implementation of Floating point ALU
Related Courses: M.Tech, MS, MCAIn this paper, the implementation of DSP modules such as a floating point ALU are presented and designed. The design is based on high performance FPGA "Cyclone TI" and implementation is done after functional and timing simulation. The simulation tool used is ModelSim. The tool for synthesis and implementation is Quartusn. The experimental results shows the functional and timing analysis for all the DSP modules carried out using high performance synthesis software from Altera.
-
FPGA Implementation of Sine and Cosine Value Generators using Cordic Algorithm for Satellite Attitude Determination and Calculators×
FPGA Implementation of Sine and Cosine Value Generators using Cordic Algorithm for Satellite Attitude Determination and Calculators
Related Courses: BE, B.Tech, BCANow-a-days various Digital Signal Processing systems are implemented on a platform of programmable signal processors or on application specific VLSI chips. Coordinate Rotation DIgital Computer (CORDIC) algorithm has turned out to be such kind of programmable signal processor. In recent times, it has been a widely researched topic in the field of vector rotated Digital Signal Processing (DSP) applications due to its simplicity. This paper presents the design of pipelined architecture for coordinate rotation algorithm for the computation of loop performance of complex Digital Phase Locked Loop (DPLL) in In-phase and quadrature channel receiver. The design of CORDIC in the vector rotation mode results in high system throughput due to its pipelined architecture where latency is reduced in each of the pipelined stage. For on-chip application, the area reduction in proposed design can is achieved through optimization in the number of micro rotations. For better loop performance of first order complex DPLL and to minimize quantization error, the numbers of iterations are also optimized.
-
Design and Implementation of CORDIC Processor for Complex DPLL×
Design and Implementation of CORDIC Processor for Complex DPLL
Related Courses: BE, B.Tech, BCAThis paper presents an area-time efficient CORDIC algorithm that completely eliminates the scale-factor. By suitable selection of the order of approximation of Taylor series the proposed CORDIC circuit meets the accuracy requirement, and attains the desired range of convergence. Besides we have proposed an algorithm to redefine the elementary angles for reducing the number of CORDIC iterations. A generalized micro-rotation selection technique based on high speed most-significant-1-detection obviates the complex search algorithms for identifying the micro-rotations. The proposed CORDIC processor provides the flexibility to manipulate the number of iterations depending on the accuracy, area and latency requirements. Compared to the existing recursive architectures the proposed one has 17% lower slice-delay product on Xilinx Spartan XC2S200E device.
-
FPGA Implementation of a chaotic oscillator using RK4 method×
FPGA Implementation of a chaotic oscillator using RK4 method
Related Courses: M.Tech, MS, MCAThe dual deterministic-stochastic behavior of chaotic systems (CS) makes them extremely interesting in electronic engineering as CS may replace noise sources in different applications. Consequently it is convenient to have hardware implementations for both, analog and digital versions. Discrete components, Micro Controllers, Digital Signal Processors (DSP) and Field Programmable Gate Arrays (FPGAs) are possible choices. For digital realizations the Ordinary Differential Equations (ODE’s) are replaced by a discrete time system. Furthermore numerical values are expressed in a numerical representation. It is well known that these two discretization processes may strongly affect the chaotic behavior of the system. In previous contributions we considered the use of the Euler’s algorithm in two different numerical representations: (a) integer arithmetics and (b) single floating point IEEE-754 standard. For applications that require a good agreement between the analog chaotic system and its digital counterpart, more involved algorithms and/or numerical representations must be used. Guided by numerical simulations, in this paper we propose an improvement replacing the Euler’s algorithm by the fourth order Runge Kutta algorithm (RK4). In order to diminish the required hardware a method based on blocks’ reusing is proposed. The procedure is exemplified on a Lorenz CS. The whole design was implemented onto a FPGA, using only 12 % of its logic elements, 13% of its embedded multipliers and 34% of its memory bits.
-
Low-Cost Scan-Chain-Based Technique to Recover Multiple Errors in TMR Systems×
Low-Cost Scan-Chain-Based Technique to Recover Multiple Errors in TMR Systems
Related Courses:In this paper, we present a scan-chain-based multiple error recovery technique for triple modular redundancy (TMR) systems (SMERTMR). The proposed technique reuses scan-chain flip-flops fabricated for testability purposes to detect and correct faulty modules in the presence of single or multiple transient faults. In the proposed technique, the manifested errors are detected at the modules’ outputs, while the latent faults are detected by comparing the internal states of the TMR modules. Upon detection of any mismatch, the faulty modules are located and the state of a fault-free module is copied into the faulty modules. In case of detecting a permanent fault, the system is degraded to a master/checker configuration by disregarding the faulty module. FPGA-based fault injection experiments reveal that SMERTMR has the error detection and recovery coverage of 100% and 99.7% in the presence of single and two faulty modules, respectively, while imposing negligible area and performance overheads on the traditional TMR systems.