CORDIC
CORDIC
CORDIC
Sine/Cosine using
CORDIC Algorithm
CORDIC Algorithm
Prof. Kris
Prof. Kris
Gaj
Gaj
Gaurav Doshi,
Gaurav Doshi,
Hiren
Hiren
Shah
Shah
Outlines
Outlines
Introduction
Introduction
Basic Idea
Basic Idea
CORDIC Principles
CORDIC Principles
Hardware Implementation
Hardware Implementation
FPGA & ASIC Results
FPGA & ASIC Results
Conclusion
Conclusion
Introduction
Introduction
CORDIC (
CORDIC (
COordinate
COordinate
Rotation
Rotation
DIgital
DIgital
Computer)
Computer)
Introduced in 1959 by Jack E.
Introduced in 1959 by Jack E.
Volder
Volder
Efficient to compute sin,
Efficient to compute sin,
cos
cos
, tan,
, tan,
sinh
sinh
,
,
cosh
cosh
,
,
tanh
tanh
Its an Hardware Efficient Algorithm
Its an Hardware Efficient Algorithm
Iterative Algorithm for Circular Rotation
Iterative Algorithm for Circular Rotation
No Multiplication
No Multiplication
Delay/Hardware cost comparable to
Delay/Hardware cost comparable to
division or square rooting.
division or square rooting.
Why CORDIC ?
How to evaluate trigonometric functions?
How to evaluate trigonometric functions?
Table lookup
Table lookup
Polynomial approximations
Polynomial approximations
CORDIC
CORDIC
Compared to other approaches, CORDIC is
a clear winner when :
Hardware Multiplier is unavailable ( Hardware Multiplier is unavailable (eg eg. microcontroller) . microcontroller)
You want to save the gates required to implement You want to save the gates required to implement
( (eg eg. FPGA) . FPGA)
Basic Ideas
Basic Ideas
Embedding of elementary function
Embedding of elementary function
evaluation as a generalized rotation
evaluation as a generalized rotation
operation.
operation.
Decompose rotation operation into
Decompose rotation operation into
successive basic rotations.
successive basic rotations.
Each basic rotation can be realized
Each basic rotation can be realized
with shift
with shift
-
-
and
and
-
-
add arithmetic
add arithmetic
operations.
operations.
CORDIC Principles
CORDIC Principles
) sin( . ) cos( .
) sin( . ) cos( .
x y y
y x x
+ =
(x,y)
Y
X
(x,y)
Rotation of any (x,y) vector:
)] tan( . ).[ cos(
)] tan( . ).[ cos(
x y y
y x x
+ =
) tan(
) cos(
) sin(
: Note
=
Rearrange as:
sin
cos
Y
X
Basic idea
Rotate (1,0) by degrees
to get (x,y): x=cos(), y=sin()
Key Idea
Key Idea
Can compute rotation in steps where each
step is of size
Expansion Vector K
Expansion Vector K
K =
K =
cos
cos
depends on rotation
depends on rotation
angle
angle
1 1
,
,
2 2
,
,
3 3
n n
Since same angles are rotated
Since same angles are rotated
always K is a constant that can be
always K is a constant that can be
pre
pre
-
-
computed
computed
K = 1.646760258121
K = 1.646760258121
Iterative rotations
di decision (rotation mode)
Zi is introduced to keep track of the angle that
has been rotated (z0 = )
di = -1 if z
i
< 0
= 1 otherwise
Find Find
i i
such that such that tan( tan(
i i
)=2 )=2
- -i i
: (or, : (or,
i i
=tan =tan
- -1 1
(2 (2
- -i i
) ) ) )
Example: Example: = =30.0 30.0
Start with Start with
0 0
= 45.0 = 45.0 ( (> > 30.0 30.0) )
45.0 45.0 26.6 = 26.6 = 18.4 18.4 ( (< < 30.0 30.0) )
18.4 18.4 + + 14.0 = 14.0 = 32.4 32.4 ( (> > 30.0 30.0) )
32.4 32.4 7.1 = 7.1 = 25.3 25.3 (< (< 30.0 30.0) )
25.3 + 3.6 = 28.9 25.3 + 3.6 = 28.9 ( (< < 30.0 30.0) )
28.9 28.9 + + 1.8 = 30.7 1.8 = 30.7 ( (> > 30.0 30.0) )
. . . . . .
X
Y
30
Example: Rewriting Angles in Terms of
Example: Rewriting Angles in Terms of
i i
= 30.0
45.0 26.6 + 14.0 7.1 + 3.6
+ 1.8 0.9 + 0.4 0.2 + 0.1
= 30.1
45
Sequential/Iterative CORDIC
Cont..
Cont..
Maximum number of Clock Cycles to
Maximum number of Clock Cycles to
calculate output
calculate output
Minimum Clock Period per
Minimum Clock Period per
itration
itration
Variable Shifters do not map well on
Variable Shifters do not map well on
certain
certain
FPGAs
FPGAs
due to high Fan
due to high Fan
-
-
in
in
Parallel/Cascaded CORDIC
Parallel CORDIC Cont..
Parallel CORDIC Cont..
Combinational circuit
Combinational circuit
More Delay, but processing time is
More Delay, but processing time is
reduced as compared to iterative
reduced as compared to iterative
circuit.
circuit.
Shifters are of fixed shift, so they
Shifters are of fixed shift, so they
can be implemented in the wiring.
can be implemented in the wiring.
Constants can be hardwired instead
Constants can be hardwired instead
of requiring storage space.
of requiring storage space.
Pipeline Architecture
Pipeline Architecture
32
Bit
Parallel Pipelined CORDIC
Parallel Pipelined CORDIC
Parallel CORDIC can be pipelined by inserting
Parallel CORDIC can be pipelined by inserting
registers between the adders stages.
registers between the adders stages.
In most FPGA architectures there are already
In most FPGA architectures there are already
registers present in each logic cell, so pipeline
registers present in each logic cell, so pipeline
registers has no hardware cost.
registers has no hardware cost.
Number of stages after which pipeline register is
Number of stages after which pipeline register is
inserted can be modeled, considering clock
inserted can be modeled, considering clock
frequency of system.
frequency of system.
When operating at greater clock period power
When operating at greater clock period power
consumption in later stages reduces due to lesser
consumption in later stages reduces due to lesser
switching activity in each clock period.
switching activity in each clock period.
Redundant Addition
Redundant Addition
Main delay in critical path of the
Main delay in critical path of the
CORDIC iteration is that of the
CORDIC iteration is that of the
adder.
adder.
To reduce this delay we can use
To reduce this delay we can use
redundant adders.
redundant adders.
In signed digit number system
In signed digit number system
addition becomes carry free.
addition becomes carry free.
Example
Example
r = 10 , digit set [0,9]
r = 10 , digit set [0,9]
5 7 8 2 4 9
5 7 8 2 4 9
+ 6 2 9 3 8 9
+ 6 2 9 3 8 9
11 9 17 5 12 18 [0,18]
11 9 17 5 12 18 [0,18]
11 9 16 5 12 16 [0,16]
11 9 16 5 12 16 [0,16]
0 0 1 0 0 2 [0,2]
0 0 1 0 0 2 [0,2]
11 10 16 5 14 16 [0,16]
11 10 16 5 14 16 [0,16]
Example
Example
r = 2, digit set [
r = 2, digit set [
-
-
1,0,1]
1,0,1]
1 0 0 1 0 0 - -1 7 in decimal 1 7 in decimal
1 0 1 0 - -1 0 6 in decimal 1 0 6 in decimal
1 0 1 0 - -1 0 1 0 C C
i i
0 0 0 1 0 0 0 1 - -1 1 U U
i i
1 0 1 0 - -1 1 1 1 - -1 13 in decimal 1 13 in decimal
Language, Platform, Tools
Language, Platform, Tools
Language
Language
Verilog
Verilog
HDL
HDL
Platform
Platform
Xilinx FPGA
Xilinx FPGA
ASIC
ASIC
TSMC
TSMC
LibraryAldec
LibraryAldec
Active
Active
-
-
HDL
HDL
Tools
Tools
Aldec
Aldec
Active
Active
-
-
HDL,
HDL,
Synplify
Synplify
Pro, Xilinx ISE
Pro, Xilinx ISE
(Windows Platform)
(Windows Platform)
Cadence
Cadence
Verilog
Verilog
-
-
XL &
XL &
Simvision
Simvision
SD
SD
Adder
Adder
Area
Area
9138
9138
39019
39019
62937
62937
356015
356015
Power
Power
554W
554W
22.9m
22.9m
W
W
3mW
3mW
27.4m
27.4m
W
W
Arrival
Arrival
Time
Time
9.7
9.7
28.3
28.3
9.7
9.7
7.82
7.82
Clock Frequency
Clock Frequency
FPGA
FPGA
85 MHz
85 MHz
ASIC
ASIC
150Mhz
150Mhz
Application
Application
CORDIC sine/cosine generators in satellite data
CORDIC sine/cosine generators in satellite data
processing systems
processing systems
attitude determination
attitude determination
Cordic Cordic modules has been proposed for calculation of modules has been proposed for calculation of
Legendre Polynomials. Increase in speed 40% compared to Legendre Polynomials. Increase in speed 40% compared to s/w s/w
running on 333Mhz Pentium running on 333Mhz Pentium
generates a new
generates a new
frequency based upon an original reference
frequency based upon an original reference
frequency.
frequency.
Verification
Verification
C code was used to generate test vectors
C code was used to generate test vectors
360 360
= 2 = 2
32 32
1 1
= = 232 232 = (0B60B60) = (0B60B60)
16 16
360 360
i.e i.e 30 = 30 = 2 2
32 32
x 30 =(15555555) x 30 =(15555555)
16 16
360 360
Verification of Results done by simulator given by
Verification of Results done by simulator given by
Isreal
Isreal
Koren
Koren
Output Waveform
Output Waveform
-
-
Parallel
Parallel
Output Waveform
Output Waveform
-
-
Pipeline
Pipeline
Status of Design
Status of Design
Written
Written
100 %
100 %
Functional Simulation
Functional Simulation
100%
100%
Timing Simulation
Timing Simulation
50%
50%
Analyzed for Speed and Area
Analyzed for Speed and Area
80%
80%
Further Optimization
Further Optimization
Area Optimized
Area Optimized
If adders are
If adders are
shared.
shared.
Speed & Area Optimized
Speed & Area Optimized
Modified
Modified
CORDIC algorithm
CORDIC algorithm
Problem
Problem
Difficulty in implementation of
Difficulty in implementation of
Redundant Adders.
Redundant Adders.
Conclusion
Conclusion
A trade
A trade
-
-
off speed/area will determine
off speed/area will determine
the right structural approach to
the right structural approach to
CORDIC FPGA implementation for an
CORDIC FPGA implementation for an
application
application
References
References
Computer Arithmetic B. Parhami
Digital Arithmetic Milos.E
Survey of CORDIC algorithms for
FPGA based computers R.Andraka
FPGA Implementation of Sine and
Cosine Generators using the CORDIC
algorithm.
Computer Arithmetic I.Koren
(SD Adder)
Thank You !
Thank You !