Modica RASC
Modica RASC
Modica RASC
Application
Specific
Computing
Presented by:
Steve Modica
RASC Product Manager
PC2100
PC2700
PC2100
PC2700 4 Channels SDRAM
Itanium2 DDR
DDRSDRAM
DDRSDRAM 10.8 – 12.8 GB/s
DDRSDRAM
SDRAM
NUMAlink4
SHUB
Front Side Bus NUMAlink4
6.4 GB/s
2 Channels NUMAlink
12.8 GB/s
Itanium2
PIC
Ethernet
SCSI Disk BASE I/O
SGI Confidential
Slide 3
SGI Altix™ 3700 Bx2 Platform
Introduction:
Itanium® 2 CR-brick
Building Blocks CPU and memory
M-brick
Memory
SGI®
R-brick Advanced
Router interconnect
Linux
Environment
IX-brick With
Base I/O module SGI
ProPack
PA-brick, PX-brick
PCI-X expansion
D-brick2
Disk expansion
SGI Confidential
Slide 4
SGI Altix™ 3700 Bx2 Platform
Introduction:
System Topology Example
Router Plane 1
Router Plane 2
SGI Confidential
Slide 5
Reconfigurable Application Specific Computing
Accelerating Interaction
IO
ut
is critical
mp
Style 1 -- Traditional FPGAs
Co
– Work with traditional FPGAs in PCI / PCI-X slots Memory
• Nallatech, Clearspeed, Annapolis Micro et al bandwidth is
– Development environments relatively advanced the key to
• All driving to same goal of “write in C, run on FPGA” Sp success
– Leverages other industry efforts Ele ecia h ics
p
• Cray, PCs, Clusters me list G ra
nts
Specialist hics
Grap
Style 2 -- Tightly coupled Elements
– Athena --- FPGA + memory for computation at high b/w
– Daytona --- FPGA + spigots for fast network
– Both being proto’d by a few customers
Confidential
The 3 Single-Paradigm Architectures
Application-specific
Application-specific
Compute
Intensity
Vector
Scalar
Low
• Ease of Use
– Languages
– Compilers
– Debuggers
– APIs
• Performance
– Bandwidth to/from System
– Scalability
x
High
VHDL
Verilo
g
Efficiency
x x
x
x
Low
• Handel-C
– Runs on Windows only
– Plans to port to Linux in June of 2005
– Most efficient procedural language
• Starbridge VIVA
– Extremely easy to learn, Graphical, Object-oriented
– Develop on Windows only, execute anywhere.
– Easiest language to program, creates very efficient cores
– Large library of packaged algorithm primitives
• Mitrion C
– Runs natively on Altix
– Utilizes a processor abstraction
– Most useful debugging environment
• Impulse-C
– Runs on Windows
– Highly optimized for Streaming Applications
– Fastest language to port legacy C code
Algorithm.c
tmp = a & b;
(gdb) fpgastep
Debugger running d = tmp | c;
(gdb) p/x $a
$6 = 0x444433 in real time
(gdb) p/x $b
$7 = 0x111122
(gdb) p/x $tmp
$8 = 0x555533
(gdb) fpgastep
(gdb) p/x $tmp a COP FPGA
$9 = 0x555533 tmp
&
(gdb) p/x $c
$10 = 0x331222 b
| d
(gdb) p/x $d
$11 = 0x111022
c
Open|Speedshop
Debugger (GDB) Download
Pro|Speedshop
Utilities
Application
User Space
Device
Abstraction Layer Manager
Library
The Abstraction Layer’s algorithm API mirrors the COP API with a
few additions that enable wide scaling,
Algorithm
Input Data Output Data
COP
COP
Application
COP
COP COP
Application
NUMAlink4
Altix 350
MOATB
36 36
NUMAlink Connectors
Addr & Ctrl
72
Algorithm 36
QDR SRAM
TIO SSP
2MB
FPGA 36
72
PC2100
PC2700 2MB SRAM 0
PC2100
PC2700 QDR SRAM
Itanium2 DDR
DDRSDRAM
SDRAM NUMAlink Addr & Ctrl 36 36
DDR
DDRSDRAM
SDRAM Addr & Ctrl
72
SHUB Algorithm 36
QDR SRAM
TIO SSP
2MB
FPGA 36
72
Itanium2 SRAM 1
PIC
36 36
Addr & Ctrl
PCI 66MHz 2MB
Select Map QDR SRAM
Programming Interface
PCI-X SRAM 2
Loader
BASE I/O FPGA
Altix 350 MOATB
QDR-II SRAM
Bank 0
Reads @ 1.6GB/s
Write Read
3.2 GB/s
port 0 port 0 Write Writes @ 1.6GB/s
Core port 1
QDR-II SRAM
SSP Services Algorithm Block Bank 1
Block
Read
port 1
3.2 GB/s Read Write
port 2 port 2
QDR-II SRAM
Bank 2
alg_clk
do_step
Algorithm alg_rst
controller
step_flag Algorithm
alg_done Block
debug0
debug63
Debug
sram_wr_data[63:0]
sram_wr_addr[17:0]
sram_rd_cmd_vld
sram_wr_be[7:0]
sram_rd_addr[17:0]
sram_rd_dv
sram_rd_da
sram_wr_r
sram_wr_g
sram_wr_dv
port sram_rd_g
sram_rd_req
eq
ld
nt
ta
ld
nt
SRAM controller
(one bank shown)
Systems
Altix 3700/350 BX2 SHUB2 UV
2MB
QDR SRAM
NUMAlink Connectors
QDR SRAM
2MB
SSP Algorithm
TIO
FPGA
QDR SRAM
2MB
PCI 66MHz 2MB
QDR SRAM
SSRAM SSRAM
SSRAM
NL4 SSP
V4LX200
TIO SSRAM
PCI
Selmap SSRAM
NL4
Loader
SSRAM
Selmap
SSRAM
NL4 SSP
TIO V4LX200
SSRAM
SSRAM SSRAM
Blade
Slots
TPS Power
Supply Slots