International Journal of Parallel Programming, 2010
We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplicat... more We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplication, optimized for implementation on high-end FPGAs. It forms the kernel in many important tile-based BLAS algorithms, making an excellent candidate for acceleration. The designs, both based on the rank-1 update scheme, can handle arbitrary matrix sizes, and are able to sustain their peak performance
International Journal of Parallel Programming, 2010
We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplicat... more We present two designs (I and II) for IEEE 754 double precision floating point matrix multiplication, optimized for implementation on high-end FPGAs. It forms the kernel in many important tile-based BLAS algorithms, making an excellent candidate for acceleration. The designs, both based on the rank-1 update scheme, can handle arbitrary matrix sizes, and are able to sustain their peak performance
Uploads
Papers by Vinay Kumar