Floating-point modular multiplication algorithm

Question

Is there a well-known algorithm for modular multiplication of floating-point numbers?

I would like to multiply some large angle in single precision (6-7 significant digits) and wrap it back to 360 degrees, without losing too many significant digits.

float r = fmodf(a * b, 360);

My current approach is to split the numbers a and b into integer and decimal parts, get the remainder (a_int * b_int) % 360, and add it to the smaller decimal products.

Does a more efficient algorithm exist?

Edit: Here's an example expression:

float L = fmodf(13.17639647 * d, 360);

where d is a decimal number between -36525 and 36525 (it could theoretically be a wider range).

Unfortunately, double-precision math is not available.

What are the ranges of $a$ and $b$ ? (It is quite unusual to have an angle that is the product of two large numbers.) — user16034, Commented Jan 17, 2023 at 8:51
If your input numbers are floats, promoting to double to perform the computation will yield good accuracy. — user16034, Commented Jan 17, 2023 at 8:54
If the inputs are arbitrary large, then you first reduce them modulo 360, which is an exact operation. Then you calculate the double precision product, modulo 360, and convert to single precision. — gnasher729, Commented Jan 17, 2023 at 10:08
I've edited the question to include an example expression. Unfortunately, double-precision is not available. — phil5, Commented Jan 17, 2023 at 15:14

njuffa · Accepted Answer · 2023-02-14 22:56:29Z

A completely general solution for this is difficult to achieve if a wider floating-point format with at least twice the number of significand bits is not available. Normally, on common system platforms using C or C++ one could just use float rem = (float) fmod ((double)a * (double)b, 360.0); with float mapped to IEEE-754 binary32 type, double mapped to IEEE-754 binary64 types and therefore both the multiplication and fmod() being exact, with only a single rounding error incurred when rounding the result back to float.

A high-performance implementation is possible in the absence of double support if a few parameters and restrictions can be enforced. This answer assumes that a C or C++ program is executed on a system platform that supports IEEE-754 arithmetic and with the binary32 data type exposed as float, and that the default rounding mode in effect is "to nearest or even". It is further assumed that the fused multiply-add (FMA) operation is provided in hardware, and exposed (for the binary32 type) via the standard math function fmaf(). This latter assumption holds for most current systems based on the ARM, x86, and Power processor architectures, as well as commonly used GPUs.

It is further assumed that the product $a \cdot b$ is restricted to $[-360 \cdot2^{22}, 360 \cdot 2^{22}]$, approximately $[-1.5\cdot 10^{9}, 1.5\cdot 10^{9}]$, such that $\lfloor \frac{a \cdot b}{360}\rfloor$ is exactly representable in a float number. With this restriction in place, the desired result can be computed with an error of at most 2 ulp as follows.

With the help of FMA and the reciprocal $\frac{1}{360}$ one can quickly compute $\mathrm{nint}(\frac{a \cdot b}{360})$, where $\mathrm{nint}(\cdot)$ is the nearest integer function, and implement Kahan's accurate difference-of-products computation $ab-cd$. The remainder is computed as $a\cdot b - \mathrm{nint}(\frac{a \cdot b}{360}) \cdot 360$. To match fmod() semantics, the remainder must have the same sign as the dividend $ab$. This can be achieved by conditionally adding / subtracting $360$ if the condition is not met.

The final float result may round to exactly $360$. If this is problematic, an additional check should be added to the algorithm below to map this to zero.

/*
  Compute a*b-c*d with error <= 1.5 ulp.

  Claude-Pierre Jeannerod, Nicolas Louvet, and Jean-Michel Muller,
  "Further Analysis of Kahan's Algorithm for the Accurate Computation 
  of 2x2 Determinants", Mathematics of Computation, Vol. 82, No. 284, 
  Oct. 2013, pp. 2245-2264
*/
float diff_of_products_kahan (float a, float b, float c, float d)
{
    float w = d * c;
    float e = fmaf (c, -d, w);
    float f = fmaf (a, b, -w);
    return f + e;
}

/* Fast computation of fmod (a * b, 360). This is only accurate as long as the 
   quotient round (a * b / 360) can be represented accurately by a 'float'. It
   is possible for the final result to equal 360 due to rounding error. If this
   is problematic, an additional correction could be added.
 */
float fmod360 (float a, float b)
{
    const float divisor = 360.0f;
    const float rcp_divisor = 2.77777777e-3f; // 1 / 360
    const float int_cvt_magic = 0x1.8p23f; // 2**23 + 2**22
    float prod, revs, rem;

    prod = a * b;

    /* approximate quotient and round it to the nearest integer */
    revs = fmaf (prod, rcp_divisor, int_cvt_magic);
    revs = revs - int_cvt_magic;

    /* compute preliminary remainder: a * b - revs * 360 */
    rem = diff_of_products_kahan (a, b, revs, divisor);

    /* remainder may have wrong sign as quotient rounded; fix as necessary */
    if ((rem * prod) < 0.0f) { // rem does not have same sign as product a*b
        rem = rem + copysignf (divisor, prod);
    }
    return rem;
}

Awesome. I learned about FMA, hex floating-point literals, rounding, and Kahan difference of products all in one post! Is there anything stopping us from doing: float err = fmaf(a, b, -a*b); return fmodf(a*b, 360) + err; Perhaps fmodf is more expensive? — phil5, Commented Feb 14, 2023 at 22:40
@phil5 If in fmod(x,y) the magnitudes of x and y are very different, the function can be quite slow. That is the cost of always delivering an exact result. From the question and example, it would seem that in your use case, the magnitudes of x and y are not close. But I do not know your platform. You probably would want to build some test scaffolding for assessing accuracy and speed and try some alternatives you can think of. — njuffa, Commented Feb 14, 2023 at 22:44

gnasher729 · Accepted Answer · 2023-01-18 17:38:36Z

-1

If only single precision arithmetic is available:

You replace a with a mod 360, and b with b mod 360. That can be done exactly. You can force a, b into the ranges -180 <= a, b <= +180.

Calculating (a * b) mod 360 would have a huge rounding error. Single precision standard floating point has a 24 bit mantissa. The product is a*b <= 32,400 < 2^15 (we are a bit lucky there). We let $a = a_1 + a_2 + a+3$, same for b, where $a_1$ is a multiple of 1/16 (at most 12 bit mantissa), $a_2$ has an absolute value of at most 1/32 and is a multiple of $2^{-17}$, and $a_2$ is at most $2^{-18}.

You need to calculate 9 partial products, which are each exact. $a_1 \cdot b_1$ is reduced modulo 360, so the values are at most 180, at most 180/32 ≤ 6, and the others are smaller. Reduce $a_1 b_2$ and $a_2 b_1$ modulo 1/16, then add everything from the smallest to the largest, and one final reduction modulo 360.

answered Jan 18, 2023 at 17:38

gnasher729

31.6k35 silver badges55 bronze badges

$\begingroup$ Thanks for your answer. I don't quite see how you can recover the correct result if you first reduce a and b (real numbers) to 360 degrees. For example: fmod360(13.1764 * 36524.6667) = 303.6183, while fmod360(13.1764 * 164.6667) = 9.7143. $\endgroup$
– phil5
Commented Jan 18, 2023 at 18:12
$\begingroup$ For real numbers, $ab\bmod360\ne(a\bmod360)(b\bmod360)$. $\endgroup$
– user16034
Commented Feb 12, 2023 at 14:00

Add a comment |

Stack Exchange Network

Floating-point modular multiplication algorithm

2 Answers 2

Your Answer

Not the answer you're looking for? Browse other questions tagged
algorithms
floating-point
modular-arithmetic
or ask your own question.

Hot Network Questions

Floating-point modular multiplication algorithm

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Not the answer you're looking for? Browse other questions tagged algorithmsfloating-pointmodular-arithmetic or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
algorithms
floating-point
modular-arithmetic
or ask your own question.