Newest 'floating-point' Questions

1 vote

1 answer

39 views

Does adding two positive floating point numbers ever result in a smaller number?

When dealing with floating point numbers, is there every a time when adding two non-negative, finite, non-NaN numbers will result in a result that is less than the greater of the two? ...

Captain Man

111

asked Dec 13 at 0:07

0 votes

0 answers

37 views

Convert float to integer representation

It's related with my previous question. I want to map arbitrary float space into integer representation. I have define the transformation, where $x'$ is integer representation, $x$ is real value like <...

Muhammad Ikhwan Perwira

494

asked Dec 6 at 0:38

2 votes

1 answer

62 views

Is there such data type for low range float?

For integers, we have unsigned integers to represent positive integers, including zero, and we have signed integers to represent negative and positive. There are always trade-offs between them. For ...

Muhammad Ikhwan Perwira

494

asked Dec 3 at 9:10

0 votes

0 answers

30 views

What is the purpose of rounding bit in floating point numbers?

Let's only consider single precision ieee-754 floating point numbers. I understand how to convert decimal floating-point number to its ieee-754 representation (Well... almost). My problem is with ...

imahmadrezas

1

asked Nov 6 at 14:42

0 votes

0 answers

29 views

Exact computation with fractional powers as integer bounds

Context: I am implementing the prime-counting function $\pi(x)$ with the Meissel-Lehmer algorithm, extended by Lagarias, Miller, and Odlyzko. The notation comes from an overview by Oliveira e Silva. $\...

qwr

628

asked Oct 6 at 4:52

2 votes

2 answers

62 views

Diverging floating point calculation

I remember from computer science classes an example where the same floating-point calculation would diverge to infinity. From memory, it was something like this: ...

emonigma

121

asked Sep 25 at 17:55

5 votes

2 answers

1k views

Big Transition of Binary Counting in perspective of IEEE754 floating point

If I iterate over binary representations from 000...000 to 111...111, is there a significant transition at some point? In the ...

Muhammad Ikhwan Perwira

494

asked Sep 10 at 7:41

-2 votes

8 answers

1k views

Why do computers use binary numbers in IEEE754 fraction instead of BCD or DPD?

I asked a new question because it more accurately reflects what I asked: about:Why don't decimal-floating-point numbers have CPU level support like float point numbers in usual computers? I will ...

user173333

asked Aug 19 at 11:31

-1 votes

1 answer

43 views

Why is ot returning TRUE in first case and FALSE in the second?

I understand 0.3 does not have an accurate binary representation. Suppose I run the following code: Why is the answer "True" in the first case and "False" in the second? Shouldn't ...

Golden_Hawk

101

asked Jul 15 at 4:11

1 vote

0 answers

50 views

What are the most number of bits ever used in arbitrary/multiple precision floating point arithmetic?

I've been exploring the evolution of floating-point arithmetic formats from single to octuple precision. Here's what I THINK I have learned about the key specifications and capabilities for each ...

Curious Layman

111

asked Apr 17 at 3:49

0 votes

1 answer

102 views

How reliable is a floating point operation (how often does it makes mistakes)

While computers are very reliable, they can also do errors because of noise. I would like to have an idea of the rough order of magnitude of error per floating point operation in a computer, in a ...

StarBuck

137

asked Mar 27 at 14:55

3 votes

4 answers

144 views

Are float pseudo-random number generators always implemented using integer generators underneath

In C it's well known to use simple routine for turning integer rng into float rng. Something like that ...

simd

131

asked Mar 12 at 15:39

0 votes

1 answer

129 views

A floating-point rounding problem

I run the Python code below. x and y differ in their 4th and 5th number, and x has larger ...

Rocky

11

asked Feb 14 at 21:35

0 votes

2 answers

106 views

How does signed floating point adder implement?

The following picture is a block diagram of an arithmetic unit dedicated to IEEE 754 floating-point addition from Computer Organization and Design RISC-V Edition: The Hardware Software Interface 2nd ...

user153245

109

asked Dec 29, 2023 at 19:42

0 votes

1 answer

53 views

Absolute difference between largest IEEE754 number and its predecesor

In simple precision format, the largest possible positive number is $A = 0 ~~~ 11111110 ~~~ 111\ldots 111$ Its predecessor is $B = 0 ~~~ 11111110 ~~~ 111 \ldots 110$ But what is the absolute ...

lafinur

195

asked Nov 10, 2023 at 16:51

2 votes

1 answer

131 views

Comparison of different algorithms for summing floating point numbers

I am exploring several approaches to summing floating point values, such as: Naive summation, for comparison Summing sorted values summing with numpy, again for comparison Kahan's algorithm Pairwise ...

Olumide

153

asked Nov 7, 2023 at 1:24

1 vote

2 answers

520 views

Convert a rational number to a floating-point number exactly

We have two integers, $n$ and $d$. They are coprime (the only positive integer that is a divisor of both of them is $1$). They may be implemented as something that fits in a machine register, or they ...

user2373145

201

asked May 6, 2023 at 3:03

2 votes

2 answers

430 views

Multiplication of subnormal and normal numbers under IEEE 754

As far as I understand, when doing this operation, we first need to identify the subnormal value, normalise it, adjust the exponent, and then multiply the significands. Wouldn't it be advantageous to ...

pabloabur

23

asked Apr 3, 2023 at 7:22

3 votes

2 answers

374 views

Floating-point modular multiplication algorithm

Is there a well-known algorithm for modular multiplication of floating-point numbers? I would like to multiply some large angle in single precision (6-7 significant digits) and wrap it back to 360 ...

phil5

33

asked Jan 17, 2023 at 4:51

1 vote

1 answer

132 views

FFT of logarithmic input data

Is there a reasonably accurate method of computing an FFT of logarithmically-represented input data (with a sign bit, that is $±2^{\text{double-precision value}}$)? The naive method (convert to linear ...

TLW

1,493

asked Dec 12, 2022 at 0:49

1 vote

1 answer

73 views

Floating-point rounding - bit patterns of values that are halfway between two possible results

I am working through the book Computer Systems: A Programmer's Perspective. The authors explain that round-to-even rounding can be applied for values that are halfway between two possible results. For ...

cmplx96

113

asked Jul 16, 2022 at 17:38

0 votes

0 answers

481 views

Why XMM register width is 128-bit while double precision floating point is 64-bit?

As far as I know, XMM is used to store floating point. But Highest width floating point according IEEE754 standard is double precision (64-bit). But why XMM register width is 128-bit. Is another 64-...

Muhammad Ikhwan Perwira

494

asked Jul 12, 2022 at 9:41

0 votes

1 answer

42 views

How much larger is the next representable value if 2^59 is stored in a double?

This is an exam question I couldn't solve If I store 2^59 as double, that would give me 1 * 2^58. Is the answer just 2? I.e. next value is 2^60??

Rubus

123

asked Jul 8, 2022 at 3:18

2 votes

0 answers

53 views

Best way to constrain a complex number to being within the unit circle?

What is the best way of implementing the following function, $$f(x) = \frac{x}{\max(1, |x|)},$$ where $x$ is complex, using a Cartesian representation of $x$ with IEEE 754 floating point ...

sircolinton

121

asked Jul 8, 2022 at 3:06

3 votes

0 answers

533 views

What are the use cases for the IEEE 754 inexact flag?

The IEEE 754 standard for floating point numbers defines a flag that is set when a result from floating point calculation isn't exact, i.e. has to be rounded. What algorithms are there that utilize ...

QuantumWiz

196

asked Jun 15, 2022 at 15:18

4 votes

3 answers

3k views

Is significand same as mantissa in IEEE754?

I'm trying to understand IEEE 754 floating point. when I try convert 0.3 from decimal to binary with online calculator, it said the significand value was ...

Muhammad Ikhwan Perwira

494

asked Jun 11, 2022 at 14:48

1 vote

1 answer

42 views

How does CPU determine Reserved Exponent cases?

Using IEEE 754 algorithm i assume, that it can be implemented in a branchless way. But how does CPU determine special cases (Reserved Exponent values): Exponent Significand is 11111111 000000000... ...

uptoyou

113

asked May 22, 2022 at 15:14

1 vote

1 answer

184 views

What does it mean unambiguously that a number is value 0 up to numerical precision?

I was reading that a quantity $x$ is $0$ upt to numerical precision. What does this statement formally mean -- especially in the context of numerical methods or real computers. I looked up in google ...

Charlie Parker

3,090

asked Apr 7, 2022 at 21:14

1 vote

0 answers

40 views

Standard for representing a float scaled to a particular range?

TL;DR Is there a "standard" way to represent a float scaled to a particular range, such that we get maximum precision for the given bit depth, within that range? I'll start with my general ...

hazymat

111

asked Jan 25, 2022 at 10:11

0 votes

2 answers

93 views

(Numerical Analysis) What is the largest double float represented for the gamma function and $n!$

Consider that \begin{align} \Gamma(n+1) = n! \end{align} for any integers. I then got the following two questions: What is the largest value of $n$ for which $Γ(n+1)$ and $n!$ can be exactly ...

Jens Kramer

3

asked Jan 23, 2022 at 10:37

3 votes

1 answer

86 views

Bisecting Intervals of floating point numbers containing 0 and infinity fairly

It is seldom considered that floating points are not evenly distributed in the real number line. I've been working with interval arithmetic and noticed when bisecting $[a,b]$ on the real number line ...

worldsmithhelper

143

asked Jan 3, 2022 at 16:55

2 votes

1 answer

451 views

Can Radix Sort be modified for signed ints and/or floats?

A few months ago I learned about the magic that allows radix sort to run in O(n) time and space. Most tutorials on radix sort say it is useful for very large ...

Adam Hoelscher

121

asked Dec 10, 2021 at 23:08

0 votes

1 answer

81 views

IEEE 754 conversion

I'm trying to convert 3.2 into IEEE 754 format. We find that $(3)_2=11$ and we also find that $0.2*2=0.4 -0$ $0.4*2=0.8 -0$ $0.8*2=1.6 -1$ $0.6*2=1.2 -1$ and this cycle repeats so $.2=00110011...$ ...

Iwan5050

135

asked Dec 10, 2021 at 9:01

4 votes

0 answers

167 views

Uniformly random decimal numbers

Due to finite precision of number representations, we face situations like: In: 0.1+0.1+0.1==0.3 Out: False (on my ...

Matthieu Latapy

673

asked Dec 7, 2021 at 15:56

1 vote

0 answers

56 views

(Branchless) Bitonic Sorting Network for a Set of Floating Point Numbers

In the past I've implemented a branchless Bitonic Sorting Network on a gpu using CUDA, for integers. I am facing a related problem: In my Order Independent Transparency implementation, I would like to ...

Vectorizer

135

asked Nov 26, 2021 at 11:30

0 votes

1 answer

71 views

How can vector angle comparison between lattice points be done without using floating-points? (Convex Hull)

Let's say I have a point $(x_0, y_0)$, and some other points $(x_1, y_1), (x_2, y_2) ... (x_n, y_n)$, such that all of them are lattice points; all have integer coordinates. Let's further assume that ...

Christopher Miller

317

asked Nov 26, 2021 at 3:41

1 vote

2 answers

1k views

How many Integers can be represent in Double-Precision floating-point form

How to calculate the number of Integers that can be represent in Double-Precision floating-point form?

0xAlon

15

asked Nov 16, 2021 at 23:31

1 vote

1 answer

415 views

Prove every number in double precision 32-bit floating-point format can be represented in 64-bit format

Theorem: Prove every number in double precision 32-bit floating-point format can be represented in double precision 64-bit floating point-format. 64-bit format: Attempt: Let $ b = b_0 ,...,b_{31} $ ...

flamel12

233

asked Nov 13, 2021 at 11:43

1 vote

1 answer

49 views

Why does floating point become less accurate as the powers of 2 increase?

https://fabiensanglard.net/floating_point_visually_explained/ I was reading this article where the exponent and the mantissa are explained as the window and offset respectively. As the gap between ...

Neel Sandell

161

asked Nov 10, 2021 at 20:16

0 votes

0 answers

282 views

Is there a way to convert FLOPS to bit operation per second

My problem is the following: I have $N$ inner products to compute in parallel every second. Each of the vectors in those inner product is composed of $7$ bits. I want to know for which $N$ it starts ...

StarBuck

137

asked Oct 27, 2021 at 18:53

2 votes

1 answer

301 views

Unit conversion - Better to divide by an integer or multiply by a double?

I currently have a long timestamp measured in units of 100ns elapsed since January 1st, 1900. I need to convert it to milliseconds. I have the choice of either ...

Bassinator

549

asked Jul 9, 2021 at 19:06

1 vote

2 answers

1k views

Half precision floating point question -- smallest non-zero number

There's a floating point question that popped up and I'm confused about the solution. It states that IEEE 754-2008 introduces half precision, which is a binary floating-point representation that uses ...

Manny

13

asked May 10, 2021 at 3:56

1 vote

1 answer

723 views

Floating Point Arithmetic with 3 bits mantissa

Find all values of $ x ∈ R $ such that x + 1 = 1 in floating point arithmetic with 3 bits mantissa. How do we represent number 1 in floating point arithmetic with 3 bits mantissa I wonder? After that, ...

Hung Do

13

asked Apr 19, 2021 at 20:23

3 votes

2 answers

199 views

Python versus Matlab on the quantity 1/0

Python and Matlab seem to disagree on the division by 0. Python: ...

pluton

133

asked Apr 16, 2021 at 18:41

1 vote

3 answers

5k views

Negative Numbers in 32 bit Floating Point IEEE Numbers

So I understand the logic behind converting positive decimal numbers to IEEE 32 bit floating numbers but I'm not completely sure behind the negative one's. If for example we have a decimal number say -...

idkrlly

13

asked Apr 1, 2021 at 13:01

1 vote

2 answers

558 views

Adding two numbers in base 2(floating point) vs Multiplying two numbers in base 2(floating point)

Is it true that adding two numbers in base 2 is more complex than multiplying them? If so can someone please explain why this is the case?

Roy Fischer

11

asked Mar 28, 2021 at 5:34

1 vote

2 answers

153 views

Prove that $1^\text{nan} = 1.00$

I know that for most computation involve nan (not a number) the result is a nan itself except for some cases. For example, $1^{\text{nan}} = 1.00$ which proven by mathematicians to be true. I tried to ...

Monther

118

asked Mar 22, 2021 at 9:25

2 votes

2 answers

31 views

Floating point bitwise comparator. If f1 and f2 are floating point numbers with the following properties can we always say f1 > f2?

Recall floating-point representation: Suppose $f$ is a floating-point number then we can express f as, If $f$ is normal: $$(-1)^{s}\cdot2^{e-127}(1 + \sum\limits_{k=1}^{23} b_{23-k}\cdot 2^{-k})$$ If $...

VilePoison

23

asked Mar 11, 2021 at 20:36

0 votes

0 answers

122 views

Convert $8.75×10^{6}$ to IEEE-32 format?

There is a similar question already asked on this site but does not have an answer as to how the 10x was converted into 2y. I know how to convert 8.75 or 875 into IEEE representation. But what about ...

callmeanythingyouwant

101

asked Feb 14, 2021 at 16:14

1 vote

1 answer

608 views

What is the machine epsilon and number of mantissa bits for TI-83?

I am trying to determine how many bits the TI-83 Plus uses to store floating point numbers. I am using the algorithm for approximating the machine epsilon given in "Numerical Mathematics and ...

irowe

113

asked Feb 10, 2021 at 23:37

Questions tagged [floating-point]

Related Tags