The Go Programming Language Specification
Integer literals
An integer literal is a sequence of digits representing an integer constant.
Floating-point literals
A floating-point literal is a decimal representation of a floating-point constant. It has an integer part, a decimal point, a fractional part, and an exponent part. The integer and fractional part comprise decimal digits; the exponent part is an e or E followed by an optionally signed decimal exponent. One of the integer part or the fractional part may be elided; one of the decimal point or the exponent may be elided.
Arithmetic operators
For two integer values x and y, the integer quotient q = x / y and remainder r = x % y satisfy the following relationships:
x = q*y + r and |r| < |y|with x / y truncated towards zero.
You wrote, using integer literals and arithmetic (x / y truncates towards zero):
package main
import (
"fmt"
"strconv"
)
func main() {
var num float64
num = 5 / 3 // float64(int(5)/int(3))
fmt.Printf("%v\n", num)
numString := strconv.FormatFloat(num, 'f', -1, 64)
fmt.Println(numString)
}
Playground: https://play.golang.org/p/PBqSbpHvuSL
Output:
1
1
You should write, using floating-point literals and arithmetic:
package main
import (
"fmt"
"strconv"
)
func main() {
var num float64
num = 5.0 / 3.0 // float64(float64(5.0) / float64 (3.0))
fmt.Printf("%v\n", num)
numString := strconv.FormatFloat(num, 'f', -1, 64)
fmt.Println(numString)
}
Playground: https://play.golang.org/p/Hp1nac358HK
Output:
1.6666666666666667
1.6666666666666667
Answer from peterSO on Stack OverflowI know javascript has problems when number goes big, usual thing is that trailing digits will be truncated to zero. And I wonder what that looks like in golang, so I write a program:
see https://go.dev/play/p/2rbKFNiupQ_6
package main
import "fmt"
func main() {
var v1 float64 = 1876219900889841660
var v2 float64 = 1876219900889841661
var v3 float64 = 1876219900889841662
var v4 float64 = 1876219900889841663
var v5 float64 = 1876219900889841664
var v6 float64 = 1876219900889841665
var v7 float64 = 1876219900889841666
var v8 float64 = 1876219900889841667
var v9 float64 = 1876219900889841668
fmt.Printf("v1==v2: %v\n", v1 == v2) // true
fmt.Printf("v2==v3: %v\n", v2 == v3) // true
fmt.Printf("v3==v4: %v\n", v3 == v4) // true
fmt.Printf("v4==v5: %v\n", v4 == v5) // true
fmt.Printf("v5==v6: %v\n", v5 == v6) // true
fmt.Printf("v6==v7: %v\n", v6 == v7) // true
fmt.Printf("v7==v8: %v\n", v7 == v8) // true
fmt.Printf("v8==v9: %v\n", v8 == v9) // true
fmt.Printf("int64(v4): %d\n", int64(v4)) // 1876219900889841664
fmt.Printf("int64(v9): %d\n", int64(v9)) // 1876219900889841664
fmt.Printf("float64(v9): %.0f\n", v9) // 1876219900889841664
}Why all float64 numbers are printed as 1876219900889841664? In javascript this is 1876219900889841700. Anyone can give an explanation please? Thanks.
go - Float64 type printing as int in Golang - Stack Overflow
Convert a float64 to an int in Go - Stack Overflow
math/big: add Int.Float64 conversion (was initially: {ToInt64,ToUint64,Float64})
go - Golang floating point precision float32 vs float64 - Stack Overflow
Copypackage main
import "fmt"
func main() {
var x float64 = 5.7
var y int = int(x)
fmt.Println(y) // outputs "5"
}
Simply casting to an int truncates the float, which if your system internally represent 2.0 as 1.9999999999, you will not get what you expect. The various printf conversions deal with this and properly round the number when converting. So to get a more accurate value, the conversion is even more complicated than you might first expect:
Copypackage main
import (
"fmt"
"strconv"
)
func main() {
floats := []float64{1.9999, 2.0001, 2.0}
for _, f := range floats {
t := int(f)
s := fmt.Sprintf("%.0f", f)
if i, err := strconv.Atoi(s); err == nil {
fmt.Println(f, t, i)
} else {
fmt.Println(f, t, err)
}
}
}
Code on Go Playground
Using math.Float32bits and math.Float64bits, you can see how Go represents the different decimal values as a IEEE 754 binary value:
Playground: https://play.golang.org/p/ZqzdCZLfvC
Result:
float32(0.1): 00111101110011001100110011001101
float32(0.2): 00111110010011001100110011001101
float32(0.3): 00111110100110011001100110011010
float64(0.1): 0011111110111001100110011001100110011001100110011001100110011010
float64(0.2): 0011111111001001100110011001100110011001100110011001100110011010
float64(0.3): 0011111111010011001100110011001100110011001100110011001100110011
If you convert these binary representation to decimal values and do your loop, you can see that for float32, the initial value of a will be:
0.20000000298023224
+ 0.10000000149011612
- 0.30000001192092896
= -7.4505806e-9
a negative value that can never never sum up to 1.
So, why does C behave different?
If you look at the binary pattern (and know slightly about how to represent binary values), you can see that Go rounds the last bit while I assume C just crops it instead.
So, in a sense, while neither Go nor C can represent 0.1 exactly in a float, Go uses the value closest to 0.1:
Go: 00111101110011001100110011001101 => 0.10000000149011612
C(?): 00111101110011001100110011001100 => 0.09999999403953552
Edit:
I posted a question about how C handles float constants, and from the answer it seems that any implementation of the C standard is allowed to do either. The implementation you tried it with just did it differently than Go.
Agree with ANisus, go is doing the right thing. Concerning C, I'm not convinced by his guess.
The C standard does not dictate, but most implementations of libc will convert the decimal representation to nearest float (at least to comply with IEEE-754 2008 or ISO 10967), so I don't think this is the most probable explanation.
There are several reasons why the C program behavior might differ... Especially, some intermediate computations might be performed with excess precision (double or long double).
The most probable thing I can think of, is if ever you wrote 0.1 instead of 0.1f in C.
In which case, you might have cause excess precision in initialization
(you sum float a+double 0.1 => the float is converted to double, then result is converted back to float)
If I emulate these operations
float32(float32(float32(0.2) + float64(0.1)) - float64(0.3))
Then I find something near 1.1920929e-8f
After 27 iterations, this sums to 1.6f
From the documentation, it converts the float64 into an uint64 without changing the bits, it's the way the bits are interpreted that change.
Here is the full source code of the Float64bits function:
func Float64bits(f float64) uint64 { return *(*uint64)(unsafe.Pointer(&f)) }
Don't be scared by that syntax trick of using an unsafe Pointer, it's quite common in Go's source code (avoids copying the data). So, that really is that simple: take the binary data of the given float and interpret it as an unsigned integer.
The reason it changes so much is because of the representation of floating point numbers. According to the specification, a floating point number is composed of a Sign, an Exponent and a Mantissa.
On a 64 bits float, there is 1 bit for the Sign, 11 bits for the exponent and 52 bits for the mantissa.
The representation of 4 as a floating point number on 64 bits is:
0b0100 0000 0001 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
SEEE EEEE EEEE MMMM MMMM MMMM MMMM MMMM MMMM MMMM MMMM MMMM MMMM MMMM MMMM MMMM
It turns out that this value is 4616189618054758400 if interpreted as an unsigned integer. You'll find plenty of great tutorials on the web regarding the IEEE754 to understand fully how the above value is a representation of 4.
As the documentation says, the function just interprets the data that form the float as uint64.
An IEEE 754 double has this bit layout:
SEEEEEEE EEEEMMMM MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM MMMMMMMM
These are 64 bits consisting of:
- one sign bit
S - exponent bits
E - mantissa bits
M
The value 4.0 equals this bit representation:
01000000 00010000 00000000 00000000 00000000 00000000 00000000 00000000
An detailed explanation why is looks this way would be too lengthy. There are some special rules regarding the mantissa which play a key role here. We can just ignore that for now, please see the linked doc if you are interested in all the dirty details of how numbers are represented in IEEE float format.
The above function does nothing else as treating these 64 bits as if they were an uint64. At the end this is just casting a bunch of bits that happen to fit into an uint64. Hence, the resulting number is totally different from the float value.