#145 Floats are not exactly preserved

Type	To find
responsible:me	tickets assigned to you
tagged:"@high"	tickets tagged @high
milestone:next	tickets in the upcoming milestone
state:invalid	tickets with the state invalid
created:"last week"	tickets created last week
sort:number, importance, updated	tickets sorted by #, importance or updated
Combine keywords for powerful searching.
Use advanced searching »

This project is archived and is in readonly mode.

#145 ✓invalid

Floats are not exactly preserved

Reported by Psycopg website | January 19th, 2013 @ 08:16 PM

Submitted by: Div Shekhar

Python float roundtrip to & from double precision seems to be losing precision.

Occurring on both Mac & Linux:
- OS X 10.8.2, Postgres.app 9.2.2.0, psycopg2 2.4.6 - Ubuntu 12.04 LTS, stock postgres (9.1.7), stock psycopg2 (2.4.5)

BTW, MySQL for Python seems to have the same so I adapted the test program from their bug report.

bug http://sourceforge.net/p/mysql-python/bugs/292/

------ floatbug.py:

import math
import psycopg2
import struct

conn = psycopg2.connect(host="localhost", database="bindertestdb", user="bindertest", password="binderpassword")

before = 3.14159265358979323846264338327950288

cursor = conn.cursor()
cursor.execute('DROP TABLE IF EXISTS test')
cursor.execute('CREATE TABLE test (a DOUBLE PRECISION)')
cursor.execute('INSERT INTO test VALUES (%s)',(before,))
cursor.execute('SELECT a FROM test')

after = cursor.fetchall()[0][0]

print after == before
print "%.36g" % after # bug -> 3.14159265358979000737349451810587198

print "before: %.20f" % before
print "before [m] %.20f" % float("%.15g" % before)
print "after: %.20f" % after

print "before : ",bin(struct.unpack('Q', struct.pack('d', before))[0])
print "before [m] : ",bin(struct.unpack('Q', struct.pack('d', float("%.15g" % before)))[0])
print "after : ",bin(struct.unpack('Q', struct.pack('d', after))[0])

---- Output:

False
3.14159265358979000737349451810587198
before: 3.14159265358979311600
before [m] 3.14159265358979000737
after: 3.14159265358979000737
before : 0b100000000001001001000011111101101010100010001000010110100011000
before [m] : 0b100000000001001001000011111101101010100010001000010110100010001
after : 0b100000000001001001000011111101101010100010001000010110100010001

Comments and changes to this ticket

Daniele Varrazzo January 21st, 2013 @ 01:02 PM
- State changed from “new” to “invalid”
Python doesn't store all these digits: python's float is only 64 bits.
```
before = 3.14159265358979323846264338327950288

>>> print before
3.141592653589793
```
Your own test shows it: printing more than 15 decimal digits only shows noise.
```
>>> "%.36g" % 3.14159265358979323846264338327950288
             '3.14159265358979311599796346854418516' # aligned for comparison
```
Even if Python passed all the digits you want (which it cannot as they are stored nowhere) Postgres does its own clipping to the 64 float:
```
piro=> select '3.14159265358979323846264338327950288'::double precision;
      float8      
------------------
 3.14159265358979
(1 row)
```
and that's what is returned to psycopg.

If you want larger precision you will have to use the decimal data type both in Postgres and in Python.
You flagged this item as spam.
Div Shekhar January 21st, 2013 @ 01:42 PM
I understand the significant digits limit, but I only care that 'before == after' is False

The test shows the 64-bit float binary value has changed:

before : ...11000
after : ...10001

Python float and postgresql double precision are both 64-bit floating point so I would expect the value to roundtrip correctly. Am I missing something here?
Daniele Varrazzo January 21st, 2013 @ 02:49 PM
As shown above, Python and Postgres parse floats in different ways:
```
piro@risotto:~$ python
Python 2.7.2+ (default, Jul 20 2012, 22:15:08) 
[GCC 4.6.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> print repr(3.14159265358979323846264338327950288)
3.141592653589793

piro@risotto:~$ psql
psql (9.1.7)
Type "help" for help.

piro=> select '3.14159265358979323846264338327950288'::float8;
      float8      
------------------
 3.14159265358979
(1 row)
```
so, the two models, Postgres and Python, just don't match. They may match using binary communication protocol but I'm not sure about that, and psycopg doesn't support it yet anyway.

More in general, asking for exact match between floating point numbers is asking for troubles. Any robust application using floating point number should check that two numbers are close enough (i.e. abs(B-A) < epsilon), never equal (B == A). This is basic scientific computing.

See also relevant Postgres docs at http://www.postgresql.org/docs/9.2/static/datatype-numeric.html#DATATYPE-FLOAT:

""" The data types real and double precision are inexact, variable-precision numeric types. [...]

Inexact means that some values cannot be converted exactly to the internal format and are stored as approximations, so that storing and retrieving a value might show slight discrepancies. Managing these errors and how they propagate through calculations is the subject of an entire branch of mathematics and computer science and will not be discussed here, except for the following points:
- If you require exact storage and calculations (such as for monetary amounts), use the numeric type instead. [...]
- Comparing two floating-point values for equality might not always work as expected. """
Python has similar notes at http://docs.python.org/2/tutorial/floatingpoint.html, and a reference to an exhaustive article.

So, just don't expect an exact roundtrip as there's not an exact representation. If you need exact precision in storage, you must use decimal in the database. This would roundtrip as expected:
```
In [1]: import psycopg2

In [2]: cnn = psycopg2.connect('')

In [3]: cur = cnn.cursor()

In [4]: before = 3.14159265358979323846264338327950288

In [5]: before
Out[5]: 3.141592653589793

In [11]: cur.execute("select %s::decimal", [before,])

In [12]: float(cur.fetchone()[0])
3.141592653589793
```
you can use this recipe from the FAQ to get Python float from Postgres decimals: http://initd.org/psycopg/docs/faq.html#faq-float.
You flagged this item as spam.
Div Shekhar January 28th, 2013 @ 12:36 PM
Agreed.

Thanks for the detailed response, and - yes - I should not be doing exact compares on float.

BTW, I added a second roundtrip to the returned value and the float does NOT change again so there's no worry that the value will keep drifting.

---- add to end of the test:

cursor.execute('UPDATE test SET a=%s',(after,))
cursor.execute('SELECT a FROM test')
after2 = cursor.fetchall()[0][0]
print after2 == after # True!

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<b>WARNING:</b> the informations in this tracker are archived. Please submit new tickets or comments to <a href="https://github.com/psycopg/psycopg2/issues">the new tracker</a>.
<br/>
Psycopg is the most used PostgreSQL adapter for the Python programming language. At the core it fully implements the Python DB API 2.0 specifications. Several extensions allow access to many of the features offered by PostgreSQL.

Shared Ticket Bins (Sort)

↓↑ drag 8 Open tickets
↓↑ drag 4 Tickets on hold
↓↑ drag 123 Resolved tickets
↓↑ drag 0 This week's tickets

Psycopg psycopg

Floats are not exactly preserved

Comments and changes to this ticket

Daniele Varrazzo January 21st, 2013 @ 01:02 PM

Div Shekhar January 21st, 2013 @ 01:42 PM

Daniele Varrazzo January 21st, 2013 @ 02:49 PM

Div Shekhar January 28th, 2013 @ 12:36 PM

Create your profile

Shared Ticket Bins (Sort)

People watching this ticket

Tags

Pages

Psycopg psycopg

Keyword searching

Floats are not exactly preserved

Comments and changes to this ticket

Daniele Varrazzo January 21st, 2013 @ 01:02 PM

Div Shekhar January 21st, 2013 @ 01:42 PM

Daniele Varrazzo January 21st, 2013 @ 02:49 PM

Div Shekhar January 28th, 2013 @ 12:36 PM

Create your profile

Shared Ticket Bins (Sort)

People watching this ticket

Tags

Pages