Floats are not exactly preserved
Reported by Psycopg website | January 19th, 2013 @ 08:16 PM
Submitted by: Div Shekhar
Python float roundtrip to & from double precision seems to be losing precision.
Occurring on both Mac & Linux:
- OS X 10.8.2, Postgres.app 188.8.131.52, psycopg2 2.4.6 - Ubuntu 12.04 LTS, stock postgres (9.1.7), stock psycopg2 (2.4.5)
BTW, MySQL for Python seems to have the same so I adapted the test program from their bug report.
conn = psycopg2.connect(host="localhost", database="bindertestdb", user="bindertest", password="binderpassword")
before = 3.14159265358979323846264338327950288
cursor = conn.cursor()
cursor.execute('DROP TABLE IF EXISTS test')
cursor.execute('CREATE TABLE test (a DOUBLE PRECISION)')
cursor.execute('INSERT INTO test VALUES (%s)',(before,))
cursor.execute('SELECT a FROM test')
after = cursor.fetchall()
print after == before
print "%.36g" % after # bug -> 3.14159265358979000737349451810587198
print "before: %.20f" % before
print "before [m] %.20f" % float("%.15g" % before)
print "after: %.20f" % after
print "before : ",bin(struct.unpack('Q', struct.pack('d',
print "before [m] : ",bin(struct.unpack('Q', struct.pack('d', float("%.15g" % before))))
print "after : ",bin(struct.unpack('Q', struct.pack('d', after)))
before [m] 3.14159265358979000737
before : 0b100000000001001001000011111101101010100010001000010110100011000
before [m] : 0b100000000001001001000011111101101010100010001000010110100010001
after : 0b100000000001001001000011111101101010100010001000010110100010001
Comments and changes to this ticket
- State changed from new to invalid
Python doesn't store all these digits: python's float is only 64 bits.
before = 3.14159265358979323846264338327950288 >>> print before 3.141592653589793
Your own test shows it: printing more than 15 decimal digits only shows noise.
>>> "%.36g" % 3.14159265358979323846264338327950288 '3.14159265358979311599796346854418516' # aligned for comparison
Even if Python passed all the digits you want (which it cannot as they are stored nowhere) Postgres does its own clipping to the 64 float:
piro=> select '3.14159265358979323846264338327950288'::double precision; float8 ------------------ 3.14159265358979 (1 row)
and that's what is returned to psycopg.
If you want larger precision you will have to use the decimal data type both in Postgres and in Python.
I understand the significant digits limit, but I only care that 'before == after' is False
The test shows the 64-bit float binary value has changed:
before : ...11000
after : ...10001
Python float and postgresql double precision are both 64-bit floating point so I would expect the value to roundtrip correctly. Am I missing something here?
As shown above, Python and Postgres parse floats in different ways:
piro@risotto:~$ python Python 2.7.2+ (default, Jul 20 2012, 22:15:08) [GCC 4.6.1] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> print repr(3.14159265358979323846264338327950288) 3.141592653589793 piro@risotto:~$ psql psql (9.1.7) Type "help" for help. piro=> select '3.14159265358979323846264338327950288'::float8; float8 ------------------ 3.14159265358979 (1 row)
so, the two models, Postgres and Python, just don't match. They may match using binary communication protocol but I'm not sure about that, and psycopg doesn't support it yet anyway.
More in general, asking for exact match between floating point numbers is asking for troubles. Any robust application using floating point number should check that two numbers are close enough (i.e. abs(B-A) < epsilon), never equal (B == A). This is basic scientific computing.
See also relevant Postgres docs at http://www.postgresql.org/docs/9.2/static/datatype-numeric.html#DATATYPE-FLOAT:
""" The data types real and double precision are inexact, variable-precision numeric types. [...]
Inexact means that some values cannot be converted exactly to the internal format and are stored as approximations, so that storing and retrieving a value might show slight discrepancies. Managing these errors and how they propagate through calculations is the subject of an entire branch of mathematics and computer science and will not be discussed here, except for the following points:
If you require exact storage and calculations (such as for monetary amounts), use the numeric type instead. [...]
Comparing two floating-point values for equality might not always work as expected. """
Python has similar notes at http://docs.python.org/2/tutorial/floatingpoint.html, and a reference to an exhaustive article.
So, just don't expect an exact roundtrip as there's not an exact representation. If you need exact precision in storage, you must use decimal in the database. This would roundtrip as expected:
In : import psycopg2 In : cnn = psycopg2.connect('') In : cur = cnn.cursor() In : before = 3.14159265358979323846264338327950288 In : before Out: 3.141592653589793 In : cur.execute("select %s::decimal", [before,]) In : float(cur.fetchone()) 3.141592653589793
you can use this recipe from the FAQ to get Python float from Postgres decimals: http://initd.org/psycopg/docs/faq.html#faq-float.
Thanks for the detailed response, and - yes - I should not be doing exact compares on float.
BTW, I added a second roundtrip to the returned value and the float does NOT change again so there's no worry that the value will keep drifting.
---- add to end of the test:
cursor.execute('UPDATE test SET a=%s',(after,))
cursor.execute('SELECT a FROM test')
after2 = cursor.fetchall()
print after2 == after # True!
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile »
Psycopg is the most used PostgreSQL adapter for the Python programming language. At the core it fully implements the Python DB API 2.0 specifications. Several extensions allow access to many of the features offered by PostgreSQL.