This project is archived and is in readonly mode.

#203 ✓resolved
Georges Racinet

OverflowError on a large object oid

Reported by Georges Racinet | April 1st, 2014 @ 08:49 AM

rel-2.5.2 os-linux
Debian 7, x86_64, libpq5 9.3.4-1.pgdg70+1 (latest version from apt.postgresql.org),
Python 2.7.3 (stock Debian) through a virtualenv

Hi, I'm getting an OverflowError about the OID that psycopg2 gave to me in the automatic creation sequence:

>>> from psycopg2 import connect
>>> cnx = connect("dbname=private_buildbot port=5434")
>>> lobj = cnx.lobject(0, 'wb')
>>> lobj.oid
2275696079L
>>> lobj.write('payload')
7
>>> oid = lobj.oid
>>> lobj.close()
>>> lobj_r = cnx.lobject(oid, 'rb')
Traceback (most recent call last):
  File "<console>", line 1, in <module>
OverflowError: signed integer is greater than maximum

The reproduction in systematic on this DB, but the problem does not occur on other ones (I suppose oids are smaller on them).
This is a continuous integration system, the cluster runs PostgreSQL 9.2.8, and has worked a lot. Large objects just work (psql with the 9.3 lib as well):

$ psql -p 5434 private_buildbot
psql (9.3.4, server 9.2.8)
Type "help" for help.

private_buildbot=> begin;
BEGIN
private_buildbot=> select lo_create(0);
 lo_create  
------------
 2275696090
(1 row)

private_buildbot=> select lo_open(2275696090, 131072);
 lo_open 
---------
       0
(1 row)

private_buildbot=> select lowrite(0, 'payload');
 lowrite 
---------
       7
(1 row)

private_buildbot=> select lo_close(0);
 lo_close 
----------
        0
(1 row)

private_buildbot=> select lo_open(2275696090, 131072);
 lo_open 
---------
       0
(1 row)

private_buildbot=> select loread(0, 10);
      loread      
------------------
 \x7061796c6f6164
(1 row)

I don't reproduce on the 9.3 cluster that runs on the same system, but that's most probably because OIDs are much smaller there:

$ psql -p 5435 private_buildbot
psql (9.3.4)
Type "help" for help.

private_buildbot=> select lo_create(0);
 lo_create 
-----------
   2644870
(1 row)

I would not bet on the versions to be that relevant.

Would simply checking the size of the current OID on the production system be enough to be safe ? Are they incremental, by the way ? The target application won't certainly have billions of large objects !

Is it worth the effort to pick random 32 bits positive number and impose them as OIDs of newly created large objects with a retry-if-busy pattern ?

Thanks for your help (and more generally for making psycopg2 in the first place)

Comments and changes to this ticket

  • Georges Racinet

    Georges Racinet April 2nd, 2014 @ 08:04 PM

    Disclaimer: I'm not familiar with C extensions, nor with libpq

    This excerpt from large_object.c

    static int
    lobject_init(PyObject *obj, PyObject *args, PyObject *kwds)
    {
        int oid = (int)InvalidOid, new_oid = (int)InvalidOid;
        const char *smode = "";
        const char *new_file = NULL;
        PyObject *conn = NULL;
    
        if (!PyArg_ParseTuple(args, "O!|iziz",
             &connectionType, &conn,
             &oid, &smode, &new_oid, &new_file))
            return -1;
    
        return lobject_setup((lobjectObject *)obj,
            (connectionObject *)conn, (Oid)oid, smode, (Oid)new_oid, new_file);
    }
    

    seems at least to contradict the earlier declaration as T_UINT

    static struct PyMemberDef lobjectObject_members[] = {
        {"oid", T_UINT, offsetof(lobjectObject, oid), READONLY,
            "The backend OID associated to this lobject."},
        {"mode", T_STRING, offsetof(lobjectObject, smode), READONLY,
            "Open mode."},
        {NULL}
    };
    

    and this, from postgres_ext.h:

    /*
     * Object ID is a fundamental type in Postgres.
     */
    typedef unsigned int Oid;
    

    so, shouldn't we replace this 'int' by 'unsigned' ?

  • Daniele Varrazzo

    Daniele Varrazzo April 2nd, 2014 @ 09:11 PM

    • State changed from “new” to “open”

    I'll take a look at that but I think you are on the right track. Can't remember much of the lobject api: can you choose your own oid or are they only assigned by the server? If we can it would be nice to add operations with a large oid in the test suite.

    You can make a test yourself if you want to try and fix it: PyArg_ParseTuple and family should use I instead of i to parse an unsigned int, Docs are here. And we should probably use the Oid type were we are using ints.

    Thank you very much!

  • Daniele Varrazzo

    Daniele Varrazzo April 3rd, 2014 @ 01:36 AM

    • State changed from “open” to “resolved”

    Fixed the object opening: can you please check if the rest of the operations work as expected? Thank you!

  • Georges Racinet

    Georges Racinet April 3rd, 2014 @ 08:39 AM

    It works for me, both on the affected system/cluster and on my development rig, thanks a lot !

    Our code does just uses open/read/write/unlink (in particular, no import/export), so I can't speak for the rest of the API.
    I'm wondering why there was this cast to int in the first place, so don't hesitate to tell if you'd like a test of more functions on that particular cluster.

    As an unrelated side note, we're using zc.buildout and are affected by what looks to be the same problem as #18 while trying to 'develop' (affects also the gp.vcsdevelop extension, not surprising since it uses buildout's develop() function).

  • Daniele Varrazzo

    Daniele Varrazzo April 3rd, 2014 @ 10:25 AM

    I've seen that error reported more recently in bug #192

    For me it's a bug in setuptools: it doesn't apply the setup.cfg. I've reported it here: https://github.com/pypa/pip/issues/1630

    Opening a ticket to try and work around that.

  • Daniele Varrazzo
  • Georges Racinet

    Georges Racinet April 3rd, 2014 @ 10:51 AM

    Great, I was considering building a custom 2.5.2.1-large-object-uid.tgz in the meanwhile, but I'll hold on.

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

<b>WARNING:</b> the informations in this tracker are archived. Please submit new tickets or comments to <a href="https://github.com/psycopg/psycopg2/issues">the new tracker</a>.
<br/>
Psycopg is the most used PostgreSQL adapter for the Python programming language. At the core it fully implements the Python DB API 2.0 specifications. Several extensions allow access to many of the features offered by PostgreSQL.

Shared Ticket Bins

Pages