This project is archived and is in readonly mode.

#179 open
Psycopg website

Mogrify performance on arrays

Reported by Psycopg website | August 28th, 2013 @ 02:18 PM

Submitted by: ChrisB

If I run cursor.mogrify on a 64000 double array it is taking 350ms. If I just do a 'ARRAY[' + ','.join(map(str, array)) + ']' and put it into a string format myself it only takes 55ms. Although the documentation tells you not to do this, it is bringing the insert time of my app down massively.

Comments and changes to this ticket

  • Daniele Varrazzo

    Daniele Varrazzo September 20th, 2013 @ 11:28 AM

    • State changed from “new” to “invalid”

    Thank you for inventing the SQL injection.

    Of course feel free to do it in your app if you know where the input is coming from, but don't even think to suggest it as a good idea.

  • Federico Di Gregorio

    Federico Di Gregorio September 20th, 2013 @ 12:35 PM

    • State changed from “invalid” to “open”
    • Tag cleared.
    • Assigned user set to “Federico Di Gregorio”

    Daniele, wait. He was talking about a simple benchmark and I can see where the bottleneck is. The mogrification of arrays is slow, because the arguments are checked one by one: we can probably make this faster by caching the adapter when the list is know to contain only objects of the same type. I'll investigate this.

  • Daniele Varrazzo

    Daniele Varrazzo May 5th, 2014 @ 11:26 PM

    I think the speedup you have in mind is not easy to implement; OTOH it can be generalized. The adaptation codepath is not straightforward as it depends on several choices: is the object __conform__, is there a prepare() to be called etc. We can cache a mapping class -> choices made and reuse it instead of exploring the adaptation tree for each object: this would speed up not only array adaptation but any operation such as execute(SQL, ["s1", "s2", "s3"]): we wouldn't have to check how to adapt the string object 3 times but only once.

    The cache should be short-lived as changing the adapters map should invalidate it and because the number of classes could be unbound (if they are created dynamically, e.g. namedtuples). But creating a new cache for every execute()/executemany() would be already a nice speedup and there should be no side effect (as there should be no program relying on a change of the adapters map during a concurrent execute()...).

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป

<b>WARNING:</b> the informations in this tracker are archived. Please submit new tickets or comments to <a href="">the new tracker</a>.
Psycopg is the most used PostgreSQL adapter for the Python programming language. At the core it fully implements the Python DB API 2.0 specifications. Several extensions allow access to many of the features offered by PostgreSQL.

Shared Ticket Bins