This project is archived and is in readonly mode.
Possible multithreading problem in psycopg2.
Reported by Manu Cupcic | May 22nd, 2012 @ 02:46 PM
Hi,
I am facing a problem which I think is due to an issue in psycopg2.
The issue arise when running sqlalchemy queries in a multhithreaded environment. I hit a SIGABRT because of the following failed assertion :
[New Thread 0x47e82950 (LWP 13578)]
python: ../Objects/obmalloc.c:755: PyObject_Malloc: Assertion `bp != ((void *)0)' failed.
The first frames of the stacktrace are :
[#0](/projects/62710/tickets/0 "Ticket #0") 0x00007fbabe6aaed5 in raise () from /lib/libc.so.6
[#1](/projects/62710/tickets/1 "Ticket #1") 0x00007fbabe6ac3f3 in abort () from /lib/libc.so.6
[#2](/projects/62710/tickets/2 "Ticket #2") 0x00007fbabe6a3dc9 in __assert_fail () from /lib/libc.so.6
[#3](/projects/62710/tickets/3 "Ticket #3") 0x000000000045848e in PyObject_Malloc (nbytes=100) at ../Objects/obmalloc.c:755
[#4](/projects/62710/tickets/4 "Ticket #4") 0x0000000000459427 in _PyObject_DebugMalloc (nbytes=68) at ../Objects/obmalloc.c:1345
[#5](/projects/62710/tickets/5 "Ticket #5") 0x0000000000457cd3 in PyMem_Malloc (nbytes=68) at ../Objects/object.c:2015
[#6](/projects/62710/tickets/6 "Ticket #6") 0x00007fbab2f0a819 in psycopg_escape_string (obj=0x5e32b00,
from=0x6088ca4 "VH55YhzQi3H3xjHZfIQW8mQbQZ8cM8d4", len=32, to=0x0, tolen=0x45668a80)
at psycopg/utils.c:47
[#7](/projects/62710/tickets/7 "Ticket #7") 0x00007fbab2f1e563 in qstring_quote (self=0x558a160) at psycopg/adapter_qstring.c:82
[#8](/projects/62710/tickets/8 "Ticket #8") 0x00007fbab2f1e74d in qstring_getquoted (self=0x558a160, args=0x0)
at psycopg/adapter_qstring.c:112
[#9](/projects/62710/tickets/9 "Ticket #9") 0x000000000051e5ea in PyCFunction_Call (func=0x521c420, arg=0x7fbabf438060, kw=0x0)
at ../Objects/methodobject.c:82
[#10](/projects/62710/tickets/10 "Ticket #10") 0x000000000041e843 in PyObject_Call (func=0x521c420, arg=0x7fbabf438060, kw=0x0)
at ../Objects/abstract.c:1861
[#11](/projects/62710/tickets/11 "Ticket #11") 0x000000000041e99b in call_function_tail (callable=0x521c420, args=0x7fbabf438060)
at ../Objects/abstract.c:1892
[#12](/projects/62710/tickets/12 "Ticket #12") 0x000000000041f0cd in _PyObject_CallMethod_SizeT (o=0x558a160, name=0x7fbab2f2a049 "getquoted",
format=0x0) at ../Objects/abstract.c:2008
[#13](/projects/62710/tickets/13 "Ticket #13") 0x00007fbab2f1f65b in microprotocol_getquoted (obj=0x6088c70, conn=0x5e32b00)
at psycopg/microprotocols.c:240
[#14](/projects/62710/tickets/14 "Ticket #14") 0x00007fbab2f10ae7 in _mogrify (var=0x651db10, fmt=0x7fbaac1aaac0, curs=0x7fbaac405850,
new=0x45668df8) at psycopg/cursor_type.c:158
[#15](/projects/62710/tickets/15 "Ticket #15") 0x00007fbab2f11750 in _psyco_curs_execute (self=0x7fbaac405850, operation=0x7fbaac1aaac0,
vars=0x651db10, async=0) at psycopg/cursor_type.c:384
[#16](/projects/62710/tickets/16 "Ticket #16") 0x00007fbab2f11bdf in psyco_curs_execute (self=0x7fbaac405850, args=0x5cbb768, kwargs=0x0)
I then had a look at the code in both psycopg/adapter_qstring.c and psycopg/utils.c and noticed something I think is invalid :
In adapter_qstring.c, at line 81 :
Py_BEGIN_ALLOW_THREADS
buffer = psycopg_escape_string(self->conn, s, len, NULL, &qlen);
Py_END_ALLOW_THREADS
seems to indicate we are releasing the GIL to perform the string escaping. However, in utils.c, at line 46, I see :
if (to == NULL) {
to = (char *)PyMem_Malloc((len * 2 + 4) * sizeof(char));
...
}
I am not an expert at writing python C extensions, but I thought that you needed to keep hold of the GIL to perform operations on python datastructures. Moreover, a comment in pymem.h says "he GIL must be held when using these APIs [the PyMem_... functions]".
I think maybe holding the GIL when doing the malloc would fix my problem.
I am currently rerunning my failing program with a modified version of psycopg2 that doesn't release the GIL before escaping the string, and will update this thread with the results when I am sure I don't get the crash anymore.
I am aware that the fact I am the first one to experience this problem seems to point to an error on my part, but I can't find what I did wrong. Also, I seem to be able to fix the problems by keeping hold of the GIL a while longer.
Don't hesitate to ask if you need more information. I can also definitely create a patch and have you review it if you want.
Cheers,
Manu
rel-2.4.5
Comments and changes to this ticket
-
Daniele Varrazzo May 22nd, 2012 @ 03:15 PM
- State changed from new to open
I think you are right, thank you for the report.
At a first glance that seems the only place where we call psycopg_escape_string without GIL, but couldn't tell for other PyMem_Malloc.
The easiest fix is to drop the gil release around that call. The other option would be to use malloc instead, but psycopg_escape_string is used in several place and we should care to pair each use with the correct free(), which is a much more invasive change. Because the operation doesn't perform any I/O, I don't think releasing the GIL there is really worth, so I'd keep it simple.
Maybe the fact we have never hit this bug before is due to the time spent into that gil release being very negligible.
Fog, do you agree with this solution?
-
Manu Cupcic May 22nd, 2012 @ 03:40 PM
At a first glance that seems the only place where we call psycopg_escape_string without GIL, but couldn't tell for other PyMem_Malloc.
I came to the same conclusion (I am no expert though).
The easiest fix is to drop the gil release around that call. The other option would be to use malloc instead, but psycopg_escape_string is used in several place and we should care to pair each use with the correct free(), which is a much more invasive change. Because the operation doesn't perform any I/O, I don't think releasing the GIL there is really worth, so I'd keep it simple.
Agree with that too, not sure how long this call takes relative to an actual call to the database. Although the PQescapeStringConn function needs a pointer to a connection, so maybe it does something longish ?
Maybe the fact we have never hit this bug before is due to the time spent into that gil release being very negligible.
Yes. Possibly
Thanks for the really quick response
Manu
-
Federico Di Gregorio May 22nd, 2012 @ 03:56 PM
I'd avoid switching to malloc() because, at least in theory, PyMem_Malloc tells the interpreter how much memory pressure we're putting on it and that's a Good Thing. So, yes, removing the GIL release is the best thing, IMHO.
-
Daniele Varrazzo May 22nd, 2012 @ 04:26 PM
- State changed from open to resolved
Fixed in my devel. Thank you again, Manu.
-
Manu Cupcic May 23rd, 2012 @ 07:29 AM
Thanks,
I would just like to know if this is going to mean a new release soon, or if I should build the module from a patched fork for the time being.
Manu
-
Daniele Varrazzo May 23rd, 2012 @ 10:18 AM
This bug was terrible, I just wonder why it hasn't bit us before. So I'm for releasing soon, even if its longevity tells it's not hot urgent.
I try and run the full tests tonight or tomorrow at most. I don't think we have anything outstanding stopping us to release.
Create your profile
Help contribute to this project by taking a few moments to create your personal profile. Create your profile ยป
<b>WARNING:</b> the informations in this tracker are archived. Please submit new tickets or comments to <a href="https://github.com/psycopg/psycopg2/issues">the new tracker</a>.
<br/>
Psycopg is the most used PostgreSQL adapter for the Python programming language. At the core it fully implements the Python DB API 2.0 specifications. Several extensions allow access to many of the features offered by PostgreSQL.