#223 Process Hangs

Type	To find
responsible:me	tickets assigned to you
tagged:"@high"	tickets tagged @high
milestone:next	tickets in the upcoming milestone
state:invalid	tickets with the state invalid
created:"last week"	tickets created last week
sort:number, importance, updated	tickets sorted by #, importance or updated
Combine keywords for powerful searching.
Use advanced searching »

This project is archived and is in readonly mode.

#223 ✓invalid

Process Hangs

Reported by Psycopg website | July 17th, 2014 @ 06:04 PM

Submitted by: Brad

I am using:
Python 3.4.1
psycopg 2.5.3

I have an issue where I am making a lot of updates super fast (100ish per second) on a single thread. Occasionally and intermittently I run into an issue where the process hangs. When I say hanging, I mean the query was not executed, process is unresponsive to any commands, but still open. I have to manually kill it. (using linux kill command) When I check the pg_stat_activity, I don't see the connection open or pending. I am using AWS RDS latest version available of postgres.

Here is my execute code:
def execute(self, query, params=None, fetch=False): self.logger.debug("starting _execute...") with psycopg2.connect(database = self.database, user = self.user, host = self.host, password = self.password) as conn: with conn.cursor() as cur: cur.execute("SET TIME ZONE 'PDT8PST';") if params: cur.execute(query,tuple(params)) else: cur.execute(query) if fetch: return cur.fetchall() self.logger.debug("execute complete")

Comments and changes to this ticket

Brad Ruderman July 17th, 2014 @ 06:06 PM

Reposting the code for formatting

  def _execute(self, query, params=None, fetch=False):
    self.logger.debug("starting _execute...")
    with psycopg2.connect(database = self.database,
                          user = self.user,
                          host = self.host,
                          password = self.password) as conn:
      with conn.cursor() as cur:
        cur.execute("SET TIME ZONE 'PDT8PST';") 
        if params:
          cur.execute(query,tuple(params))
        else:
          cur.execute(query)
        if fetch:
          return cur.fetchall()
    self.logger.debug("_execute complete")

Brad Ruderman July 17th, 2014 @ 07:33 PM

[4420] pint_setup: init pint object at 0x7f31c51558b8, refcnt = 1
[4420] pint_setup: good pint object at 0x7f31c51558b8, refcnt = 1
[4420] microprotocol_getquoted: adapted to psycopg2._psycopg.Int
[4420] pint_dealloc: deleted pint object at 0x7f31c51558b8, refcnt = 0
[4420] microprotocols_adapt: trying to adapt int
[4420] pint_setup: init pint object at 0x7f31c51558b8, refcnt = 1
[4420] pint_setup: good pint object at 0x7f31c51558b8, refcnt = 1
[4420] microprotocol_getquoted: adapted to psycopg2._psycopg.Int
[4420] pint_dealloc: deleted pint object at 0x7f31c51558b8, refcnt = 0
[4420] curs_execute: pg connection at 0x294cf80 OK
[4420] pq_begin_locked: pgconn = 0x294cf80, autocommit = 0, status = 2
[4420] pq_begin_locked: transaction in progress
[4420] pq_execute: executing SYNC query: pgconn = 0x294cf80
[4420]     UPDATE etl_mongo_oplog SET time = 1405625377, ordinal = 576 WHERE name = 'tracker'
[4420] pq_execute: entering synchronous DBAPI compatibility mode
[4420] pq_fetch: pgstatus = PGRES_COMMAND_OK
[4420] pq_fetch: command returned OK (no tuples)
[4420] _read_rowcount: PQcmdTuples = 1
[4420] pq_fetch: fetching done; check for critical errors
[4420] psyco_curs_execute: res = 1, pgres = (nil)
[4420] psyco_curs_close: cursor at 0x7f31c6c656d8 closed
[4420] pq_commit: pgconn = 0x294cf80, autocommit = 0, status = 2
[4420] pq_execute_command_locked: pgconn = 0x294cf80, query = COMMIT
[4420] conn_close: PQfinish called
[4420] connection_dealloc: deleted connection object at 0x7f31c6987048, refcnt = 0
[4420] cursor_dealloc: deleted cursor object at 0x7f31c6c656d8, refcnt = 0
[4420] psyco_connect: dsn = '<%SENSITIVE DATA%>', async = 0
[4420] connection_setup: init connection object at 0x7f31c6987048, async 0, refcnt = 1
[4420] con_connect: connecting in SYNC mode

Daniele Varrazzo July 20th, 2014 @ 10:21 AM
- State changed from “new” to “invalid”
For what I can see here it's the server not completing the connection. I tend to blame the AWS servers, not the client side.

If you want to perform operation "super fast" don't open a new connection into the innermost loop: recycle the connections using either psycopg pool or pgbouncer. If you do so, setting the timezone can be done once per connection, not once per query (it can be done in the connection statement too, with an env var or a connection param, look for the docs).

You are not even near the number of operations you can send to the database per second, but you have to leave the connection outside.

Not our bug for me, but feel free to reopen if you find more evidence it is instead.
You flagged this item as spam.
Brad Ruderman July 20th, 2014 @ 02:16 PM
I am not sure how you can claim the server is causing this error. The process hangs infinitely. I have since implemented connection pooling which seems to help, however not sure why not having it would cause this bug. Even when I include a connect timeout, which means the driver should disconnect after a timeout, the process still hangs indefinitely. There have been other reports of this @112. Not sure what other evidence I can provide. I did notice in the debug output this line:

[4420] connection_dealloc: deleted connection object at 0x7f31c6987048, refcnt = 0

Then 3 lines later:

[4420] connection_setup: init connection object at 0x7f31c6987048, async 0, refcnt = 1

Am I misunderstanding or is the driver deallocating the connection object from memory then re-referencing it later?
Daniele Varrazzo July 20th, 2014 @ 10:31 PM
Ticket #112 you reference is about queries, not about connections.

The driver is not referencing a deallocated object: it destroyed one and allocated a new one (when the conn name is re-bound in the loop) which happened to be created at the previous location, by chance or by optimization: the Python memory manager knows why.

As for being AWS or psycopg: try repeating your test connecting to a regular postgres server instead of AWS and see if you can repeat the problem. If you can and provide a test showing the bug I'll be happy to fix it. Note that the code you posted above is just a middle-man function that creates a connection and runs whatever query you pass it, not a complete test.

Thank you very much.
You flagged this item as spam.
Brad Ruderman July 22nd, 2014 @ 12:36 AM
Daniele-
Why would the process still hang with a connect_timeout set? The process definitely hangs!

Thanks!
Daniele Varrazzo July 22nd, 2014 @ 09:47 AM
With AWS or with regular Postgres?

You may be bumping in some quirkiness of AWS pooling or throttling that the libpq is not prepared to handle. Getting a C stack trace of a stuck process could be helpful.
You flagged this item as spam.
Brad Ruderman July 22nd, 2014 @ 04:20 PM
With aws. How do I get the c stack trace? I have a strace of the issue.
Daniele Varrazzo July 24th, 2014 @ 10:41 AM
You can attach gcc to the stuck process, or kill it generating a core.

You can also try to make the process responsive again (killable at python level) switching to green mode as discussed in the last couple of messages in the ML (registering wait_select with set_wait_callback).
Daniele Varrazzo July 24th, 2014 @ 10:59 AM
You can attach gcc to the stuck process

sorry, "gdb"

Create your profile

Help contribute to this project by taking a few moments to create your personal profile. Create your profile »

<b>WARNING:</b> the informations in this tracker are archived. Please submit new tickets or comments to <a href="https://github.com/psycopg/psycopg2/issues">the new tracker</a>.
<br/>
Psycopg is the most used PostgreSQL adapter for the Python programming language. At the core it fully implements the Python DB API 2.0 specifications. Several extensions allow access to many of the features offered by PostgreSQL.

Shared Ticket Bins (Sort)

↓↑ drag 8 Open tickets
↓↑ drag 4 Tickets on hold
↓↑ drag 123 Resolved tickets
↓↑ drag 0 This week's tickets

Psycopg psycopg