View Issue Details

IDProjectCategoryView StatusLast Update
0004955Talerexchangepublic2017-10-18 15:42
ReporterMarcello StanisciAssigned ToChristian Grothoff 
PriorityurgentSeveritymajorReproducibilityhave not tried
Status closedResolutionfixed 
Product VersionSVN HEAD 
Target Version0.4Fixed in Version0.4 
Summary0004955: DB failure breaks 'test'.
DescriptionThe 'test' deployment fails always due to a DB error coming from
the exchange. It happens upon withdrawal, and it's not the first
time it pops up.

{
  "error": "internal error",
  "code": 1602,
  "hint": "Failure in database interaction"
}
Steps To ReproduceTry to withdraw from https://bank.test.taler.net/.
TagsNo tags attached.

Activities

Christian Grothoff

2017-03-14 15:10

manager   ~0011923

Actually relevant logs (before/after failure starts):

Mar 10 15:32:16-584184 taler-exchange-httpd-1449 INFO Handling request for URL '/keys'
Mar 10 16:11:14-470310 taler-exchange-httpd-1449 INFO Handling request for URL '/keys'
Mar 10 16:11:16-170098 taler-exchange-httpd-1449 INFO Handling request for URL '/wire'
Mar 10 16:11:16-195922 taler-exchange-httpd-1449 INFO Handling request for URL '/wire'
Mar 10 16:11:16-866038 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 16:11:16-866077 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 16:11:16-866088 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 16:11:16-866126 taler-exchange-httpd-1449 DEBUG Skipping signature check as master public key not given
Mar 10 16:11:16-866370 taler-exchange-httpd-1449 DEBUG Executing prepared SQL statement `reserve_get'
Mar 10 16:11:16-866615 taler-exchange-httpd-1449 DEBUG Reserve does not exist; creating a new one
Mar 10 16:11:16-866633 taler-exchange-httpd-1449 DEBUG Executing prepared SQL statement `reserve_create'
Mar 10 16:11:16-866903 taler-exchange-httpd-1449 DEBUG Executing prepared SQL statement `reserves_in_add_transaction'
Mar 10 16:11:17-065718 taler-exchange-httpd-1449 INFO Handling request for URL '/reserve/status'
Mar 10 16:11:17-065765 taler-exchange-httpd-1449 DEBUG Executing prepared SQL statement `reserves_in_get_transactions'
Mar 10 16:11:17-066119 taler-exchange-httpd-1449 DEBUG Executing prepared SQL statement `get_reserves_out'
Mar 10 16:11:31-084823 taler-exchange-httpd-1449 INFO Handling request for URL '/keys'
Mar 10 17:46:11-955034 taler-exchange-httpd-1449 INFO Handling request for URL '/keys'
Mar 10 17:46:13-575962 taler-exchange-httpd-1449 INFO Handling request for URL '/wire'
Mar 10 17:46:13-600633 taler-exchange-httpd-1449 INFO Handling request for URL '/wire'
Mar 10 17:46:14-323673 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 17:46:14-323717 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 17:46:14-323734 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 17:46:14-323774 taler-exchange-httpd-1449 DEBUG Skipping signature check as master public key not given
Mar 10 17:46:14-323815 taler-exchange-httpd-1449 ERROR Failed to start transaction:
Mar 10 17:46:14-323825 taler-exchange-httpd-1449 ERROR Assertion failed at plugin_exchangedb_postgres.c:1437.
Mar 10 17:46:14-323831 taler-exchange-httpd-1449 ERROR Assertion failed at plugin_exchangedb_postgres.c:1807.
Mar 10 17:46:14-323838 taler-exchange-httpd-1449 ERROR Assertion failed at taler-exchange-httpd_db.c:1833.
Mar 10 17:49:24-713446 taler-exchange-httpd-1449 INFO Handling request for URL '/keys'
Mar 10 18:14:57-727001 taler-exchange-httpd-1449 INFO Handling request for URL '/keys'
Mar 10 21:01:12-390177 taler-exchange-httpd-1449 INFO Handling request for URL '/keys'
Mar 10 21:01:14-276762 taler-exchange-httpd-1449 INFO Handling request for URL '/wire'
Mar 10 21:01:14-306117 taler-exchange-httpd-1449 INFO Handling request for URL '/wire'
Mar 10 21:01:15-300258 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 21:01:15-300301 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 21:01:15-300313 taler-exchange-httpd-1449 INFO Handling request for URL '/admin/add/incoming'
Mar 10 21:01:15-300351 taler-exchange-httpd-1449 DEBUG Skipping signature check as master public key not given
Mar 10 21:01:15-300365 taler-exchange-httpd-1449 ERROR Failed to start transaction:
Mar 10 21:01:15-300372 taler-exchange-httpd-1449 ERROR Assertion failed at plugin_exchangedb_postgres.c:1437.
Mar 10 21:01:15-300378 taler-exchange-httpd-1449 ERROR Assertion failed at plugin_exchangedb_postgres.c:1807.
Mar 10 21:01:15-300384 taler-exchange-httpd-1449 ERROR Assertion failed at taler-exchange-httpd_db.c:1833.

Christian Grothoff

2017-03-14 15:34

manager   ~0011924

Fixed the logging in 4b952b3..6d798ce, so we should now see something after 'Failed to start transaction:', which ought to help understand the issue.

Christian Grothoff

2017-03-14 15:55

manager   ~0011926

Need to wait until it happens again...

Christian Grothoff

2017-06-29 20:13

manager   ~0012285

Has this been observed recently?

Christian Grothoff

2017-07-04 23:45

manager   ~0012303

Here's the story about this long-standing exchange "database bug":

Basically, the exchange got a "connection down" error out of the blue,
and then all the transactions fail hard. Except, it wasn't out of the blue!

Reading the postgres logs, you can see that it happens exactly after
Postgres DATABASE shut down properly (and later restarted). Naturally,
if Postgres goes down, the client's connection goes down as well. And
we need to re-run the prepared statement logic go make it go back up.
I'm about to push changes that will make that happen, now that we
understand that it's not a client error causing the "fatal" error, but
the server disconnecting.

But why did the Postgres server restart? Did it crash? Is there another
problem? Great answer: nope, that is also perfectly normal as it
happened DURING an 'apt-get update'.

So whenever we did an 'apt-get upgrade' which touched Postgres (which
obviously happened very irregularly), the exchange DB would go down, and
all future transactions would fail.

With the change, the *NEXT* transaction will still fail hard (as PQ
doesn't detect that the socket died until we try to use it), but then
the exchange will 'heal' by closing the dead connection and re-opening
it from scratch.

Hence, IF the merchant/wallet re-try transactions (as they should), the
error will be invisible even to the one client that is hit by a Postgres
backend database restart.

Fixed with latest commits to exchange/merchant masters.

Issue History

Date Modified Username Field Change
2017-03-14 14:56 Marcello Stanisci New Issue
2017-03-14 14:56 Marcello Stanisci Status new => assigned
2017-03-14 14:56 Marcello Stanisci Assigned To => Christian Grothoff
2017-03-14 15:10 Christian Grothoff Note Added: 0011923
2017-03-14 15:34 Christian Grothoff Note Added: 0011924
2017-03-14 15:55 Christian Grothoff Status assigned => feedback
2017-03-14 15:55 Christian Grothoff Note Added: 0011926
2017-03-18 23:28 Christian Grothoff Priority normal => urgent
2017-03-18 23:28 Christian Grothoff Severity minor => major
2017-03-18 23:28 Christian Grothoff Product Version => SVN HEAD
2017-06-29 20:13 Christian Grothoff Note Added: 0012285
2017-07-04 23:45 Christian Grothoff Status feedback => resolved
2017-07-04 23:45 Christian Grothoff Resolution open => fixed
2017-07-04 23:45 Christian Grothoff Fixed in Version => 0.4
2017-07-04 23:45 Christian Grothoff Note Added: 0012303
2017-07-04 23:45 Christian Grothoff Target Version => 0.4
2017-10-18 15:42 Christian Grothoff Status resolved => closed