View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0008452 | Taler | exchange | public | 2024-02-20 16:20 | 2024-03-07 20:47 |
Reporter | Florian Dold | Assigned To | Christian Grothoff | ||
Priority | normal | Severity | minor | Reproducibility | have not tried |
Status | closed | Resolution | fixed | ||
Target Version | 0.9.4 | Fixed in Version | 0.9.4 | ||
Summary | 0008452: deposit long-polling does not work | ||||
Description | # check out branch dev/dold/deposit-longpolling $ taler-harness run-integrationtests deposit This will hang in deposit long-polling, whereas the master branch that currently uses no long-polling but just retries with back-off does not hang. The aggregator is run (one-shot), but GET /deposits/$H_WIRE/$MERCHANT_PUB/$H_CONTRACT_TERMS/$COIN_PUB is stuck and does not return. | ||||
Tags | No tags attached. | ||||
child of | 0008413 | closed | Florian Dold | wallet-core does not use long polling for deposit tracking |
|
I did a visual inspection of the code, and all of the logic required seems to be there... |
|
I see *empty* log files for the aggregator when running your test, which either suggests it did NOT run OK, *or* you didn't log on INFO level, which means the logging is pretty useless. Could you please check that the aggregator is run at INFO for all of your test suites? |
|
Ah, looks like the test didn't even get to that point yet, let me look into it. |
|
Test fails with latest exchange because now, due to x-taler-bank validation, the missing ?receiver-name= of the payto://-URI in your test stops you early: 2024-02-21T20:15:21.097Z harness.ts INFO running command taler-exchange-offline -c '/tmp/taler-integrationtests-7kI6Oc/deposit/exchange-testexchange-1.conf' enable-account payto://x-taler-bank/localhost/myexchange upload FATAL: test failed for exchange-offline Unexpected exit code 2: "taler-exchange-offline" 2024-02-21T21:15:21.113731+0100 taler-exchange-offline-55636 ERROR payto URI is malformed: 'receiver-name' parameter missing CommandError: Unexpected exit code 2 at ChildProcess.<anonymous> (/home/grothoff/lib/taler-harness/node_modules/taler-harness/dist/taler-harness-bundled.cjs:45588:11) at ChildProcess.emit (node:events:513:28) at ChildProcess._handle.onexit (node:internal/child_process:293:12) { logName: 'exchange-offline', command: 'taler-exchange-offline' |
|
I've fixed the receiver-name issue on master and also in the dev/dold/deposit-longpolling branch (with a rebase / force-push). Here's the commit where deposit again fails now due to the longpolled request not returning a response: commit 4aacad96d0d02293d2d665e99a9bdf1c4adaf421 (HEAD -> dev/dold/deposit-longpolling, origin/dev/dold/deposit-longpolling) Author: Florian Dold <florian@dold.me> Date: Wed Feb 28 01:29:22 2024 +0100 wallet-core: enable deposit long-polling |
|
Wrong condition, fixed in 3f0bfbeb..295be9df. |
|
The long-poller returns now, but it only returns after 30 seconds instead of immediately when the aggregator runs. |
|
I've added a -t run of taler-exchange-transfer after a -t run of taler-exchange-wire, but it still hangs 30 seconds. |
|
9173d274..b8c5d140 fixes it by working around Postgres #18380. |
|
The following bug has been logged on the website: Bug reference: 18380 Logged by: Christian Grothoff Email address: grothoff@gnu.org PostgreSQL version: 16.2 Operating system: Debian GNU/Linux Description: We tested with Postgres 14, 15, 16.1 and 16.2; same results. Overview: ======== We use LISTEN + NOTIFY to implement some long polling. This usually works. However, we stop getting NOTIFY messages after a LISTENing client issued a particular SELECT query with a LEFT JOIN in the LISTENing client. Disabling "enable_mergejoin" fixes the issue. We believe that the "enable_mergejoin" logic somehow corrupts the notify state (likely an obscure rare memory corruption issue). The bug is 100% reproduceable on several of our systems. How to reproduce: =============== * Install GNUnet 0.21.0 [or: git.gnunet.org/gnunet.git master] * Install GNU Taler exchange, checkout tag 'postgres-bug-mergejoin' [from git.taler.net/exchange.git] * Install taler-wallet-core [from git.taler.net/taler-wallet-core] * run $ taler-harness run-integrationtests deposit Buggy outcome ============= If enable_mergejoin is ON, the test reports this at the end: parent: got result {"name":"deposit","timeSec":35.168,"status":"pass"} Note that the 35s is due to all kinds of things in the test, none related to actual postgres query performance. This is NOT a performance issue. We can set the long-poll threshold higher, and the time will go up accordingly. Expected behavior with work-around ============================== If enable_mergejoin is OFF, long-polling works and the test reports something like this at the end: parent: got result {"name":"deposit","timeSec":16.094,"status":"pass"} This is basically the fastest the integration test will run (~15-17 seconds), the NOTIFY happens as expected. Relevant SQL query break-down ========================== The query that causes things to break is in exchange/src/exchangedb/pg_lookup_transfer_by_deposit.c. The buggy sub-clause is marked with #if BUG below: PREPARE (pg, "get_deposit_without_wtid", "SELECT" " bdep.wire_salt" ",wt.payto_uri" ",cdep.amount_with_fee" ",denom.fee_deposit" ",bdep.wire_deadline" #if BUG ",agt.legitimization_requirement_serial_id" #endif ",aml.status" ",aml.kyc_requirement" " FROM coin_deposits cdep" " JOIN batch_deposits bdep" " USING (batch_deposit_serial_id)" " JOIN wire_targets wt" " USING (wire_target_h_payto)" " JOIN known_coins kc" " ON (kc.coin_pub = cdep.coin_pub)" " JOIN denominations denom" " USING (denominations_serial)" #if BUG " LEFT JOIN aggregation_transient agt " " ON ( (bdep.wire_target_h_payto = agt.wire_target_h_payto) AND" " (bdep.merchant_pub = agt.merchant_pub) )" #endif " LEFT JOIN aml_status aml" " ON (wt.wire_target_h_payto = aml.h_payto)" " WHERE cdep.coin_pub=$1" " AND bdep.merchant_pub=$3" " AND bdep.h_contract_terms=$2" " LIMIT 1;"); If we #define BUG to 0 for the code above, we do NOT have to disable the merge_join to get the desired behavior. |
Date Modified | Username | Field | Change |
---|---|---|---|
2024-02-20 16:20 | Florian Dold | New Issue | |
2024-02-20 16:20 | Florian Dold | Status | new => assigned |
2024-02-20 16:20 | Florian Dold | Assigned To | => Christian Grothoff |
2024-02-20 16:20 | Florian Dold | Relationship added | child of 0008413 |
2024-02-20 16:21 | Florian Dold | Steps to Reproduce Updated | |
2024-02-21 20:59 | Christian Grothoff | Note Added: 0021421 | |
2024-02-21 21:42 | Christian Grothoff | Note Added: 0021430 | |
2024-02-21 21:44 | Christian Grothoff | Note Added: 0021431 | |
2024-02-21 21:45 | Christian Grothoff | Note Added: 0021432 | |
2024-02-28 01:32 | Florian Dold | Note Added: 0021541 | |
2024-02-29 00:57 | Christian Grothoff | Note Added: 0021573 | |
2024-02-29 00:57 | Christian Grothoff | Status | assigned => resolved |
2024-02-29 00:57 | Christian Grothoff | Resolution | open => fixed |
2024-02-29 00:57 | Christian Grothoff | Fixed in Version | => 0.9.4 |
2024-02-29 00:57 | Christian Grothoff | Target Version | 0.10 => 0.9.4 |
2024-03-06 08:51 | Florian Dold | Status | resolved => assigned |
2024-03-06 08:51 | Florian Dold | Note Added: 0021711 | |
2024-03-06 09:08 | Florian Dold | Note Added: 0021715 | |
2024-03-06 22:21 | Christian Grothoff | Status | assigned => resolved |
2024-03-06 22:21 | Christian Grothoff | Note Added: 0021737 | |
2024-03-06 22:32 | Christian Grothoff | Note Added: 0021738 | |
2024-03-07 20:47 | Christian Grothoff | Status | resolved => closed |