View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0005337 | Taler | twister | public | 2018-06-01 16:39 | 2019-12-20 19:12 |
Reporter | Marcello Stanisci | Assigned To | Christian Grothoff | ||
Priority | normal | Severity | minor | Reproducibility | have not tried |
Status | closed | Resolution | fixed | ||
Product Version | git (master) | ||||
Target Version | 0.6 | Fixed in Version | 0.6 | ||
Summary | 0005337: Machine state not changed to "download started" | ||||
Description | If the twister is run under Valgrind, then it fails _always_ to set the machine state as REQUEST_STATE_DOWNLOAD_STARTED, making the first assert from curl_download_cb() fail. | ||||
Steps To Reproduce | it suffices to start the Web server which is found under src/test/ (twister repo), then start the twister under Valgrind (using the config file from the same directory), and finally issue a POST request to http://127.0.0.1:8888/ uploading a dummy JSON. | ||||
Additional Information | Logs from this kind of error. May 25 10:54:21-522105 twister-5532 ERROR Assertion failed at taler-twister-service.c:550. ==5532== ==5532== Process terminating with default action of signal 6 (SIGABRT) ==5532== at 0x5CB7E7B: raise (raise.c:51) ==5532== by 0x5CB9230: abort (abort.c:79) ==5532== by 0x5812918: GNUNET_abort_ (common_logging.c:284) ==5532== by 0x10AEF7: curl_download_cb (taler-twister-service.c:549) ==5532== by 0x5387291: ??? (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0) ==5532== by 0x539F3E9: ??? (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0) ==5532== by 0x53A96F5: ??? (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0) ==5532== by 0x53AA330: curl_multi_perform (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0) ==5532== by 0x10C3E5: curl_task_download (taler-twister-service.c:738) ==5532== by 0x58435DE: GNUNET_SCHEDULER_do_work (scheduler.c:2106) ==5532== by 0x584436F: select_loop (scheduler.c:2407) ==5532== by 0x584436F: GNUNET_SCHEDULER_run (scheduler.c:727) ==5532== by 0x584864C: GNUNET_SERVICE_run_ (service.c:1847) | ||||
Tags | No tags attached. | ||||
|
Commit c5a65df2f40125fdd7d35f298220 should fix this. The problem was that if the Web server responds before reading any POSTed data, then the curl download callback gets called _before_ the curl upload callback; moreover, it seems that the latter NEVER gets called in such scenarios. This way, no function ever set the state as "download started" and so the Twister got in an inconsistent state: download callback in action and state NOT as "download started". This fix just tolerates such scenarios by setting the state as "download started" INSIDE the download callback, printing a warning. The issue is left open for internal review. |
|
Did you push the commit? I don't see it in the master branch, and dev/marcello is untouched for 3 months. |
|
After several tries I can confirm that: if the problem is reproduced as in the description above, the libcurl's upload callback does NEVER get called (therefore no one sets the state as "download started"). On the other hand, if the scenario is the same but Valgrind is NOT used, then the upload callback goes first and sets the state as expected. Note, we already set two timeout options like CURLOPT_TIMEOUT and CURLOPT_CONNECTTIMEOUT to 600 seconds, so unless there is some other crazy timeout option to use, I'd say the problem is not even time-related. And yes, the mentioned commit above did get pushed on master. |
|
Additional issues: * if the download callback returns CURL_WRITEFUNC_PAUSE then the progress callback does NEVER get called (as the manpage suggests), therefore making it impossible to wake things up. * if the easy handle is instantiated using curl_easy_pause(handle, CURLPAUSE_RECV), then _no_ effect is produced: the download callback will just be called as if it were never get paused. Those issues arose because we do want to keep the state machine as it is, namely we do want the state to be set as "download started" when the download callback fires up. And to do so, we need to pause the download callback. |
|
If we returned CURL_WRITEFUNC_PAUSE, we need to call https://linux.die.net/man/3/curl_easy_pause to have the progress callback be called again later. |
|
Sure, and as of the docs, the best place to call curl_easy_pause is the progress function (that I was not able to get invoked); but now it is fixed (commit 26b8340), it just needed to be enabled by setting CURLOPT_NOPROGRESS to 0. But another problem arises: the progress function's argument 'dltotal' (= expected number of bytes to download) is passed as 0 always, and so it is not possible to understand when it's the right time to unpause the download callback from within the progress function.. The idea was to compare 'ultotal' and 'ulnow', and unpause the download callback whenever 'ulnow' reached 'ultotal'; but apparently this is not possible. |
|
I'm confused. You added the progress_cb -- but for what? We don't need it. |
|
I've pushed a possible fix in 26b8340..324f6db. Basically, the code now remembers that it suspended the curl download and resumes it if necessary. |
|
the progress_cb is told to be the best place to unpause what has been paused. See curl_easy_pause man page. |
|
26b8340..324f6db doesn't work. There are two problems with it: (1) once the download_cb fires up, the upload_cb doesn't get called _anymore_, even when the former goes to sleep, therefore no unpausing happens. (2) Your patch unpauses regardless of whether _all_ the bytes got uploaded first. |
|
Sorry, (2) doesn't apply on the upload_cb. |
|
Please let me know if this helps, or if there is another issue left. |
|
No it does not. As (1) says, the upload cb does never get called, therefore NO unpausing happens. We should find a way to tell libcurl "do run the upload cb first for all POSTs". |
|
Could you get a packet capture of the interaction that happens with wireshark and attach it to this bug report? I wonder if we're missing something, like a 100 continue or so. |
|
Wireshark only catches the first POST, and not the one between the Twister and the test webserver. Below you see the (nice) HTTP request when netcat is used as the test webserver. The Expect: header is something that I added now, and did NOT fix things. The request was generated with: $ curl -d '{}' -H 'Content-Type: application/json' http://localhost:8888/ marcello@laptop01:~$ nc -l -p 8080 POST / HTTP/1.1 Host: localhost:8888 User-Agent: curl/7.52.1 Accept: */* Content-Type: application/json Content-Length: 2 Expect: 100-continue {} |
|
2f69f626b0e47 fixes it. |
|
I don't think the proposed fix is sane, it is likely to create subtle problems and almost certainly does not fix the root cause. |
|
9ad1ebf8f6b126fdef9a0753305b77f30ebbfa3f fixes the "odd" behavior of the test web server. 51165a5b431fc5859afc81dba47833ad1e1f35d6 adjust the proxy to deal with odd behavior of test web server and (slightly) odd handling of such behavior by curl. The "odd" behavior was that the test Web server returned a (non-error) HTTP response before the upload was complete, causing curl to abandon the upload (but the proxy "insisting" that it should be completed). With the change, the Web server waits with the response until the upload is done, and the proxy _would_ abort uploads (and move towards the download phase) for Web servers that do return a response 'early'. |
Date Modified | Username | Field | Change |
---|---|---|---|
2018-06-01 16:39 | Marcello Stanisci | New Issue | |
2018-06-01 16:39 | Marcello Stanisci | Status | new => assigned |
2018-06-01 16:39 | Marcello Stanisci | Assigned To | => Marcello Stanisci |
2018-06-01 16:43 | Marcello Stanisci | Target Version | => 0.6 |
2018-06-01 16:49 | Marcello Stanisci | Note Added: 0012976 | |
2018-06-01 17:13 | Marcello Stanisci | Note Edited: 0012976 | |
2018-06-01 17:18 | Marcello Stanisci | Note Edited: 0012976 | |
2018-06-03 16:07 | Christian Grothoff | Note Added: 0012977 | |
2018-09-07 12:36 | Marcello Stanisci | Note Added: 0013228 | |
2018-09-17 22:25 | Marcello Stanisci | Note Added: 0013234 | |
2018-09-17 23:50 | Christian Grothoff | Note Added: 0013235 | |
2018-09-17 23:58 | Marcello Stanisci | Note Added: 0013236 | |
2018-09-18 12:22 | Christian Grothoff | Note Added: 0013237 | |
2018-09-18 12:22 | Christian Grothoff | Note Added: 0013238 | |
2018-09-18 12:25 | Marcello Stanisci | Note Added: 0013239 | |
2018-09-18 12:36 | Marcello Stanisci | Note Added: 0013240 | |
2018-09-18 12:37 | Marcello Stanisci | Note Added: 0013241 | |
2018-09-18 13:01 | Christian Grothoff | Note Added: 0013242 | |
2018-09-18 15:01 | Marcello Stanisci | Note Added: 0013243 | |
2018-09-18 17:42 | Christian Grothoff | Note Added: 0013244 | |
2018-09-18 18:16 | Marcello Stanisci | Note Added: 0013245 | |
2018-09-18 18:17 | Marcello Stanisci | Note Edited: 0013245 | |
2018-09-20 16:51 | Marcello Stanisci | Note Added: 0013246 | |
2018-09-20 16:52 | Marcello Stanisci | Status | assigned => resolved |
2018-09-20 16:52 | Marcello Stanisci | Resolution | open => fixed |
2018-09-20 17:14 | Christian Grothoff | Assigned To | Marcello Stanisci => Christian Grothoff |
2018-09-20 17:15 | Christian Grothoff | Status | resolved => assigned |
2018-09-20 17:15 | Christian Grothoff | Note Added: 0013247 | |
2018-09-20 20:12 | Christian Grothoff | Status | assigned => resolved |
2018-09-20 20:12 | Christian Grothoff | Note Added: 0013249 | |
2018-09-20 20:13 | Christian Grothoff | Product Version | => git (master) |
2018-09-20 20:13 | Christian Grothoff | Fixed in Version | => 0.6 |
2019-12-20 19:12 | Christian Grothoff | Status | resolved => closed |