View Issue Details

IDProjectCategoryView StatusLast Update
0005333Talertwisterpublic2018-06-03 14:33
ReporterMarcello StanisciAssigned ToMarcello Stanisci 
PrioritynormalSeverityminorReproducibilitysometimes
Status resolvedResolutionwon't fix 
Product VersionSVN HEAD 
Target Version0.6Fixed in Version0.6 
Summary0005333: Machine state not always changed.
DescriptionThere are two states that fail to be changed.

1. If the twister is run under Valgrind, then it fails _always_ to
    set the machine state as REQUEST_STATE_DOWNLOAD_STARTED, making
    the first assert from curl_download_cb() fail.

2. _Sometimes_, and NOT under Valgrind, the twister fails to set the
    machine state as REQUEST_STATE_UPLOAD_STARTED, making the first assert
    from curl_upload_cb() fail.
Steps To ReproduceTo reproduce 1., it suffices to start the Web server which is found under
src/test/ (twister repo), then start the twister under Valgrind (using the
config file from the same directory), and finally issue a POST request to
http://127.0.0.1:8888/ uploading a dummy JSON.

As said, 2. does appear in a sporadic way.
Additional InformationThis is the error thrown by Valgrind as for 1.

May 25 10:54:21-522105 twister-5532 ERROR Assertion failed at taler-twister-service.c:550.
==5532==
==5532== Process terminating with default action of signal 6 (SIGABRT)
==5532== at 0x5CB7E7B: raise (raise.c:51)
==5532== by 0x5CB9230: abort (abort.c:79)
==5532== by 0x5812918: GNUNET_abort_ (common_logging.c:284)
==5532== by 0x10AEF7: curl_download_cb (taler-twister-service.c:549)
==5532== by 0x5387291: ??? (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0)
==5532== by 0x539F3E9: ??? (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0)
==5532== by 0x53A96F5: ??? (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0)
==5532== by 0x53AA330: curl_multi_perform (in /usr/lib/x86_64-linux-gnu/libcurl-gnutls.so.4.4.0)
==5532== by 0x10C3E5: curl_task_download (taler-twister-service.c:738)
==5532== by 0x58435DE: GNUNET_SCHEDULER_do_work (scheduler.c:2106)
==5532== by 0x584436F: select_loop (scheduler.c:2407)
==5532== by 0x584436F: GNUNET_SCHEDULER_run (scheduler.c:727)
==5532== by 0x584864C: GNUNET_SERVICE_run_ (service.c:1847)

These are the error logs given when 2. appears.

May 24 18:53:06-963367 test-merchant-api-twisted-11011 DEBUG Running command `create-proposal-0'
May 24 18:53:06-963583 taler-merchant-httpd-11040 WARNING Failed to parse JSON request body
May 24 18:53:06-963590 twister-11046 ERROR Assertion failed at taler-twister-service.c:596.
May 24 18:53:06-963633 twister-11046 ERROR Download curl failed: Operation was aborted by an application callback
May 24 18:53:23-014677 test-merchant-api-twisted-11011 WARNING Did NOT detect response as JSON
May 24 18:53:23-014705 test-merchant-api-twisted-11011 ERROR Assertion failed at testing_api_cmd_proposal.c:287.
TagsNo tags attached.

Activities

Marcello Stanisci

2018-05-25 20:33

manager   ~0012961

This error also used to appear in the past, lucky it popped up again:

May 25 20:26:52-382631 taler-exchange-httpd-13089 INFO Handling request for URL '/refresh/reveal'
May 25 20:26:52-382731 taler-exchange-httpd-13089 INFO Handling request for URL '/refresh/reveal'
May 25 20:26:52-382749 taler-exchange-httpd-13089 INFO Handling request for URL '/refresh/reveal'
May 25 20:26:52-382871 taler-exchange-httpd-13089 WARNING External protocol violation detected at json_helper.c:75.
May 25 20:26:52-383025 util-scheduler-13050 DEBUG Running task 0x5642bb31fb10
May 25 20:26:52-383064 test-exchange-api-new-twisted-13050 DEBUG Downloaded body: {"error":"parse error","field":"<no field>","line":0}
May 25 20:26:52-383080 test-exchange-api-new-twisted-13050 ERROR Unexpected response code 400 to command refresh-(flipped-)reveal
{"error": "parse error", "field": "<no field>", "line": 0}May 25 20:26:52-383118 test-exchange-api-new-twisted-13050 ERROR Failed at command `refresh-(flipped-)reveal'

Marcello Stanisci

2018-05-30 11:30

manager   ~0012971

Update as of the point number 1.

If the error reproduction is done by replacing the test
Web server with a netcat instance listening on port 8080,
then the error disappears entirely.

This is still far from resolving the issue, but it narrows
down what the main responsible actors are: Valgrind + the test Web server.

Marcello Stanisci

2018-05-30 11:48

manager   ~0012972

False. The error disappears when netcat does not
respond. If netcat is instructed to *respond* to the
connection (see snippet below), then the error appears again.

$ nc -l -p 8080 -c 'echo {"the":"body"}
echo
echo HTTP/1.0 200 OK'

Marcello Stanisci

2018-05-30 14:50

manager   ~0012973

Last edited: 2018-05-31 12:31

View 2 revisions

The take away from this situation as for the issue at bullet point 1 is:

if the server does NOT respond to the request, then the Valgrind-run twister
behaves well, otherwise the Valgrind-run twister fails just as it is reported.

EDIT:

if the test web server is patched to wait, say 10 secs, before sending the
response back , then the crash disappears. So this must have something to do with
"low level" TCP communication.

Marcello Stanisci

2018-06-01 14:07

manager   ~0012974

Stack trace of a download callback, this is just to point out
which functions at libcurl are responsible for calling it.

==5912== at 0x10AC17: curl_download_cb (taler-twister-service.c:461)
==5912== by 0x5380237: Curl_client_write (in /home/marko/.local/lib/libcurl.so.4.5.0)
==5912== by 0x53916CD: Curl_readwrite (in /home/marko/.local/lib/libcurl.so.4.5.0)
==5912== by 0x539A11D: multi_runsingle (in /home/marko/.local/lib/libcurl.so.4.5.0)
==5912== by 0x539AF78: curl_multi_perform (in /home/marko/.local/lib/libcurl.so.4.5.0)
==5912== by 0x10BBB5: curl_task_download (taler-twister-service.c:595)
==5912== by 0x58235DE: GNUNET_SCHEDULER_do_work (scheduler.c:2106)
==5912== by 0x582436F: select_loop (scheduler.c:2407)
==5912== by 0x582436F: GNUNET_SCHEDULER_run (scheduler.c:727)
==5912== by 0x582864C: GNUNET_SERVICE_run_ (service.c:1847)
==5912== by 0x10A3EA: main (taler-twister-service.c:2037)

Marcello Stanisci

2018-06-01 16:43

manager   ~0012975

Closing as all the individual reports in this bug
got their own new bug number now.

Issue History

Date Modified Username Field Change
2018-05-25 11:07 Marcello Stanisci New Issue
2018-05-25 11:07 Marcello Stanisci Status new => assigned
2018-05-25 11:07 Marcello Stanisci Assigned To => Marcello Stanisci
2018-05-25 11:07 Marcello Stanisci Target Version => 0.6
2018-05-25 20:33 Marcello Stanisci Note Added: 0012961
2018-05-30 11:30 Marcello Stanisci Note Added: 0012971
2018-05-30 11:48 Marcello Stanisci Note Added: 0012972
2018-05-30 14:50 Marcello Stanisci Note Added: 0012973
2018-05-31 12:31 Marcello Stanisci Note Edited: 0012973 View Revisions
2018-06-01 14:07 Marcello Stanisci Note Added: 0012974
2018-06-01 16:43 Marcello Stanisci Status assigned => closed
2018-06-01 16:43 Marcello Stanisci Resolution open => won't fix
2018-06-01 16:43 Marcello Stanisci Note Added: 0012975
2018-06-03 14:33 Christian Grothoff Status closed => resolved
2018-06-03 14:33 Christian Grothoff Product Version => SVN HEAD
2018-06-03 14:33 Christian Grothoff Fixed in Version => 0.6