View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0002962 | GNUnet | cadet service | public | 2013-07-22 20:14 | 2013-12-24 20:54 |
| Reporter | Florian Dold | Assigned To | Bart Polot | ||
| Priority | high | Severity | major | Reproducibility | always |
| Status | closed | Resolution | fixed | ||
| Product Version | Git master | ||||
| Target Version | 0.10.0 | Fixed in Version | 0.10.0 | ||
| Summary | 0002962: gnunet-set-profiler makes the mesh service trigger assertions in core and mesh | ||||
| Description | Running the set profiler with a relatively large number of non-common elements causes mesh to fail with assertions. Note that running gnunet-set-profiler requires a running peer with the set service started. | ||||
| Steps To Reproduce | $ gnunet-arm -s && gnunet-arm -i set $ gnunet-set-profiler -LINFO -A 100 -B 100 -C 100 | ||||
| Additional Information | [...] Jul 22 20:10:05-310633 mesh-29647 ERROR Assertion failed at core_api.c:1294. Jul 22 20:10:05-311606 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3514. Jul 22 20:10:05-311661 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3740. Jul 22 20:10:05-312461 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3514. Jul 22 20:10:05-312515 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3740. Jul 22 20:10:05-313643 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3514. Jul 22 20:10:05-313699 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3740. Jul 22 20:12:03-264894 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:1112. | ||||
| Tags | No tags attached. | ||||
|
|
The lines did not correspond to assertions from the begining, and even now I can't reproduce the behavior. Does this still happen in svn HEAD? |
|
|
In r28256 I only get assertions from core, and messages are not delivered: Jul 23 00:15:43-978249 mesh-20519 ERROR Assertion failed at core_api.c:1294 |
|
|
I can't reproduce even this behavior. Can I ask you to: - change the GNUNET_break(0) to a GNUNET_abort() in core_api.c:1294 - make install in core and mesh - rerun the testcase, collect the coredump - post here a stacktrace |
|
|
Sure: (gdb) bt #0 0x00007f3091b98037 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f3091b9b698 in __GI_abort () at abort.c:90 #2 0x00007f3092556439 in GNUNET_abort () at common_logging.c:293 #3 0x00007f309233dc2c in GNUNET_CORE_notify_transmit_ready (handle=<optimized out>, cork=cork@entry=0, priority=priority@entry=0, maxdelay=..., target=target@entry=0x7fff4c1b7890, notify_size=notify_size@entry=84, notify=notify@entry=0x411470 <queue_send>, notify_cls=notify_cls@entry=0x1ddea60) at core_api.c:1279 #4 0x00000000004116d1 in queue_send (cls=0x1ddea60, size=<optimized out>, buf=0x1de06cc) at gnunet-service-mesh.c:3734 #5 0x00007f309233cb33 in transmit_message (cls=<optimized out>, size=232, buf=0x1de0670) at core_api.c:648 #6 0x00007f309255277c in client_notify (cls=0x1de12d0, size=232, buf=0x1de0670) at client.c:1135 #7 0x00007f309255b62e in process_notify (connection=0x1dde360) at connection.c:1183 #8 transmit_ready (cls=0x1dde360, tc=<optimized out>) at connection.c:1314 #9 0x00007f309257dbf9 in run_ready (ws=0x1defb60, rs=0x1defad0) at scheduler.c:597 #10 GNUNET_SCHEDULER_run (task=task@entry=0x7f3092585aa0 <service_task>, task_cls=task_cls@entry=0x7fff4c1b7b90) at scheduler.c:786 #11 0x00007f30925888f6 in GNUNET_SERVICE_run (argc=argc@entry=3, argv=argv@entry=0x7fff4c1b7f18, service_name=service_name@entry=0x413221 "mesh", options=options@entry=GNUNET_SERVICE_OPTION_NONE, task=task@entry=0x4026c0 <run>, task_cls=task_cls@entry=0x0) at service.c:1816 #12 0x0000000000402509 in main (argc=3, argv=0x7fff4c1b7f18) at gnunet-service-mesh.c:5946 |
|
|
Hmmmmm... .this is really strange. This function (queue_send) is the one called by transmit_ready to fill the buffer, it cannot be "called twice" from there. I'll ask Christian tomorrow to see if there is some corner case hit here. |
|
|
Wait... core_api.c:1279!? That's plain impossible, the notify callback is hardcoded!! |
|
|
Sometimes gdb reports different line numbers than GNUNET_break, most probably due to optimizations. The assertion occurs in core_api.c:1294. |
|
|
It helps to export CFLAGS="-O0 $CFLAGS" to avoid the optimizations in gdb. |
|
|
Doh, should have done this at configure time, will report back once compilation is done ... However, I changed the GNUNET_abort to GNUNET_assert (0), and the line reported is 1294, but gdb still has 1279 in the stack trace. |
|
|
Now here's the stack trace without optimizations: #0 0x00007fc558519037 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x00007fc558519037 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fc55851c698 in __GI_abort () at abort.c:90 #2 0x00007fc558edb964 in GNUNET_abort () at common_logging.c:293 #3 0x00007fc558cc2456 in GNUNET_CORE_notify_transmit_ready (handle=0xcd9dd0, cork=0, priority=0, maxdelay=..., target=0x7fff21e6bc40, notify_size=84, notify=0x40bec5 <queue_send>, notify_cls=0xcdaa60) at core_api.c:1294 #4 0x000000000040cbc4 in queue_send (cls=0xcdaa60, size=140, buf=0xcdc33c) at gnunet-service-mesh.c:3734 #5 0x00007fc558cbebe1 in transmit_message (cls=0xcd9dd0, size=232, buf=0xcdc2e0) at core_api.c:648 #6 0x00007fc558ed9f89 in client_notify (cls=0xcda520, size=232, buf=0xcdc2e0) at client.c:1135 #7 0x00007fc558ee56e4 in process_notify (connection=0xcda360) at connection.c:1183 #8 0x00007fc558ee620b in transmit_ready (cls=0xcda360, tc=0x7fff21e6bf10) at connection.c:1314 #9 0x00007fc558f0c371 in run_ready (rs=0xcebad0, ws=0xcebb60) at scheduler.c:597 #10 0x00007fc558f0cbba in GNUNET_SCHEDULER_run (task=0x7fc558f1a2be <service_task>, task_cls=0x7fff21e6c180) at scheduler.c:786 #11 0x00007fc558f1be2e in GNUNET_SERVICE_run (argc=3, argv=0x7fff21e6c4f8, service_name=0x4176ba "mesh", options=GNUNET_SERVICE_OPTION_NONE, task=0x414e5a <run>, task_cls=0x0) at service.c:1816 #12 0x0000000000415b2a in main (argc=3, argv=0x7fff21e6c4f8) at gnunet-service-mesh.c:5946 |
|
|
The only change I could think of is the one in r28257. Does that help at all? |
|
|
Still getting the assertion :( Is there any more debug info that would help you? |
|
|
Not really. But you might try with 28258... |
|
|
r28258 did it for the core assertion! However, now I get the following assertions when running the set profiler with many non-common elements: ERROR Assertion failed at gnunet-service-mesh.c:3840 |
|
|
If you are running with reliable tunnels that shouldn't worry you, it's just about dropped traffic, which should be retransmitted. The message is ugly, though... I'll look into it tomorrow, as I have the feeling that there shouldn't be any dropped messages in a local test, but I'd have to see if I can reproduce it on the office desktop. |
|
|
what's the status on this? |
|
|
Oct 20 20:32:24-236468 mesh-16806 ERROR Assertion failed at gnunet-service-mesh.c:3111. Seems like I can still reproduce at least one assertion in SVN HEAD, by running e.g. $ gnunet-set-profiler -A 500 -B 100 -C 5 a few times (requires a running peer with set service started). Not every run triggers the assertion ... |
|
|
Somehow got another one: Oct 20 20:37:23-632411 mesh-16806 ERROR Assertion failed at gnunet-service-mesh.c:3099. |
|
|
Needs re-testing. |
|
|
Running set profiler finishes with a segfault in the mq code, cannot reproduce these. |
|
|
Program terminated with signal 11, Segmentation fault. #0 0x00007f8ca9798f24 in handle_client_set_error (cls=0x23c0600, error=GNUNET_MQ_ERROR_READ) at set_api.c:372 372 if (NULL != set->ops_head->result_cb) (gdb) bt #0 0x00007f8ca9798f24 in handle_client_set_error (cls=0x23c0600, error=GNUNET_MQ_ERROR_READ) at set_api.c:372 #1 0x00007f8ca99d814a in GNUNET_MQ_inject_error (mq=0x23e0870, error=GNUNET_MQ_ERROR_READ) at mq.c:220 #2 0x00007f8ca99d930a in connection_client_transmit_queued (cls=0x23e0870, size=0, buf=0x0) at mq.c:560 #3 0x00007f8ca99af5ac in client_notify (cls=0x23c1fe0, size=0, buf=0x0) at client.c:1120 #4 0x00007f8ca99bb774 in connect_error (cls=0x23e0720, tc=0x7fff43519f10) at connection.c:1232 #5 0x00007f8ca99e67eb in run_ready (rs=0x23be8d0, ws=0x23d44c0) at scheduler.c:593 #6 0x00007f8ca99e7079 in GNUNET_SCHEDULER_run (task=0x7f8ca99e16fc <program_main>, task_cls=0x7fff4351a1b0) at scheduler.c:808 #7 0x00007f8ca99e21be in GNUNET_PROGRAM_run2 (argc=7, argv=0x7fff4351a3d8, binaryName=0x4022e3 "gnunet-set-profiler", binaryHelp=0x4022de "help", options=0x4023a0 <options.11232>, task=0x401cfd <run>, task_cls=0x0, run_without_scheduler=0) at program.c:281 #8 0x00007f8ca99e22eb in GNUNET_PROGRAM_run (argc=7, argv=0x7fff4351a3d8, binaryName=0x4022e3 "gnunet-set-profiler", binaryHelp=0x4022de "help", options=0x4023a0 <options.11232>, task=0x401cfd <run>, task_cls=0x0) at program.c:317 #9 0x0000000000402117 in main (argc=7, argv=0x7fff4351a3d8) at gnunet-set-profiler.c:299 |
|
|
Should be fixed in r30849. |
|
|
Seems to no longer happen, probably was fixed along the way. |
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2013-07-22 20:14 | Florian Dold | New Issue | |
| 2013-07-22 20:14 | Florian Dold | Status | new => assigned |
| 2013-07-22 20:14 | Florian Dold | Assigned To | => Bart Polot |
| 2013-07-23 00:12 | Bart Polot | Note Added: 0007272 | |
| 2013-07-23 00:12 | Bart Polot | Status | assigned => feedback |
| 2013-07-23 00:17 | Florian Dold | Note Added: 0007273 | |
| 2013-07-23 00:17 | Florian Dold | Status | feedback => assigned |
| 2013-07-23 00:26 | Bart Polot | Note Added: 0007274 | |
| 2013-07-23 00:26 | Bart Polot | Status | assigned => feedback |
| 2013-07-23 00:32 | Florian Dold | Note Added: 0007275 | |
| 2013-07-23 00:32 | Florian Dold | Status | feedback => assigned |
| 2013-07-23 00:43 | Bart Polot | Note Added: 0007276 | |
| 2013-07-23 00:45 | Bart Polot | Note Added: 0007277 | |
| 2013-07-23 00:47 | Florian Dold | Note Added: 0007278 | |
| 2013-07-23 00:49 | Bart Polot | Note Added: 0007279 | |
| 2013-07-23 00:57 | Florian Dold | Note Added: 0007280 | |
| 2013-07-23 01:16 | Florian Dold | Note Added: 0007281 | |
| 2013-07-23 01:20 | Bart Polot | Note Added: 0007282 | |
| 2013-07-23 01:26 | Florian Dold | Note Added: 0007283 | |
| 2013-07-23 01:31 | Bart Polot | Note Added: 0007284 | |
| 2013-07-23 01:35 | Florian Dold | Note Added: 0007285 | |
| 2013-07-23 01:39 | Bart Polot | Note Added: 0007286 | |
| 2013-09-24 08:21 | Christian Grothoff | Product Version | => Git master |
| 2013-09-24 08:21 | Christian Grothoff | Target Version | => 0.10.0 |
| 2013-10-20 20:27 | Christian Grothoff | Note Added: 0007543 | |
| 2013-10-20 20:34 | Florian Dold | Note Added: 0007547 | |
| 2013-10-20 20:40 | Florian Dold | Note Added: 0007548 | |
| 2013-11-15 13:07 | Bart Polot | Note Added: 0007640 | |
| 2013-11-21 18:17 | Bart Polot | Note Added: 0007668 | |
| 2013-11-21 18:17 | Bart Polot | Status | assigned => feedback |
| 2013-11-21 18:19 | Bart Polot | Note Added: 0007669 | |
| 2013-11-22 18:42 | Florian Dold | Note Added: 0007677 | |
| 2013-11-22 18:42 | Florian Dold | Status | feedback => assigned |
| 2013-11-26 16:36 | Bart Polot | Note Added: 0007710 | |
| 2013-11-26 16:36 | Bart Polot | Status | assigned => resolved |
| 2013-11-26 16:36 | Bart Polot | Fixed in Version | => Git master |
| 2013-11-26 16:36 | Bart Polot | Resolution | open => fixed |
| 2013-12-08 23:56 | Christian Grothoff | Fixed in Version | Git master => 0.10.0 |
| 2013-12-24 20:54 | Christian Grothoff | Status | resolved => closed |
| 2014-05-09 18:34 | Christian Grothoff | Category | mesh service => cadet service |