View Issue Details

IDProjectCategoryView StatusLast Update
0002962GNUnetcadet servicepublic2013-12-24 20:54
ReporterFlorian Dold Assigned ToBart Polot  
PriorityhighSeveritymajorReproducibilityalways
Status closedResolutionfixed 
Product VersionGit master 
Target Version0.10.0Fixed in Version0.10.0 
Summary0002962: gnunet-set-profiler makes the mesh service trigger assertions in core and mesh
DescriptionRunning the set profiler with a relatively large number of non-common elements causes mesh to fail with assertions.

Note that running gnunet-set-profiler requires a running peer with the set service started.
Steps To Reproduce$ gnunet-arm -s && gnunet-arm -i set
$ gnunet-set-profiler -LINFO -A 100 -B 100 -C 100
Additional Information[...]
Jul 22 20:10:05-310633 mesh-29647 ERROR Assertion failed at core_api.c:1294.
Jul 22 20:10:05-311606 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3514.
Jul 22 20:10:05-311661 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3740.
Jul 22 20:10:05-312461 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3514.
Jul 22 20:10:05-312515 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3740.
Jul 22 20:10:05-313643 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3514.
Jul 22 20:10:05-313699 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:3740.
Jul 22 20:12:03-264894 mesh-29647 ERROR Assertion failed at gnunet-service-mesh.c:1112.
TagsNo tags attached.

Activities

Bart Polot

2013-07-23 00:12

reporter   ~0007272

The lines did not correspond to assertions from the begining, and even now I can't reproduce the behavior. Does this still happen in svn HEAD?

Florian Dold

2013-07-23 00:17

developer   ~0007273

In r28256 I only get assertions from core, and messages are not delivered:

Jul 23 00:15:43-978249 mesh-20519 ERROR Assertion failed at core_api.c:1294

Bart Polot

2013-07-23 00:26

reporter   ~0007274

I can't reproduce even this behavior. Can I ask you to:
- change the GNUNET_break(0) to a GNUNET_abort() in core_api.c:1294
- make install in core and mesh
- rerun the testcase, collect the coredump
- post here a stacktrace

Florian Dold

2013-07-23 00:32

developer   ~0007275

Sure:
(gdb) bt
#0 0x00007f3091b98037 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f3091b9b698 in __GI_abort () at abort.c:90
#2 0x00007f3092556439 in GNUNET_abort () at common_logging.c:293
#3 0x00007f309233dc2c in GNUNET_CORE_notify_transmit_ready (handle=<optimized out>,
    cork=cork@entry=0, priority=priority@entry=0, maxdelay=...,
    target=target@entry=0x7fff4c1b7890, notify_size=notify_size@entry=84,
    notify=notify@entry=0x411470 <queue_send>, notify_cls=notify_cls@entry=0x1ddea60)
    at core_api.c:1279
#4 0x00000000004116d1 in queue_send (cls=0x1ddea60, size=<optimized out>, buf=0x1de06cc)
    at gnunet-service-mesh.c:3734
#5 0x00007f309233cb33 in transmit_message (cls=<optimized out>, size=232, buf=0x1de0670)
    at core_api.c:648
#6 0x00007f309255277c in client_notify (cls=0x1de12d0, size=232, buf=0x1de0670) at client.c:1135
#7 0x00007f309255b62e in process_notify (connection=0x1dde360) at connection.c:1183
#8 transmit_ready (cls=0x1dde360, tc=<optimized out>) at connection.c:1314
#9 0x00007f309257dbf9 in run_ready (ws=0x1defb60, rs=0x1defad0) at scheduler.c:597
#10 GNUNET_SCHEDULER_run (task=task@entry=0x7f3092585aa0 <service_task>,
    task_cls=task_cls@entry=0x7fff4c1b7b90) at scheduler.c:786
#11 0x00007f30925888f6 in GNUNET_SERVICE_run (argc=argc@entry=3, argv=argv@entry=0x7fff4c1b7f18,
    service_name=service_name@entry=0x413221 "mesh",
    options=options@entry=GNUNET_SERVICE_OPTION_NONE, task=task@entry=0x4026c0 <run>,
    task_cls=task_cls@entry=0x0) at service.c:1816
#12 0x0000000000402509 in main (argc=3, argv=0x7fff4c1b7f18) at gnunet-service-mesh.c:5946

Bart Polot

2013-07-23 00:43

reporter   ~0007276

Hmmmmm... .this is really strange. This function (queue_send) is the one called by transmit_ready to fill the buffer, it cannot be "called twice" from there.

I'll ask Christian tomorrow to see if there is some corner case hit here.

Bart Polot

2013-07-23 00:45

reporter   ~0007277

Wait... core_api.c:1279!?

That's plain impossible, the notify callback is hardcoded!!

Florian Dold

2013-07-23 00:47

developer   ~0007278

Sometimes gdb reports different line numbers than GNUNET_break, most probably due to optimizations. The assertion occurs in core_api.c:1294.

Bart Polot

2013-07-23 00:49

reporter   ~0007279

It helps to export
CFLAGS="-O0 $CFLAGS"
to avoid the optimizations in gdb.

Florian Dold

2013-07-23 00:57

developer   ~0007280

Doh, should have done this at configure time, will report back once compilation is done ...

However, I changed the GNUNET_abort to GNUNET_assert (0), and the line reported is 1294, but gdb still has 1279 in the stack trace.

Florian Dold

2013-07-23 01:16

developer   ~0007281

Now here's the stack trace without optimizations:

#0 0x00007fc558519037 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fc558519037 in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fc55851c698 in __GI_abort () at abort.c:90
#2 0x00007fc558edb964 in GNUNET_abort () at common_logging.c:293
#3 0x00007fc558cc2456 in GNUNET_CORE_notify_transmit_ready (handle=0xcd9dd0, cork=0, priority=0,
    maxdelay=..., target=0x7fff21e6bc40, notify_size=84, notify=0x40bec5 <queue_send>,
    notify_cls=0xcdaa60) at core_api.c:1294
#4 0x000000000040cbc4 in queue_send (cls=0xcdaa60, size=140, buf=0xcdc33c)
    at gnunet-service-mesh.c:3734
#5 0x00007fc558cbebe1 in transmit_message (cls=0xcd9dd0, size=232, buf=0xcdc2e0) at core_api.c:648
#6 0x00007fc558ed9f89 in client_notify (cls=0xcda520, size=232, buf=0xcdc2e0) at client.c:1135
#7 0x00007fc558ee56e4 in process_notify (connection=0xcda360) at connection.c:1183
#8 0x00007fc558ee620b in transmit_ready (cls=0xcda360, tc=0x7fff21e6bf10) at connection.c:1314
#9 0x00007fc558f0c371 in run_ready (rs=0xcebad0, ws=0xcebb60) at scheduler.c:597
#10 0x00007fc558f0cbba in GNUNET_SCHEDULER_run (task=0x7fc558f1a2be <service_task>,
    task_cls=0x7fff21e6c180) at scheduler.c:786
#11 0x00007fc558f1be2e in GNUNET_SERVICE_run (argc=3, argv=0x7fff21e6c4f8,
    service_name=0x4176ba "mesh", options=GNUNET_SERVICE_OPTION_NONE, task=0x414e5a <run>,
    task_cls=0x0) at service.c:1816
#12 0x0000000000415b2a in main (argc=3, argv=0x7fff21e6c4f8) at gnunet-service-mesh.c:5946

Bart Polot

2013-07-23 01:20

reporter   ~0007282

The only change I could think of is the one in r28257. Does that help at all?

Florian Dold

2013-07-23 01:26

developer   ~0007283

Still getting the assertion :(

Is there any more debug info that would help you?

Bart Polot

2013-07-23 01:31

reporter   ~0007284

Not really. But you might try with 28258...

Florian Dold

2013-07-23 01:35

developer   ~0007285

r28258 did it for the core assertion!

However, now I get the following assertions when running the set profiler with many non-common elements:

ERROR Assertion failed at gnunet-service-mesh.c:3840

Bart Polot

2013-07-23 01:39

reporter   ~0007286

If you are running with reliable tunnels that shouldn't worry you, it's just about dropped traffic, which should be retransmitted. The message is ugly, though...

I'll look into it tomorrow, as I have the feeling that there shouldn't be any dropped messages in a local test, but I'd have to see if I can reproduce it on the office desktop.

Christian Grothoff

2013-10-20 20:27

manager   ~0007543

what's the status on this?

Florian Dold

2013-10-20 20:34

developer   ~0007547

Oct 20 20:32:24-236468 mesh-16806 ERROR Assertion failed at gnunet-service-mesh.c:3111.

Seems like I can still reproduce at least one assertion in SVN HEAD, by running e.g.
$ gnunet-set-profiler -A 500 -B 100 -C 5
a few times (requires a running peer with set service started).
Not every run triggers the assertion ...

Florian Dold

2013-10-20 20:40

developer   ~0007548

Somehow got another one:
Oct 20 20:37:23-632411 mesh-16806 ERROR Assertion failed at gnunet-service-mesh.c:3099.

Bart Polot

2013-11-15 13:07

reporter   ~0007640

Needs re-testing.

Bart Polot

2013-11-21 18:17

reporter   ~0007668

Running set profiler finishes with a segfault in the mq code, cannot reproduce these.

Bart Polot

2013-11-21 18:19

reporter   ~0007669

Program terminated with signal 11, Segmentation fault.
#0 0x00007f8ca9798f24 in handle_client_set_error (cls=0x23c0600,
    error=GNUNET_MQ_ERROR_READ) at set_api.c:372
372 if (NULL != set->ops_head->result_cb)
(gdb) bt
#0 0x00007f8ca9798f24 in handle_client_set_error (cls=0x23c0600,
    error=GNUNET_MQ_ERROR_READ) at set_api.c:372
#1 0x00007f8ca99d814a in GNUNET_MQ_inject_error (mq=0x23e0870,
    error=GNUNET_MQ_ERROR_READ) at mq.c:220
#2 0x00007f8ca99d930a in connection_client_transmit_queued (cls=0x23e0870, size=0,
    buf=0x0) at mq.c:560
#3 0x00007f8ca99af5ac in client_notify (cls=0x23c1fe0, size=0, buf=0x0)
    at client.c:1120
#4 0x00007f8ca99bb774 in connect_error (cls=0x23e0720, tc=0x7fff43519f10)
    at connection.c:1232
#5 0x00007f8ca99e67eb in run_ready (rs=0x23be8d0, ws=0x23d44c0) at scheduler.c:593
#6 0x00007f8ca99e7079 in GNUNET_SCHEDULER_run (task=0x7f8ca99e16fc <program_main>,
    task_cls=0x7fff4351a1b0) at scheduler.c:808
#7 0x00007f8ca99e21be in GNUNET_PROGRAM_run2 (argc=7, argv=0x7fff4351a3d8,
    binaryName=0x4022e3 "gnunet-set-profiler", binaryHelp=0x4022de "help",
    options=0x4023a0 <options.11232>, task=0x401cfd <run>, task_cls=0x0,
    run_without_scheduler=0) at program.c:281
#8 0x00007f8ca99e22eb in GNUNET_PROGRAM_run (argc=7, argv=0x7fff4351a3d8,
    binaryName=0x4022e3 "gnunet-set-profiler", binaryHelp=0x4022de "help",
    options=0x4023a0 <options.11232>, task=0x401cfd <run>, task_cls=0x0)
    at program.c:317
#9 0x0000000000402117 in main (argc=7, argv=0x7fff4351a3d8)
    at gnunet-set-profiler.c:299

Florian Dold

2013-11-22 18:42

developer   ~0007677

Should be fixed in r30849.

Bart Polot

2013-11-26 16:36

reporter   ~0007710

Seems to no longer happen, probably was fixed along the way.

Issue History

Date Modified Username Field Change
2013-07-22 20:14 Florian Dold New Issue
2013-07-22 20:14 Florian Dold Status new => assigned
2013-07-22 20:14 Florian Dold Assigned To => Bart Polot
2013-07-23 00:12 Bart Polot Note Added: 0007272
2013-07-23 00:12 Bart Polot Status assigned => feedback
2013-07-23 00:17 Florian Dold Note Added: 0007273
2013-07-23 00:17 Florian Dold Status feedback => assigned
2013-07-23 00:26 Bart Polot Note Added: 0007274
2013-07-23 00:26 Bart Polot Status assigned => feedback
2013-07-23 00:32 Florian Dold Note Added: 0007275
2013-07-23 00:32 Florian Dold Status feedback => assigned
2013-07-23 00:43 Bart Polot Note Added: 0007276
2013-07-23 00:45 Bart Polot Note Added: 0007277
2013-07-23 00:47 Florian Dold Note Added: 0007278
2013-07-23 00:49 Bart Polot Note Added: 0007279
2013-07-23 00:57 Florian Dold Note Added: 0007280
2013-07-23 01:16 Florian Dold Note Added: 0007281
2013-07-23 01:20 Bart Polot Note Added: 0007282
2013-07-23 01:26 Florian Dold Note Added: 0007283
2013-07-23 01:31 Bart Polot Note Added: 0007284
2013-07-23 01:35 Florian Dold Note Added: 0007285
2013-07-23 01:39 Bart Polot Note Added: 0007286
2013-09-24 08:21 Christian Grothoff Product Version => Git master
2013-09-24 08:21 Christian Grothoff Target Version => 0.10.0
2013-10-20 20:27 Christian Grothoff Note Added: 0007543
2013-10-20 20:34 Florian Dold Note Added: 0007547
2013-10-20 20:40 Florian Dold Note Added: 0007548
2013-11-15 13:07 Bart Polot Note Added: 0007640
2013-11-21 18:17 Bart Polot Note Added: 0007668
2013-11-21 18:17 Bart Polot Status assigned => feedback
2013-11-21 18:19 Bart Polot Note Added: 0007669
2013-11-22 18:42 Florian Dold Note Added: 0007677
2013-11-22 18:42 Florian Dold Status feedback => assigned
2013-11-26 16:36 Bart Polot Note Added: 0007710
2013-11-26 16:36 Bart Polot Status assigned => resolved
2013-11-26 16:36 Bart Polot Fixed in Version => Git master
2013-11-26 16:36 Bart Polot Resolution open => fixed
2013-12-08 23:56 Christian Grothoff Fixed in Version Git master => 0.10.0
2013-12-24 20:54 Christian Grothoff Status resolved => closed
2014-05-09 18:34 Christian Grothoff Category mesh service => cadet service