View Issue Details

IDProjectCategoryView StatusLast Update
0003691GNUnettransport servicepublic2019-01-28 18:54
Reporteramatus Assigned Toamatus  
PrioritynormalSeveritycrashReproducibilityhave not tried
Status feedbackResolutionreopened 
Product VersionGit master 
Fixed in VersionGit master 
Summary0003691: Assertion failed at scheduler.c. if (ret == GNUNET_SYSERR)
DescriptionTransport crashed on my node running svn rev 35298 and generated a core file. See the additional information section for the backtrace.
Additional InformationProgram terminated with signal SIGABRT, Aborted.
#0 0x00007f41d9b68107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt full
#0 0x00007f41d9b68107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
        resultvar = 0
        pid = 30654
        selftid = 30654
#1 0x00007f41d9b694e8 in __GI_abort () at abort.c:89
        save_stage = 2
        act = {__sigaction_handler = {sa_handler = 0xffffffffffffffff,
            sa_sigaction = 0xffffffffffffffff}, sa_mask = {__val = {0, 37543328, 139920824319495, 1, 0,
              1, 139920802028840, 139920803447012, 37543328, 1424870435542203, 139920824345381, 0,
              18446744073709551504, 0, 139920805824400, 139920805818464}}, sa_flags = -618154240,
          sa_restorer = 0x7f41d9ed4be0 <_IO_helper_jumps>}
        sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007f41da1f3799 in GNUNET_abort () at common_logging.c:289
No locals.
#3 0x00007f41da21efac in GNUNET_SCHEDULER_run (task=0x50fe9802954bb, task_cls=0x7f41db27b690)
    at scheduler.c:733
        rs = 0x23cdda0
        ws = 0x7f41d9c940e4
        ret = 0
        c = 0 '\000'
        __FUNCTION__ = "GNUNET_SCHEDULER_run"
#4 0x00007f41da2278d8 in GNUNET_SERVICE_run (argc=5, argv=0x7fffa91e9608,
    service_name=0x418fdf "transport", options=GNUNET_SERVICE_OPTION_NONE, task=0x7fffa91e9270,
    task_cls=0x23b9720) at service.c:1503
        err = 5
        ret = 0
        cfg_fn = 0x23b9700 "~/.config/gnunet.conf"
        opt_cfg_fn = 0x23b9850 "/var/lib/gnunet/.config/gnunet.conf"
        loglev = 0x23b9880 "ERROR"
        logfile = 0x0
        do_daemonize = 0
        i = 4296671
        skew_offset = 0
        skew_variance = 0
        sctx = {cfg = 0x23b9720, server = 0x23b9950, addrs = 0x0, service_name = 0x418fdf "transport",
          task = 0x403b90 <run>, task_cls = 0x0, v4_denied = 0x0, v6_denied = 0x0,
          v4_allowed = 0x23b9780, v6_allowed = 0x23cca20, my_handlers = 0x23cd360, addrlens = 0x0,
          lsocks = 0x23c8b50, shutdown_task = 0x0, timeout = {rel_value_us = 18446744073709551615},
          ret = 1, ready_confirm_fd = -1, require_found = 1, match_uid = 0, match_gid = 1,
          options = GNUNET_SERVICE_OPTION_NONE}
        cfg = 0x23b9720
        xdg = 0x0
        service_options = {{shortName = 99 'c', name = 0x7f41da230e63 "config",
            argumentHelp = 0x7f41da230e6a "FILENAME",
            description = 0x7f41da230f30 "use configuration file FILENAME", require_argument = 1,
            processor = 0x7f41da2107b0 <GNUNET_GETOPT_set_string>, scls = 0x7fffa91e91f0}, {
            shortName = 100 'd', name = 0x7f41da231c5b "daemonize", argumentHelp = 0x0,
            description = 0x7f41da231f48 "do daemonize (detach from terminal)", require_argument = 0,
            processor = 0x7f41da2107a0 <GNUNET_GETOPT_set_one>, scls = 0x7fffa91e91e4}, {
            shortName = 104 'h', name = 0x7f41da230e7e "help", argumentHelp = 0x0,
            description = 0x7f41da230e73 "print this help", require_argument = 0,
            processor = 0x7f41da210440 <GNUNET_GETOPT_format_help_>, scls = 0x0}, {shortName = 76 'L',
            name = 0x7f41da230e83 "log", argumentHelp = 0x7f41da230e87 "LOGLEVEL",
            description = 0x7f41da230f50 "configure logging to use LOGLEVEL", require_argument = 1,
            processor = 0x7f41da2107b0 <GNUNET_GETOPT_set_string>, scls = 0x7fffa91e91f8}, {
            shortName = 108 'l', name = 0x7f41da230e90 "logfile",
            argumentHelp = 0x7f41da22db34 "LOGFILE",
            description = 0x7f41da230f78 "configure logging to write logs to LOGFILE",
            require_argument = 1, processor = 0x7f41da2107b0 <GNUNET_GETOPT_set_string>,
            scls = 0x7fffa91e9200}, {shortName = 118 'v', name = 0x7f41da230e98 "version",
            argumentHelp = 0x0, description = 0x7f41da230ea0 "print the version number",
            require_argument = 0, processor = 0x7f41da210420 <GNUNET_GETOPT_print_version_>,
            scls = 0x7f41da230eb9}, {shortName = 0 '\000', name = 0x0, argumentHelp = 0x0,
            description = 0x0, require_argument = 0, processor = 0x0, scls = 0x0}}
        __FUNCTION__ = "GNUNET_SERVICE_run"
#5 0x0000000000403909 in main (argc=<optimized out>, argv=<optimized out>)
    at gnunet-service-transport.c:925
No locals.
TagsNo tags attached.

Activities

Christian Grothoff

2015-02-28 14:06

manager   ~0008928

This is totally strange, active_task is a static global and you're touching it for the first time. Also, this code has run like this fine for 5 years in every service.

Can you reproduce this? What does 'active_task' point to!? Hardware / compiler bug?

amatus

2015-02-28 16:25

developer   ~0008937

This has happened 3 times on 2 different machines. I saved a core file from each machine, unfortunately they both show:
(gdb) p active_task
$1 = (struct GNUNET_SCHEDULER_Task *) 0x0

Christian Grothoff

2015-02-28 17:53

manager   ~0008938

I've never seen this, and this is very odd as discussed. Waiting for more input.

amatus

2015-03-03 05:18

developer   ~0008974

Last edited: 2015-03-03 05:26

I hit this again at rev 35318. It happened just as I shutdown my peer.
Here's the improved backtrace (-O0 -g):

#0 0xb7584ddc in __GI_raise (sig=sig@entry=6)
    at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0xb7586463 in __GI_abort () at abort.c:89
#2 0xb772ba25 in GNUNET_abort_ () at common_logging.c:289
#3 0xb775fbb2 in GNUNET_SCHEDULER_run (task=0xb776a7f5 <service_task>,
    task_cls=0xbfbc36c0) at scheduler.c:804
#4 0xb776c393 in GNUNET_SERVICE_run (argc=7, argv=0xbfbc38a4,
    service_name=0x80648a8 "transport", options=GNUNET_SERVICE_OPTION_NONE,
    task=0x804c529 <run>, task_cls=0x0) at service.c:1503
#5 0x0804cb00 in main (argc=7, argv=0xbfbc38a4)
    at gnunet-service-transport.c:925

amatus

2015-03-03 05:22

developer   ~0008975

(gdb) p *rs
$5 = {nsds = 19, sds = {fds_bits = {393248, 0 <repeats 31 times>}}}
(gdb) p *ws
$6 = {nsds = 0, sds = {fds_bits = {0 <repeats 32 times>}}}
(gdb) p timeout
$7 = {rel_value_us = 0}
(gdb) p ready_count
$8 = 0

transport log:
Mar 03 04:06:16-263959 ats-scheduling-api-27403 ERROR ATS connection died (code 1), reconnecting
Mar 03 04:06:17-426518 transport-27403 ERROR Assertion failed at plugin_transport_tcp.c:3104.
Mar 03 04:06:17-428820 util-scheduler-27403 ERROR `select' failed at scheduler.c:791 with error: Bad file descriptor
Mar 03 04:06:17-428852 transport-27403 ERROR Assertion failed at scheduler.c:804

Christian Grothoff

2015-03-03 09:08

manager   ~0008977

Ok, this one I did see (a few days ago), but not anymore with SVN HEAD. Not that I know what specifically I might have done to fix it...

amatus

2015-03-09 15:28

developer   ~0009012

Saw this again at rev 35349

amatus

2015-03-17 15:54

developer   ~0009023

Still seeing this at rev 35365.

amatus

2015-06-28 01:35

developer   ~0009356

Saw this again at rev 35997

Christian Grothoff

2015-06-28 09:02

manager   ~0009359

Which plugins do you have enabled? And: does changing the subset of plugins change the occurrence of the bug?

amatus

2015-06-30 00:50

developer   ~0009375

PLUGINS = tcp udp http_server http_client https_server https_client

I haven't tried with a different set of plugins yet.
What plugins were you running when you saw it?

amatus

2015-07-21 01:02

developer   ~0009472

Saw this again at rev 36103

amatus

2015-08-25 21:03

developer   ~0009587

Saw this again at rev 36262

amatus

2015-10-23 16:12

developer   ~0009783

Saw this again at rev 36559

amatus

2017-06-17 15:59

developer   ~0012257

Saw this again at 576541cc166763231f822d47e87e6d536b4a4adf

amatus

2017-06-26 00:43

developer   ~0012272

Fixed in 556ccd6d483b3678867c3829e6979c307df04450

amatus

2017-11-16 00:25

developer   ~0012580

This is still happening.

cy1

2018-05-16 10:57

reporter   ~0012927

Can confirm this... made gnunet completely hand on my machine.

May 15 04:50:40-510031 cadet-27197 WARNING Sending BROKEN due to MQ going down
May 15 04:50:40-529036 core-32456 WARNING Failed checksum validation for a message from `487M'
May 15 04:50:40-662245 core-32456 WARNING Failed checksum validation for a message from `487M'
May 15 04:50:40-750807 core-32456 WARNING Failed checksum validation for a message from `487M'
May 15 04:50:40-839449 core-32456 WARNING Failed checksum validation for a message from `487M'
May 15 04:50:40-928354 core-32456 WARNING Failed checksum validation for a message from `487M'
May 15 04:50:41-016929 core-32456 WARNING Failed checksum validation for a message from `487M'
May 15 04:50:41-106089 core-32456 WARNING Failed checksum validation for a message from `487M'
May 15 04:50:42-453494 transport-udp-23763 WARNING Message `UDP could not transmit IPv6 message! Please check your network configuration and disable IPv6 if your connection does not have a global IPv6 address' repeated 12 times in the last 152 s
May 15 04:50:42-453494 transport-23763 WARNING It took us 3135 ms to send 39540/41328 bytes to DSTJ (1, udp)
May 15 04:50:42-461074 util-scheduler-23763 ERROR `select' failed at scheduler.c:2335 with error: Bad file descriptor
May 15 04:50:42-461098 transport-23763 ERROR Assertion failed at scheduler.c:2371.
May 15 04:50:42-463444 cadet-27197 WARNING Sending BROKEN due to MQ going down
May 15 04:50:42-463493 cadet-27197 WARNING Sending BROKEN due to MQ going down

dvn

2019-01-28 04:58

developer   ~0013490

Hmmm,... this has been going on for years. Does it make sense to have this as a blocker for the 0.11.0 release?

Additionally, I imagine the next generation transport will either greatly affect this whole issue, or nullify it, no?

Christian Grothoff

2019-01-28 06:03

manager   ~0013494

Hard to say if it's a "blocker", as I've personally _not_ seen it, but maybe I wasn't watching my logs closely enough either...

But I agree that given that it seems to only affect the transport service, TNG should take care of this as well.

dvn

2019-01-28 18:54

developer   ~0013496

amatus expressed agreement with my previous comment on IRC.

I have moved it out of the 0.11.0 roadmap.

Issue History

Date Modified Username Field Change
2015-02-27 17:05 amatus New Issue
2015-02-27 17:05 amatus Status new => assigned
2015-02-27 17:05 amatus Assigned To => Matthias Wachs
2015-02-28 14:06 Christian Grothoff Note Added: 0008928
2015-02-28 14:06 Christian Grothoff Assigned To Matthias Wachs =>
2015-02-28 14:06 Christian Grothoff Status assigned => feedback
2015-02-28 14:06 Christian Grothoff Category transport service => util library
2015-02-28 16:25 amatus Note Added: 0008937
2015-02-28 16:25 amatus Status feedback => new
2015-02-28 17:53 Christian Grothoff Note Added: 0008938
2015-02-28 17:53 Christian Grothoff Assigned To => amatus
2015-02-28 17:53 Christian Grothoff Status new => feedback
2015-03-03 05:18 amatus Note Added: 0008974
2015-03-03 05:18 amatus Status feedback => assigned
2015-03-03 05:22 amatus Note Added: 0008975
2015-03-03 05:25 amatus Assigned To amatus => Christian Grothoff
2015-03-03 05:26 amatus Note Edited: 0008974
2015-03-03 09:08 Christian Grothoff Note Added: 0008977
2015-03-03 09:08 Christian Grothoff Category util library => transport service
2015-03-03 09:08 Christian Grothoff Assigned To Christian Grothoff =>
2015-03-03 09:08 Christian Grothoff Status assigned => feedback
2015-03-09 15:28 amatus Note Added: 0009012
2015-03-09 15:28 amatus Status feedback => new
2015-03-09 15:34 amatus Summary Assertion failed at scheduler.c:733. GNUNET_assert (NULL == active_task); => Assertion failed at scheduler.c. if (ret == GNUNET_SYSERR)
2015-03-17 15:54 amatus Note Added: 0009023
2015-03-21 00:35 Christian Grothoff Status new => acknowledged
2015-03-21 00:35 Christian Grothoff Target Version => 0.11.0
2015-06-28 01:35 amatus Note Added: 0009356
2015-06-28 09:02 Christian Grothoff Note Added: 0009359
2015-06-30 00:50 amatus Note Added: 0009375
2015-07-21 01:02 amatus Note Added: 0009472
2015-08-25 21:03 amatus Note Added: 0009587
2015-10-23 16:12 amatus Note Added: 0009783
2017-06-17 15:59 amatus Note Added: 0012257
2017-06-17 15:59 amatus Assigned To => amatus
2017-06-17 15:59 amatus Status acknowledged => assigned
2017-06-26 00:43 amatus Status assigned => resolved
2017-06-26 00:43 amatus Resolution open => fixed
2017-06-26 00:43 amatus Fixed in Version => Git master
2017-06-26 00:43 amatus Note Added: 0012272
2017-11-16 00:25 amatus Status resolved => feedback
2017-11-16 00:25 amatus Resolution fixed => reopened
2017-11-16 00:25 amatus Note Added: 0012580
2018-05-16 10:57 cy1 Note Added: 0012927
2019-01-28 04:58 dvn Note Added: 0013490
2019-01-28 06:03 Christian Grothoff Note Added: 0013494
2019-01-28 18:51 dvn Target Version 0.11.0 =>
2019-01-28 18:54 dvn Note Added: 0013496