View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0005569||GNUnet||other||public||2019-02-14 23:57||2019-02-28 11:17|
|Reporter||ng0||Assigned To||Christian Grothoff|
|Priority||normal||Severity||minor||Reproducibility||have not tried|
|Product Version||SVN HEAD|
|Target Version||0.11.0||Fixed in Version||0.11.0|
|Summary||0005569: tests hang|
|Description||tests on their own pass (I think), but when run on the latest commit with a simple gmake check in the root of the source directory it gets seemingly random stuck in statistics.|
CG: I'm changing the title, as it seems this happens randomly in _any_ test on shutdown, at least for me. But with very low probability overall.
|Additional Information||gmake: Entering directory '/home/ng0/src/gnunet/gnunet/src/statistics'|
gmake: Entering directory '/home/ng0/src/gnunet/gnunet/src/statistics'
|Tags||No tags attached.|
wrong, they also hang with simply running in src/statistics.
No idea why I assumed a difference.
||It's confusing because in the the python3.7 migration ticket I tested them positively without hanging.|
I also see some tests randomly hang, with a process stuck like this:
#0 __lll_lock_wait_private () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:63
#1 0x00007f6ea4dfddcb in __unregister_atfork (dso_handle=0x7f6ea4eb5940 <atfork_lock>, dso_handle@entry=0x55a865ebd0b8) at register-atfork.c:80
#2 0x00007f6ea4d314a9 in __cxa_finalize (d=0x55a865ebd0b8) at cxa_finalize.c:107
#3 0x000055a865eba233 in __do_global_dtors_aux ()
#4 0x00007ffe871814b0 in ?? ()
#5 0x00007f6ea510f686 in _dl_fini () at dl-fini.c:138
Backtrace stopped: frame did not save the PC
This is VERY, very odd.
||Adding amatus to the monitors on this, as he might have a clue as expert for obscure issues.|
Likely fixed as of e98a4e07e..4611c473f. For posterity, here's Florian Weimer's diagnosis of this one:
* Christian Grothoff:
> I'm seeing some _very_ odd behavior with processes hanging on exit (?)
> with GNU libc 2.28-6 on Debian (amd64 threadripper). This seems to
> happen at random (for random tests, with very low frequency!) in the
> GNUnet (Git master) testsuite when a child process is about to exit.
It looks like you call exit from a signal handler, see
* Signal handler called for signals that should cause us to shutdown.
static char c;
int old_errno = errno; /* backup errno */
if (getpid () != my_pid)
exit (1); /* we have fork'ed since the signal handler was created,
* ignore the signal, see https://gnunet.org/vfork discussion */
&c, sizeof (c));
errno = old_errno;
In general, this results in undefined behavior because exit (unlike
_exit) is not an async-signal-safe function.
I suspect you either call the exit function while a fork is in progress,
or since you register this signal handler multiple times for different
sh->shc_int = GNUNET_SIGNAL_handler_install (SIGINT,
sh->shc_term = GNUNET_SIGNAL_handler_install (SIGTERM,
one call to exit might interrupt another call to exit if both signals
are delivered to the process.
The deadlock you see was introduced in commit
27761a1042daf01987e7d79636d0c41511c6df3c ("Refactor atfork handlers"),
first released in glibc 2.28. The fork deadlock will be gone (in the
single-threaded case) if Debian updates to the current
release/2.28/master branch because we backported commit
60f80624257ef84eacfd9b400bda1b5a5e8e7816 ("nptl: Avoid fork handler lock
for async-signal-safe fork [BZ #24161]") there.
But this will not help you. Even without the deadlock, I expect you
still experience some random corruption during exit, but it's going to
be difficult to spot.
|2019-02-14 23:57||ng0||New Issue|
|2019-02-14 23:59||ng0||OS||=> NetBSD|
|2019-02-14 23:59||ng0||OS Version||=> CURRENT|
|2019-02-14 23:59||ng0||Platform||=> amd64|
|2019-02-14 23:59||ng0||Product Version||=> SVN HEAD|
|2019-02-15 00:00||ng0||Note Added: 0013786|
|2019-02-15 00:06||ng0||Note Added: 0013787|
|2019-02-15 00:23||Christian Grothoff||Note Added: 0013788|
|2019-02-15 00:24||Christian Grothoff||Note Added: 0013789|
|2019-02-16 14:49||Christian Grothoff||Assigned To||=> Christian Grothoff|
|2019-02-16 14:49||Christian Grothoff||Status||new => assigned|
|2019-02-16 15:48||Christian Grothoff||Category||statistics service => other|
|2019-02-16 15:48||Christian Grothoff||Summary||statistics tests hang => tests hang|
|2019-02-16 15:48||Christian Grothoff||Description Updated||View Revisions|
|2019-02-16 15:48||Christian Grothoff||Additional Information Updated||View Revisions|
|2019-02-16 21:21||Christian Grothoff||Status||assigned => resolved|
|2019-02-16 21:21||Christian Grothoff||Resolution||open => fixed|
|2019-02-16 21:21||Christian Grothoff||Note Added: 0013833|
|2019-02-16 21:21||Christian Grothoff||Fixed in Version||=> 0.11.0|
|2019-02-16 21:21||Christian Grothoff||Target Version||=> 0.11.0|
|2019-02-28 11:17||Christian Grothoff||Status||resolved => closed|