View Issue Details

IDProjectCategoryView StatusLast Update
0003799GNUnetotherpublic2018-06-07 00:24
ReporterjahAssigned ToChristian Grothoff 
PrioritynormalSeverityminorReproducibilityalways
Status closedResolutionfixed 
Platformx86_64OSTrisquelOS Version7.0, Belenos
Product VersionSVN HEAD 
Target Version0.11.0pre66Fixed in Version0.11.0pre66 
Summary0003799: gnunet-publish client hangs after unindex error
DescriptionThe publish-service won't publish files (for some reason as yet undetermined) and the client receives a GNUNET_FS_STATUS_UNINDEX_START which causes it to hang until the user sends SIGINT.

Then the client receives GNUNET_FS_STATUS_UNINDEX_ERROR which again causes it to hang until a further SIGINT.

Then the client process ends with a non-zero exit status.
Steps To Reproduce$ gnunet-publish -s Music/some.mp3
Publishing `/path/to/Music/some.mp3' done.
URI is `gnunet://fs/chk/DF8D[snipped].5853556'.

$ gnunet-publish -V Music/some.mp3
Preprocessing complete.

Ctrl+c

^CError publishing: Insufficient space for publishing: Failed to receive status response from database..
Unexpected status: 30

Ctrl+c

^CUnexpected status: 34
Additional InformationI've attached the output from a gdb session: publish.gdb.out.

Gnunet is in multi-user mode and publishing is being attempted as a user in the gnunet group. I've no idea why publishing isn't working, but the problem isn't relevant to this issue with the client anyway...

In gnunet-publish-gtk the issue manifests as a progress bar stuck at 0%.
TagsNo tags attached.

Activities

jah

2015-05-29 02:09

reporter  

publish.gdb.out (10,627 bytes)

Christian Grothoff

2015-05-29 14:46

manager   ~0009170

I think the real problem here is that your user is not in the group 'gnunet', and that's why you are hitting some (admittedly buggy) error handling paths in the logic. So add your $USER to the 'gnunet' group so that you can access the datastore, re-login and hopefully this will 'go away'.

Christian Grothoff

2015-05-29 15:07

manager   ~0009179

Ok, tried to fix the error handling in SvN 35841, so if you didn't fix the 'basic' issue of your user not being in 'gnunet', I'd appreciate feedback on whether the fix works.

jah

2015-05-29 19:24

reporter   ~0009190

r35841 doesn't help much:-

$ gnunet-publish -V Music/various/08-luniz-i_got_5_on_it.mp3
Preprocessing complete.

<hangs> Ctrl-c

^CError publishing: Insufficient space for publishing: Failed to receive status response from database..
Starting cleanup after abort

<hangs> Ctrl-c

^CCleanup after abort failed.
Unexpected status: 36
$

By the way, the user already is in the gnunet group:-

$ id -G gnunet
130
$ id -G
1000 4 24 27 30 46 109 122 130

Christian Grothoff

2015-05-29 20:43

manager   ~0009191

Hmm. That suggests my initial diagnostic of where the issue came from was wrong. What do you get for

gnunet-statistics -s datastore

jah

2015-05-30 14:29

reporter  

publish.debug.log (880,249 bytes)

jah

2015-05-30 14:29

reporter   ~0009195

To better reproduce the issue: set GNUNET_RUNTIME_DIR in the system's config to something other than /tmp/gnunet-system-runtime (e.g. /tmp/gnunet).

Attached is publish.debug.log which was run through the following to remove config loading messages:-

grep -v 'DEBUG Asked' | grep -v 'DEBUG Retrieved' | grep -v 'DEBUG Expanded' | grep -v 'DEBUG Config' | grep -iv 'deserializing' | grep -iv tmp

SIGINTs were sent at 'May 30 13:13:01' and 'May 30 13:18:25', which roughly correspond to line 5362 and the end of the file.

jah

2015-05-30 14:38

reporter   ~0009196

The reason this issue happened for me was that I'd set GNUNET_RUNTIME_DIR incorrectly so that the client processes were trying to talk over the default named sockets, but the services were configured differently (and erroneously).

I would have realised this sooner if I'd run gnunet-publish with DEBUG logging.

jah

2015-05-30 15:19

reporter   ~0009197

Last edited: 2015-05-30 21:25

View 2 revisions

It looks as if the requests to the datastore only timed-out because of the SIGINTs:-

grep -i datastore publish.debug.log | grep -v 'Failed to connect'

...
May 30 13:13:01-381249 datastore-api-5012 DEBUG Timeout of request in datastore queue
...
May 30 13:18:25-436292 datastore-api-5012 DEBUG Timeout of request in datastore queue
...

Christian Grothoff

2015-05-31 12:58

manager   ~0009198

Ok, so that's still equivalent to not being in the 'gnunet' group: you didn't communicate because you had the wrong path, I suspected you had the wrong permissions. Still, the error handling should be fixed.

jah

2015-05-31 16:58

reporter   ~0009205

Is the expected behaviour of the client to attempt repeatedly attempt to transmit over a unix socket that has no listener, forever? Would it make more sense to detect the absence of anything listening at the other end?

Christian Grothoff

2015-05-31 17:15

manager   ~0009207

Well, the problem is that the listener may (re)start anytime. So in general GNUnet's client APIs all are supposed to always keep trying until it works. So yes, this is expected behavior.

jah

2015-05-31 18:10

reporter  

gnunet-publish_unistop.patch (479 bytes)
Index: gnunet-publish.c
===================================================================
--- gnunet-publish.c	(revision 35841)
+++ gnunet-publish.c	(working copy)
@@ -312,6 +312,9 @@
              _("Cleanup after abort failed.\n"));
     GNUNET_FS_unindex_stop (info->value.unindex.uc);
     return NULL;
+  case GNUNET_FS_STATUS_UNINDEX_STOPPED:
+    GNUNET_break (NULL == pc);
+    return NULL;
   default:
     FPRINTF (stderr,
              _("Unexpected status: %d\n"),

jah

2015-05-31 18:12

reporter   ~0009208

Understood, thank you. Attached is a patch to handle GNUNET_FS_STATUS_UNINDEX_STOPPED (see #c9190, above).

Christian Grothoff

2015-05-31 18:59

manager   ~0009209

Is the problem fixed with that patch? (Still need to look at it...)

jah

2015-05-31 19:28

reporter   ~0009210

My patch just handles one more status message, but please do check that it's the right thing to do.

As to your r35841, I will examine the difference it makes more closely and come back to you. Certainly the behaviour now is "Sending REMOVE request to DATASTORE service" (see publish.debug.log:5470) after sending the first of the two SIGINTs required to halt the client.

Christian Grothoff

2015-05-31 21:36

manager   ~0009213

Well, the line

+ GNUNET_break (NULL == pc);

is a bit unnatural in the patch, as theoretically the unindex could complete before the publish stop is finished. So I'll leave that one out. Also, we should fix it so we do not need two CTRL-C. I'll look into this, now that the diagnostic is a bit clearer.

Christian Grothoff

2015-05-31 21:45

manager   ~0009214

Reproduced it by simply not starting the peer.

Fixed the need for double-CTRL-C by finding the 2nd place where the unindex-on-error was initiated and then skipping unindex if we never stored anything in the DB (which is the case on connect failure).

Also improved the error message (SVN 35849).

Thanks for reporting!

Issue History

Date Modified Username Field Change
2015-05-29 02:09 jah New Issue
2015-05-29 02:09 jah File Added: publish.gdb.out
2015-05-29 14:07 Christian Grothoff Assigned To => Christian Grothoff
2015-05-29 14:07 Christian Grothoff Status new => assigned
2015-05-29 14:46 Christian Grothoff Note Added: 0009170
2015-05-29 15:07 Christian Grothoff Note Added: 0009179
2015-05-29 15:07 Christian Grothoff Status assigned => feedback
2015-05-29 19:24 jah Note Added: 0009190
2015-05-29 19:24 jah Status feedback => assigned
2015-05-29 20:43 Christian Grothoff Note Added: 0009191
2015-05-30 14:29 jah File Added: publish.debug.log
2015-05-30 14:29 jah Note Added: 0009195
2015-05-30 14:38 jah Note Added: 0009196
2015-05-30 15:19 jah Note Added: 0009197
2015-05-30 21:25 jah Note Edited: 0009197 View Revisions
2015-05-31 12:58 Christian Grothoff Note Added: 0009198
2015-05-31 16:58 jah Note Added: 0009205
2015-05-31 17:15 Christian Grothoff Note Added: 0009207
2015-05-31 18:10 jah File Added: gnunet-publish_unistop.patch
2015-05-31 18:12 jah Note Added: 0009208
2015-05-31 18:59 Christian Grothoff Note Added: 0009209
2015-05-31 19:28 jah Note Added: 0009210
2015-05-31 21:36 Christian Grothoff Note Added: 0009213
2015-05-31 21:45 Christian Grothoff Note Added: 0009214
2015-05-31 21:45 Christian Grothoff Status assigned => resolved
2015-05-31 21:45 Christian Grothoff Fixed in Version => 0.11.0pre66
2015-05-31 21:45 Christian Grothoff Resolution open => fixed
2015-05-31 21:45 Christian Grothoff Target Version => 0.11.0pre66
2018-06-07 00:24 Christian Grothoff Status resolved => closed