View Issue Details

IDProjectCategoryView StatusLast Update
0006225GNUnetfile-sharing servicepublic2020-06-05 05:25
Reportercy1Assigned ToChristian Grothoff 
PrioritynormalSeveritymajorReproducibilityalways
Status assignedResolutionopen 
PlatformGentooOSLinuxOS Version5.4.11
Product Version 
Target VersionFixed in Version 
Summary0006225: gnunet-download chokes while downloading a directory with subdirs each with a bunch of files
DescriptionI've been trying to get gnunet-download working again, and I think I've finally got a reproducible test case. It's got to be a race condition or something, because the conditions to make this bug occur are very touchy. But basically, when I publish a directory, that has a bunch of subdirectories, that each have a moderate number of moderately large files (and some images), I can't download it. Even if I download it directly after publishing. Even if the data is indexed, and gnunet-fs -i reports it indexed. Even if I can download the files all individually with no delay. It still chokes. The download will be moving very quickly, with no delays, then suddenly stop cold, and never get past that seemingly arbitrary point. Resuming the download makes it even worse, making the choking point even earlier (although oddly consistent).

I'll attach the program which generates the test data, as well as the test data. File contents don't seem to matter thankfully, only file size, file count, and subdirectory count, so I can fill every file with Q and xz compresses it mightily, rather than how xz screams like a little girl and cowers in the corner when finding randomized file contents.
Steps To Reproduce1. gcc -ggdb3 ~/code/make_dumb_directory_tree.c
2. ./a.out
3. gnunet-publish -n herp
<copy the final hash>
4. gnunet-download -V -o derp.gnd <that final hash>
...
Downloading `derp/derpc.gnd' at 443/443 (0 ms remaining, 51 b/s). Block took 0 ms to download
Downloading `derp/derpwckt.gnd' at 442/442 (0 ms remaining, 50 b/s). Block took 0 ms to download
Downloading `derp/derpbc.gnd' at 440/440 (0 ms remaining, 48 b/s). Block took 0 ms to download
Downloading `derp/derpws.gnd' at 440/440 (0 ms remaining, 48 b/s). Block took 0 ms to download
<then it hangs forever>
Additional InformationYou can omit the -n to index the larger files, but it still hangs. The test data that I cannot download from myself is at gnunet://fs/chk/KANEDFHMJNHFQYQP06XZ2KXBQRXT7JXN2DQX9F2GWR83QEQRAB8G9PVFSN3BB0YYMHFHY6J7MXGQ6P0RSS1QXRR8FRY61TSSGD94Y30.GM5FQF97RFEQ04DXW3PDV1J4ZP6S0C7GJCQ2S64YFF1YHZHP4KC6F7J5CEGTA6DJ9DJWCCVHEZQS5FA31XG5RPXWYBQSESQ2M5CB9C8.28248

No idea if it works to download it remotely, rather than downloading your own data.
TagsNo tags attached.

Activities

cy1

2020-04-30 09:23

reporter  

testdata.tar.xz (6,404 bytes)
make_dumb_directory_tree.c (2,690 bytes)

cy1

2020-04-30 09:25

reporter   ~0015812

Oh the -ggdb3 thing isn't needed. I just forgot to remove it.

schanzen

2020-05-06 18:10

manager   ~0015863

Assigning this to grothoff as fs is a black box to me atm.

cy1

2020-05-13 03:30

reporter   ~0015886

I've tried to follow fs, but the multiple cross socket jumps make it just impossible to debug. And it's 41,630 lines of code, so it's a bit much to understand. Maybe I'll take another look at it.

I was told (years ago) that the reason it was freezing was something about routing decisions that never got written as more than a hack, but these are local files, and indexed to boot, so they shouldn't get requested via any route at all.

cy1

2020-05-15 05:41

reporter   ~0015892

This is weird... the CHK for my test data changed. Did I specify the wrong CHK somehow?

gnunet://fs/chk/AJRGAG82WAENCXR4QPWQ7HCR3PQPFBK6J8BRBZV2TPAVKX1Z3631B7T27D2TGQQPCA61H7M4QR8RG92EVDZZCG3FXYR9NH3ZYW15HW8.AQ6D1QZ9EN6A7M1ZWCWFF88ZP3KBPCNDFJAS3NBWETEP27SVQNDSVKFB4S9028ETXCECTQ2PZP5XCW2EVVVTGPPSW2173GZG2SZR4E8.28249

That's the CHK to my test data. I just checked.

cy1

2020-05-17 19:27

reporter   ~0015929

For some reason, the download stopped freezing entirely, and just badly fails now, despite the fact that I haven't updated gnunet at all. It still freezes at the same point, but then always precisely 1 minute later, it resumes downloading. Unfortunately, for instance...

<code bash>
$ function doit() {
  rm -rf fail fail.gnd
  gnunet-download -V -V -V -R -o fail.gnd gnunet://fs/chk/AJRGAG82WAENCXR4QPWQ7HCR3PQPFBK6J8BRBZV2TPAVKX1Z3631B7T27D2TGQQPCA61H7M4QR8RG92EVDZZCG3FXYR9NH3ZYW15HW8.AQ6D1QZ9EN6A7M1ZWCWFF88ZP3KBPCNDFJAS3NBWETEP27SVQNDSVKFB4S9028ETXCECTQ2PZP5XCW2EVVVTGPPSW2173GZG2SZR4E8.28249
}

$ time doit | grep grep kdb
Starting download `fail/derpykdb.gnd'.
Starting download `fail/derpykdb/filebpkmcg.html'.
Downloading `fail/derpykdb.gnd' at 9118/9118 (0 ms remaining, 2816 b/s). Block took 0 ms to download
Downloading `fail/derpykdb/filebpkmcg.html' at 78750/78750 (0 ms remaining, 1808 b/s). Block took 0 ms to download
Downloading `fail/derpykdb/filebpkmcg.html' done (1808 b/s).
Downloading `fail/derpykdb.gnd' done (194 b/s).
doit 2>&1 0.06s user 0.01s system 0% cpu 1:00.08 total
grep kdb 0.00s user 0.00s system 0% cpu 1:00.08 total
</code>

Even in the full output, gnunet-download doesn't error out, and it says "Downloading `fail/derpykdb.gnd' done (194 b/s)." but using gnunet-directory, I can see that fail/derpykdb.gnd has many files in it, not just one. In the original data I uploaded...

<code bash>
$ ls herp/derpykdb
fileaq.html fileb.txt fileg.html fileixt.txt filepeqa.txt fileshupsa.txt fileyamfw.html
filea.txt filedef.txt filegmahf.html filel.txt filep.txt filet.txt fileyjfx.txt
filebpkmcg.html filedq.html filehkcij.txt filemi.txt fileqg.txt fileuj.html fileyprmeb.html
filebsosqw.html filed.txt filehxji.html filen.txt fileqw.txt fileuleaiv.txt filezrdvsi.txt
</code>
but...
<code bash>
$ ls fail/derpykdb
filebpkmcg.html
</code>
If it helps filebpkmcg.html is identical in both directories. Just the others are missing. Every subdirectory in "fail" has exactly one file in it, while many in "herp" have lots of files in them. I tried redoing gnunet-download, without deleting the existing files, but it didn't download any other files. And in every fail/*.gnd, the only file downloaded is also the first entry in gnunet-directory.

Incidentally in my test data, all the regular files are identical. 78,750 bytes of the letter Q. So if gnunet has one of them, it should have all of them, and none should silently fail to download. ...I should see what happens when I change the file size slightly.

cy1

2020-05-17 21:38

reporter   ~0015930

Oh, that would explain why gnunet-fs reports only one of them indexed, though, if they're all the same file.

cy1

2020-05-20 01:37

reporter   ~0015933

Here's a new edition of make_dumb_download.c that produces a slightly less degenerate set of test data, and freezes partway through despite all files being indexed.

And... the test data it generates. All files containing Q but in varying amounts for different file sizes. Oh and here's the log of gnunet-download. Well, of...

$ rm derp derp.gnd -rf; time gnunet-download -V -V -V -o derp.gnd -R gnunet://fs/chk/FFGTWZ4ZWPZ8Y9T4T8S79KE5YMX0898ZNE2HNCCWRY865HWTVGVS2ETQ0YA0JSPZ5Z601BWVNJHQ4KQ14JAXX0HMEEB2QQPZFTPGXR0.523SVBFHY1DTWM42YKV1FBVPT8QB2P9G3T3Z0A6J6EC6A7C7H4352JJ92XA01G8090WC6EKPXHVTW2VZKC8014K89V8S2S7Q2Q6MD88.78237 |& s6-tai64n | tee dl.log

The freeze happens at `derp/derpxjmfn/filehktb.txt' around @400000005ec469dd21d65862 where the next timestamp is @400000005ec46a9103d89815, approximately 12.9 minutes later.
Also the time command says this:
0.72s user 0.03s system 0% cpu 3:00.54 total

3 minutes is a lot less than 12.9 minutes, so I don't know what's special about 10 minutes that the "time" command would forget about them.

The data is correctly downloaded at least. (i.e. copied from where it's already indexed.) So there's a second "bug" where if multiple files in a directory are identical, gnunet-download will only recreate the first one. I'm not sure how likely that is to ever happen in a real world application, but it is there. The bug I'm concerned with is this freeze, since it makes no sense to freeze on downloading already indexed files, every time at exactly the same place for exactly 12.9 minutes, only 3 of which register as being used by the process.

Oh, and downloading again without deleting the data does not freeze it, once all the data has downloaded. I'm not sure which files I would have to remove to get the download to freeze again, other than a rm -rf of all of them.

make_dumb_directory_tree-2.c (2,896 bytes)
testdata.tar-2.xz (7,748 bytes)
dl.log (88,957 bytes)

cy1

2020-05-22 04:28

reporter   ~0015939

The website deleted my post, so retyping fast. I just tried again. Randomized file contents this time. Still a 12.9 minute freeze at @400000005ec731e92a82d395 posting logs

make_dumb_directory_tree-3.c (3,167 bytes)
dl2.log (138,999 bytes)

cy1

2020-05-22 06:11

reporter   ~0015940

Oh another problem with removing indexed files is you'll get endless error messages in your logs, trying and failing about once a second like:

May 22 04:01:51-376176 fs-28169 ERROR Assertion failed at gnunet-service-fs_indexing.c:303.
May 22 04:01:51-376211 fs-28169 ERROR Failed to find index M20F56KN

If you cannot recreate the file whose hash is M20F56KN, it just reports that error over and over again ad infinitum. For every file that is supposed to be indexed, but was deleted. I don't know if it ever gets tired of doing that. Removing idxinfo.lst doesn't seem to stop it from these endless attempts. Whatever calls GNUNET_FS_handle_on_demand_block does not even log what it's doing at the DEBUG level, so the above error messages are just back to back, without context, one after the other. I could attach gdb I suppose...

cy1

2020-06-05 05:25

reporter   ~0016231

Well this is interesting. ...I'm still clueless as to why, but while gnunet-download freezes on downloading that test data, even when indexed, gnunet-fs-gtk does NOT freeze.

gnunet-fs-gtk downloads the data instantly, so whatever's causing the unnecessary delay, it must be in gnunet-download, but not in gnunet-fs-gtk (which has download functionality).

Issue History

Date Modified Username Field Change
2020-04-30 09:23 cy1 New Issue
2020-04-30 09:23 cy1 File Added: testdata.tar.xz
2020-04-30 09:23 cy1 File Added: make_dumb_directory_tree.c
2020-04-30 09:25 cy1 Note Added: 0015812
2020-05-06 18:09 schanzen Assigned To => Christian Grothoff
2020-05-06 18:09 schanzen Status new => assigned
2020-05-06 18:10 schanzen Note Added: 0015863
2020-05-13 03:30 cy1 Note Added: 0015886
2020-05-15 05:41 cy1 Note Added: 0015892
2020-05-17 19:27 cy1 Note Added: 0015929
2020-05-17 21:38 cy1 Note Added: 0015930
2020-05-20 01:37 cy1 File Added: make_dumb_directory_tree-2.c
2020-05-20 01:37 cy1 File Added: testdata.tar-2.xz
2020-05-20 01:37 cy1 File Added: dl.log
2020-05-20 01:37 cy1 Note Added: 0015933
2020-05-22 04:28 cy1 File Added: make_dumb_directory_tree-3.c
2020-05-22 04:28 cy1 File Added: dl2.log
2020-05-22 04:28 cy1 Note Added: 0015939
2020-05-22 06:11 cy1 Note Added: 0015940
2020-06-01 00:49 Adminknox Issue cloned: 0006278
2020-06-05 05:25 cy1 Note Added: 0016231