View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0002420 | GNUnet | DHT service | public | 2012-06-12 13:23 | 2012-11-05 18:34 |
Reporter | diekmann | Assigned To | Christian Grothoff | ||
Priority | urgent | Severity | major | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | x86-64 | OS | Linux | OS Version | Ubuntu 12.04 |
Product Version | 0.9.3 | ||||
Target Version | 0.9.4 | Fixed in Version | 0.9.4 | ||
Summary | 0002420: Core eating up all memory | ||||
Description | Error in SVN Revision: 21321 and 0.9.3 gnunet-service-core and gnunet-service-dht consume all resources. Flood of 'core-7944 INFO Sending 8 message with 823 bytes to client interested in messages of type 148.' | ||||
Steps To Reproduce | compile GNUnet extension from attachment. Builds against r21321 and with minor changes against 0.9.3 ./admin --configure make make install make check --- monitor memory and CPU usage | ||||
Tags | No tags attached. | ||||
Attached Files | |||||
|
Doesn't compile for me, 'json.h' is missing. Is that supposed to be a system header, or did you forget to include it in the tgz? make[4]: Entering directory `/home/grothoff/test/src/popitter' gcc -DHAVE_CONFIG_H -I. -I../.. -I../../src/include -I../.. -I/home/grothoff//include -fno-strict-aliasing -Wall -g -Wall -MT json.o -MD -MP -MF .deps/json.Tpo -c -o json.o `test -f 'util/json.c' || echo './'`util/json.c util/json.c:32:18: fatal error: json.h: No such file or directory compilation terminated. |
|
Found a json.h to make it work. Now, the issue seems to not be core taking so much memory, but the DHT (I see it take 4 GB on my system): 15332 grothoff 27 7 3477m 3.3g 536 R 15 37.9 0:32.37 gnunet-service-dht -c /tmp/test-popitter-multipeer//0//gnunet-testing-configbkZLA0 -L DEBUG 15341 grothoff 27 7 1218m 1.1g 544 R 13 12.8 0:29.74 gnunet-service-dht -c /tmp/test-popitter-multipeer//1//gnunet-testing-configQeT70U -L DEBUG 15333 grothoff 27 7 31888 1256 500 S 12 0.0 0:31.41 gnunet-service-core -c /tmp/test-popitter-multipeer//0//gnunet-testing-configbkZLA0 -L DEBUG 15342 grothoff 27 7 31872 1240 504 R 11 0.0 0:34.30 gnunet-service-core -c /tmp/test-popitter-multipeer//1//gnunet-testing-configQeT70U -L DEBUG 15324 grothoff 27 7 31896 1376 612 S 6 0.0 0:30.12 gnunet-service-core -c /tmp/test-popitter-multipeer//2//gnunet-testing-configyuf9S8 -L DEBUG 15337 grothoff 27 7 25180 440 180 S 3 0.0 0:07.08 gnunet-service-statistics -c /tmp/test-popitter-multipeer//0//gnunet-testing-configbkZLA0 15346 grothoff 27 7 25180 448 180 S 3 0.0 0:07.20 gnunet-service-statistics -c /tmp/test-popitter-multipeer//1//gnunet-testing-configQeT70U |
|
The queue of 'struct P2PPendingMessage' in gnunet-service-dht_neighbours.c seems to grow without bound: (gdb) ba #0 __memset_sse2 () at ../sysdeps/x86_64/multiarch/../memset.S:518 #1 0x00002b9d51114d68 in GNUNET_xmalloc_unchecked_ (size=775, filename=0x40f180 "gnunet-service-dht_neighbours.c", linenumber=1445) at common_allocation.c:142 #2 0x00002b9d511149d6 in GNUNET_xmalloc_ (size=775, filename=0x40f180 "gnunet-service-dht_neighbours.c", linenumber=1445) at common_allocation.c:66 #3 0x000000000040ba33 in GDS_NEIGHBOURS_handle_reply (target=0x103e880, type=16, expiration_time=..., key=0x7fffc67fe2b0, put_path_length=0, put_path=0x7fffc67fe2f0, get_path_length=0, get_path=0x7fffc67fdf70, data=0x7fffc67fe2f0, data_size=647) at gnunet-service-dht_neighbours.c:1445 #4 0x000000000040dcda in process (cls=0x7fffc67fdf00, key=0x7fffc67fe2b0, value=0x103e880) at gnunet-service-dht_routing.c:213 #5 0x00002b9d51124e3b in GNUNET_CONTAINER_multihashmap_get_multiple (map=0x1008f40, key=0x7fffc67fe2b0, it=0x40daf0 <process>, it_cls=0x7fffc67fdf00) at container_multihashmap.c:485 #6 0x000000000040e031 in GDS_ROUTING_process (type=16, expiration_time=..., key=0x7fffc67fe2b0, put_path_length=0, put_path=0x7fffc67fe2f0, get_path_length=1, get_path=0x7fffc67fdf70, data=0x7fffc67fe2f0, data_size=647) at gnunet-service-dht_routing.c:295 #7 0x000000000040d879 in handle_dht_p2p_result (cls=0x0, peer=0x7fffc67fe248, message=0x7fffc67fe298, atsi=0x7fffc67fe288, atsi_count=2) at gnunet-service-dht_neighbours.c:1941 #8 0x00002b9d500d29e8 in main_notify_handler (cls=0x1038b50, msg=0x7fffc67fe240) at core_api.c:934 #9 0x00002b9d511128af in receive_task (cls=0x1039920, tc=0x7fffc67fe630) at client.c:584 #10 0x00002b9d51144afd in run_ready (rs=0x1004240, ws=0x10042d0) at scheduler.c:602 #11 0x00002b9d51145301 in GNUNET_SCHEDULER_run (task=0x2b9d511522b4 <service_task>, task_cls=0x7fffc67fe930) at scheduler.c:790 #12 0x00002b9d51153d76 in GNUNET_SERVICE_run (argc=5, argv=0x7fffc67feb98, service_name=0x40e6b0 "dht", options=GNUNET_SERVICE_OPTION_NONE, task=0x402f87 <run>, task_cls=0x0) at service.c:1785 #13 0x0000000000403140 in main (argc=5, argv=0x7fffc67feb98) at gnunet-service-dht.c:184 (gdb) print *pi->head->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->nex t->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next- >next->next->next->next->next $10 = {next = 0x74ee4000, prev = 0x74ee5b90, importance = 0, timeout = {abs_value = 1339529969961}, msg = 0x74ee4338} |
|
Some statistics: grothoff@spec:~/test/src/popitter$ gnunet-statistics -c /tmp/test-popitter-multipeer//0//gnunet-testing-config4Trqn7 -s dht dht # P2P RESULTS received: 330772 dht # RESULT messages queued for transmission: 1984647 dht # Good REPLIES matched against routing table: 1984641 dht # REPLIES ignored for CLIENTS (no match): 330776 dht # Duplicate REPLIES to CLIENT request dropped: 330774 dht # Bytes transmitted to other peers: 358472834 dht # GET messages queued for transmission: 47 dht # Peers excluded from routing due to Bloomfilter: 42 dht # GET requests routed: 41 dht # GET requests from clients injected: 30 dht # Good RESULTS found in datacache: 7 dht # GET requests given to datacache: 10 dht # Entries added to routing table: 10 dht # P2P GET requests received: 10 dht # Peer selection failed: 17 dht # Bytes of bandwdith requested from core: 8816 dht # P2P GET requests ONLY routed: 2 dht # RESULTS queued for clients: 1 dht # GET requests received from clients: 2 dht # PUT messages queued for transmission: 2 dht # PUT requests routed: 1 dht # ITEMS stored in datacache: 1 dht # PUT requests received from clients: 1 dht # peers connected: 2 dht # Preference updates given to core: 1 dht # FIND PEER messages initiated: 1 grothoff@spec:~/test/src/popitter$ gnunet-statistics -c /tmp/test-popitter-multipeer//1/gnunet-testing-confignsYotd -s dht dht # Good REPLIES matched against routing table: 5113141 dht # REPLIES ignored for CLIENTS (no match): 426108 dht # Duplicate REPLIES to CLIENT request dropped: 426105 dht # P2P RESULTS received: 426080 dht # RESULT messages queued for transmission: 5113094 dht # Bytes transmitted to other peers: 342182505 dht # GET messages queued for transmission: 30 dht # Peers excluded from routing due to Bloomfilter: 51 dht # GET requests routed: 48 dht # GET requests from clients injected: 30 dht # Peer selection failed: 35 dht # Bytes of bandwdith requested from core: 5125 dht # P2P FIND PEER requests processed: 2 dht # Entries added to routing table: 17 dht # P2P GET requests received: 17 dht # Good RESULTS found in datacache: 25 dht # GET requests given to datacache: 17 dht # PUT requests routed: 2 dht # ITEMS stored in datacache: 2 dht # P2P PUT requests received: 1 dht # FIND PEER requests ignored due to Bloomfilter: 1 dht # Preference updates given to core: 2 dht # peers connected: 2 dht # FIND PEER messages initiated: 1 dht # RESULTS queued for clients: 1 dht # GET requests received from clients: 2 dht # PUT requests received from clients: 1 grothoff@spec:~/test/src/popitter$ gnunet-statistics -c /tmp/test-popitter-multipeer//2/gnunet-testing-configdBsbBm -s dht dht # RESULT messages queued for transmission: 4545415 dht # Good REPLIES matched against routing table: 4545402 dht # REPLIES ignored for CLIENTS (no match): 378793 dht # Duplicate REPLIES to CLIENT request dropped: 378791 dht # P2P RESULTS received: 378768 dht # Bytes transmitted to other peers: 245078032 dht # PUT messages queued for transmission: 2 dht # Peers excluded from routing due to Bloomfilter: 49 dht # PUT requests routed: 3 dht # ITEMS stored in datacache: 3 dht # RESULTS queued for clients: 2 dht # PUT requests received from clients: 2 dht # GET requests from clients injected: 30 dht # GET messages queued for transmission: 29 dht # GET requests routed: 47 dht # Good RESULTS found in datacache: 22 dht # GET requests given to datacache: 17 dht # Entries added to routing table: 16 dht # P2P GET requests received: 16 dht # Peer selection failed: 34 dht # Bytes of bandwdith requested from core: 4377 dht # P2P PUT requests received: 1 dht # P2P FIND PEER requests processed: 1 dht # Preference updates given to core: 2 dht # peers connected: 2 dht # FIND PEER messages initiated: 1 dht # GET requests received from clients: 2 |
|
So the problem would seem to be that because (as per discussion with reporter earlier) there is no check for duplicate results and as we have 3 peers and they all have asked each other at some point for a result, all three peers keep forwarding the same result amongst each others at high speed (no bw limit applies in the test). So A gets a result, sends it to B and C. B gets a result, sends it to C and A, C gets a result, sends it to A and B. Which causes n messages to be LEFT in the system after n transmissions. As none of the DHTs limit their queue sizes and as the duplicate check by block lib was not implemented, this ends up being rather unhealthy. Solutions: 1) limit per-peer queue size; 2) really require block plugins to do duplicate checks... |
|
Fixed unbouned message queue in DHT in SVN 21927 (I also documented the need to check for duplicates in C tutorial and block chapter in developer handbook and added sample code to the block template). |
Date Modified | Username | Field | Change |
---|---|---|---|
2012-06-12 13:23 | diekmann | New Issue | |
2012-06-12 13:23 | diekmann | File Added: core_eating_my_mem.tar.gz | |
2012-06-12 19:23 | Christian Grothoff | Assigned To | => Christian Grothoff |
2012-06-12 19:23 | Christian Grothoff | Status | new => assigned |
2012-06-12 19:30 | Christian Grothoff | Note Added: 0006051 | |
2012-06-12 19:31 | Christian Grothoff | Assigned To | Christian Grothoff => |
2012-06-12 19:31 | Christian Grothoff | Status | assigned => feedback |
2012-06-12 19:31 | Christian Grothoff | Priority | normal => high |
2012-06-12 20:36 | Christian Grothoff | Assigned To | => Christian Grothoff |
2012-06-12 20:36 | Christian Grothoff | Status | feedback => assigned |
2012-06-12 20:36 | Christian Grothoff | Note Added: 0006052 | |
2012-06-12 20:36 | Christian Grothoff | Category | core service => DHT service |
2012-06-12 20:43 | Christian Grothoff | Note Added: 0006053 | |
2012-06-12 20:45 | Christian Grothoff | Note Added: 0006054 | |
2012-06-12 20:50 | Christian Grothoff | Note Added: 0006055 | |
2012-06-12 20:50 | Christian Grothoff | Target Version | => 0.9.4 |
2012-06-12 20:54 | Christian Grothoff | Priority | high => urgent |
2012-06-12 20:54 | Christian Grothoff | Severity | minor => major |
2012-06-12 21:24 | Christian Grothoff | Note Added: 0006056 | |
2012-06-12 21:24 | Christian Grothoff | Status | assigned => resolved |
2012-06-12 21:24 | Christian Grothoff | Fixed in Version | => 0.9.4 |
2012-06-12 21:24 | Christian Grothoff | Resolution | open => fixed |
2012-11-05 18:34 | Christian Grothoff | Status | resolved => closed |