View Issue Details

IDProjectCategoryView StatusLast Update
0002420GNUnetDHT servicepublic2012-11-05 18:34
Reporterdiekmann Assigned ToChristian Grothoff  
PriorityurgentSeveritymajorReproducibilityalways
Status closedResolutionfixed 
Platformx86-64OSLinuxOS VersionUbuntu 12.04
Product Version0.9.3 
Target Version0.9.4Fixed in Version0.9.4 
Summary0002420: Core eating up all memory
DescriptionError in SVN Revision: 21321 and 0.9.3


gnunet-service-core and gnunet-service-dht consume all resources.

Flood of
'core-7944 INFO Sending 8 message with 823 bytes to client interested in messages of type 148.'
Steps To Reproducecompile GNUnet extension from attachment. Builds against r21321 and with minor changes against 0.9.3

./admin --configure
make
make install
make check
---
monitor memory and CPU usage
TagsNo tags attached.
Attached Files
core_eating_my_mem.tar.gz (101,739 bytes)

Activities

Christian Grothoff

2012-06-12 19:30

manager   ~0006051

Doesn't compile for me, 'json.h' is missing. Is that supposed to be a system header, or did you forget to include it in the tgz?

make[4]: Entering directory `/home/grothoff/test/src/popitter'
gcc -DHAVE_CONFIG_H -I. -I../.. -I../../src/include -I../.. -I/home/grothoff//include -fno-strict-aliasing -Wall -g -Wall -MT json.o -MD -MP -MF .deps/json.Tpo -c -o json.o `test -f 'util/json.c' || echo './'`util/json.c
util/json.c:32:18: fatal error: json.h: No such file or directory
compilation terminated.

Christian Grothoff

2012-06-12 20:36

manager   ~0006052

Found a json.h to make it work. Now, the issue seems to not be core taking so much memory, but the DHT (I see it take 4 GB on my system):

15332 grothoff 27 7 3477m 3.3g 536 R 15 37.9 0:32.37 gnunet-service-dht -c /tmp/test-popitter-multipeer//0//gnunet-testing-configbkZLA0 -L DEBUG
15341 grothoff 27 7 1218m 1.1g 544 R 13 12.8 0:29.74 gnunet-service-dht -c /tmp/test-popitter-multipeer//1//gnunet-testing-configQeT70U -L DEBUG
15333 grothoff 27 7 31888 1256 500 S 12 0.0 0:31.41 gnunet-service-core -c /tmp/test-popitter-multipeer//0//gnunet-testing-configbkZLA0 -L DEBUG
15342 grothoff 27 7 31872 1240 504 R 11 0.0 0:34.30 gnunet-service-core -c /tmp/test-popitter-multipeer//1//gnunet-testing-configQeT70U -L DEBUG
15324 grothoff 27 7 31896 1376 612 S 6 0.0 0:30.12 gnunet-service-core -c /tmp/test-popitter-multipeer//2//gnunet-testing-configyuf9S8 -L DEBUG
15337 grothoff 27 7 25180 440 180 S 3 0.0 0:07.08 gnunet-service-statistics -c /tmp/test-popitter-multipeer//0//gnunet-testing-configbkZLA0
15346 grothoff 27 7 25180 448 180 S 3 0.0 0:07.20 gnunet-service-statistics -c /tmp/test-popitter-multipeer//1//gnunet-testing-configQeT70U

Christian Grothoff

2012-06-12 20:43

manager   ~0006053

The queue of 'struct P2PPendingMessage' in gnunet-service-dht_neighbours.c seems to grow without bound:

(gdb) ba
#0 __memset_sse2 () at ../sysdeps/x86_64/multiarch/../memset.S:518
#1 0x00002b9d51114d68 in GNUNET_xmalloc_unchecked_ (size=775, filename=0x40f180 "gnunet-service-dht_neighbours.c", linenumber=1445) at common_allocation.c:142
#2 0x00002b9d511149d6 in GNUNET_xmalloc_ (size=775, filename=0x40f180 "gnunet-service-dht_neighbours.c", linenumber=1445) at common_allocation.c:66
#3 0x000000000040ba33 in GDS_NEIGHBOURS_handle_reply (target=0x103e880, type=16, expiration_time=..., key=0x7fffc67fe2b0, put_path_length=0, put_path=0x7fffc67fe2f0, get_path_length=0, get_path=0x7fffc67fdf70, data=0x7fffc67fe2f0,
    data_size=647) at gnunet-service-dht_neighbours.c:1445
#4 0x000000000040dcda in process (cls=0x7fffc67fdf00, key=0x7fffc67fe2b0, value=0x103e880) at gnunet-service-dht_routing.c:213
#5 0x00002b9d51124e3b in GNUNET_CONTAINER_multihashmap_get_multiple (map=0x1008f40, key=0x7fffc67fe2b0, it=0x40daf0 <process>, it_cls=0x7fffc67fdf00) at container_multihashmap.c:485
#6 0x000000000040e031 in GDS_ROUTING_process (type=16, expiration_time=..., key=0x7fffc67fe2b0, put_path_length=0, put_path=0x7fffc67fe2f0, get_path_length=1, get_path=0x7fffc67fdf70, data=0x7fffc67fe2f0, data_size=647)
    at gnunet-service-dht_routing.c:295
#7 0x000000000040d879 in handle_dht_p2p_result (cls=0x0, peer=0x7fffc67fe248, message=0x7fffc67fe298, atsi=0x7fffc67fe288, atsi_count=2) at gnunet-service-dht_neighbours.c:1941
#8 0x00002b9d500d29e8 in main_notify_handler (cls=0x1038b50, msg=0x7fffc67fe240) at core_api.c:934
#9 0x00002b9d511128af in receive_task (cls=0x1039920, tc=0x7fffc67fe630) at client.c:584
#10 0x00002b9d51144afd in run_ready (rs=0x1004240, ws=0x10042d0) at scheduler.c:602
#11 0x00002b9d51145301 in GNUNET_SCHEDULER_run (task=0x2b9d511522b4 <service_task>, task_cls=0x7fffc67fe930) at scheduler.c:790
#12 0x00002b9d51153d76 in GNUNET_SERVICE_run (argc=5, argv=0x7fffc67feb98, service_name=0x40e6b0 "dht", options=GNUNET_SERVICE_OPTION_NONE, task=0x402f87 <run>, task_cls=0x0) at service.c:1785
#13 0x0000000000403140 in main (argc=5, argv=0x7fffc67feb98) at gnunet-service-dht.c:184


(gdb) print *pi->head->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->nex
t->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next->next-
>next->next->next->next->next
$10 = {next = 0x74ee4000, prev = 0x74ee5b90, importance = 0, timeout = {abs_value = 1339529969961}, msg = 0x74ee4338}

Christian Grothoff

2012-06-12 20:45

manager   ~0006054

Some statistics:

grothoff@spec:~/test/src/popitter$ gnunet-statistics -c /tmp/test-popitter-multipeer//0//gnunet-testing-config4Trqn7 -s dht
          dht # P2P RESULTS received: 330772
          dht # RESULT messages queued for transmission: 1984647
          dht # Good REPLIES matched against routing table: 1984641
          dht # REPLIES ignored for CLIENTS (no match): 330776
          dht # Duplicate REPLIES to CLIENT request dropped: 330774
          dht # Bytes transmitted to other peers: 358472834
          dht # GET messages queued for transmission: 47
          dht # Peers excluded from routing due to Bloomfilter: 42
          dht # GET requests routed: 41
          dht # GET requests from clients injected: 30
          dht # Good RESULTS found in datacache: 7
          dht # GET requests given to datacache: 10
          dht # Entries added to routing table: 10
          dht # P2P GET requests received: 10
          dht # Peer selection failed: 17
          dht # Bytes of bandwdith requested from core: 8816
          dht # P2P GET requests ONLY routed: 2
          dht # RESULTS queued for clients: 1
          dht # GET requests received from clients: 2
          dht # PUT messages queued for transmission: 2
          dht # PUT requests routed: 1
          dht # ITEMS stored in datacache: 1
          dht # PUT requests received from clients: 1
          dht # peers connected: 2
          dht # Preference updates given to core: 1
          dht # FIND PEER messages initiated: 1
grothoff@spec:~/test/src/popitter$ gnunet-statistics -c /tmp/test-popitter-multipeer//1/gnunet-testing-confignsYotd -s dht
          dht # Good REPLIES matched against routing table: 5113141
          dht # REPLIES ignored for CLIENTS (no match): 426108
          dht # Duplicate REPLIES to CLIENT request dropped: 426105
          dht # P2P RESULTS received: 426080
          dht # RESULT messages queued for transmission: 5113094
          dht # Bytes transmitted to other peers: 342182505
          dht # GET messages queued for transmission: 30
          dht # Peers excluded from routing due to Bloomfilter: 51
          dht # GET requests routed: 48
          dht # GET requests from clients injected: 30
          dht # Peer selection failed: 35
          dht # Bytes of bandwdith requested from core: 5125
          dht # P2P FIND PEER requests processed: 2
          dht # Entries added to routing table: 17
          dht # P2P GET requests received: 17
          dht # Good RESULTS found in datacache: 25
          dht # GET requests given to datacache: 17
          dht # PUT requests routed: 2
          dht # ITEMS stored in datacache: 2
          dht # P2P PUT requests received: 1
          dht # FIND PEER requests ignored due to Bloomfilter: 1
          dht # Preference updates given to core: 2
          dht # peers connected: 2
          dht # FIND PEER messages initiated: 1
          dht # RESULTS queued for clients: 1
          dht # GET requests received from clients: 2
          dht # PUT requests received from clients: 1
grothoff@spec:~/test/src/popitter$ gnunet-statistics -c /tmp/test-popitter-multipeer//2/gnunet-testing-configdBsbBm -s dht
          dht # RESULT messages queued for transmission: 4545415
          dht # Good REPLIES matched against routing table: 4545402
          dht # REPLIES ignored for CLIENTS (no match): 378793
          dht # Duplicate REPLIES to CLIENT request dropped: 378791
          dht # P2P RESULTS received: 378768
          dht # Bytes transmitted to other peers: 245078032
          dht # PUT messages queued for transmission: 2
          dht # Peers excluded from routing due to Bloomfilter: 49
          dht # PUT requests routed: 3
          dht # ITEMS stored in datacache: 3
          dht # RESULTS queued for clients: 2
          dht # PUT requests received from clients: 2
          dht # GET requests from clients injected: 30
          dht # GET messages queued for transmission: 29
          dht # GET requests routed: 47
          dht # Good RESULTS found in datacache: 22
          dht # GET requests given to datacache: 17
          dht # Entries added to routing table: 16
          dht # P2P GET requests received: 16
          dht # Peer selection failed: 34
          dht # Bytes of bandwdith requested from core: 4377
          dht # P2P PUT requests received: 1
          dht # P2P FIND PEER requests processed: 1
          dht # Preference updates given to core: 2
          dht # peers connected: 2
          dht # FIND PEER messages initiated: 1
          dht # GET requests received from clients: 2

Christian Grothoff

2012-06-12 20:50

manager   ~0006055

So the problem would seem to be that because (as per discussion with reporter earlier) there is no check for duplicate results and as we have 3 peers and they all have asked each other at some point for a result, all three peers keep forwarding the same result amongst each others at high speed (no bw limit applies in the test). So A gets a result, sends it to B and C. B gets a result, sends it to C and A, C gets a result, sends it to A and B. Which causes n messages to be LEFT in the system after n transmissions. As none of the DHTs limit their queue sizes and as the duplicate check by block lib was not implemented, this ends up being rather unhealthy.

Solutions: 1) limit per-peer queue size; 2) really require block plugins to do duplicate checks...

Christian Grothoff

2012-06-12 21:24

manager   ~0006056

Fixed unbouned message queue in DHT in SVN 21927 (I also documented the need to check for duplicates in C tutorial and block chapter in developer handbook and added sample code to the block template).

Issue History

Date Modified Username Field Change
2012-06-12 13:23 diekmann New Issue
2012-06-12 13:23 diekmann File Added: core_eating_my_mem.tar.gz
2012-06-12 19:23 Christian Grothoff Assigned To => Christian Grothoff
2012-06-12 19:23 Christian Grothoff Status new => assigned
2012-06-12 19:30 Christian Grothoff Note Added: 0006051
2012-06-12 19:31 Christian Grothoff Assigned To Christian Grothoff =>
2012-06-12 19:31 Christian Grothoff Status assigned => feedback
2012-06-12 19:31 Christian Grothoff Priority normal => high
2012-06-12 20:36 Christian Grothoff Assigned To => Christian Grothoff
2012-06-12 20:36 Christian Grothoff Status feedback => assigned
2012-06-12 20:36 Christian Grothoff Note Added: 0006052
2012-06-12 20:36 Christian Grothoff Category core service => DHT service
2012-06-12 20:43 Christian Grothoff Note Added: 0006053
2012-06-12 20:45 Christian Grothoff Note Added: 0006054
2012-06-12 20:50 Christian Grothoff Note Added: 0006055
2012-06-12 20:50 Christian Grothoff Target Version => 0.9.4
2012-06-12 20:54 Christian Grothoff Priority high => urgent
2012-06-12 20:54 Christian Grothoff Severity minor => major
2012-06-12 21:24 Christian Grothoff Note Added: 0006056
2012-06-12 21:24 Christian Grothoff Status assigned => resolved
2012-06-12 21:24 Christian Grothoff Fixed in Version => 0.9.4
2012-06-12 21:24 Christian Grothoff Resolution open => fixed
2012-11-05 18:34 Christian Grothoff Status resolved => closed