View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0003846 | GNUnet | cadet service | public | 2015-06-19 16:46 | 2018-06-07 00:25 |
Reporter | amatus | Assigned To | Christian Grothoff | ||
Priority | normal | Severity | crash | Reproducibility | always |
Status | closed | Resolution | fixed | ||
Platform | x86 | OS | Debian | OS Version | Jessie |
Product Version | Git master | ||||
Target Version | 0.11.0pre66 | Fixed in Version | 0.11.0pre66 | ||
Summary | 0003846: Assertion failed (GNUNET_CONTAINER_multihashmap_remove) in GCP_remove_connection (rev 35954) | ||||
Description | Program terminated with signal SIGABRT, Aborted. #0 0x04161ddc in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0x04161ddc in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x04163463 in __GI_abort () at abort.c:89 #2 0x0405d98a in GNUNET_abort_ () at common_logging.c:290 #3 0x08067f88 in GCP_remove_connection (peer=0x4a93df8, c=0x6062b58) at gnunet-service-cadet_peer.c:2181 #4 0x08057356 in unregister_neighbors (c=0x6062b58) at gnunet-service-cadet_connection.c:1582 #5 0x0805a230 in GCC_destroy (c=0x6062b58) at gnunet-service-cadet_connection.c:2804 #6 0x0805a0a0 in GCC_new (cid=0xbeb09648, t=0x0, path=0x5f40cc8, own_pos=2) at gnunet-service-cadet_connection.c:2766 #7 0x08057b55 in GCC_handle_create (cls=0x0, peer=0xbeb09624, message=0xbeb09644) at gnunet-service-cadet_connection.c:1811 #8 0x040cd726 in main_notify_handler (cls=0x4634440, msg=0xbeb09620) at core_api.c:967 #9 0x04059028 in receive_task (cls=0x46352b0, tc=0xbeb09814) at client.c:618 #10 0x040928b4 in run_ready (rs=0x4527358, ws=0x4527410) at scheduler.c:587 #11 0x04093142 in GNUNET_SCHEDULER_run (task=0x409dcd0 <service_task>, task_cls=0xbeb09a40) at scheduler.c:867 #12 0x0409f86e in GNUNET_SERVICE_run (argc=7, argv=0xbeb09c34, service_name=0x806d314 "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x8069be5 <run>, task_cls=0x0) at service.c:1503 #13 0x08069e77 in main (argc=7, argv=0xbeb09c34) at gnunet-service-cadet.c:174 (gdb) f 3 #3 0x08067f88 in GCP_remove_connection (peer=0x4a93df8, c=0x6062b58) at gnunet-service-cadet_peer.c:2181 2181 gnunet-service-cadet_peer.c: No such file or directory. (gdb) p *peer $1 = {id = 11, last_contact = {abs_value_us = 1434723237915931}, path_head = 0x4a94280, path_tail = 0x4a94280, search_h = 0x0, search_delayed = 0x0, tunnel = 0x0, connections = 0x4a944a8, core_transmit = 0x0, tmt_time = {abs_value_us = 0}, queue_head = 0x0, queue_tail = 0x0, queue_n = 0, hello = 0x4e7c798} (gdb) p *peer->connections $2 = {map = 0x4a944f0, size = 1, map_length = 16, use_small_entries = 0, modification_counter = 1} (gdb) p *c $3 = {t = 0x0, fwd_fc = {c = 0x6062b58, queue_n = 0, queue_max = 11, last_pid_sent = 4294967295, last_pid_recv = 4294967295, recv_bitmap = 0, last_ack_sent = 0, last_ack_recv = 0, poll_task = 0x0, poll_time = { rel_value_us = 1000000}, poll_msg = 0x0, ack_msg = 0x0}, bck_fc = { c = 0x6062b58, queue_n = 0, queue_max = 11, last_pid_sent = 4294967295, last_pid_recv = 4294967295, recv_bitmap = 0, last_ack_sent = 0, last_ack_recv = 0, poll_task = 0x0, poll_time = {rel_value_us = 1000000}, poll_msg = 0x0, ack_msg = 0x0}, perf = 0x0, id = { bits = "\255\307\b\033+-\034\320?k\354\230\060\030{\273\032\067\377\024$\337$$'\245\241\300L\305_\376"}, path = 0x6056818, fwd_maintenance_task = 0x0, bck_maintenance_task = 0x0, maintenance_q = 0x0, next_peer = 0x4a915e0, prev_peer = 0x4a93df8, state = CADET_CONNECTION_NEW, own_pos = 2, pending_messages = 0, destroy = 2, create_retry = 0} gnunet-service-cadet was run under valgrind and no memory badness was reported when hitting this assertion. This happened after restarting gnunet-service-cadet after a previous crash. | ||||
Tags | No tags attached. | ||||
has duplicate | 0003873 | closed | Bart Polot | Assertion failed at gnunet-service-cadet_peer.c:2222. |
related to | 0003845 | closed | Bart Polot | Assertion failure (c->next_peer == peer) in unregister_neighbors (rev 35954) |
|
SVN rev 35978 still hits this assertion. New error messages are printed out: Jun 24 16:11:49-825775 cadet-p2p-7630 ERROR core wait time 1526348 µs Jun 24 16:11:57-208685 cadet-p2p-7630 ERROR core wait time 1810139 µs Jun 24 16:12:35-871996 cadet-p2p-7630 ERROR core wait time 3006385 µs Jun 24 16:13:06-559054 cadet-p2p-7630 ERROR core wait time 7381775 µs Jun 24 16:13:33-010045 cadet-p2p-7630 ERROR core wait time 5246260 µs Jun 24 16:13:39-973830 cadet-p2p-7630 ERROR core wait time 2006685 µs Jun 24 16:13:52-347106 cadet-p2p-7630 ERROR core wait time 2666171 µs Jun 24 16:14:01-146560 cadet-p2p-7630 ERROR core wait time 7768246 µs Jun 24 16:14:10-283293 cadet-p2p-7630 ERROR core wait time 4631520 µs Jun 24 16:14:33-392780 cadet-p2p-7630 ERROR core wait time 4526869 µs Jun 24 16:14:33-453537 cadet-p2p-7630 ERROR core wait time 4560109 µs Jun 24 16:14:35-003190 cadet-7630 ERROR Assertion failed at gnunet-service-cadet_peer.c:2199. Jun 24 16:17:30-335802 cadet-p2p-7637 ERROR core wait time 1109652 µs Jun 24 16:17:48-483554 cadet-p2p-7637 ERROR core wait time 11343884 µs Jun 24 16:17:50-826244 cadet-p2p-7637 ERROR core wait time 1415831 µs Jun 24 16:17:56-706457 cadet-p2p-7637 ERROR core wait time 3268081 µs Jun 24 16:18:09-540902 cadet-p2p-7637 ERROR core wait time 5441439 µs Jun 24 16:18:14-860535 cadet-p2p-7637 ERROR core wait time 4287178 µs Jun 24 16:18:20-694135 cadet-p2p-7637 ERROR core wait time 1796187 µs Jun 24 16:18:20-694288 cadet-p2p-7637 ERROR core wait time 1450599 µs Jun 24 16:18:37-436791 cadet-p2p-7637 ERROR core wait time 3483678 µs Jun 24 16:18:47-915556 cadet-p2p-7637 ERROR core wait time 24561400 µs Jun 24 16:19:03-961717 cadet-p2p-7637 ERROR core wait time 9317277 µs Jun 24 16:19:03-961910 cadet-p2p-7637 ERROR core wait time 8231655 µs Jun 24 16:19:07-275967 cadet-p2p-7637 ERROR core wait time 2231404 µs Jun 24 16:19:11-584880 cadet-p2p-7637 ERROR core wait time 1841162 µs Jun 24 16:19:19-633443 cadet-p2p-7637 ERROR core wait time 3378745 µs Jun 24 16:19:19-633620 cadet-p2p-7637 ERROR core wait time 3378896 µs Jun 24 16:19:26-867448 cadet-p2p-7637 ERROR core wait time 4963454 µs Jun 24 16:19:33-174265 cadet-p2p-7637 ERROR core wait time 3921485 µs Jun 24 16:19:33-174399 cadet-p2p-7637 ERROR core wait time 3816158 µs Jun 24 16:19:40-782140 cadet-p2p-7637 ERROR core wait time 1006007 µs Jun 24 16:19:51-215917 cadet-p2p-7637 ERROR core wait time 5232503 µs Jun 24 16:19:52-821536 cadet-p2p-7637 ERROR core wait time 2303988 µs Jun 24 16:19:55-298745 cadet-p2p-7637 ERROR core wait time 3048631 µs Jun 24 16:20:07-257621 cadet-p2p-7637 ERROR core wait time 12146145 µs Jun 24 16:20:19-863831 cadet-p2p-7637 ERROR core wait time 9055750 µs Jun 24 16:20:39-746406 cadet-p2p-7637 ERROR core wait time 3665470 µs Jun 24 16:20:39-746590 cadet-p2p-7637 ERROR core wait time 3638215 µs Jun 24 16:20:39-767277 cadet-p2p-7637 ERROR core wait time 3543392 µs Jun 24 16:20:43-496210 cadet-p2p-7637 ERROR core wait time 2619660 µs Jun 24 16:20:43-496355 cadet-p2p-7637 ERROR core wait time 2422327 µs Jun 24 16:20:43-496431 cadet-p2p-7637 ERROR core wait time 2386799 µs Jun 24 16:21:10-309795 cadet-p2p-7637 ERROR core wait time 25751076 µs Jun 24 16:21:11-272378 cadet-p2p-7637 ERROR core wait time 25411188 µs Jun 24 16:21:14-250999 cadet-p2p-7637 ERROR core wait time 2761715 µs Jun 24 16:21:32-164995 cadet-p2p-7637 ERROR core wait time 16183941 µs Jun 24 16:21:37-040363 cadet-p2p-7637 ERROR core wait time 5188508 µs Jun 24 16:21:37-058879 cadet-p2p-7637 ERROR core wait time 3831963 µs Jun 24 16:21:39-050295 cadet-p2p-7637 ERROR core wait time 6921646 µs Jun 24 16:21:50-005058 cadet-p2p-7637 ERROR core wait time 11864860 µs Jun 24 16:21:51-615700 cadet-p2p-7637 ERROR core wait time 11535796 µs Jun 24 16:21:51-615871 cadet-p2p-7637 ERROR core wait time 12357191 µs Jun 24 16:22:06-158601 cadet-p2p-7637 ERROR core wait time 13403850 µs Jun 24 16:22:33-593809 cadet-p2p-7637 ERROR core wait time 25050929 µs Jun 24 16:22:33-621438 cadet-p2p-7637 ERROR core wait time 4680544 µs Jun 24 16:22:44-272479 cadet-p2p-7637 ERROR core wait time 9446562 µs Jun 24 16:22:44-272642 cadet-p2p-7637 ERROR core wait time 35687409 µs Jun 24 16:22:44-291908 cadet-p2p-7637 ERROR core wait time 8111919 µs Jun 24 16:22:44-465022 cadet-p2p-7637 ERROR core wait time 35792148 µs Jun 24 16:23:03-892711 cadet-p2p-7637 ERROR core wait time 18432260 µs Jun 24 16:23:03-979330 cadet-p2p-7637 ERROR core wait time 18603614 µs Jun 24 16:23:04-022888 cadet-p2p-7637 ERROR core wait time 18391794 µs Jun 24 16:23:04-427159 cadet-p2p-7637 ERROR core wait time 18731278 µs Jun 24 16:23:17-497803 cadet-p2p-7637 ERROR core wait time 12441777 µs Jun 24 16:23:17-497955 cadet-p2p-7637 ERROR core wait time 12261175 µs Jun 24 16:23:17-498035 cadet-p2p-7637 ERROR core wait time 2837817 µs Jun 24 16:23:24-238451 cadet-p2p-7637 ERROR core wait time 5246310 µs Jun 24 16:23:40-920026 cadet-p2p-7637 ERROR core wait time 20741797 µs Jun 24 16:23:40-941020 cadet-p2p-7637 ERROR core wait time 20716256 µs Jun 24 16:23:53-981582 cadet-p2p-7637 ERROR core wait time 11698883 µs Jun 24 16:24:04-867139 cadet-p2p-7637 ERROR core wait time 21204620 µs Jun 24 16:24:04-886729 cadet-p2p-7637 ERROR core wait time 21089378 µs Jun 24 16:24:04-886854 cadet-p2p-7637 ERROR core wait time 9830636 µs Jun 24 16:24:06-520914 cadet-p2p-7637 ERROR core wait time 1463508 µs Jun 24 16:24:26-612671 cadet-p2p-7637 ERROR core wait time 13728272 µs Jun 24 16:24:26-693633 cadet-p2p-7637 ERROR core wait time 13863223 µs Jun 24 16:24:28-886638 cadet-p2p-7637 ERROR core wait time 22357393 µs Jun 24 16:24:29-017782 cadet-p2p-7637 ERROR core wait time 1013363 µs Jun 24 16:24:44-730806 cadet-p2p-7637 ERROR core wait time 13391372 µs Jun 24 16:24:44-730972 cadet-p2p-7637 ERROR core wait time 13284756 µs Jun 24 16:24:48-394109 cadet-p2p-7637 ERROR core wait time 1186329 µs Jun 24 16:24:48-394252 cadet-p2p-7637 ERROR core wait time 1056908 µs Jun 24 16:24:48-394329 cadet-p2p-7637 ERROR core wait time 1028680 µs Jun 24 16:25:27-148906 cadet-p2p-7637 ERROR core wait time 1202636 µs Jun 24 16:25:27-359345 cadet-p2p-7637 ERROR core wait time 1084818 µs Jun 24 16:26:03-514570 cadet-p2p-7637 ERROR core wait time 1252936 µs Jun 24 16:26:03-985843 cadet-p2p-7637 ERROR core wait time 35566135 µs Jun 24 16:26:17-827476 cadet-p2p-7637 ERROR core wait time 11867350 µs Jun 24 16:26:30-167180 cadet-p2p-7637 ERROR core wait time 23702592 µs Jun 24 16:26:45-710540 cadet-p2p-7637 ERROR core wait time 26625994 µs Jun 24 16:26:47-933963 cadet-p2p-7637 ERROR core wait time 16670937 µs Jun 24 16:26:57-323420 cadet-p2p-7637 ERROR core wait time 39494267 µs Jun 24 16:26:58-214701 cadet-p2p-7637 ERROR core wait time 9147146 µs Jun 24 16:27:09-533290 cadet-p2p-7637 ERROR core wait time 10237966 µs Jun 24 16:27:09-533464 cadet-p2p-7637 ERROR core wait time 20378605 µs Jun 24 16:27:15-500351 cadet-p2p-7637 ERROR core wait time 4643506 µs Jun 24 16:27:15-500502 cadet-p2p-7637 ERROR core wait time 4349448 µs Jun 24 16:27:50-960755 cadet-p2p-7637 ERROR core wait time 34168373 µs Jun 24 16:27:51-130636 cadet-p2p-7637 ERROR core wait time 5069534 µs Jun 24 16:27:57-167752 cadet-p2p-7637 ERROR core wait time 4683136 µs Jun 24 16:28:03-329494 cadet-p2p-7637 ERROR core wait time 6161592 µs Jun 24 16:28:09-318579 cadet-p2p-7637 ERROR core wait time 5988942 µs Jun 24 16:28:30-321834 cadet-7637 ERROR Assertion failed at gnunet-service-cadet_peer.c:2199. |
|
The new messages are most probably a CORE bug: Jun 24 16:27:50-960755 cadet-p2p-7637 ERROR core wait time 34168373 µs This means that CADET waited for 34 seconds before being able to send a message. All kinds of timeouts are possible in these conditions... |
|
I believe r35985 should have fixed this. |
|
Still happening in rev 35987 |
|
I'm pretty sure r35993 should finally kill this one. |
|
Looks fixed. |
|
Hit this again in SVN rev 36015 |
|
Are you sure it's not 0003845? |
|
Ok, I had it wrong, 3845 is fixed, this one is the one that still happens. Could you confirm that you stacktrace looks like this? #0 0xb77a9d40 in __kernel_vsyscall () #1 0xb7510307 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #2 0xb75119c3 in __GI_abort () at abort.c:89 #3 0xb7721ef3 in GNUNET_abort_ () at common_logging.c:290 #4 0x08074245 in GCP_remove_connection (peer=0x83a2290, c=0x83a5a10) at gnunet-service-cadet_peer.c:2219 #5 0x0805b8a9 in unregister_neighbors (c=0x83a5a10) at gnunet-service-cadet_connection.c:1588 #6 0x0805fdf0 in GCC_destroy (c=0x83a5a10) at gnunet-service-cadet_connection.c:2813 #7 0x08058a64 in conn_message_sent (cls=0x8393110, c=0x83a5a10, sent=1, type=258, pid=0, fwd=1, size=100, wait=...) at gnunet-service-cadet_connection.c:638 #8 0x080710fa in GCP_queue_destroy (queue=0x8396338, clear_cls=0, sent=1, pid=0) at gnunet-service-cadet_peer.c:1349 #9 0x0807090b in queue_send (cls=0x83a1f70, size=364, buf=0x83a79a0) at gnunet-service-cadet_peer.c:1229 #10 0xb76f44f8 in transmit_message (cls=0x8393360, size=420, buf=0x83a7968) at core_api.c:676 #11 0xb771e574 in client_notify (cls=0x83a7670, size=420, buf=0x83a7968) at client.c:1201 #12 0xb772b447 in process_notify (connection=0x8392ef0) at connection.c:1261 #13 0xb772be4e in transmit_ready (cls=0x8392ef0, tc=0xbfaaada4) at connection.c:1405 #14 0xb775963f in run_ready (rs=0x83a0368, ws=0x83a03f0) at scheduler.c:587 #15 0xb7759f42 in GNUNET_SCHEDULER_run (task=0xb7765f2b <service_task>, task_cls=0xbfaaafd0) at scheduler.c:867 #16 0xb7767bcb in GNUNET_SERVICE_run (argc=3, argv=0xbfaab1c4, service_name=0x807d0ba "\a\b\276j\a\b\312j\a\b\303k\a\b\326j\a\b\342j\a\b\356j\a\b\303k\a\b\372j\a\b\006k\a\b\022k\a\b\036k\a\b*k\a\b\303k\a\b\303k\a\b\303k\a\b6k\a\b\246j\a\bBk\a\b\303k\a\b\303k\a\bNk\a\bZk\a\b\303k\a\b\303k\a\b\303k\a\bfk\a\brk\a\b{k\a\b\204k\a\b\215k\a\b\226k\a\b\237k\a\b\250k\a\b\261k\a\b\272k\a\bgnunet-service-cadet.c", options=GNUNET_SERVICE_OPTION_NONE, task=0x8076c91 <shutdown_task+106>, task_cls=0x0) at service.c:1503 #17 0x08076fe3 in run (cls=0x8077170 <__udivdi3+128>, server=0x0, c=0x0) at gnunet-service-cadet.c:155 #18 0xb74fba63 in __libc_start_main (main=0x3, argc=134523120, argv=0x0, init=0x804a911 <_start+33>, fini=0x8076fab <run+617>, rtld_fini=0x3, stack_end=0xbfaab1c4) at libc-start.c:287 #19 0xb77cc000 in ?? () from /lib/ld-linux.so.2 The key part is in frame 7 (conn_message_sent), type is always 258 for me. |
|
This is the backtrace from rev 36015 Program terminated with signal SIGABRT, Aborted. #0 0xb6f1eddc in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0xb6f1eddc in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0xb6f20463 in __GI_abort () at abort.c:89 #2 0xb7148f2d in GNUNET_abort_ () at common_logging.c:290 #3 0x0808ad46 in GCP_remove_connection (peer=0xb4109680, c=0xb2f01240) at gnunet-service-cadet_peer.c:2219 #4 0x080666f5 in unregister_neighbors (c=0xb2f01240) at gnunet-service-cadet_connection.c:1589 #5 0x0806b5b1 in GCC_destroy (c=0xb2f01240) at gnunet-service-cadet_connection.c:2824 #6 0x08065d4c in connection_timeout (c=0xb2f01240, fwd=1) at gnunet-service-cadet_connection.c:1420 #7 0x08065df2 in connection_fwd_timeout (cls=0xb2f01240, tc=0xbfcd03d0) at gnunet-service-cadet_connection.c:1441 #8 0xb71c4498 in run_ready (rs=0xb2b03370, ws=0xb2b032a0) at scheduler.c:587 #9 0xb71c51b7 in GNUNET_SCHEDULER_run (task=0xb71df0c5 <service_task>, task_cls=0xbfcd0930) at scheduler.c:867 #10 0xb71e2240 in GNUNET_SERVICE_run (argc=7, argv=0xbfcd0bb4, service_name=0x809a340 "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x808f307 <run>, task_cls=0x0) at service.c:1503 #11 0x0808f59f in main (argc=7, argv=0xbfcd0bb4) at gnunet-service-cadet.c:174 |
|
Here is a backtrace from rev 36024 showing type=258: #0 0x00007fbef284e107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007fbef284f4e8 in __GI_abort () at abort.c:89 #2 0x00007fbef3d0e96a in GNUNET_abort_ () at common_logging.c:290 #3 0x000000000042361e in GCP_remove_connection (peer=0x198bf90, c=0x19aa9b0) at gnunet-service-cadet_peer.c:2219 #4 0x0000000000411273 in unregister_neighbors (c=0x19aa9b0) at gnunet-service-cadet_connection.c:1589 #5 0x00000000004145ba in GCC_destroy (c=0x19aa9b0) at gnunet-service-cadet_connection.c:2824 #6 0x000000000040f0ac in conn_message_sent (cls=0x19a9b70, c=0x19aa9b0, sent=1, type=258, pid=0, fwd=1, size=100, wait=...) at gnunet-service-cadet_connection.c:643 #7 0x0000000000420e86 in GCP_queue_destroy (queue=0x198eee0, clear_cls=0, sent=1, pid=0) at gnunet-service-cadet_peer.c:1351 #8 0x00000000004208eb in queue_send (cls=0x1988110, size=332, buf=0x199ed38) at gnunet-service-cadet_peer.c:1240 #9 0x00007fbef38e1a2b in transmit_message (cls=0x19740e0, size=388, buf=0x199ed00) at core_api.c:676 #10 0x00007fbef3d0b48a in client_notify (cls=0x19a8e50, size=388, buf=0x199ed00) at client.c:1201 #11 0x00007fbef3d1754c in process_notify (connection=0x1974200) at connection.c:1261 #12 0x00007fbef3d17d21 in transmit_ready (cls=0x1974200, tc=0x7ffc3fee6550) at connection.c:1405 #13 0x00007fbef3d46c43 in run_ready (rs=0x19736e0, ws=0x1973770) at scheduler.c:587 #14 0x00007fbef3d474d7 in GNUNET_SCHEDULER_run ( task=0x7fbef3d52a4e <service_task>, task_cls=0x7ffc3fee68e0) at scheduler.c:867 #15 0x00007fbef3d54676 in GNUNET_SERVICE_run (argc=7, argv=0x7ffc3fee6b78, service_name=0x429998 "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x42557f <run>, task_cls=0x0) at service.c:1503 #16 0x000000000042581e in main (argc=7, argv=0x7ffc3fee6b78) at gnunet-service-cadet.c:174 |
|
I got it too, with SVN HEAD of today: #0 0x00007f45778df107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f45778e04e8 in __GI_abort () at abort.c:89 #2 0x00007f4578da3fc0 in GNUNET_abort_ () at common_logging.c:290 #3 0x00000000004300c1 in GCP_remove_connection (peer=0x186d7b0, c=0x1874b20) at gnunet-service-cadet_peer.c:2219 #4 0x000000000041591c in unregister_neighbors (c=0x1874b20) at gnunet-service-cadet_connection.c:1586 #5 0x000000000041a447 in GCC_destroy (c=0x1874b20) at gnunet-service-cadet_connection.c:2824 #6 0x00000000004127cc in conn_message_sent (cls=0x1866a60, c=0x1874b20, sent=0, type=258, pid=0, fwd=0, size=100, wait=...) at gnunet-service-cadet_connection.c:643 #7 0x000000000042cb8f in GCP_queue_destroy (queue=0x1879e40, clear_cls=1, sent=0, pid=0) at gnunet-service-cadet_peer.c:1351 #8 0x000000000042d794 in GCP_queue_cancel (peer=0x186ec20, c=0x1874b20) at gnunet-service-cadet_peer.c:1517 #9 0x0000000000414684 in connection_cancel_queues (c=0x1874b20, fwd=0) at gnunet-service-cadet_connection.c:1249 #10 0x000000000041ad36 in GCC_notify_broken (c=0x1874b20, peer=0x186ec20) at gnunet-service-cadet_connection.c:3089 #11 0x0000000000429658 in notify_broken (cls=0x186ec20, key=0x7fff507e2b20, value=0x1874b20) at gnunet-service-cadet_peer.c:364 #12 0x00007f4578db5423 in GNUNET_CONTAINER_multihashmap_iterate (map=0x1870380, it=0x42958b <notify_broken>, it_cls=0x186ec20) at container_multihashmap.c:358 #13 0x0000000000429dbd in core_disconnect (cls=0x0, peer=0x1867040) at gnunet-service-cadet_peer.c:480 #14 0x00007f4578971d08 in disconnect_and_free_peer_entry (cls=0x184f9d0, key=0x7fff507e2dc8, value=0x1866ff0) at core_api.c:389 #15 0x00007f45789753ad in main_notify_handler (cls=0x184f9d0, msg=0x7fff507e2dc0) at core_api.c:913 #16 0x00007f4578d9e7db in receive_task (cls=0x184f360, tc=0x7fff507e2ec0) at client.c:618 #17 0x00007f4578dded4c in run_ready (rs=0x184c990, ws=0x184ca20) at scheduler.c:587 #18 0x00007f4578ddf657 in GNUNET_SCHEDULER_run (task=0x7f4578dec132 <service_task>, task_cls=0x7fff507e3250) at scheduler.c:867 #19 0x00007f4578dede6b in GNUNET_SERVICE_run (argc=3, argv=0x7fff507e34e8, service_name=0x43a12a "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x432f1d <run>, task_cls=0x0) at service.c:1503 #20 0x000000000043328c in main (argc=3, argv=0x7fff507e34e8) at gnunet-service-cadet.c:174 (gdb) bt full #0 0x00007f45778df107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 resultvar = 0 pid = 27452 selftid = 27452 #1 0x00007f45778e04e8 in __GI_abort () at abort.c:89 save_stage = 2 act = {__sigaction_handler = {sa_handler = 0x28, sa_sigaction = 0x28}, sa_mask = {__val = {0, 140734543833008, 139936361745927, 140733193388033, 0, 1, 139936335072552, 16, 140734543833008, 4208400, 139936361771813, 1, 139936335935008, 0, 139936338868112, 139936338862176}}, sa_flags = 2034288384, sa_restorer = 0x7f45778f59fa <_IO_vfprintf_internal+22490>} sigs = {__val = {32, 0 <repeats 15 times>}} #2 0x00007f4578da3fc0 in GNUNET_abort_ () at common_logging.c:290 No locals. #3 0x00000000004300c1 in GCP_remove_connection (peer=0x186d7b0, c=0x1874b20) at gnunet-service-cadet_peer.c:2219 __FUNCTION__ = "GCP_remove_connection" #4 0x000000000041591c in unregister_neighbors (c=0x1874b20) at gnunet-service-cadet_connection.c:1586 peer = 0x186d7b0 __FUNCTION__ = "unregister_neighbors" #5 0x000000000041a447 in GCC_destroy (c=0x1874b20) at gnunet-service-cadet_connection.c:2824 __FUNCTION__ = "GCC_destroy" #6 0x00000000004127cc in conn_message_sent (cls=0x1866a60, c=0x1874b20, sent=0, type=258, pid=0, fwd=0, size=100, wait=...) at gnunet-service-cadet_connection.c:643 p = 0x0 fc = 0x1874b70 q = 0x1866a60 usecsperbyte = 1.9762625833649862e-322 forced = 1 __FUNCTION__ = "conn_message_sent" #7 0x000000000042cb8f in GCP_queue_destroy (queue=0x1879e40, clear_cls=1, sent=0, pid=0) at gnunet-service-cadet_peer.c:1351 wait_time = {rel_value_us = 110263569} peer = 0x186ec20 connection_destroyed = 0 __FUNCTION__ = "GCP_queue_destroy" #8 0x000000000042d794 in GCP_queue_cancel (peer=0x186ec20, c=0x1874b20) at gnunet-service-cadet_peer.c:1517 q = 0x1879e40 next = 0x1879e40 prev = 0x186e7a0 connection_destroyed = 0 __FUNCTION__ = "GCP_queue_cancel" #9 0x0000000000414684 in connection_cancel_queues (c=0x1874b20, fwd=0) at gnunet-service-cadet_connection.c:1249 fc = 0x1874b70 peer = 0x186ec20 __FUNCTION__ = "connection_cancel_queues" #10 0x000000000041ad36 in GCC_notify_broken (c=0x1874b20, peer=0x186ec20) at gnunet-service-cadet_connection.c:3089 hop = 0x186ec20 fwd = 1 __FUNCTION__ = "GCC_notify_broken" #11 0x0000000000429658 in notify_broken (cls=0x186ec20, key=0x7fff507e2b20, value=0x1874b20) at gnunet-service-cadet_peer.c:364 peer = 0x186ec20 c = 0x1874b20 __FUNCTION__ = "notify_broken" #12 0x00007f4578db5423 in GNUNET_CONTAINER_multihashmap_iterate (map=0x1870380, it=0x42958b <notify_broken>, it_cls=0x186ec20) at container_multihashmap.c:358 bme = 0x1871e70 nxt = 0x0 count = 1 i = 14 me = {sme = 0x1871e70, bme = 0x1871e70} kc = {bits = {1936134110, 3582693111, 677836696, 4254488402, 3265454343, 1598125268, 627729676, 1047629589, 0, 0, 0, 0, 0, 0, 0, 0}} __FUNCTION__ = "GNUNET_CONTAINER_multihashmap_iterate" #13 0x0000000000429dbd in core_disconnect (cls=0x0, peer=0x1867040) at gnunet-service-cadet_peer.c:480 p = 0x186ec20 direct_path = 0x1867180 own_id = "9QXF", '\000' <repeats 11 times> __FUNCTION__ = "core_disconnect" #14 0x00007f4578971d08 in disconnect_and_free_peer_entry (cls=0x184f9d0, key=0x7fff507e2dc8, value=0x1866ff0) at core_api.c:389 h = 0x184f9d0 th = 0x1867080 pr = 0x1866ff0 __FUNCTION__ = "disconnect_and_free_peer_entry" #15 0x00007f45789753ad in main_notify_handler (cls=0x184f9d0, msg=0x7fff507e2dc0) at core_api.c:913 h = 0x184f9d0 m = 0x4571fddfcc8e6223 cnm = 0x38 dnm = 0x7fff507e2dc0 ntm = 0x690442854600e800 em = 0x38 ---Type <return> to continue, or q <return> to quit--- smr = 0x403710 <_start> mh = 0x4b278564d25 init = 0x2a pr = 0x1866ff0 th = 0x0 hpos = 0 trigger = 0 msize = 40 et = 0 __FUNCTION__ = "main_notify_handler" #16 0x00007f4578d9e7db in receive_task (cls=0x184f360, tc=0x7fff507e2ec0) at client.c:618 client = 0x184f360 handler = 0x7f45789743d7 <main_notify_handler> cmsg = 0x1874570 handler_cls = 0x184f9d0 msize = 40 mbuf = 0x7fff507e2dc0 "" msg = 0x7fff507e2dc0 __FUNCTION__ = "receive_task" #17 0x00007f4578dded4c in run_ready (rs=0x184c990, ws=0x184ca20) at scheduler.c:587 p = GNUNET_SCHEDULER_PRIORITY_DEFAULT pos = 0x186be70 tc = {reason = GNUNET_SCHEDULER_REASON_TIMEOUT, read_ready = 0x184c990, write_ready = 0x184ca20} __FUNCTION__ = "run_ready" #18 0x00007f4578ddf657 in GNUNET_SCHEDULER_run (task=0x7f4578dec132 <service_task>, task_cls=0x7fff507e3250) at scheduler.c:867 rs = 0x184c990 ws = 0x184ca20 timeout = {rel_value_us = 0} ret = 0 shc_int = 0x1863e50 shc_term = 0x184dbf0 shc_quit = 0x184dd70 shc_hup = 0x184de30 shc_pipe = 0x184dcb0 last_tr = 34833 busy_wait_warning = 0 pr = 0x1862c00 c = 0 '\000' __FUNCTION__ = "GNUNET_SCHEDULER_run" #19 0x00007f4578dede6b in GNUNET_SERVICE_run (argc=3, argv=0x7fff507e34e8, service_name=0x43a12a "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x432f1d <run>, task_cls=0x0) at service.c:1503 err = 0 ret = 3 cfg_fn = 0x184a7c0 "/home/grothoff/.config/gnunet.conf" opt_cfg_fn = 0x184a920 "/home/grothoff/.config/gnunet.conf" loglev = 0x0 logfile = 0x0 do_daemonize = 0 i = 4402973 skew_offset = 139936363926784 skew_variance = 140734543836208 clock_offset = 4208400 sctx = {cfg = 0x184a7f0, server = 0x184e150, addrs = 0x184c520, service_name = 0x43a12a "cadet", task = 0x432f1d <run>, task_cls = 0x0, v4_denied = 0x0, v6_denied = 0x0, v4_allowed = 0x184a810, v6_allowed = 0x1863fd0, my_handlers = 0x184e240, addrlens = 0x1863f30, lsocks = 0x0, shutdown_task = 0x184e1e0, timeout = {rel_value_us = 18446744073709551615}, ret = 1, ready_confirm_fd = -1, require_found = 1, match_uid = 1, match_gid = 1, options = GNUNET_SERVICE_OPTION_NONE} cfg = 0x184a7f0 xdg = 0x7fff507e3db0 "/home/grothoff/.config" service_options = {{shortName = 99 'c', name = 0x7f4578dfc6ed "config", argumentHelp = 0x7f4578dfc6f4 "FILENAME", description = 0x7f4578dfc700 "use configuration file FILENAME", require_argument = 1, processor = 0x7f4578dcc089 <GNUNET_GETOPT_set_string>, scls = 0x7fff507e3308}, {shortName = 100 'd', name = 0x7f4578dfc720 "daemonize", argumentHelp = 0x0, description = 0x7f4578dfc730 "do daemonize (detach from terminal)", require_argument = 0, processor = 0x7f4578dcc05c <GNUNET_GETOPT_set_one>, scls = 0x7fff507e32f4}, {shortName = 104 'h', name = 0x7f4578dfc754 "help", argumentHelp = 0x0, description = 0x7f4578dfc759 "print this help", require_argument = 0, processor = 0x7f4578dcbb27 <GNUNET_GETOPT_format_help_>, scls = 0x0}, {shortName = 76 'L', name = 0x7f4578dfc769 "log", argumentHelp = 0x7f4578dfc76d "LOGLEVEL", description = 0x7f4578dfc778 "configure logging to use LOGLEVEL", require_argument = 1, processor = 0x7f4578dcc089 <GNUNET_GETOPT_set_string>, scls = 0x7fff507e3300}, {shortName = 108 'l', name = 0x7f4578dfc79a "logfile", argumentHelp = 0x7f4578dfc7a2 "LOGFILE", description = 0x7f4578dfc7b0 "configure logging to write logs to LOGFILE", require_argument = 1, processor = 0x7f4578dcc089 <GNUNET_GETOPT_set_string>, scls = 0x7fff507e32f8}, { shortName = 118 'v', name = 0x7f4578dfc7db "version", argumentHelp = 0x0, description = 0x7f4578dfc7e3 "print the version number", require_argument = 0, processor = 0x7f4578dcbae1 <GNUNET_GETOPT_print_version_>, scls = 0x7f4578dfc7fc}, {shortName = 0 '\000', name = 0x0, argumentHelp = 0x0, description = 0x0, require_argument = 0, processor = 0x0, scls = 0x0}} __FUNCTION__ = "GNUNET_SERVICE_run" #20 0x000000000043328c in main (argc=3, argv=0x7fff507e34e8) at gnunet-service-cadet.c:174 |
|
With the latest instrumentation, I got this violation of the invariant, which seems interesting: (gdb) ba #0 0x00007f5c56909107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f5c5690a4e8 in __GI_abort () at abort.c:89 #2 0x00007f5c57dce480 in GNUNET_abort_ () at common_logging.c:290 #3 0x00000000004302f7 in GCP_check_connection (peer=0x208b980, c=0x2094630) at gnunet-service-cadet_peer.c:2213 #4 0x00000000004133c2 in check_neighbours (c=0x2094630) at gnunet-service-cadet_connection.c:806 #5 0x00000000004133ec in check_connection (cls=0x0, key=0x7ffea0de53a0, value=0x2094630) at gnunet-service-cadet_connection.c:824 #6 0x00007f5c57ddf8e3 in GNUNET_CONTAINER_multihashmap_iterate (map=0x2074c80, it=0x4133c4 <check_connection>, it_cls=0x0) at container_multihashmap.c:358 #7 0x0000000000413410 in check_connections () at gnunet-service-cadet_connection.c:835 #8 0x000000000041a0b6 in GCC_new (cid=0x7ffea0de54a0, t=0x2094ff0, path=0x2093b50, own_pos=0) at gnunet-service-cadet_connection.c:2838 #9 0x000000000040fc08 in GCT_use_path (t=0x2094ff0, p=0x2093b50) at gnunet-service-cadet_tunnel.c:3778 #10 0x000000000042ee4b in GCP_connect (peer=0x208b980) at gnunet-service-cadet_peer.c:1889 #11 0x000000000042b6e3 in search_handler (cls=0x208b980, path=0x208b900) at gnunet-service-cadet_peer.c:1032 #12 0x000000000043144f in dht_get_id_handler (cls=0x2094e90, exp=..., key=0x7ffea0de5820, get_path=0x7ffea0de5880, get_path_length=0, put_path=0x7ffea0de5860, put_path_length=1, type=GNUNET_BLOCK_TYPE_DHT_HELLO, size=90, data=0x7ffea0de5880) at gnunet-service-cadet_dht.c:192 #13 0x00007f5c57794e1d in process_reply (cls=0x7ffea0de5800, key=0x7ffea0de5820, value=0x2095400) at dht_api.c:740 #14 0x00007f5c57de04c3 in GNUNET_CONTAINER_multihashmap_get_multiple (map=0x2078570, key=0x7ffea0de5820, it=0x7f5c57794946 <process_reply>, it_cls=0x7ffea0de5800) at container_multihashmap.c:816 #15 0x00007f5c57795b0a in service_message_handler (cls=0x2078590, msg=0x7ffea0de5800) at dht_api.c:1016 #16 0x00007f5c57dc8c9b in receive_task (cls=0x20789f0, tc=0x7ffea0de59b0) at client.c:618 #17 0x00007f5c57e0aaf3 in run_ready (rs=0x2073c10, ws=0x2073ca0) at scheduler.c:587 #18 0x00007f5c57e0b3fe in GNUNET_SCHEDULER_run (task=0x7f5c57e17ed9 <service_task>, task_cls=0x7ffea0de5d40) at scheduler.c:867 #19 0x00007f5c57e19c12 in GNUNET_SERVICE_run (argc=3, argv=0x7ffea0de5fd8, service_name=0x43a56a "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x43333f <run>, task_cls=0x0) at service.c:1503 #20 0x00000000004336ae in main (argc=3, argv=0x7ffea0de5fd8) at gnunet-service-cadet.c:174 |
|
Additionally, this one suggests that we probably need to add a few more places with check_connections(), as here the invariant is false at the beginning of a function: #0 0x00007f7cdce00107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) ba #0 0x00007f7cdce00107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f7cdce014e8 in __GI_abort () at abort.c:89 #2 0x00007f7cde2c5480 in GNUNET_abort_ () at common_logging.c:290 #3 0x00000000004302f7 in GCP_check_connection (peer=0x2013750, c=0x202d550) at gnunet-service-cadet_peer.c:2213 #4 0x00000000004133a4 in check_neighbours (c=0x202d550) at gnunet-service-cadet_connection.c:805 #5 0x00000000004133ec in check_connection (cls=0x0, key=0x7ffdfe8f0e20, value=0x202d550) at gnunet-service-cadet_connection.c:824 #6 0x00007f7cde2d68e3 in GNUNET_CONTAINER_multihashmap_iterate (map=0x200cc80, it=0x4133c4 <check_connection>, it_cls=0x0) at container_multihashmap.c:358 #7 0x0000000000413410 in check_connections () at gnunet-service-cadet_connection.c:835 #8 0x0000000000416313 in GCC_handle_create (cls=0x0, peer=0x7ffdfe8f10c4, message=0x7ffdfe8f10e4) at gnunet-service-cadet_connection.c:1813 #9 0x00007f7cdde96a0c in main_notify_handler (cls=0x200d6f0, msg=0x7ffdfe8f10c0) at core_api.c:967 #10 0x00007f7cde2bfc9b in receive_task (cls=0x200cfd0, tc=0x7ffdfe8f1220) at client.c:618 #11 0x00007f7cde301af3 in run_ready (rs=0x200bc10, ws=0x200bca0) at scheduler.c:587 #12 0x00007f7cde3023fe in GNUNET_SCHEDULER_run (task=0x7f7cde30eed9 <service_task>, task_cls=0x7ffdfe8f15b0) at scheduler.c:867 #13 0x00007f7cde310c12 in GNUNET_SERVICE_run (argc=3, argv=0x7ffdfe8f1848, service_name=0x43a56a "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x43333f <run>, task_cls=0x0) at service.c:1503 #14 0x00000000004336ae in main (argc=3, argv=0x7ffdfe8f1848) at gnunet-service-cadet.c:174 |
|
Gotcha: invariant violated at the END of core_connect, but passed at the beginning: #0 0x00007f20911a3107 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f20911a44e8 in __GI_abort () at abort.c:89 #2 0x00007f2092668480 in GNUNET_abort_ () at common_logging.c:290 #3 0x00000000004305d3 in GCP_check_connection (peer=0xc7b6c0, c=0xca7fe0) at gnunet-service-cadet_peer.c:2251 #4 0x00000000004133f6 in check_neighbours (c=0xca7fe0) at gnunet-service-cadet_connection.c:814 #5 0x000000000041343e in check_connection (cls=0x0, key=0x7ffd368c1a80, value=0xca7fe0) at gnunet-service-cadet_connection.c:834 #6 0x00007f20926798e3 in GNUNET_CONTAINER_multihashmap_iterate (map=0xc61c80, it=0x413416 <check_connection>, it_cls=0x0) at container_multihashmap.c:358 #7 0x0000000000413470 in GCC_check_connections () at gnunet-service-cadet_connection.c:847 #8 0x000000000042a058 in core_connect (cls=0x0, peer=0xc9ff40) at gnunet-service-cadet_peer.c:448 #9 0x00007f2092238fa4 in main_notify_handler (cls=0xc626f0, msg=0x7ffd368c1cc0) at core_api.c:882 #10 0x00007f2092662c9b in receive_task (cls=0xc61fd0, tc=0x7ffd368c1dc0) at client.c:618 #11 0x00007f20926a4af3 in run_ready (rs=0xc60c10, ws=0xc60ca0) at scheduler.c:587 #12 0x00007f20926a53fe in GNUNET_SCHEDULER_run (task=0x7f20926b1ed9 <service_task>, task_cls=0x7ffd368c2150) at scheduler.c:867 #13 0x00007f20926b3c12 in GNUNET_SERVICE_run (argc=3, argv=0x7ffd368c23e8, service_name=0x43a8ba "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x433693 <run>, task_cls=0x0) at service.c:1503 #14 0x0000000000433a02 in main (argc=3, argv=0x7ffd368c23e8) at gnunet-service-cadet.c:174 (gdb) |
|
AHA! How about this: we have an existing 'struct CadetConnection' A-US-B, but no connection to say 'A'. Then A connects to us (core_connect). At the beginning, the invariant passes because (NULL == peer->connections) for the non-connected peer. Then 'core_connect' *creates* peer->connections HT (BUT does NOT add existing CadetConnections to the new map). Then afterwards the check fails, as the CadetConnection is NOT in the map. If we then tear down the CadetConnection (i.e. because 'B' disconnects), we try to remove from 'A' but the remove fails (connection not in map!). I'm not sure how we can have a 'struct CadetConnection' from A-US-B without a connection A-US, but the fact that we check "NULL == peer->connections" (in GCP_remove_connection) suggests this at least plausible. |
|
From cadet_peer.c:2251 on assertion failure: (gdb) print *peer $1 = {id = 20, last_contact = {abs_value_us = 1436910759928036}, path_head = 0xc99ee0, path_tail = 0xca6c40, search_h = 0xc9ef60, search_delayed = 0x0, tunnel = 0xcaacb0, connections = 0xcab350, core_transmit = 0x0, tmt_time = {abs_value_us = 0}, queue_head = 0x0, queue_tail = 0x0, queue_n = 0, hello = 0xca9590} (gdb) print *c $2 = {t = 0x0, fwd_fc = {c = 0xca7fe0, queue_n = 0, queue_max = 11, last_pid_sent = 4294967295, last_pid_recv = 4294967295, recv_bitmap = 0, last_ack_sent = 0, last_ack_recv = 0, poll_task = 0x0, poll_time = {rel_value_us = 1000000}, poll_msg = 0x0, ack_msg = 0x0}, bck_fc = {c = 0xca7fe0, queue_n = 0, queue_max = 11, last_pid_sent = 4294967295, last_pid_recv = 4294967295, recv_bitmap = 0, last_ack_sent = 0, last_ack_recv = 0, poll_task = 0x0, poll_time = {rel_value_us = 1000000}, poll_msg = 0x0, ack_msg = 0x0}, perf = 0x0, id = {bits = "\t9I)\222\063A\004\326\363\353\226j\365j\366\226w\344\\h\220\rj2Z0l\267\262]\205"}, path = 0xca8140, fwd_maintenance_task = 0xca7160, bck_maintenance_task = 0x0, maintenance_q = 0x0, next_peer = 0xc7b6c0, prev_peer = 0xc7c570, state = CADET_CONNECTION_DESTROYED, own_pos = 3, pending_messages = 1, destroy = 1, create_retry = 0} (gdb) print *c->next_peer $3 = {id = 20, last_contact = {abs_value_us = 1436910759928036}, path_head = 0xc99ee0, path_tail = 0xca6c40, search_h = 0xc9ef60, search_delayed = 0x0, tunnel = 0xcaacb0, connections = 0xcab350, core_transmit = 0x0, tmt_time = {abs_value_us = 0}, queue_head = 0x0, queue_tail = 0x0, queue_n = 0, hello = 0xca9590} (gdb) print *c->prev_peer $4 = {id = 28, last_contact = {abs_value_us = 1436910759928010}, path_head = 0xc7aa70, path_tail = 0xc9b200, search_h = 0xc84370, search_delayed = 0x0, tunnel = 0xc86640, connections = 0xc7b9c0, core_transmit = 0xc81618, tmt_time = { abs_value_us = 1436910751616026}, queue_head = 0xca8ad0, queue_tail = 0xca8ad0, queue_n = 1, hello = 0xc9e170} => so 'c' was not in the map for c's 'next_peer' At cadet_peer.c:448 at the same time: (gdb) print *mp $5 = {id = 20, last_contact = {abs_value_us = 1436910759928036}, path_head = 0xc99ee0, path_tail = 0xca6c40, search_h = 0xc9ef60, search_delayed = 0x0, tunnel = 0xcaacb0, connections = 0xcab350, core_transmit = 0x0, tmt_time = {abs_value_us = 0}, queue_head = 0x0, queue_tail = 0x0, queue_n = 0, hello = 0xca9590} => so the peer that we established a connection to was '20' == c's 'next_peer' This confirms the previous theory that by initializing 'mp->connections' we cause the invariant to fail. |
|
So now the 1M $$ question: why did we have a connection via a 'next' peer which is NOT currently connected? It cannot (by design) be due to disconnect, on disconnect we clean up connections associated with the disconnecting peer. That leaves that we somehow *create* a connection via peers that we are not currently connected to. => Another invariant to check? Or Bart, do you have an explanation from the design? |
|
Ok, my stronger check of the invariant fails during disconnect -- a connection remains with a disconnected peer. Closer inspection easily explains this as well: the connection in 'GCC_notify_broken' is not fully destroyed as a last 'disconnect notification' is scheduled (send_broken()) for the OTHER peer that remains. While that one is in the queue, the comments say the connection object will live. Thus, if then the peer re-connects, the connection may still be around, but wouldn't be re-added to the peer's connection list. If it *then* gets killed, (which is normal, after all it will eventually transmit the notification...) the code will attempt to remove it from it's neighbour's queue and then this violates the connections-must-be-in-peer-list assertion. Possibly a good answer: remove it immediately from the disconnecting peer AND mark this in the data structure, so we don't try doing so again (so we can keep the invariant). |
|
This is not the only issue in this domain. Sometimes c->destroy is set to YES, the actual destruction is deferred; however, the connection is then anyway returned via the hashmap and in some cases the 'destroy' flag is not checked. This can then in particular cause havoc when a new (more or less identical) connection is to be created while the previous one is awaiting destruction. Very messy. |
|
Proposed fix in SVN 36070. Will leave at feedback for a bit to be sure... |
|
The fix caused a follow-up issue when connections were re-created. That's also now fixed (right after Bart's long commit series), so ~36083. |
|
At rev 36090 I'm now hitting your new GNUNET_assert (NULL != ( (fwd) ? c->next_peer : c->prev_peer) ); in GCC_notify_broken. Program terminated with signal SIGABRT, Aborted. #0 0xb756eddc in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory. (gdb) bt #0 0xb756eddc in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0xb7570463 in __GI_abort () at abort.c:89 #2 0xb7740caa in GNUNET_abort_ () at common_logging.c:290 #3 0x0805b2d2 in GCC_notify_broken (c=0x99a2860, peer=0x9985318) at gnunet-service-cadet_connection.c:3276 #4 0x08064284 in notify_broken (cls=0x9985318, key=0xbffa25e0, value=0x99a2860) at gnunet-service-cadet_peer.c:364 #5 0xb774f851 in GNUNET_CONTAINER_multihashmap_iterate (map=0x9982d40, it=0x806425d <notify_broken>, it_cls=0x9985318) at container_multihashmap.c:358 #6 0x08064948 in core_disconnect (cls=0x0, peer=0x9992630) at gnunet-service-cadet_peer.c:483 #7 0xb77155dc in disconnect_and_free_peer_entry (cls=0x99809e8, key=0xbffa2828, value=0x9992600) at core_api.c:389 #8 0xb77181a7 in main_notify_handler (cls=0x99809e8, msg=0xbffa2820) at core_api.c:913 #9 0xb773c348 in receive_task (cls=0x9980040, tc=0xbffa28b4) at client.c:618 #10 0xb7777469 in run_ready (rs=0x998ee30, ws=0x998eeb8) at scheduler.c:587 #11 0xb7777cf7 in GNUNET_SCHEDULER_run (task=0xb7782885 <service_task>, task_cls=0xbffa2ae0) at scheduler.c:867 #12 0xb7784423 in GNUNET_SERVICE_run (argc=7, argv=0xbffa2cd4, service_name=0x806e5bc "cadet", options=GNUNET_SERVICE_OPTION_NONE, task=0x806ac6e <run>, task_cls=0x0) at service.c:1503 #13 0x0806af00 in main (argc=7, argv=0xbffa2cd4) at gnunet-service-cadet.c:174 (gdb) f 3 #3 0x0805b2d2 in GCC_notify_broken (c=0x99a2860, peer=0x9985318) at gnunet-service-cadet_connection.c:3276 3276 gnunet-service-cadet_connection.c: No such file or directory. (gdb) p *peer $1 = {id = 16, last_contact = {abs_value_us = 1436969489418527}, path_head = 0x0, path_tail = 0x0, search_h = 0x0, search_delayed = 0x0, tunnel = 0x0, connections = 0x9982d40, core_transmit = 0x0, tmt_time = { abs_value_us = 0}, queue_head = 0x0, queue_tail = 0x0, queue_n = 0, hello = 0x99856b8} (gdb) p *c $2 = {t = 0x0, fwd_fc = {c = 0x99a2860, queue_n = 0, queue_max = 11, last_pid_sent = 4294967295, last_pid_recv = 4294967295, recv_bitmap = 0, last_ack_sent = 0, last_ack_recv = 0, poll_task = 0x0, poll_time = { rel_value_us = 1000000}, poll_msg = 0x0, ack_msg = 0x0}, bck_fc = { c = 0x99a2860, queue_n = 0, queue_max = 11, last_pid_sent = 4294967295, last_pid_recv = 4294967295, recv_bitmap = 0, last_ack_sent = 0, last_ack_recv = 0, poll_task = 0x0, poll_time = {rel_value_us = 1000000}, poll_msg = 0x0, ack_msg = 0x0}, perf = 0x0, id = { bits = "\343\223\031\016\n\004\263^\275\231\224\365\272\361s $\253\061\066Q\234\233B\315L\320\251\236\340\355E"}, path = 0x999d9f0, fwd_maintenance_task = 0x999f218, bck_maintenance_task = 0x0, maintenance_q = 0x0, next_peer = 0x0, prev_peer = 0x0, state = CADET_CONNECTION_DESTROYED, own_pos = 3, pending_messages = 1, destroy = 1, was_removed = 1, create_retry = 0} (gdb) p *hop $3 = {id = 62, last_contact = {abs_value_us = 1436969489418517}, path_head = 0x99a0418, path_tail = 0x99a0418, search_h = 0x0, search_delayed = 0x0, tunnel = 0x999e770, connections = 0x99a04c8, core_transmit = 0x999e93c, tmt_time = {abs_value_us = 1436969488768256}, queue_head = 0x999ddf0, queue_tail = 0x999ef40, queue_n = 2, hello = 0x0} |
|
Yeah, after my last changes the assertion is just bogus. Removed in 36091. |
Date Modified | Username | Field | Change |
---|---|---|---|
2015-06-19 16:46 | amatus | New Issue | |
2015-06-19 16:46 | amatus | Status | new => assigned |
2015-06-19 16:46 | amatus | Assigned To | => Bart Polot |
2015-06-19 16:52 | amatus | Platform | amd64 => x86 |
2015-06-24 16:53 | Bart Polot | Status | assigned => feedback |
2015-06-24 19:09 | amatus | Note Added: 0009331 | |
2015-06-24 19:09 | amatus | Status | feedback => assigned |
2015-06-24 19:17 | Bart Polot | Note Added: 0009332 | |
2015-06-25 05:19 | Bart Polot | Relationship added | related to 0003845 |
2015-06-25 05:20 | Bart Polot | Note Added: 0009337 | |
2015-06-25 05:20 | Bart Polot | Status | assigned => feedback |
2015-06-25 06:29 | amatus | Note Added: 0009340 | |
2015-06-25 06:29 | amatus | Status | feedback => assigned |
2015-06-26 04:22 | Bart Polot | Note Added: 0009341 | |
2015-06-26 04:22 | Bart Polot | Status | assigned => feedback |
2015-06-26 13:35 | Christian Grothoff | Target Version | => 0.11.0pre66 |
2015-06-28 01:40 | amatus | Note Added: 0009357 | |
2015-06-28 01:40 | amatus | Status | feedback => assigned |
2015-06-28 03:48 | Bart Polot | Status | assigned => resolved |
2015-06-28 03:48 | Bart Polot | Fixed in Version | => 0.11.0pre66 |
2015-06-28 03:48 | Bart Polot | Resolution | open => fixed |
2015-06-28 17:40 | amatus | Note Added: 0009362 | |
2015-06-28 17:40 | amatus | Status | resolved => feedback |
2015-06-28 17:40 | amatus | Resolution | fixed => reopened |
2015-06-28 17:40 | amatus | Status | feedback => assigned |
2015-06-28 18:01 | amatus | Note Edited: 0009362 | |
2015-06-28 18:49 | Bart Polot | Note Added: 0009363 | |
2015-06-29 14:58 | Bart Polot | Note Added: 0009368 | |
2015-06-29 16:34 | amatus | Note Added: 0009369 | |
2015-06-30 00:55 | amatus | Note Added: 0009376 | |
2015-06-30 11:32 | Bart Polot | Relationship added | has duplicate 0003873 |
2015-06-30 11:36 | Christian Grothoff | Note Added: 0009381 | |
2015-07-13 23:29 | Christian Grothoff | Note Added: 0009430 | |
2015-07-13 23:31 | Christian Grothoff | Note Added: 0009431 | |
2015-07-14 23:56 | Christian Grothoff | Note Added: 0009432 | |
2015-07-15 00:01 | Christian Grothoff | Note Added: 0009433 | |
2015-07-15 00:04 | Christian Grothoff | Note Added: 0009434 | |
2015-07-15 00:06 | Christian Grothoff | Note Added: 0009435 | |
2015-07-15 00:28 | Christian Grothoff | Note Added: 0009436 | |
2015-07-15 01:03 | Christian Grothoff | Note Added: 0009437 | |
2015-07-15 01:34 | Christian Grothoff | Note Added: 0009438 | |
2015-07-15 01:45 | Christian Grothoff | Status | assigned => feedback |
2015-07-15 01:46 | Christian Grothoff | Assigned To | Bart Polot => Christian Grothoff |
2015-07-15 14:16 | Christian Grothoff | Note Added: 0009450 | |
2015-07-15 19:19 | amatus | Note Added: 0009451 | |
2015-07-15 19:19 | amatus | Status | feedback => assigned |
2015-07-15 19:41 | Christian Grothoff | Note Added: 0009452 | |
2015-07-16 11:15 | Christian Grothoff | Status | assigned => feedback |
2015-08-03 13:37 | Christian Grothoff | Status | feedback => resolved |
2015-08-03 13:37 | Christian Grothoff | Resolution | reopened => fixed |
2018-06-07 00:25 | Christian Grothoff | Status | resolved => closed |