View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0002701 | GNUnet | file-sharing service | public | 2012-12-13 17:11 | 2012-12-21 16:49 |
Reporter | LRN | Assigned To | LRN | ||
Priority | normal | Severity | crash | Reproducibility | have not tried |
Status | closed | Resolution | fixed | ||
Product Version | Git master | ||||
Target Version | 0.9.5 | Fixed in Version | 0.9.5 | ||
Summary | 0002701: Crash in fs service | ||||
Description | First crash i've had in, like, a week. fs service crashed. There were no warnings/errors from that process in the log. | ||||
Additional Information | Reading symbols from D:\Progs\GNUnet\lib\gnunet\libexec\gnunet-service-fs.exe...done. Continuing. Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 13492.0x1d5c] GSF_peer_transmit_cancel_ (pth=0xdf0adba) at gnunet-service-fs_cp.c:1447 1447 if (GNUNET_SCHEDULER_NO_TASK != pth->timeout_task) (gdb) bt #0 GSF_peer_transmit_cancel_ (pth=0xdf0adba) at gnunet-service-fs_cp.c:1447 #1 0x0040ad07 in schedule_peer_transmission (cls=0x25e4d98, tc=0x28fd34) at gnunet-service-fs_pe.c:450 #2 0x62b7b673 in run_ready (ws=0xc8a000, rs=0xc88fe8) at scheduler.c:597 #3 GNUNET_SCHEDULER_run (task=task@entry=0x62b83ec4 <service_task>, task_cls=task_cls@entry=0x28fdb8) at scheduler.c:785 #4 0x62b86b39 in GNUNET_SERVICE_run (argc=3, argv=0x253be98, service_name=0x418188 <__register_frame_info+4293000> "fs", options=GNUNET_SERVICE_OPTION_NONE, task=0x401a20 <run>, task_cls=0x0) at service.c:1813 #5 0x004163a8 in main (argc=3, argv=0x253be98) at gnunet-service-fs.c:709 (gdb) p *pth Cannot access memory at address 0xdf0adba (gdb) up #1 0x0040ad07 in schedule_peer_transmission (cls=0x25e4d98, tc=0x28fd34) at gnunet-service-fs_pe.c:450 450 GSF_peer_transmit_cancel_ (pp->pth); | ||||
Tags | No tags attached. | ||||
Attached Files | 0001-Don-t-forget-to-cancel-old-tasks-in-FS.patch (1,786 bytes)
From 1b745661e4c5086e48ca5484814ad2c6220ba7b5 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=D0=A0=C2=A0=D0=A1=D1=93=D0=A1=D0=83=D0=A0=C2=BB=D0=A0=C2=B0?= =?UTF-8?q?=D0=A0=D0=85=20=D0=A0=C2=98=D0=A0=C2=B6=D0=A0=C2=B1=D0=A1=D1=93=D0?= =?UTF-8?q?=A0=C2=BB=D0=A0=C2=B0=D0=A1=E2=80=9A=D0=A0=D1=95=D0=A0=D0=86?= <lrn1986@gmail.com> Date: Fri, 14 Dec 2012 15:06:20 +0400 Subject: [PATCH] Don't forget to cancel old tasks in FS --- src/fs/gnunet-service-fs_pe.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/src/fs/gnunet-service-fs_pe.c b/src/fs/gnunet-service-fs_pe.c index c7cac75..0992d21 100644 --- a/src/fs/gnunet-service-fs_pe.c +++ b/src/fs/gnunet-service-fs_pe.c @@ -391,6 +391,9 @@ transmit_message_callback (void *cls, size_t buf_size, void *buf) if (NULL == buf) { /* failed, try again... */ + if (GNUNET_SCHEDULER_NO_TASK != pp->task) + GNUNET_SCHEDULER_cancel (pp->task); + pp->task = GNUNET_SCHEDULER_add_now (&schedule_peer_transmission, pp); GNUNET_STATISTICS_update (GSF_stats, gettext_noop @@ -401,12 +404,16 @@ transmit_message_callback (void *cls, size_t buf_size, void *buf) rp = GNUNET_CONTAINER_heap_peek (pp->priority_heap); if (NULL == rp) { + if (GNUNET_SCHEDULER_NO_TASK != pp->task) + GNUNET_SCHEDULER_cancel (pp->task); pp->task = GNUNET_SCHEDULER_add_now (&schedule_peer_transmission, pp); return 0; } msize = GSF_pending_request_get_message_ (get_latest (rp), buf_size, buf); if (msize > buf_size) { + if (GNUNET_SCHEDULER_NO_TASK != pp->task) + GNUNET_SCHEDULER_cancel (pp->task); /* buffer to small (message changed), try again */ pp->task = GNUNET_SCHEDULER_add_now (&schedule_peer_transmission, pp); return 0; -- 1.7.11 | ||||
|
On every free of a GSF_PeerTransmitHandle timeout task is canceled |
|
timeout_task is only scheduled in on place: 1430 |
|
Is that true for r25337? Because that is what i'm currently running. |
|
Try to cancel an invalid request (Use after free)? Possible ... |
|
Places where transmit_request can be forgotten: GSF_block_peer_migration_ migration_pth -> Checked fs_pe.c schedule_peer_transmission -> checked GSF_push_stop_ -> checked |
|
Is that true for r25337? Because that is what i'm currently running. -> Last change in 24484 10/23/12 |
|
I think my gdb log does not describe the problem completely. The problem is that the cls (pp) given to schedule_peer_transmission() is freed before use (pp->pth pointer itself is poisoned -> pp contents are poisoned). So the only suspect is GSF_plan_notify_peer_disconnect_ not cancelling a schedule_peer_transmission() before freeing pp. Or something else in the code does things with pp, and loses track of it. |
|
Uploaded a patch. I think it would fix this issue. I _THINK_, because i have not been able to reproduce it. I did run my node with a modification of this patch that does not cancel tasks, but logs an error message instead, and proceeds normally. Despite a large number of error messages logged this way, my node did not crash for 14 hours straight. Apparently, the condition is rare. From what i can see in the code, schedule_peer_transmission() is almost always scheduled "now", which makes it unlikely that another function would have a chance to free the plan before scheduled schedule_peer_transmission() is called. |
|
I've reviewed your patch and found an execution path (GSF_plan_add_ called 2nd time and finds an existing 'pp', resulting in a call to 'plan' which happens to not have a 'task' but an active transmission request (NULL != pp->pth); that one may then be run before the task added in 'plan', and then may overwrite 'task' without checking (which is fixed by LRN's patch); if then the pp is destroyed before the dangling task is executed, we get the crash. Rare indeed. |
Date Modified | Username | Field | Change |
---|---|---|---|
2012-12-13 17:11 | LRN | New Issue | |
2012-12-13 17:31 | Matthias Wachs | Note Added: 0006700 | |
2012-12-13 17:32 | Matthias Wachs | Note Added: 0006701 | |
2012-12-13 17:33 | LRN | Note Added: 0006702 | |
2012-12-13 17:38 | Matthias Wachs | Note Added: 0006703 | |
2012-12-13 17:49 | Matthias Wachs | Note Edited: 0006703 | |
2012-12-13 17:53 | Matthias Wachs | Note Added: 0006704 | |
2012-12-13 17:55 | Matthias Wachs | Note Edited: 0006704 | |
2012-12-13 18:00 | Matthias Wachs | Note Added: 0006705 | |
2012-12-14 00:33 | LRN | Note Added: 0006706 | |
2012-12-14 12:21 | LRN | File Added: 0001-Don-t-forget-to-cancel-old-tasks-in-FS.patch | |
2012-12-14 12:27 | LRN | Note Added: 0006709 | |
2012-12-15 15:28 | Christian Grothoff | Note Added: 0006714 | |
2012-12-15 15:28 | Christian Grothoff | Status | new => resolved |
2012-12-15 15:28 | Christian Grothoff | Fixed in Version | => 0.9.5 |
2012-12-15 15:28 | Christian Grothoff | Resolution | open => fixed |
2012-12-15 15:28 | Christian Grothoff | Assigned To | => LRN |
2012-12-15 15:28 | Christian Grothoff | Product Version | => Git master |
2012-12-15 15:28 | Christian Grothoff | Target Version | => 0.9.5 |
2012-12-21 16:49 | Christian Grothoff | Status | resolved => closed |