View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0002727 | GNUnet | file-sharing service | public | 2013-01-09 16:17 | 2013-12-24 20:54 |
Reporter | Matthias Wachs | Assigned To | Christian Grothoff | ||
Priority | low | Severity | minor | Reproducibility | have not tried |
Status | closed | Resolution | fixed | ||
Product Version | 0.10.0 | ||||
Target Version | 0.10.0 | Fixed in Version | 0.10.0 | ||
Summary | 0002727: Segfaults in FS on sparcbot | ||||
Description | Kernel logs on sparcbot [2013-01-05 20:50:24] gnunet-service-[144230]: segfault at 8 ip 00000000702cf7e8 (rpc 000000007029328c) sp 00000000ffd0c9a0 error 30001 in libc-2.13.so[70240000+16c000] [2013-01-05 20:57:58] gnunet-service-[148622]: segfault at 8 ip 00000000702cf7e8 (rpc 000000007029328c) sp 00000000ffeee9b0 error 30001 in libc-2.13.so[70240000+16c000] [2013-01-05 20:58:25] gnunet-service-[149168]: segfault at 0 ip 000000007031f7e8 (rpc 00000000702e328c) sp 00000000ffa269a0 error 30001 in libc-2.13.so[70290000+16c000] [2013-01-05 20:58:26] gnunet-service-[149205]: segfault at 0 ip 00000000702fb7e8 (rpc 00000000702bf28c) sp 00000000ffb249a0 error 30001 in libc-2.13.so[7026c000+16c000] [2013-01-05 20:58:49] gnunet-service-[149395]: segfault at 0 ip 00000000702cf7e8 (rpc 000000007029328c) sp 00000000ffc069a0 error 30001 in libc-2.13.so[70240000+16c000] [2013-01-05 23:04:16] gnunet-service-[206960]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000fff9b140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-05 23:07:01] gnunet-service-[207101]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000ffef9140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-06 03:25:23] gnunet-service-[10777]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000ff971140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-07 14:23:57] gnunet-service-[198992]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000ffdcf140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-07 17:47:08] gnunet-service-[4041]: segfault at 0 ip 000000000001eba8 (rpc 00000000700c59a8) sp 00000000ffe9f140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-07 17:47:08] gnunet-service-[4042]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000ffba5140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-08 11:33:31] gnunet-service-[136331]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000ff967140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-08 17:04:24] gnunet-service-[202254]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000ffe1f140 error 30001 in gnunet-service-fs[10000+1c000] [2013-01-08 17:07:12] gnunet-service-[202393]: segfault at 0 ip 000000000001eba8 (rpc 00000000700799a8) sp 00000000ffa53140 error 30001 in gnunet-service-fs[10000+1c000] | ||||
Additional Information | Correlates to buildbot fs builds https://gnunet.org/buildbot/builders/lenny-sparc64-wachs/builds/103/steps/tests%20fs/logs/stdio Jan 08 17:19:53-644221 | ||||
Tags | No tags attached. | ||||
|
Try to reproduce by running fs tests manually |
|
ran fs tests manually: passed without coredump |
|
found older coredump in fs dir, don't know if this is the problem: root@mamasparc:~/buildbot/lenny-sparc64-wachs/build/src/fs# gdb .libs/gnunet-service-fs core GNU gdb (GDB) 7.4.1-debian Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "sparc-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /root/buildbot/lenny-sparc64-wachs/build/src/fs/.libs/gnunet-service-fs...done. warning: core file may not match specified executable file. [New LWP 202393] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/sparc-linux-gnu/libthread_db.so.1". Core was generated by `/tmp/gnbuild/lib/gnunet/libexec/gnunet-service-fs -c /tmp/testbed4a10EF/1/confi'. Program terminated with signal 11, Segmentation fault. #0 put_migration_continuation (cls=0x1498f0, success=0, min_expiration=<error reading variable: Cannot access memory at address 0x0>, msg=0x0) at gnunet-service-fs_pr.c:965 965 { (gdb) bt #0 put_migration_continuation (cls=0x1498f0, success=0, min_expiration=<error reading variable: Cannot access memory at address 0x0>, msg=0x0) at gnunet-service-fs_pr.c:965 #1 0x7007aee4 in process_result_message (cls=0x4ba28, msg=0x0) at datastore_api.c:1169 #2 0x700799b0 in timeout_queue_entry (cls=0x828c8, tc=0xffa53378) at datastore_api.c:398 #3 0x7019b190 in run_ready (ws=0x3da98, rs=0x3da10) at scheduler.c:597 #4 GNUNET_SCHEDULER_run (task=<optimized out>, task_cls=<optimized out>) at scheduler.c:786 #5 0x701a6f38 in GNUNET_SERVICE_run (argc=3, argv=0xffa53714, service_name=0x28080 "fs", options=<optimized out>, task=<optimized out>, task_cls=<optimized out>) at service.c:1815 #6 0x00013008 in main (argc=3, argv=0xffa53714) at gnunet-service-fs.c:709 (gdb) |
|
could reproduce with ./perf_gnunet_service_fs_p2p_respect Core was generated by `/tmp/gnbuild/lib/gnunet/libexec/gnunet-service-fs -c /tmp/testbedte0z5P/2/confi'. Program terminated with signal 11, Segmentation fault. #0 put_migration_continuation (cls=0x9d388, success=0, min_expiration=<error reading variable: Cannot access memory at address 0x0>, msg=0x0) at gnunet-service-fs_pr.c:965 965 { (gdb) bt #0 put_migration_continuation (cls=0x9d388, success=0, min_expiration=<error reading variable: Cannot access memory at address 0x0>, msg=0x0) at gnunet-service-fs_pr.c:965 #1 0xf7d82ee4 in process_result_message (cls=0x4ba18, msg=0x0) at datastore_api.c:1169 #2 0xf7d819b0 in timeout_queue_entry (cls=0x9be60, tc=0xff95d7f8) at datastore_api.c:398 #3 0xf7cdf190 in run_ready (ws=0x3da88, rs=0x3da00) at scheduler.c:597 #4 GNUNET_SCHEDULER_run (task=<optimized out>, task_cls=<optimized out>) at scheduler.c:786 #5 0xf7ceaf38 in GNUNET_SERVICE_run (argc=3, argv=0xff95db94, service_name=0x28080 "fs", options=<optimized out>, task=<optimized out>, task_cls=<optimized out>) at service.c:1815 #6 0x00013008 in main (argc=3, argv=0xff95db94) at gnunet-service-fs.c:709 (gdb) |
|
Very strange -- the line number 965 is above the beginning of the function, and EVERY dereference in the file (except for 'cls' derefs) is explicitly checked against NULL _already_ (since end of 2012!). So there cannot be a NULL-deref in the entire function! |
|
root@mamasparc:~/buildbot/lenny-sparc64-wachs/build/src/fs# ls -l core -rw------- 1 root root 2080768 Jan 25 09:49 core root@mamasparc:~/buildbot/lenny-sparc64-wachs/build/src/fs# gdb .libs/gnunet-service-fs core GNU gdb (GDB) 7.4.1-debian Copyright (C) 2012 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "sparc-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /root/buildbot/lenny-sparc64-wachs/build/src/fs/.libs/gnunet-service-fs...done. warning: core file may not match specified executable file. [New LWP 219160] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/sparc-linux-gnu/libthread_db.so.1". Core was generated by `/tmp/gnbuild/lib/gnunet/libexec/gnunet-service-fs -c /tmp/testbedkuitAu/0/confi'. Program terminated with signal 11, Segmentation fault. #0 put_migration_continuation (cls=0x43a08, success=0, min_expiration=<error reading variable: Cannot access memory at address 0x0>, msg=0x0) at gnunet-service-fs_pr.c:965 965 { (gdb) bt #0 put_migration_continuation (cls=0x43a08, success=0, min_expiration=<error reading variable: Cannot access memory at address 0x0>, msg=0x0) at gnunet-service-fs_pr.c:965 #1 0x7007aee4 in process_result_message (cls=0x4b030, msg=0x0) at datastore_api.c:1169 #2 0x700799b0 in timeout_queue_entry (cls=0xaef18, tc=0xffdcf8d8) at datastore_api.c:398 #3 0x70147190 in run_ready (ws=0x3cd38, rs=0x3ccb0) at scheduler.c:597 #4 GNUNET_SCHEDULER_run (task=<optimized out>, task_cls=<optimized out>) at scheduler.c:786 #5 0x70152f38 in GNUNET_SERVICE_run (argc=3, argv=0xffdcfc74, service_name=0x28080 "fs", options=<optimized out>, task=<optimized out>, task_cls=<optimized out>) at service.c:1815 #6 0x00013008 in main (argc=3, argv=0xffdcfc74) at gnunet-service-fs.c:709 (gdb) |
|
Could reproduce with perf_gnunet_service_fs_p2p_index |
|
put_migration_continuation should never be called from process_result_message as the function has the WRONG SIGNATURE for that call location. So somehow datastore_api.c had some internal (memory) corruption, and then likely calls the wrong function with possibly some bogus closure and thus we crash. |
|
Stack trace with SVN HEAD: Reading symbols from /tmp/gnbuild/lib/gnunet/libexec/gnunet-service-fs...done. warning: core file may not match specified executable file. [New LWP 103372] Core was generated by `/tmp/gnbuild/lib/gnunet/libexec/gnunet-service-fs -c /tmp/testbedgdXR8u/0/confi'. Program terminated with signal 11, Segmentation fault. #0 put_migration_continuation (cls=0x43938, success=0, min_expiration=<error reading variable: Could not find type for DW_OP_GNU_const_type>, msg=0x0) at gnunet-service-fs_pr.c:940 940 { (gdb) ba #0 put_migration_continuation (cls=0x43938, success=0, min_expiration=<error reading variable: Could not find type for DW_OP_GNU_const_type>, msg=0x0) at gnunet-service-fs_pr.c:940 #1 0x7007efe4 in process_result_message (cls=0x4b850, msg=0x0) at datastore_api.c:1172 #2 0x7007dad0 in timeout_queue_entry (cls=0xfd2c8, tc=0xffc0387c) at datastore_api.c:398 #3 0x701467e0 in run_ready (ws=0x3c8d0, rs=0x3c848) at scheduler.c:593 #4 GNUNET_SCHEDULER_run (task=<optimized out>, task_cls=task_cls@entry=0xffc039b0) at scheduler.c:808 #5 0x70151a80 in GNUNET_SERVICE_run (argc=argc@entry=3, argv=argv@entry=0xffc03c14, service_name=service_name@entry=0x27d50 "fs", options=options@entry=GNUNET_SERVICE_OPTION_NONE, task=task@entry=0x138c0 <run>, task_cls=task_cls@entry=0x0) at service.c:1478 #6 0x00013148 in main (argc=3, argv=0xffc03c14) at gnunet-service-fs.c:721 |
|
The function 'put_migration_continuation' is called, but the signature at datastore_api.c1172 does not match that type, clearly the h->queue_head.qc union (!) does not contain the type of pointer that was expected here. |
|
timeout_queue_entry in datastore_api.c did not ensure that the queue slot for which the 'response_proc' was being invoked was at the head of the list. This caused a missmatch in which functions were called next. This only showed on sparc due to timing (needed timeout!). Fixed in SVN 30791. |
|
Fixed in SVN 30792. |
Date Modified | Username | Field | Change |
---|---|---|---|
2013-01-09 16:17 | Matthias Wachs | New Issue | |
2013-01-09 16:17 | Matthias Wachs | Status | new => assigned |
2013-01-09 16:17 | Matthias Wachs | Assigned To | => Christian Grothoff |
2013-01-09 16:18 | Matthias Wachs | Note Added: 0006766 | |
2013-01-09 16:42 | Matthias Wachs | Note Added: 0006767 | |
2013-01-09 16:43 | Matthias Wachs | Note Added: 0006768 | |
2013-01-09 19:57 | Matthias Wachs | Note Added: 0006769 | |
2013-01-24 17:18 | Christian Grothoff | Note Added: 0006793 | |
2013-01-25 10:13 | Matthias Wachs | Note Added: 0006794 | |
2013-01-25 10:53 | Matthias Wachs | Note Added: 0006795 | |
2013-01-25 13:32 | Christian Grothoff | Note Added: 0006796 | |
2013-01-31 10:32 | Christian Grothoff | Relationship added | related to 0002724 |
2013-01-31 10:33 | Christian Grothoff | Relationship deleted | related to 0002724 |
2013-07-10 23:46 | Christian Grothoff | Assigned To | Christian Grothoff => |
2013-07-10 23:46 | Christian Grothoff | Priority | normal => low |
2013-07-10 23:46 | Christian Grothoff | Status | assigned => confirmed |
2013-11-18 21:19 | Christian Grothoff | Assigned To | => Christian Grothoff |
2013-11-18 21:19 | Christian Grothoff | Status | confirmed => assigned |
2013-11-18 21:31 | Christian Grothoff | Note Added: 0007653 | |
2013-11-18 21:33 | Christian Grothoff | Note Added: 0007654 | |
2013-11-18 21:45 | Christian Grothoff | Note Added: 0007655 | |
2013-11-18 21:58 | Christian Grothoff | Note Added: 0007656 | |
2013-11-18 21:58 | Christian Grothoff | Status | assigned => resolved |
2013-11-18 21:58 | Christian Grothoff | Fixed in Version | => 0.10.0 |
2013-11-18 21:58 | Christian Grothoff | Resolution | open => fixed |
2013-11-18 21:58 | Christian Grothoff | Target Version | => 0.10.0 |
2013-12-24 20:54 | Christian Grothoff | Status | resolved => closed |