View Issue Details

IDProjectCategoryView StatusLast Update
0008978GNUnetutil librarypublic2024-07-08 23:33
Reporterthejackimonster Assigned To 
Status newResolutionopen 
Product VersionGit master 
Summary0008978: Scheduler: ready_count underflow
DescriptionDuring testing of the new voice chat feature in messenger-gtk I ran into a crash caused by the scheduler. An assertion shut it down because of an underflow of the `ready_count` variable. I'm not sure how this is possible but it seems that during some decrement of it, it may happen.

It might be caused by multiple threads accessing the GNUnet APIs. But I'm not 100% sure about this because I can't reproduce a backtrace that hints to that yet.
Steps To ReproduceThe upstream version of messenger-gtk runs into this issue during a voice call at times. The application just crashes because of the assertion that assumes `ready_count` in the scheduler would need to be zero when the task queues are all empty. However it seems possible that an underflow hits it to the value 4294967295.
Additional InformationAssertion that triggers the crash:

Locations of decrement:

TagsNo tags attached.



2024-07-02 00:23

developer   ~0022781

Okay, so I tried to remove multiple threads as cause by using the code from gnunet-gtk to combine GNUnet scheduler and g_main_loop from GTK. But I could still reproduce the issue, triggering the assertion. I'm not sure how this is possible in a single-threaded environment but it looks like it is. So there's definitely something wrong in the scheduler logic.


2024-07-04 22:29

developer   ~0022787

I think I found the actual cause of `ready_count` being wrong. The internal function `remove_pass_end_marker()` removed a task from the ready queue while not adjusting the `ready_count`. So I patched this:

The scheduler test case also used a workaround for this existing issue, verifying a wrong `ready_count` as accurate...


2024-07-05 02:08

developer   ~0022788

I can still run into the same assertion crashing my application after those changes. So I'm still investigating.


2024-07-08 23:33

developer   ~0022799

From my testing today it seems like the issue was coming from GStreamer running on a separate thread as the GNUnet scheduler. After synchronizing the pulled sample data with the scheduler thread, I was able to write it via messaging without reproducing the crash so far. I'll keep this issue open for now but it seems like the cause has been found.

Issue History

Date Modified Username Field Change
2024-06-25 15:23 thejackimonster New Issue
2024-06-25 15:23 thejackimonster Additional Information Updated
2024-07-02 00:23 thejackimonster Note Added: 0022781
2024-07-04 22:29 thejackimonster Note Added: 0022787
2024-07-05 02:08 thejackimonster Note Added: 0022788
2024-07-08 23:33 thejackimonster Note Added: 0022799