Skip to content

Fix worker thread attrition#321

Merged
ZimbiX merged 4 commits intomasterfrom
fix-worker-thread-attrition
Dec 22, 2021
Merged

Fix worker thread attrition#321
ZimbiX merged 4 commits intomasterfrom
fix-worker-thread-attrition

Conversation

@ZimbiX
Copy link
Copy Markdown
Member

@ZimbiX ZimbiX commented Dec 22, 2021

The changes in #285 unfortunately introduced a race condition bug where the result from Worker#fetch_next_metajob (which just calls JobBuffer#shift) could be unexpectedly nil, and so the truthiness check in the Worker#work_loop's while loop could evaluate to false, causing the worker thread to stop. For details, see #285 (comment).

We've fixed this by handling nil separately from false, and in that case, restarting the loop to do the fetch again, which should return a proper result the next time. We realise it's not the cleanest of fixes, but we'd prefer to get the fix in, and refactor later if the need arises.

The test this adds covers the changes from: here, #285, and #318. It's a fixed version of the test submitted in a comment by @ebeigarts: #285 (comment)

smontanari and others added 4 commits December 22, 2021 17:54
This uses essentially a trinary return from `fetch_next_metajob` to determine if a race condition occured where the buffer was emptied between the check and the buffer shift.

Co-Authored-By: Thomas van der Pol <tvdp@hey.com>
Co-Authored-By: Brendan Weibrecht <brendan@weibrecht.net.au>
From a [comment](#285 (comment)) by @ebeigarts

Co-Authored-By: Edgars Beigarts <edgars.beigarts@gmail.com>
…hift deadlock test

Restores the concurrency and and job count per thread to those from the originally submitted test. With the deadlock fix removed, it can now produce a deadlock most of the time (83% of runs in my testing)
@ZimbiX ZimbiX merged commit ad02bb5 into master Dec 22, 2021
@ZimbiX ZimbiX deleted the fix-worker-thread-attrition branch December 22, 2021 07:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants