Skip to content

fix #500 to avoid potential hang and event loss#501

Merged
cfc4n merged 5 commits into
gojue:masterfrom
ruitianzhong:bug-fix
Mar 6, 2024
Merged

fix #500 to avoid potential hang and event loss#501
cfc4n merged 5 commits into
gojue:masterfrom
ruitianzhong:bug-fix

Conversation

@ruitianzhong
Copy link
Copy Markdown
Contributor

@ruitianzhong ruitianzhong commented Mar 3, 2024

  • add IfUsed() for worker to check if it is used.
  • add Get() for routine that generates event before using the worker and Put() when after using the worker. They should not be called concurently(only one producer). So it panics if unexpected behavior happen.
  • add clean up logic for ew.incoming and return only after both the ew.incoming is empty and ew is not used.

The reasoning for that is as follows.

When returned from ew.Close(), there are two possiblities:

  1. no routine can touch it.
  2. one routine can still touch ew because getWorkerByUUID()
    happen before ew.Close()

When no routine can touch it (i.e.,ew.IfUsed == false),
we just drain the ew.incoming and return.

When one routine can touch it (i.e.,ew.IfUsed == true), we ensure
that we only return after the routine can not touch it
(i.e.,ew.IfUsed == false). At this point, we can ensure that no
other routine will touch it and send events through the ew.incoming.
So, we return.

Because eworker has been deleted from workqueue after ew.Close()
(ordered by a workqueue lock), at this point, we can ensure that
no ew will not be touched even in the future. So the return is
safe.

Finally, to verify it, in ruitianzhong@e5bce57 , run go test -v ./pkg/event_processor/ -run TestHang , no hang happened.

Because the poc inject delay to reliably reproduce the hang, I do not add it as a unit test in bug-fix branch.

Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
@cfc4n cfc4n added question Further information is requested test Tests and some Magic labels Mar 3, 2024
Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
Comment thread pkg/event_processor/iworker.go
Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
Signed-off-by: ruitianzhong <ruitian-zhong@outlook.com>
@ruitianzhong ruitianzhong requested a review from cfc4n March 5, 2024 16:50
Copy link
Copy Markdown
Member

@cfc4n cfc4n left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

Comment thread pkg/event_processor/iworker.go
@cfc4n cfc4n merged commit a286703 into gojue:master Mar 6, 2024
@ruitianzhong ruitianzhong deleted the bug-fix branch March 6, 2024 01:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

question Further information is requested test Tests and some Magic

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants