fix: deadlock if /change is called during /restart#321
Conversation
Summary of ChangesHello @toddbaert, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical deadlock scenario that could arise from the interaction between flag change operations and system restarts. By refactoring the file watcher mechanism to allow for a safe and controlled restart, the changes eliminate a race condition where the file watcher's waitgroup would never complete. This significantly enhances the system's robustness and prevents service disruptions during dynamic configuration updates. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
4957628 to
f26749f
Compare
There was a problem hiding this comment.
Code Review
This pull request addresses a deadlock issue by replacing the pause/continue logic for the file watcher with a more robust restart mechanism. This is a solid approach to fixing the underlying problem. However, the new implementation introduces a couple of issues: a potential data race in RestartFileWatcher and an ignored error return value. My review includes suggestions to resolve these concerns.
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
f26749f to
490239a
Compare
If the
/changeendpoint was called while flagd was restarting (due to the/restartendpoint) the filewatcher was closed (because theallFlags.jsonwas deleted). Due to this, the waitgroup never stopped waiting.In the Java e2e suite, it looks like this:
https://github.com/open-feature/java-sdk-contrib/actions/runs/20754500347/job/59592648835?pr=1624
This change removes the
PauseFileWatcherandContinueFileWatcherfuncs, and instead just allows to thefilewatcherto be safely restarted, preventing this deadlock. I ran this against the Java suite locally and it fixed the intermittent failure caused by the deadlock