Skip to content

limit large glob operation in gunicorn config#3220

Merged
sgoggins merged 2 commits intochaoss:mainfrom
MoralCode:unused-glob
Jul 10, 2025
Merged

limit large glob operation in gunicorn config#3220
sgoggins merged 2 commits intochaoss:mainfrom
MoralCode:unused-glob

Conversation

@MoralCode
Copy link
Contributor

@MoralCode MoralCode commented Jul 9, 2025

Description
This change removes a line of code from gunicorn config that performs a recursive glob operation for seemingly no purpose since the variable it assigns the results to is unused

This might help with performance somewhat, and in worst cases, helps prevent instances from accidentally self-DOS ing if they happen to store the fascade repository clones in the path of this glob (therefore looking at and processing all jinja2 files in all repositories being analyzed by augur)

Notes for Reviewers
This change has been deployed in prod in the Red Hat augur instance for a while without causing issues. It was put in place by previous sysadmins and has therefore been thoroughly reviewed by at least one other experienced person in addition to my searching for uses of this variable in the codebase

Fixes #2958

Signed commits

  • Yes, I signed my commits.

sgoggins
sgoggins previously approved these changes Jul 9, 2025
Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good catch. We don't use Jinja files anymore, so we don't need to worry about this step. LGTM.

@Ulincsys
Copy link
Contributor

Ulincsys commented Jul 9, 2025

We do still use Jinja files, they are what the frontend is rendered with.

I think instead of removing the glob alltogether, we could probably just restrict it to the templates directory, which is where all of the jinja files are located for the frontend.

@Ulincsys
Copy link
Contributor

Ulincsys commented Jul 9, 2025

@MoralCode This line is not unused, the whole file is loaded by Gunicorn at startup to configure worker and background settings. You can see where it is loaded in backend.py lines 74, 97, 98:

gunicorn_location = os.getcwd() + "/augur/api/gunicorn_conf.py"
...
gunicorn_command = f"gunicorn -c {gunicorn_location} -b {host}:{port} augur.api.server:app --log-file {gunicorn_log_file}" 
server = subprocess.Popen(gunicorn_command.split(" "))

@Ulincsys
Copy link
Contributor

Ulincsys commented Jul 9, 2025

Here's my suggested patch:

diff --git a/augur/api/gunicorn_conf.py b/augur/api/gunicorn_conf.py
index 09c21161a..3e7bf5e3a 100644
--- a/augur/api/gunicorn_conf.py
+++ b/augur/api/gunicorn_conf.py
@@ -19,7 +19,17 @@ logger = logging.getLogger(__name__)
 workers = multiprocessing.cpu_count() * 2 + 1
 umask = 0o007
 reload = True
-reload_extra_files = glob(str(Path.cwd() / '**/*.j2'), recursive=True)
+
+augur_templates_dir = Path.cwd() / "augur/templates"
+
+if not augur_templates_dir.is_dir():
+    logger.critical("Could not locate templates in Gunicorn startup")
+    exit(-1)
+
+reload_extra_files = glob(str(augur_templates_dir.resolve() / '**/*.j2'), recursive=True)
+
+# Don't  want to leave extraneous variables in config scope
+del augur_templates_dir
 
 # set the log location for gunicorn    
 logs_directory = get_value('Logging', 'logs_directory')

@MoralCode
Copy link
Contributor Author

@Ulincsys

This line is not unused, the whole file is loaded by Gunicorn at startup to configure worker and background settings. You can see where it is loaded in backend.py lines 74, 97, 98:

I'm not sure that I see where the reload_extra_files variable is being used in the lines you quoted. is there some implicit behavior here that I'm not familiar with?

@Ulincsys
Copy link
Contributor

Ulincsys commented Jul 9, 2025

@MoralCode reload_extra_files is a Gunicorn config option. Gunicorn imports this file, and uses the globals defined in it as option definitions.

You can see the documentation for this option here: https://docs.gunicorn.org/en/stable/settings.html#reload-extra-files

@MoralCode
Copy link
Contributor Author

ah ok. I'll update this to use your patch then. Thanks for the info!

Is this reload feature likely to only be useful for development environments? Its been patched out of our prod instance for a while with seemingly no issues - maybe its worth putting behind an AUGUR_DEBUG or similar environment variable if one already exists?

…gur templates directory

Co-authored-by: Ulincsys <28362836a@gmail.com>
Signed-off-by: Adrian Edwards <adredwar@redhat.com>
@MoralCode MoralCode changed the title remove unused line that performs a large glob operation limit large glob operation in gunicorn config Jul 9, 2025
@Ulincsys
Copy link
Contributor

Ulincsys commented Jul 9, 2025

@MoralCode Yes, it's mostly a development feature, so I think putting it behind the development flag is probably appropriate.

We use the AUGUR_DEV environment variable to denote a development instance, so maybe adding a check for it in the file would do?

The only time it might be suitable for productions would be if the templates were to be programmatically updated during runtime somehow, but Augur does not do that, so it's not something that we necessarily need to support.

…environment variable

Signed-off-by: Adrian Edwards <adredwar@redhat.com>
Copy link
Member

@sgoggins sgoggins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sgoggins sgoggins merged commit eaa2fb9 into chaoss:main Jul 10, 2025
10 checks passed
@MoralCode MoralCode deleted the unused-glob branch November 19, 2025 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Recursive glob in gunicorn config can hang augur on large installs

3 participants