Conversation
This reverts commit ec2b007.
WalkthroughComprehensive infrastructure-as-code refactoring transitioning from staging to live environment with systematic variable naming updates (create_* to enable_*), Terraform version constraints relaxation, Django-centric parameter restructuring, NAT gateway conditional logic, and CI/CD workflow updates for ECS task orchestration. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
1 issue found across 46 files
Confidence score: 3/5
- The example config in
infrastructure/live/terraform.tfvars.examplepointsdjango_settings_moduletosettings.Staging, which doesn’t match the actual module file and can trigger startup import errors if used. - This is a concrete runtime risk, so the merge carries some user-impacting uncertainty despite being limited to an example config.
- Pay close attention to
infrastructure/live/terraform.tfvars.example- incorrect module path may break deployments.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="infrastructure/live/terraform.tfvars.example">
<violation number="1" location="infrastructure/live/terraform.tfvars.example:8">
P1: Use lowercase module path for `django_settings_module`; `settings.Staging` does not match the actual settings module file and can cause startup import errors.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4258 +/- ##
=======================================
Coverage 99.48% 99.48%
=======================================
Files 520 520
Lines 16305 16305
Branches 2204 2254 +50
=======================================
Hits 16221 16221
Misses 56 56
Partials 28 28
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report in Codecov by Sentry.
🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
infrastructure/bootstrap/main.tf (1)
84-102:⚠️ Potential issue | 🟠 MajorAdd
kms:CreateGrantto the CloudWatch Logs KMS policy statement.The
logs:AssociateKmsKeypermission has been added to the caller side, butinfrastructure/modules/kms/main.tfstill does not grantkms:CreateGranttologs.${data.aws_region.current.id}.amazonaws.com. CloudWatch Logs requires this permission to create grants on the CMK when the log-group modules (task, service, cache, networking) associate the KMS key. Without it, Terraform applies will fail during key association.🔧 Required change in
infrastructure/modules/kms/main.tfstatement { actions = [ + "kms:CreateGrant", "kms:Decrypt", "kms:Describe*", "kms:Encrypt", "kms:GenerateDataKey*", "kms:ReEncrypt*", ]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/bootstrap/main.tf` around lines 84 - 102, The KMS key policy block that allows CloudWatch Logs (the statement with sid "CloudWatchLogsManagement" or the aws_kms_key policy allowing principal logs.${data.aws_region.current.id}.amazonaws.com) is missing the kms:CreateGrant action; update that policy statement to include "kms:CreateGrant" alongside existing kms actions so CloudWatch Logs can create grants when associating the CMK (ensure the principal remains logs.${data.aws_region.current.id}.amazonaws.com and the resource/statement scope matches the existing key policy).infrastructure/live/README.md (1)
3-7:⚠️ Potential issue | 🟠 MajorFix the first-run bootstrap prerequisite.
This section says
nest-staging-terraformis created automatically by bootstrap, but the first CI/CD run needs that role to already exist. As written, a new deployment will dead-end before bootstrap can update anything.Based on learnings: The
nest-staging-terraformIAM role is expected to be manually pre-created before the first CI/CD pipeline run. The bootstrap-staging-nest job in .github/workflows/run-ci-cd.yaml updates this role but does not handle initial creation, so the role must exist before build-staging-images runs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/live/README.md` around lines 3 - 7, Update the README to state that the IAM role nest-staging-terraform must be manually created before the very first staging CI/CD run: clarify that the bootstrap-staging-nest job only updates/assumes the existing role (it does not create it on first run), and that build-staging-images (and the initial staging pipeline) will fail if nest-staging-terraform doesn't already exist; mention how to create the role or point to the terraform snippet/trust policy to pre-create it and note that subsequent bootstrap runs will update the role.
🧹 Nitpick comments (2)
infrastructure/live/.terraform.lock.hcl (1)
27-45: Consider declaring therandomprovider ininfrastructure/live/main.tf.The lockfile includes the
randomprovider (used by child modules likedatabaseforrandom_password), butinfrastructure/live/main.tfonly declares theawsprovider in itsrequired_providersblock. While Terraform aggregates provider requirements from child modules, explicitly declaring all providers in the root module improves clarity and makes version constraints more discoverable.💡 Suggested addition to infrastructure/live/main.tf
terraform { required_version = "~> 1.14.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 6.36.0" } random = { source = "hashicorp/random" version = "~> 3.8.0" } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/live/.terraform.lock.hcl` around lines 27 - 45, The root terraform block's required_providers is missing the random provider; update the terraform { required_providers { ... } } block in main.tf to explicitly declare random with source "hashicorp/random" and a version constraint (e.g. "~> 3.8.0") alongside the existing aws entry so the root module documents the provider requirement used by child modules (like the database module that uses random_password).infrastructure/live/variables.tf (1)
132-140: Preventenvironment/Django config mismatches.
django_configurationanddjango_settings_moduleare now free-form alongsidevar.environment, soproduction+Staging/settings.stagingis currently a valid input set. Consider deriving them fromvar.environmentor enforcing the allowed pairs so bad tfvars fail fast.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/live/variables.tf` around lines 132 - 140, Ensure django_configuration and django_settings_module cannot diverge from var.environment by adding deterministic derivation or validation: either set them as computed locals based on var.environment (e.g., map environment -> configuration and settings module) and replace the free variables, or add terraform validation blocks on variable "django_configuration" and "django_settings_module" that check values against var.environment (using a mapping/local lookup) and fail fast on mismatch; reference the variable names django_configuration, django_settings_module and var.environment in the new logic so only allowed pairs (e.g., production -> Production/settings.production) are accepted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@infrastructure/live/terraform.tfbackend.example`:
- Line 2: The backend "key" currently hardcodes "staging/terraform.tfstate";
change it to be environment-driven instead: replace the literal key with a
placeholder or interpolation that derives the state path from a variable (e.g.,
var.environment) or from terraform workspace (terraform.workspace) so each
environment uses its own "<environment>/terraform.tfstate"; update the template
comment to document that callers must set the environment variable or choose the
appropriate workspace.
In `@infrastructure/live/variables.tf`:
- Around line 147-150: The new default for backend_desired_count (and similarly
frontend_desired_count) is 1 which violates the module validation requiring
min_count <= desired_count when autoscaling is enabled; update the corresponding
min variables (backend_min_count and frontend_min_count) to 1 to keep defaults
compatible with autoscaling, or alternatively restore backend_desired_count and
frontend_desired_count to match the existing min values (e.g., 2); adjust
whichever pair you choose (backend_desired_count/backend_min_count and
frontend_desired_count/frontend_min_count) so desired_count >= min_count and
re-run validation.
---
Outside diff comments:
In `@infrastructure/bootstrap/main.tf`:
- Around line 84-102: The KMS key policy block that allows CloudWatch Logs (the
statement with sid "CloudWatchLogsManagement" or the aws_kms_key policy allowing
principal logs.${data.aws_region.current.id}.amazonaws.com) is missing the
kms:CreateGrant action; update that policy statement to include
"kms:CreateGrant" alongside existing kms actions so CloudWatch Logs can create
grants when associating the CMK (ensure the principal remains
logs.${data.aws_region.current.id}.amazonaws.com and the resource/statement
scope matches the existing key policy).
In `@infrastructure/live/README.md`:
- Around line 3-7: Update the README to state that the IAM role
nest-staging-terraform must be manually created before the very first staging
CI/CD run: clarify that the bootstrap-staging-nest job only updates/assumes the
existing role (it does not create it on first run), and that
build-staging-images (and the initial staging pipeline) will fail if
nest-staging-terraform doesn't already exist; mention how to create the role or
point to the terraform snippet/trust policy to pre-create it and note that
subsequent bootstrap runs will update the role.
---
Nitpick comments:
In `@infrastructure/live/.terraform.lock.hcl`:
- Around line 27-45: The root terraform block's required_providers is missing
the random provider; update the terraform { required_providers { ... } } block
in main.tf to explicitly declare random with source "hashicorp/random" and a
version constraint (e.g. "~> 3.8.0") alongside the existing aws entry so the
root module documents the provider requirement used by child modules (like the
database module that uses random_password).
In `@infrastructure/live/variables.tf`:
- Around line 132-140: Ensure django_configuration and django_settings_module
cannot diverge from var.environment by adding deterministic derivation or
validation: either set them as computed locals based on var.environment (e.g.,
map environment -> configuration and settings module) and replace the free
variables, or add terraform validation blocks on variable "django_configuration"
and "django_settings_module" that check values against var.environment (using a
mapping/local lookup) and fail fast on mismatch; reference the variable names
django_configuration, django_settings_module and var.environment in the new
logic so only allowed pairs (e.g., production -> Production/settings.production)
are accepted.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6caccf2a-ca21-4ddf-8905-3298c63004b0
📒 Files selected for processing (46)
.github/workflows/run-ci-cd.yamlinfrastructure/README.mdinfrastructure/bootstrap/.terraform.lock.hclinfrastructure/bootstrap/main.tfinfrastructure/live/.terraform.lock.hclinfrastructure/live/README.mdinfrastructure/live/backend.tfinfrastructure/live/main.tfinfrastructure/live/outputs.tfinfrastructure/live/providers.tfinfrastructure/live/terraform.tfbackend.exampleinfrastructure/live/terraform.tfvars.exampleinfrastructure/live/variables.tfinfrastructure/modules/alb/.terraform.lock.hclinfrastructure/modules/alb/main.tfinfrastructure/modules/cache/.terraform.lock.hclinfrastructure/modules/cache/main.tfinfrastructure/modules/database/.terraform.lock.hclinfrastructure/modules/database/main.tfinfrastructure/modules/kms/.terraform.lock.hclinfrastructure/modules/kms/main.tfinfrastructure/modules/networking/.terraform.lock.hclinfrastructure/modules/networking/main.tfinfrastructure/modules/networking/modules/nacl/.terraform.lock.hclinfrastructure/modules/networking/modules/nacl/main.tfinfrastructure/modules/networking/modules/vpc-endpoint/.terraform.lock.hclinfrastructure/modules/networking/modules/vpc-endpoint/main.tfinfrastructure/modules/parameters/.terraform.lock.hclinfrastructure/modules/parameters/main.tfinfrastructure/modules/parameters/tests/parameters.tftest.hclinfrastructure/modules/parameters/variables.tfinfrastructure/modules/security/.terraform.lock.hclinfrastructure/modules/security/main.tfinfrastructure/modules/service/.terraform.lock.hclinfrastructure/modules/service/main.tfinfrastructure/modules/storage/.terraform.lock.hclinfrastructure/modules/storage/main.tfinfrastructure/modules/storage/modules/s3-bucket/.terraform.lock.hclinfrastructure/modules/storage/modules/s3-bucket/main.tfinfrastructure/modules/tasks/.terraform.lock.hclinfrastructure/modules/tasks/main.tfinfrastructure/modules/tasks/modules/task/.terraform.lock.hclinfrastructure/modules/tasks/modules/task/main.tfinfrastructure/staging/.terraform.lock.hclinfrastructure/state/.terraform.lock.hclinfrastructure/state/main.tf
💤 Files with no reviewable changes (2)
- infrastructure/live/backend.tf
- infrastructure/staging/.terraform.lock.hcl
There was a problem hiding this comment.
3 issues found across 11 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="infrastructure/README.md">
<violation number="1" location="infrastructure/README.md:261">
P2: This command introduces an undocumented `jq` dependency. Either add `jq` to the prerequisites or use a jq-free example so the README works with the tools it already tells users to install.</violation>
</file>
<file name="infrastructure/modules/service/main.tf">
<violation number="1" location="infrastructure/modules/service/main.tf:148">
P3: Add a module test for the ECS service network configuration. This PR changes public-IP and subnet selection behavior, but the service module tests never assert those fields, so a networking regression here would still pass.
(Based on your team's feedback about adding or updating tests for new logic.) [FEEDBACK_USED]</violation>
</file>
<file name="infrastructure/modules/networking/main.tf">
<violation number="1" location="infrastructure/modules/networking/main.tf:52">
P1: Adding `count` changes this existing resource's state address to `aws_eip.nat[0]` (and the NAT gateway to `[0]` as well). Without `moved` blocks or a state migration, NAT-enabled stacks will plan a destroy/create of the NAT gateway and EIP.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (5)
infrastructure/README.md (1)
259-264: Consider more robust conditional syntax.The bash pattern
[ "$NAT_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED"works but can be fragile if the first echo fails. While unlikely withecho, a more defensive approach would be:ASSIGN_PUBLIC_IP=$(if [ "$NAT_ENABLED" = "true" ]; then echo "DISABLED"; else echo "ENABLED"; fi)♻️ Proposed refactor for robustness
-ASSIGN_PUBLIC_IP=$([ "$NAT_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED") +ASSIGN_PUBLIC_IP=$(if [ "$NAT_ENABLED" = "true" ]; then echo "DISABLED"; else echo "ENABLED"; fi)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/README.md` around lines 259 - 264, The ASSIGN_PUBLIC_IP assignment uses a short-circuited conditional that can misbehave if the first command fails; replace the current command-substitution expression with an explicit if/then/else/fi inside the substitution so ASSIGN_PUBLIC_IP is set by evaluating NAT_ENABLED with a deterministic branch (use the NAT_ENABLED variable and the literal outputs "DISABLED" and "ENABLED" in the two branches) to avoid relying on &&/|| short-circuit behavior.infrastructure/modules/service/variables.tf (1)
123-126: Add a non-empty validation forsubnet_ids.Please fail early when no subnets are provided; otherwise this defers failure to provider/runtime behavior.
♻️ Suggested change
variable "subnet_ids" { description = "Subnet IDs for ECS tasks (can be public or private)." type = list(string) + + validation { + condition = length(var.subnet_ids) > 0 + error_message = "At least one subnet ID must be provided." + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/modules/service/variables.tf` around lines 123 - 126, Add a validation block to the variable "subnet_ids" to fail fast when no subnets are supplied: inside the variable "subnet_ids" (type = list(string)) add a validation { condition = length(var.subnet_ids) > 0 error_message = "subnet_ids must contain at least one subnet ID" } so Terraform will error at plan/apply time if the list is empty or not provided.infrastructure/live/main.tf (1)
57-57: Extract the repeated subnet selection into a local for maintainability.The same NAT-based subnet ternary is repeated three times.
♻️ Suggested cleanup
locals { assign_public_ip = var.enable_nat_gateway ? false : true + tasks_subnet_ids = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids common_tags = { Environment = var.environment ManagedBy = "Terraform" Project = var.project_name } @@ - subnet_ids = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids + subnet_ids = local.tasks_subnet_ids @@ - subnet_ids = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids + subnet_ids = local.tasks_subnet_ids @@ - subnet_ids = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids + subnet_ids = local.tasks_subnet_idsAlso applies to: 120-120, 218-218
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/live/main.tf` at line 57, Extract the repeated NAT-based subnet selection into a local by adding a local like local.subnet_ids = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids, then replace the ternary occurrences (where subnet_ids is currently set to var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids) with local.subnet_ids in the modules/resources that use it (e.g., the blocks setting subnet_ids currently using that ternary).infrastructure/modules/service/tests/service.tftest.hcl (1)
18-18: Add explicit assertions for network wiring behavior.Good rename, but please add tests that assert ECS service
network_configurationusesvar.subnet_idsand honorsassign_public_ip(true/false paths). This protects the new NAT-driven behavior from regressions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/modules/service/tests/service.tftest.hcl` at line 18, Add assertions in the Terraform test to validate the ECS service's network wiring: check that the ECS service resource's network_configuration.subnets equals var.subnet_ids and add two test cases (or parameterized checks) that assert network_configuration.assign_public_ip is "ENABLED" when assign_public_ip=true and "DISABLED" (or absent) when assign_public_ip=false; update the test around the ECS service block (the resource that references subnet_ids) to read var.subnet_ids and to verify assign_public_ip behavior for both true/false paths so NAT-driven behavior cannot regress.infrastructure/live/variables.tf (1)
132-140: Add validation guards for Django config inputs.These are now required knobs; lightweight validation will prevent accidental bad environment wiring.
♻️ Suggested change
variable "django_configuration" { description = "The name of the Django configuration to use (e.g., Staging, Production)." type = string + validation { + condition = contains(["Staging", "Production"], var.django_configuration) + error_message = "django_configuration must be either 'Staging' or 'Production'." + } } variable "django_settings_module" { description = "The location of the Django settings module to use (e.g., settings.staging, settings.production)." type = string + validation { + condition = can(regex("^settings\\.(staging|production)$", var.django_settings_module)) + error_message = "django_settings_module must be 'settings.staging' or 'settings.production'." + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/live/variables.tf` around lines 132 - 140, Add Terraform validation blocks to both variables to prevent empty or malformed values: for variable "django_configuration" add a validation that the value is non-empty (length > 0) and optionally restrict to an allowed set if you have known environments (e.g., Staging/Production/Dev); for variable "django_settings_module" add a validation that the value is non-empty and matches a simple module-name pattern (no spaces, only letters, numbers, underscores and dots) and provide clear error_message text for each validation so a bad input fails fast. Use the variable names "django_configuration" and "django_settings_module" to locate where to add these validation blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/run-ci-cd.yaml:
- Around line 946-957: The aws ecs run-task call can succeed at HTTP level but
still return failures in the response; capture the JSON output of the aws ecs
run-task invocation into a variable (instead of piping straight to stdout), use
a JSON parser (jq) to check .failures | length > 0 or absence of .tasks[] and if
failures exist echo an error with the failures and exit non-zero; otherwise
continue to echo success. Update the block around ASSIGN_PUBLIC_IP / aws ecs
run-task and use the CLUSTER_NAME and TASK_DEFINITION variables when checking
the response so the job fails fast when the index-data task did not actually
start.
---
Nitpick comments:
In `@infrastructure/live/main.tf`:
- Line 57: Extract the repeated NAT-based subnet selection into a local by
adding a local like local.subnet_ids = var.enable_nat_gateway ?
module.networking.private_subnet_ids : module.networking.public_subnet_ids, then
replace the ternary occurrences (where subnet_ids is currently set to
var.enable_nat_gateway ? module.networking.private_subnet_ids :
module.networking.public_subnet_ids) with local.subnet_ids in the
modules/resources that use it (e.g., the blocks setting subnet_ids currently
using that ternary).
In `@infrastructure/live/variables.tf`:
- Around line 132-140: Add Terraform validation blocks to both variables to
prevent empty or malformed values: for variable "django_configuration" add a
validation that the value is non-empty (length > 0) and optionally restrict to
an allowed set if you have known environments (e.g., Staging/Production/Dev);
for variable "django_settings_module" add a validation that the value is
non-empty and matches a simple module-name pattern (no spaces, only letters,
numbers, underscores and dots) and provide clear error_message text for each
validation so a bad input fails fast. Use the variable names
"django_configuration" and "django_settings_module" to locate where to add these
validation blocks.
In `@infrastructure/modules/service/tests/service.tftest.hcl`:
- Line 18: Add assertions in the Terraform test to validate the ECS service's
network wiring: check that the ECS service resource's
network_configuration.subnets equals var.subnet_ids and add two test cases (or
parameterized checks) that assert network_configuration.assign_public_ip is
"ENABLED" when assign_public_ip=true and "DISABLED" (or absent) when
assign_public_ip=false; update the test around the ECS service block (the
resource that references subnet_ids) to read var.subnet_ids and to verify
assign_public_ip behavior for both true/false paths so NAT-driven behavior
cannot regress.
In `@infrastructure/modules/service/variables.tf`:
- Around line 123-126: Add a validation block to the variable "subnet_ids" to
fail fast when no subnets are supplied: inside the variable "subnet_ids" (type =
list(string)) add a validation { condition = length(var.subnet_ids) > 0
error_message = "subnet_ids must contain at least one subnet ID" } so Terraform
will error at plan/apply time if the list is empty or not provided.
In `@infrastructure/README.md`:
- Around line 259-264: The ASSIGN_PUBLIC_IP assignment uses a short-circuited
conditional that can misbehave if the first command fails; replace the current
command-substitution expression with an explicit if/then/else/fi inside the
substitution so ASSIGN_PUBLIC_IP is set by evaluating NAT_ENABLED with a
deterministic branch (use the NAT_ENABLED variable and the literal outputs
"DISABLED" and "ENABLED" in the two branches) to avoid relying on &&/||
short-circuit behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 5609cb95-f61b-43ec-819e-9bf9785d8808
📒 Files selected for processing (11)
.github/workflows/run-ci-cd.yamlinfrastructure/README.mdinfrastructure/live/main.tfinfrastructure/live/outputs.tfinfrastructure/live/terraform.tfvars.exampleinfrastructure/live/variables.tfinfrastructure/modules/networking/main.tfinfrastructure/modules/networking/variables.tfinfrastructure/modules/service/main.tfinfrastructure/modules/service/tests/service.tftest.hclinfrastructure/modules/service/variables.tf
🚧 Files skipped from review as they are similar to previous changes (1)
- infrastructure/modules/service/main.tf
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@infrastructure/modules/networking/main.tf`:
- Around line 1-10: The change replaced exact pins with pessimistic constraints
in the terraform block (see required_version) and the aws provider block
(required_providers.aws.version); either obtain explicit team approval for
switching to pessimistic versioning and document that decision, or revert these
lines back to the team's required exact pins (use the previous exact values for
required_version and required_providers.aws.version) across all modules under
infrastructure/modules/ so the versioning policy remains consistent; ensure the
commit message or PR description records the team's consensus if you keep the
new "~>" constraints.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c9a51eb2-af3c-4329-90ba-a5bc7fb2ad25
📒 Files selected for processing (18)
.github/workflows/run-ci-cd.yamlinfrastructure/live/main.tfinfrastructure/live/terraform.tfvars.exampleinfrastructure/live/variables.tfinfrastructure/modules/database/main.tfinfrastructure/modules/database/outputs.tfinfrastructure/modules/database/tests/database.tftest.hclinfrastructure/modules/database/variables.tfinfrastructure/modules/networking/main.tfinfrastructure/modules/networking/modules/vpc-endpoint/main.tfinfrastructure/modules/networking/modules/vpc-endpoint/tests/vpc-endpoint.tftest.hclinfrastructure/modules/networking/modules/vpc-endpoint/variables.tfinfrastructure/modules/networking/tests/networking.tftest.hclinfrastructure/modules/networking/variables.tfinfrastructure/modules/security/main.tfinfrastructure/modules/security/outputs.tfinfrastructure/modules/security/tests/security.tftest.hclinfrastructure/modules/security/variables.tf
✅ Files skipped from review due to trivial changes (1)
- infrastructure/modules/database/tests/database.tftest.hcl
There was a problem hiding this comment.
1 issue found across 8 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="infrastructure/modules/networking/modules/vpc-endpoint/variables.tf">
<violation number="1" location="infrastructure/modules/networking/modules/vpc-endpoint/variables.tf:62">
P1: This unconditional check breaks valid S3-only usage of the module; `private_subnet_ids` is only required when an Interface endpoint is enabled.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
infrastructure/modules/networking/modules/vpc-endpoint/variables.tf
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
.github/workflows/run-ci-cd.yaml (1)
900-918:⚠️ Potential issue | 🟡 MinorAdd explicit launch-response validation for the migrate ECS task to align with the index-data task.
The migrate task block currently captures the task ARN directly without validating that the ECS
run-taskAPI call succeeded. Unlike the index-data task below it, there is no check for.failuresor task presence. If the API returns an HTTP 200 with a failures array (which can happen when insufficient resources are available or role restrictions apply), the task step will emit an invalid/empty ARN, deferring the failure to downstream waiter/describe steps with poor diagnostics.Proposed fix
- name: Run ECS migrate task id: migrate-task env: CLUSTER_NAME: ${{ steps.tf-outputs.outputs.tasks_cluster_name }} NAT_GATEWAY_ENABLED: ${{ steps.tf-outputs.outputs.nat_gateway_enabled }} TASKS_SECURITY_GROUP_ID: ${{ steps.tf-outputs.outputs.tasks_security_group_id }} TASK_DEFINITION: nest-staging-migrate TASKS_SUBNET_IDS: ${{ steps.tf-outputs.outputs.tasks_subnet_ids }} run: | ASSIGN_PUBLIC_IP=$([ "$NAT_GATEWAY_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED") - TASK_ARN=$(aws ecs run-task \ + RESPONSE=$(aws ecs run-task \ --cluster "$CLUSTER_NAME" \ --task-definition "$TASK_DEFINITION" \ --launch-type FARGATE \ --network-configuration "awsvpcConfiguration={subnets=[$TASKS_SUBNET_IDS],securityGroups=[$TASKS_SECURITY_GROUP_ID],assignPublicIp=$ASSIGN_PUBLIC_IP}" \ - --query 'tasks[0].taskArn' \ - --output text) + --output json) + FAILURE_COUNT=$(echo "$RESPONSE" | jq '.failures | length') + TASK_ARN=$(echo "$RESPONSE" | jq -r '.tasks[0].taskArn // empty') + if [ "$FAILURE_COUNT" -gt 0 ] || [ -z "$TASK_ARN" ]; then + echo "::error::Failed to start migrate task: $(echo "$RESPONSE" | jq -c '.failures')" + exit 1 + fi echo "task_arn=$TASK_ARN" >> $GITHUB_OUTPUT🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/run-ci-cd.yaml around lines 900 - 918, The migrate-task step currently writes TASK_ARN directly from the aws ecs run-task output without validating the response; change it to capture the full JSON response (e.g., TASK_RESPONSE), check for non-empty .failures and for presence of tasks[0].taskArn, and if failures exist or taskArn is missing emit a clear error and exit non-zero; only when validation passes set TASK_ARN and write it to GITHUB_OUTPUT—mirror the validation logic used by the index-data run-task step to ensure failures are surfaced immediately.
🧹 Nitpick comments (2)
infrastructure/README.md (2)
269-273: Consider automating AWS_REGION substitution.The manual
AWS_REGIONplaceholder appears in three separate commands (lines 282, 293, 304), which is error-prone. Users might forget to replace all occurrences or use inconsistent values.♻️ Proposed automated region extraction
If the AWS region is available in Terraform outputs, extract it alongside other variables:
CLUSTER=$(terraform output -raw tasks_cluster_name) SECURITY_GROUP=$(terraform output -raw tasks_security_group_id) SUBNETS=$(terraform output -json tasks_subnet_ids | jq -r 'join(",")') NAT_ENABLED=$(terraform output -raw nat_gateway_enabled) +REGION=$(terraform output -raw aws_region) ASSIGN_PUBLIC_IP=$([ "$NAT_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED")Then update the commands to use
--region "$REGION"or--region $REGION.Alternatively, if region isn't in outputs, you could use:
REGION=$(aws configure get region)This ensures consistency across all three commands.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/README.md` around lines 269 - 273, The README currently uses a manual AWS_REGION placeholder in three commands which is error-prone; update the docs to capture the region once (e.g., assign a REGION variable by extracting from Terraform outputs or via `aws configure get region`) and then replace each `AWS_REGION` occurrence in the commands with `--region "$REGION"` (or `--region $REGION`) so all three commands use the same automated REGION variable; mention both options (Terraform outputs variable name and the aws configure fallback) so users can copy the exact one that applies.
256-260: Verify jq installation before use.Line 259 notes that
jqis required, but the script doesn't verify it's installed. The command on line 264 will fail with a cryptic error ifjqis missing.🔧 Proposed dependency check
Add a check before the variable extraction block:
# Verify required tools if ! command -v jq &> /dev/null; then echo "Error: jq is required but not installed. Install it with: sudo apt-get install jq" exit 1 fiAlternatively, add installation instructions in the Prerequisites section (lines 7-13) alongside Terraform and AWS CLI.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/README.md` around lines 256 - 260, Add a pre-check for the jq dependency before the Terraform output parsing block: verify jq is on PATH (e.g., via command -v jq) and if missing print a clear error message and exit non‑zero so the subsequent variable extraction doesn't fail cryptically; alternatively update the README's Prerequisites section to list installation instructions for jq alongside Terraform and AWS CLI so users install it before running the script. Include the check immediately before the "Set variables from Terraform outputs" extraction and ensure the error message mentions how to install jq.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@infrastructure/README.md`:
- Around line 256-267: Add validation guards after extracting Terraform outputs
(CLUSTER, SECURITY_GROUP, SUBNETS, NAT_ENABLED) that verify each variable is
non-empty and that SUBNETS is valid JSON/csv; if any check fails, print a clear
error and exit non-zero so downstream ECS commands don't run on bad data.
Replace the inline ASSIGN_PUBLIC_IP subshell with an explicit if/else that tests
NAT_ENABLED ("true"/"false") and sets ASSIGN_PUBLIC_IP accordingly, and ensure
all error messages reference the variable name (e.g., "Missing Terraform output:
CLUSTER") to fail fast and aid debugging.
---
Outside diff comments:
In @.github/workflows/run-ci-cd.yaml:
- Around line 900-918: The migrate-task step currently writes TASK_ARN directly
from the aws ecs run-task output without validating the response; change it to
capture the full JSON response (e.g., TASK_RESPONSE), check for non-empty
.failures and for presence of tasks[0].taskArn, and if failures exist or taskArn
is missing emit a clear error and exit non-zero; only when validation passes set
TASK_ARN and write it to GITHUB_OUTPUT—mirror the validation logic used by the
index-data run-task step to ensure failures are surfaced immediately.
---
Nitpick comments:
In `@infrastructure/README.md`:
- Around line 269-273: The README currently uses a manual AWS_REGION placeholder
in three commands which is error-prone; update the docs to capture the region
once (e.g., assign a REGION variable by extracting from Terraform outputs or via
`aws configure get region`) and then replace each `AWS_REGION` occurrence in the
commands with `--region "$REGION"` (or `--region $REGION`) so all three commands
use the same automated REGION variable; mention both options (Terraform outputs
variable name and the aws configure fallback) so users can copy the exact one
that applies.
- Around line 256-260: Add a pre-check for the jq dependency before the
Terraform output parsing block: verify jq is on PATH (e.g., via command -v jq)
and if missing print a clear error message and exit non‑zero so the subsequent
variable extraction doesn't fail cryptically; alternatively update the README's
Prerequisites section to list installation instructions for jq alongside
Terraform and AWS CLI so users install it before running the script. Include the
check immediately before the "Set variables from Terraform outputs" extraction
and ensure the error message mentions how to install jq.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7eb29de0-9fe1-4fda-8674-2e00850b0c1d
📒 Files selected for processing (8)
.github/workflows/run-ci-cd.yamlinfrastructure/README.mdinfrastructure/modules/alb/variables.tfinfrastructure/modules/networking/modules/nacl/variables.tfinfrastructure/modules/networking/modules/vpc-endpoint/variables.tfinfrastructure/modules/service/variables.tfinfrastructure/modules/tasks/modules/task/variables.tfinfrastructure/modules/tasks/variables.tf
|
Running fork CI/CD. |
|
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name=".github/workflows/run-ci-cd.yaml">
<violation number="1">
P2: This gate removes PR/push validation for the deploy frontend image. `docker/frontend/Dockerfile` is now only built on manual/nightly staging runs or release, so release-image breakages can merge unnoticed.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| DOMAIN_NAME: ${{ vars.DOMAIN_NAME }} | ||
| DJANGO_CONFIGURATION: Staging | ||
| DJANGO_SETTINGS_MODULE: settings.staging | ||
| ENABLE_CRON_TASKS: false |
There was a problem hiding this comment.
I disabled cron tasks for staging please enable them here and in terraform.tfvars.example if this is not correct.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
infrastructure/README.md (1)
98-101: Clarify thestagingvsliveterminology in this section.This reads as “staging deployment,” but the path is
infrastructure/live/. Please add one explicit sentence thatinfrastructure/live/currently hosts the staging deployment config to avoid operator confusion during setup.Also applies to: 125-131
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/README.md` around lines 98 - 101, Add one explicit clarifying sentence to the NOTE block (the paragraph beginning with "Prerequisite: Create a `nest-staging` IAM user...") stating that the repository's infrastructure/live directory currently contains the staging deployment configuration (i.e., "Note: the infrastructure/live/ directory currently hosts the staging deployment configuration") so operators won't be confused by the "staging" wording; make the same exact clarification in the later section around the paragraph at lines 125-131 to keep both places consistent.infrastructure/modules/tasks/main.tf (1)
14-18: Gate the EventBridge IAM resources withenable_cron_taskstoo.When this flag is false, the schedules disappear, but
aws_iam_role.event_bridge_roleandaws_iam_policy.event_bridge_ecs_policyare still created withecs:RunTaskandiam:PassRole. That leaves unused privilege in environments that intentionally disable cron. Consider wrapping those resources with the same flag, or moving them into the scheduled-task path so they only exist when a schedule exists.Based on learnings: In the OWASP/Nest repository, all Terraform files under infrastructure/ (e.g., infrastructure/state/main.tf) must implement production-grade security hardening.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/modules/tasks/main.tf` around lines 14 - 18, The EventBridge IAM resources aws_iam_role.event_bridge_role and aws_iam_policy.event_bridge_ecs_policy are still created even when var.enable_cron_tasks is false, leaving unnecessary ecs:RunTask and iam:PassRole privileges; guard these resources with the same flag by wrapping their resource blocks (or moving them into the scheduled-task conditional/module) so they are only defined when var.enable_cron_tasks is true, ensuring all references that depend on these resources are likewise conditional or use count/for_each keyed on var.enable_cron_tasks to avoid dangling references.infrastructure/modules/tasks/tests/tasks.tftest.hcl (1)
20-64: Assert resource presence, not only local values.These tests prove the locals flip, but they will not catch a future wiring bug where a task module stops using those locals. As a stronger regression test, also assert that the child
aws_cloudwatch_event_rule.taskresources are absent/present for the disabled and enabled runs.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@infrastructure/modules/tasks/tests/tasks.tftest.hcl` around lines 20 - 64, Add assertions that verify the actual child resource aws_cloudwatch_event_rule.task is absent when enable_cron_tasks = false and present when enable_cron_tasks = true: in the "test_cron_tasks_disabled_removes_schedules" run add an assert that the count/length of resources for aws_cloudwatch_event_rule.task == 0, and in "test_cron_tasks_enabled_creates_schedules" assert that the count/length of resources for aws_cloudwatch_event_rule.task > 0; keep the existing local checks (local.sync_data_schedule_expression, local.update_project_health_metrics_schedule_expression, local.update_project_health_scores_schedule_expression) and add these resource-presence asserts to catch wiring regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@infrastructure/modules/alb/variables.tf`:
- Around line 71-74: Update the validation for the variable public_subnet_ids so
it requires at least two entries (not just one) before plan/apply; modify the
validation condition in the validation block that references
var.public_subnet_ids (the validation block in variables.tf) to enforce
length(var.public_subnet_ids) >= 2 (or > 1) and update the error_message to
clearly state that at least two public_subnet_ids are required for ALB subnets
across Availability Zones.
---
Nitpick comments:
In `@infrastructure/modules/tasks/main.tf`:
- Around line 14-18: The EventBridge IAM resources
aws_iam_role.event_bridge_role and aws_iam_policy.event_bridge_ecs_policy are
still created even when var.enable_cron_tasks is false, leaving unnecessary
ecs:RunTask and iam:PassRole privileges; guard these resources with the same
flag by wrapping their resource blocks (or moving them into the scheduled-task
conditional/module) so they are only defined when var.enable_cron_tasks is true,
ensuring all references that depend on these resources are likewise conditional
or use count/for_each keyed on var.enable_cron_tasks to avoid dangling
references.
In `@infrastructure/modules/tasks/tests/tasks.tftest.hcl`:
- Around line 20-64: Add assertions that verify the actual child resource
aws_cloudwatch_event_rule.task is absent when enable_cron_tasks = false and
present when enable_cron_tasks = true: in the
"test_cron_tasks_disabled_removes_schedules" run add an assert that the
count/length of resources for aws_cloudwatch_event_rule.task == 0, and in
"test_cron_tasks_enabled_creates_schedules" assert that the count/length of
resources for aws_cloudwatch_event_rule.task > 0; keep the existing local checks
(local.sync_data_schedule_expression,
local.update_project_health_metrics_schedule_expression,
local.update_project_health_scores_schedule_expression) and add these
resource-presence asserts to catch wiring regressions.
In `@infrastructure/README.md`:
- Around line 98-101: Add one explicit clarifying sentence to the NOTE block
(the paragraph beginning with "Prerequisite: Create a `nest-staging` IAM
user...") stating that the repository's infrastructure/live directory currently
contains the staging deployment configuration (i.e., "Note: the
infrastructure/live/ directory currently hosts the staging deployment
configuration") so operators won't be confused by the "staging" wording; make
the same exact clarification in the later section around the paragraph at lines
125-131 to keep both places consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6bc13bf2-0fd0-429e-8bd2-95610c54bed2
📒 Files selected for processing (11)
.github/workflows/run-ci-cd.yamlinfrastructure/README.mdinfrastructure/live/main.tfinfrastructure/live/terraform.tfvars.exampleinfrastructure/live/variables.tfinfrastructure/modules/alb/variables.tfinfrastructure/modules/service/variables.tfinfrastructure/modules/tasks/main.tfinfrastructure/modules/tasks/modules/task/variables.tfinfrastructure/modules/tasks/tests/tasks.tftest.hclinfrastructure/modules/tasks/variables.tf
🚧 Files skipped from review as they are similar to previous changes (2)
- infrastructure/modules/tasks/modules/task/variables.tf
- infrastructure/live/terraform.tfvars.example
* refactor parameters module to expose Django configuration/settings module * update desired count * rename staging/ to live/ * post rename code changes * temporary changes to run CI/CD * fix quotes for variables * unpin terraform, providers, and modules versions * Revert "temporary changes to run CI/CD" This reverts commit ec2b007. * fix django settings name * add a flag for enabling NAT Gateway and disable it by default * update docs * rename create_ prefixed variables to enable_ * add/fix tests * bot comments * add flag for enabling cron tasks * Reapply "temporary changes to run CI/CD" This reverts commit 68f1d5d. * temporary changes to run CI/CD * Revert "temporary changes to run CI/CD" This reverts commit cdf9078. * Revert "Reapply "temporary changes to run CI/CD"" This reverts commit 64b5b04.



Proposed change
Resolves #4257
make updatecommand to update Terraform providers/versions, can be a good first issue or addressed later.create_prefixed variables to haveenable_prefix.desired_countto 1.Checklist
make check-testlocally: all warnings addressed, tests passed