Skip to content

Refactor Infrastructure Code#4258

Merged
arkid15r merged 20 commits intoOWASP:mainfrom
rudransh-shrivastava:feature/prepare-migrate-production
Mar 13, 2026
Merged

Refactor Infrastructure Code#4258
arkid15r merged 20 commits intoOWASP:mainfrom
rudransh-shrivastava:feature/prepare-migrate-production

Conversation

@rudransh-shrivastava
Copy link
Collaborator

@rudransh-shrivastava rudransh-shrivastava commented Mar 12, 2026

Proposed change

Resolves #4257

  • Refactor Infrastructure code to be more environment agnostic.
  • Unpin Terraform versions.
    • Did not add depandabot config, can be a good first issue or addressed later.
    • Did not update the make update command to update Terraform providers/versions, can be a good first issue or addressed later.
  • Rename create_ prefixed variables to have enable_ prefix.
  • Finalize staging and production differences.
    • Reduce backend and frontend task desired_count to 1.
    • Disable cron tasks for staging. (unsure if this is correct).
    • Disable NAT gateway for staging.

Checklist

  • Required: I followed the contributing workflow
  • Required: I verified that my code works as intended and resolves the issue as described
  • Required: I ran make check-test locally: all warnings addressed, tests passed
  • I used AI for code, documentation, tests, or communication related to this PR

@github-actions github-actions bot added docs Improvements or additions to documentation ci labels Mar 12, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 12, 2026

Walkthrough

Comprehensive infrastructure-as-code refactoring transitioning from staging to live environment with systematic variable naming updates (create_* to enable_*), Terraform version constraints relaxation, Django-centric parameter restructuring, NAT gateway conditional logic, and CI/CD workflow updates for ECS task orchestration.

Changes

Cohort / File(s) Summary
CI/CD Workflow & Backend State
.github/workflows/run-ci-cd.yaml, infrastructure/live/backend.tf, infrastructure/live/terraform.tfbackend.example
Terraform version bumped from 1.14.0 to ~> 1.14.0 for flexible patching. Backend state key introduced and wired into live configuration. TF_STATE_KEY added to environment. Backend artifact paths updated from staging to live directory.
Provider Version Bumps
infrastructure/bootstrap/main.tf, infrastructure/modules/alb/main.tf, infrastructure/modules/cache/main.tf, infrastructure/modules/database/main.tf, infrastructure/modules/kms/main.tf, infrastructure/modules/networking/main.tf, infrastructure/modules/parameters/main.tf, infrastructure/modules/security/main.tf, infrastructure/modules/service/main.tf, infrastructure/modules/storage/main.tf, infrastructure/state/main.tf, infrastructure/modules/tasks/...
AWS provider version updated from 6.22.0 to ~> 6.36.0 across all modules; Random provider updated to ~> 3.8.0 where applicable. All .terraform.lock.hcl files updated with corresponding hash entries.
Terraform Configuration & Lock Files
infrastructure/live/.terraform.lock.hcl, infrastructure/staging/.terraform.lock.hcl
New terraform lockfile introduced for live environment. Staging lockfile deleted as part of environment transition.
Variable Naming Refactoring (create_ → enable_*)*
infrastructure/modules/networking/variables.tf, infrastructure/modules/networking/main.tf, infrastructure/modules/networking/modules/vpc-endpoint/variables.tf, infrastructure/modules/networking/modules/vpc-endpoint/main.tf, infrastructure/modules/database/variables.tf, infrastructure/modules/database/main.tf, infrastructure/modules/security/variables.tf, infrastructure/modules/security/main.tf
Systematic rename: create_rds_proxy → enable_rds_proxy, create_vpc_cloudwatch_logs_endpoint → enable_vpc_cloudwatch_logs_endpoint (and similar for ECR API/DKR, S3, Secrets Manager, SSM). create_vpc_endpoint_rules → enable_vpc_endpoint_rules. Count expressions and validation messages updated accordingly.
NAT Gateway & Subnet Selection Logic
infrastructure/live/main.tf, infrastructure/live/outputs.tf, infrastructure/modules/networking/main.tf, infrastructure/modules/service/main.tf, infrastructure/modules/service/variables.tf
New enable_nat_gateway variable introduced with conditional logic for ECS network configuration. assign_public_ip behavior now depends on NAT gateway enablement. Subnet selection logic: private_subnet_ids when NAT enabled, public_subnet_ids otherwise. NAT EIP and gateway resources gated by enable_nat_gateway flag.
Parameters Module & Django Configuration
infrastructure/modules/parameters/variables.tf, infrastructure/modules/parameters/main.tf, infrastructure/modules/parameters/tests/parameters.tftest.hcl
Wholesale variable restructuring: legacy db_, redis_, server_, settings_ replaced with django_* prefix (django_db_host, django_redis_host, django_settings_module, etc.). Added next_server_csrf_url, next_server_graphql_url, nextauth_url, project_name. SSM parameter mappings updated to reference new variable names.
Live Infrastructure Configuration
infrastructure/live/variables.tf, infrastructure/live/terraform.tfvars.example, infrastructure/live/main.tf, infrastructure/live/README.md
New enable_nat_gateway, enable_rds_proxy, enable_cron_tasks, django_configuration, django_settings_module variables added. Removed create_rds_proxy, create_vpc_* flags, tasks_assign_public_ip. Module inputs restructured to pass new enable_* and django_* variables. Documentation updated to reference live directory.
ECS Tasks & Cron Configuration
infrastructure/modules/tasks/main.tf, infrastructure/modules/tasks/variables.tf, infrastructure/modules/tasks/tests/tasks.tftest.hcl, infrastructure/modules/tasks/modules/task/variables.tf
New enable_cron_tasks variable gates schedule expressions for sync_data, update_project_health_metrics, update_project_health_scores tasks. Task subnet_ids validation added. Test coverage expanded for cron task enable/disable scenarios.
Service Module Network Configuration
infrastructure/modules/service/variables.tf, infrastructure/modules/service/main.tf, infrastructure/modules/service/tests/service.tftest.hcl
Replaced private_subnet_ids with subnet_ids variable. New assign_public_ip variable (default false) controls ECS task public IP assignment. Network configuration now uses dynamic subnet and public IP values. Test updated to reference subnet_ids.
Subnet Validation & Documentation
infrastructure/modules/alb/variables.tf, infrastructure/modules/networking/modules/vpc-endpoint/variables.tf, infrastructure/modules/networking/tests/networking.tftest.hcl, infrastructure/README.md
Input validation blocks added to public_subnet_ids (ALB) and subnet_ids (VPC endpoint) requiring at least one subnet. Networking tests expanded with NAT gateway scenarios. Infrastructure README expanded with Run ECS Tasks section including environment setup and task execution commands.
CI/CD Workflow ECS Orchestration
.github/workflows/run-ci-cd.yaml
New ECS task logic: ASSIGN_PUBLIC_IP computed based on NAT_GATEWAY_ENABLED. TASKS_SUBNET_IDS and NAT_GATEWAY_ENABLED propagated from Terraform outputs. ECS run-task commands for migrate, load-data, index-data tasks updated with proper network configuration (subnets, security groups, public IP assignment). Added robust response validation for index-data task.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • PR #3238: Overlapping refactor of Terraform module variables (create_* to enable_*), enable_nat_gateway/enable_rds_proxy addition, ECS network configuration changes, and module input restructuring.
  • PR #3888: Concurrent changes to CI/CD workflow .github/workflows/run-ci-cd.yaml with Terraform plan/apply steps and infrastructure output propagation.
  • PR #4243: Related changes to networking, service, and tasks modules around NAT gateway enablement, public IP assignment, and subnet selection logic.

Suggested reviewers

  • kasya
  • arkid15r
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Refactor Infrastructure Code' is vague and generic, using non-descriptive terms that don't convey the specific nature of changes (variable renaming, version unpinning, NAT gateway configuration, etc.). Consider a more specific title like 'Refactor infrastructure to be environment-agnostic with enable_ prefixes and unpinned versions' to better reflect the key changes.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR addresses all primary objectives from issue #4257: making infrastructure environment-agnostic, finalizing staging vs. production differences (NAT gateway, cron tasks, CPU/memory), unpinning Terraform versions, and renaming create_ to enable_ variables throughout the codebase.
Out of Scope Changes check ✅ Passed All changes are directly aligned with stated objectives: variable renaming (create_ to enable_), version unpinning across all modules, NAT gateway configuration logic, parameter restructuring for Django integration, and test updates. No unrelated changes detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description check ✅ Passed The PR description clearly relates to the changeset, describing refactoring of infrastructure code, unpinning Terraform versions, renaming variables from create_ to enable_, and finalizing staging/production differences.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rudransh-shrivastava rudransh-shrivastava changed the title Refactor Infrastructure Code [WIP] Refactor Infrastructure Code Mar 12, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 46 files

Confidence score: 3/5

  • The example config in infrastructure/live/terraform.tfvars.example points django_settings_module to settings.Staging, which doesn’t match the actual module file and can trigger startup import errors if used.
  • This is a concrete runtime risk, so the merge carries some user-impacting uncertainty despite being limited to an example config.
  • Pay close attention to infrastructure/live/terraform.tfvars.example - incorrect module path may break deployments.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="infrastructure/live/terraform.tfvars.example">

<violation number="1" location="infrastructure/live/terraform.tfvars.example:8">
P1: Use lowercase module path for `django_settings_module`; `settings.Staging` does not match the actual settings module file and can cause startup import errors.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@codecov
Copy link

codecov bot commented Mar 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.48%. Comparing base (a5af224) to head (ad7f5c3).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #4258   +/-   ##
=======================================
  Coverage   99.48%   99.48%           
=======================================
  Files         520      520           
  Lines       16305    16305           
  Branches     2204     2254   +50     
=======================================
  Hits        16221    16221           
  Misses         56       56           
  Partials       28       28           
Flag Coverage Δ
backend 99.83% <ø> (ø)
frontend 98.48% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a5af224...ad7f5c3. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
infrastructure/bootstrap/main.tf (1)

84-102: ⚠️ Potential issue | 🟠 Major

Add kms:CreateGrant to the CloudWatch Logs KMS policy statement.

The logs:AssociateKmsKey permission has been added to the caller side, but infrastructure/modules/kms/main.tf still does not grant kms:CreateGrant to logs.${data.aws_region.current.id}.amazonaws.com. CloudWatch Logs requires this permission to create grants on the CMK when the log-group modules (task, service, cache, networking) associate the KMS key. Without it, Terraform applies will fail during key association.

🔧 Required change in infrastructure/modules/kms/main.tf
   statement {
     actions = [
+      "kms:CreateGrant",
       "kms:Decrypt",
       "kms:Describe*",
       "kms:Encrypt",
       "kms:GenerateDataKey*",
       "kms:ReEncrypt*",
     ]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/bootstrap/main.tf` around lines 84 - 102, The KMS key policy
block that allows CloudWatch Logs (the statement with sid
"CloudWatchLogsManagement" or the aws_kms_key policy allowing principal
logs.${data.aws_region.current.id}.amazonaws.com) is missing the kms:CreateGrant
action; update that policy statement to include "kms:CreateGrant" alongside
existing kms actions so CloudWatch Logs can create grants when associating the
CMK (ensure the principal remains
logs.${data.aws_region.current.id}.amazonaws.com and the resource/statement
scope matches the existing key policy).
infrastructure/live/README.md (1)

3-7: ⚠️ Potential issue | 🟠 Major

Fix the first-run bootstrap prerequisite.

This section says nest-staging-terraform is created automatically by bootstrap, but the first CI/CD run needs that role to already exist. As written, a new deployment will dead-end before bootstrap can update anything.

Based on learnings: The nest-staging-terraform IAM role is expected to be manually pre-created before the first CI/CD pipeline run. The bootstrap-staging-nest job in .github/workflows/run-ci-cd.yaml updates this role but does not handle initial creation, so the role must exist before build-staging-images runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/live/README.md` around lines 3 - 7, Update the README to state
that the IAM role nest-staging-terraform must be manually created before the
very first staging CI/CD run: clarify that the bootstrap-staging-nest job only
updates/assumes the existing role (it does not create it on first run), and that
build-staging-images (and the initial staging pipeline) will fail if
nest-staging-terraform doesn't already exist; mention how to create the role or
point to the terraform snippet/trust policy to pre-create it and note that
subsequent bootstrap runs will update the role.
🧹 Nitpick comments (2)
infrastructure/live/.terraform.lock.hcl (1)

27-45: Consider declaring the random provider in infrastructure/live/main.tf.

The lockfile includes the random provider (used by child modules like database for random_password), but infrastructure/live/main.tf only declares the aws provider in its required_providers block. While Terraform aggregates provider requirements from child modules, explicitly declaring all providers in the root module improves clarity and makes version constraints more discoverable.

💡 Suggested addition to infrastructure/live/main.tf
terraform {
  required_version = "~> 1.14.0"

  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 6.36.0"
    }
    random = {
      source  = "hashicorp/random"
      version = "~> 3.8.0"
    }
  }
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/live/.terraform.lock.hcl` around lines 27 - 45, The root
terraform block's required_providers is missing the random provider; update the
terraform { required_providers { ... } } block in main.tf to explicitly declare
random with source "hashicorp/random" and a version constraint (e.g. "~> 3.8.0")
alongside the existing aws entry so the root module documents the provider
requirement used by child modules (like the database module that uses
random_password).
infrastructure/live/variables.tf (1)

132-140: Prevent environment/Django config mismatches.

django_configuration and django_settings_module are now free-form alongside var.environment, so production + Staging/settings.staging is currently a valid input set. Consider deriving them from var.environment or enforcing the allowed pairs so bad tfvars fail fast.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/live/variables.tf` around lines 132 - 140, Ensure
django_configuration and django_settings_module cannot diverge from
var.environment by adding deterministic derivation or validation: either set
them as computed locals based on var.environment (e.g., map environment ->
configuration and settings module) and replace the free variables, or add
terraform validation blocks on variable "django_configuration" and
"django_settings_module" that check values against var.environment (using a
mapping/local lookup) and fail fast on mismatch; reference the variable names
django_configuration, django_settings_module and var.environment in the new
logic so only allowed pairs (e.g., production -> Production/settings.production)
are accepted.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infrastructure/live/terraform.tfbackend.example`:
- Line 2: The backend "key" currently hardcodes "staging/terraform.tfstate";
change it to be environment-driven instead: replace the literal key with a
placeholder or interpolation that derives the state path from a variable (e.g.,
var.environment) or from terraform workspace (terraform.workspace) so each
environment uses its own "<environment>/terraform.tfstate"; update the template
comment to document that callers must set the environment variable or choose the
appropriate workspace.

In `@infrastructure/live/variables.tf`:
- Around line 147-150: The new default for backend_desired_count (and similarly
frontend_desired_count) is 1 which violates the module validation requiring
min_count <= desired_count when autoscaling is enabled; update the corresponding
min variables (backend_min_count and frontend_min_count) to 1 to keep defaults
compatible with autoscaling, or alternatively restore backend_desired_count and
frontend_desired_count to match the existing min values (e.g., 2); adjust
whichever pair you choose (backend_desired_count/backend_min_count and
frontend_desired_count/frontend_min_count) so desired_count >= min_count and
re-run validation.

---

Outside diff comments:
In `@infrastructure/bootstrap/main.tf`:
- Around line 84-102: The KMS key policy block that allows CloudWatch Logs (the
statement with sid "CloudWatchLogsManagement" or the aws_kms_key policy allowing
principal logs.${data.aws_region.current.id}.amazonaws.com) is missing the
kms:CreateGrant action; update that policy statement to include
"kms:CreateGrant" alongside existing kms actions so CloudWatch Logs can create
grants when associating the CMK (ensure the principal remains
logs.${data.aws_region.current.id}.amazonaws.com and the resource/statement
scope matches the existing key policy).

In `@infrastructure/live/README.md`:
- Around line 3-7: Update the README to state that the IAM role
nest-staging-terraform must be manually created before the very first staging
CI/CD run: clarify that the bootstrap-staging-nest job only updates/assumes the
existing role (it does not create it on first run), and that
build-staging-images (and the initial staging pipeline) will fail if
nest-staging-terraform doesn't already exist; mention how to create the role or
point to the terraform snippet/trust policy to pre-create it and note that
subsequent bootstrap runs will update the role.

---

Nitpick comments:
In `@infrastructure/live/.terraform.lock.hcl`:
- Around line 27-45: The root terraform block's required_providers is missing
the random provider; update the terraform { required_providers { ... } } block
in main.tf to explicitly declare random with source "hashicorp/random" and a
version constraint (e.g. "~> 3.8.0") alongside the existing aws entry so the
root module documents the provider requirement used by child modules (like the
database module that uses random_password).

In `@infrastructure/live/variables.tf`:
- Around line 132-140: Ensure django_configuration and django_settings_module
cannot diverge from var.environment by adding deterministic derivation or
validation: either set them as computed locals based on var.environment (e.g.,
map environment -> configuration and settings module) and replace the free
variables, or add terraform validation blocks on variable "django_configuration"
and "django_settings_module" that check values against var.environment (using a
mapping/local lookup) and fail fast on mismatch; reference the variable names
django_configuration, django_settings_module and var.environment in the new
logic so only allowed pairs (e.g., production -> Production/settings.production)
are accepted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6caccf2a-ca21-4ddf-8905-3298c63004b0

📥 Commits

Reviewing files that changed from the base of the PR and between 0431ee7 and 68f1d5d.

📒 Files selected for processing (46)
  • .github/workflows/run-ci-cd.yaml
  • infrastructure/README.md
  • infrastructure/bootstrap/.terraform.lock.hcl
  • infrastructure/bootstrap/main.tf
  • infrastructure/live/.terraform.lock.hcl
  • infrastructure/live/README.md
  • infrastructure/live/backend.tf
  • infrastructure/live/main.tf
  • infrastructure/live/outputs.tf
  • infrastructure/live/providers.tf
  • infrastructure/live/terraform.tfbackend.example
  • infrastructure/live/terraform.tfvars.example
  • infrastructure/live/variables.tf
  • infrastructure/modules/alb/.terraform.lock.hcl
  • infrastructure/modules/alb/main.tf
  • infrastructure/modules/cache/.terraform.lock.hcl
  • infrastructure/modules/cache/main.tf
  • infrastructure/modules/database/.terraform.lock.hcl
  • infrastructure/modules/database/main.tf
  • infrastructure/modules/kms/.terraform.lock.hcl
  • infrastructure/modules/kms/main.tf
  • infrastructure/modules/networking/.terraform.lock.hcl
  • infrastructure/modules/networking/main.tf
  • infrastructure/modules/networking/modules/nacl/.terraform.lock.hcl
  • infrastructure/modules/networking/modules/nacl/main.tf
  • infrastructure/modules/networking/modules/vpc-endpoint/.terraform.lock.hcl
  • infrastructure/modules/networking/modules/vpc-endpoint/main.tf
  • infrastructure/modules/parameters/.terraform.lock.hcl
  • infrastructure/modules/parameters/main.tf
  • infrastructure/modules/parameters/tests/parameters.tftest.hcl
  • infrastructure/modules/parameters/variables.tf
  • infrastructure/modules/security/.terraform.lock.hcl
  • infrastructure/modules/security/main.tf
  • infrastructure/modules/service/.terraform.lock.hcl
  • infrastructure/modules/service/main.tf
  • infrastructure/modules/storage/.terraform.lock.hcl
  • infrastructure/modules/storage/main.tf
  • infrastructure/modules/storage/modules/s3-bucket/.terraform.lock.hcl
  • infrastructure/modules/storage/modules/s3-bucket/main.tf
  • infrastructure/modules/tasks/.terraform.lock.hcl
  • infrastructure/modules/tasks/main.tf
  • infrastructure/modules/tasks/modules/task/.terraform.lock.hcl
  • infrastructure/modules/tasks/modules/task/main.tf
  • infrastructure/staging/.terraform.lock.hcl
  • infrastructure/state/.terraform.lock.hcl
  • infrastructure/state/main.tf
💤 Files with no reviewable changes (2)
  • infrastructure/live/backend.tf
  • infrastructure/staging/.terraform.lock.hcl

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 12, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 11 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="infrastructure/README.md">

<violation number="1" location="infrastructure/README.md:261">
P2: This command introduces an undocumented `jq` dependency. Either add `jq` to the prerequisites or use a jq-free example so the README works with the tools it already tells users to install.</violation>
</file>

<file name="infrastructure/modules/service/main.tf">

<violation number="1" location="infrastructure/modules/service/main.tf:148">
P3: Add a module test for the ECS service network configuration. This PR changes public-IP and subnet selection behavior, but the service module tests never assert those fields, so a networking regression here would still pass.

(Based on your team's feedback about adding or updating tests for new logic.) [FEEDBACK_USED]</violation>
</file>

<file name="infrastructure/modules/networking/main.tf">

<violation number="1" location="infrastructure/modules/networking/main.tf:52">
P1: Adding `count` changes this existing resource's state address to `aws_eip.nat[0]` (and the NAT gateway to `[0]` as well). Without `moved` blocks or a state migration, NAT-enabled stacks will plan a destroy/create of the NAT gateway and EIP.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
infrastructure/README.md (1)

259-264: Consider more robust conditional syntax.

The bash pattern [ "$NAT_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED" works but can be fragile if the first echo fails. While unlikely with echo, a more defensive approach would be:

ASSIGN_PUBLIC_IP=$(if [ "$NAT_ENABLED" = "true" ]; then echo "DISABLED"; else echo "ENABLED"; fi)
♻️ Proposed refactor for robustness
-ASSIGN_PUBLIC_IP=$([ "$NAT_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED")
+ASSIGN_PUBLIC_IP=$(if [ "$NAT_ENABLED" = "true" ]; then echo "DISABLED"; else echo "ENABLED"; fi)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/README.md` around lines 259 - 264, The ASSIGN_PUBLIC_IP
assignment uses a short-circuited conditional that can misbehave if the first
command fails; replace the current command-substitution expression with an
explicit if/then/else/fi inside the substitution so ASSIGN_PUBLIC_IP is set by
evaluating NAT_ENABLED with a deterministic branch (use the NAT_ENABLED variable
and the literal outputs "DISABLED" and "ENABLED" in the two branches) to avoid
relying on &&/|| short-circuit behavior.
infrastructure/modules/service/variables.tf (1)

123-126: Add a non-empty validation for subnet_ids.

Please fail early when no subnets are provided; otherwise this defers failure to provider/runtime behavior.

♻️ Suggested change
 variable "subnet_ids" {
   description = "Subnet IDs for ECS tasks (can be public or private)."
   type        = list(string)
+
+  validation {
+    condition     = length(var.subnet_ids) > 0
+    error_message = "At least one subnet ID must be provided."
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/modules/service/variables.tf` around lines 123 - 126, Add a
validation block to the variable "subnet_ids" to fail fast when no subnets are
supplied: inside the variable "subnet_ids" (type = list(string)) add a
validation { condition = length(var.subnet_ids) > 0 error_message = "subnet_ids
must contain at least one subnet ID" } so Terraform will error at plan/apply
time if the list is empty or not provided.
infrastructure/live/main.tf (1)

57-57: Extract the repeated subnet selection into a local for maintainability.

The same NAT-based subnet ternary is repeated three times.

♻️ Suggested cleanup
 locals {
   assign_public_ip = var.enable_nat_gateway ? false : true
+  tasks_subnet_ids = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids
   common_tags = {
     Environment = var.environment
     ManagedBy   = "Terraform"
     Project     = var.project_name
   }
@@
-  subnet_ids            = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids
+  subnet_ids            = local.tasks_subnet_ids
@@
-  subnet_ids          = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids
+  subnet_ids          = local.tasks_subnet_ids
@@
-  subnet_ids                    = var.enable_nat_gateway ? module.networking.private_subnet_ids : module.networking.public_subnet_ids
+  subnet_ids                    = local.tasks_subnet_ids

Also applies to: 120-120, 218-218

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/live/main.tf` at line 57, Extract the repeated NAT-based
subnet selection into a local by adding a local like local.subnet_ids =
var.enable_nat_gateway ? module.networking.private_subnet_ids :
module.networking.public_subnet_ids, then replace the ternary occurrences (where
subnet_ids is currently set to var.enable_nat_gateway ?
module.networking.private_subnet_ids : module.networking.public_subnet_ids) with
local.subnet_ids in the modules/resources that use it (e.g., the blocks setting
subnet_ids currently using that ternary).
infrastructure/modules/service/tests/service.tftest.hcl (1)

18-18: Add explicit assertions for network wiring behavior.

Good rename, but please add tests that assert ECS service network_configuration uses var.subnet_ids and honors assign_public_ip (true/false paths). This protects the new NAT-driven behavior from regressions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/modules/service/tests/service.tftest.hcl` at line 18, Add
assertions in the Terraform test to validate the ECS service's network wiring:
check that the ECS service resource's network_configuration.subnets equals
var.subnet_ids and add two test cases (or parameterized checks) that assert
network_configuration.assign_public_ip is "ENABLED" when assign_public_ip=true
and "DISABLED" (or absent) when assign_public_ip=false; update the test around
the ECS service block (the resource that references subnet_ids) to read
var.subnet_ids and to verify assign_public_ip behavior for both true/false paths
so NAT-driven behavior cannot regress.
infrastructure/live/variables.tf (1)

132-140: Add validation guards for Django config inputs.

These are now required knobs; lightweight validation will prevent accidental bad environment wiring.

♻️ Suggested change
 variable "django_configuration" {
   description = "The name of the Django configuration to use (e.g., Staging, Production)."
   type        = string
+  validation {
+    condition     = contains(["Staging", "Production"], var.django_configuration)
+    error_message = "django_configuration must be either 'Staging' or 'Production'."
+  }
 }
 
 variable "django_settings_module" {
   description = "The location of the Django settings module to use (e.g., settings.staging, settings.production)."
   type        = string
+  validation {
+    condition     = can(regex("^settings\\.(staging|production)$", var.django_settings_module))
+    error_message = "django_settings_module must be 'settings.staging' or 'settings.production'."
+  }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/live/variables.tf` around lines 132 - 140, Add Terraform
validation blocks to both variables to prevent empty or malformed values: for
variable "django_configuration" add a validation that the value is non-empty
(length > 0) and optionally restrict to an allowed set if you have known
environments (e.g., Staging/Production/Dev); for variable
"django_settings_module" add a validation that the value is non-empty and
matches a simple module-name pattern (no spaces, only letters, numbers,
underscores and dots) and provide clear error_message text for each validation
so a bad input fails fast. Use the variable names "django_configuration" and
"django_settings_module" to locate where to add these validation blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/run-ci-cd.yaml:
- Around line 946-957: The aws ecs run-task call can succeed at HTTP level but
still return failures in the response; capture the JSON output of the aws ecs
run-task invocation into a variable (instead of piping straight to stdout), use
a JSON parser (jq) to check .failures | length > 0 or absence of .tasks[] and if
failures exist echo an error with the failures and exit non-zero; otherwise
continue to echo success. Update the block around ASSIGN_PUBLIC_IP / aws ecs
run-task and use the CLUSTER_NAME and TASK_DEFINITION variables when checking
the response so the job fails fast when the index-data task did not actually
start.

---

Nitpick comments:
In `@infrastructure/live/main.tf`:
- Line 57: Extract the repeated NAT-based subnet selection into a local by
adding a local like local.subnet_ids = var.enable_nat_gateway ?
module.networking.private_subnet_ids : module.networking.public_subnet_ids, then
replace the ternary occurrences (where subnet_ids is currently set to
var.enable_nat_gateway ? module.networking.private_subnet_ids :
module.networking.public_subnet_ids) with local.subnet_ids in the
modules/resources that use it (e.g., the blocks setting subnet_ids currently
using that ternary).

In `@infrastructure/live/variables.tf`:
- Around line 132-140: Add Terraform validation blocks to both variables to
prevent empty or malformed values: for variable "django_configuration" add a
validation that the value is non-empty (length > 0) and optionally restrict to
an allowed set if you have known environments (e.g., Staging/Production/Dev);
for variable "django_settings_module" add a validation that the value is
non-empty and matches a simple module-name pattern (no spaces, only letters,
numbers, underscores and dots) and provide clear error_message text for each
validation so a bad input fails fast. Use the variable names
"django_configuration" and "django_settings_module" to locate where to add these
validation blocks.

In `@infrastructure/modules/service/tests/service.tftest.hcl`:
- Line 18: Add assertions in the Terraform test to validate the ECS service's
network wiring: check that the ECS service resource's
network_configuration.subnets equals var.subnet_ids and add two test cases (or
parameterized checks) that assert network_configuration.assign_public_ip is
"ENABLED" when assign_public_ip=true and "DISABLED" (or absent) when
assign_public_ip=false; update the test around the ECS service block (the
resource that references subnet_ids) to read var.subnet_ids and to verify
assign_public_ip behavior for both true/false paths so NAT-driven behavior
cannot regress.

In `@infrastructure/modules/service/variables.tf`:
- Around line 123-126: Add a validation block to the variable "subnet_ids" to
fail fast when no subnets are supplied: inside the variable "subnet_ids" (type =
list(string)) add a validation { condition = length(var.subnet_ids) > 0
error_message = "subnet_ids must contain at least one subnet ID" } so Terraform
will error at plan/apply time if the list is empty or not provided.

In `@infrastructure/README.md`:
- Around line 259-264: The ASSIGN_PUBLIC_IP assignment uses a short-circuited
conditional that can misbehave if the first command fails; replace the current
command-substitution expression with an explicit if/then/else/fi inside the
substitution so ASSIGN_PUBLIC_IP is set by evaluating NAT_ENABLED with a
deterministic branch (use the NAT_ENABLED variable and the literal outputs
"DISABLED" and "ENABLED" in the two branches) to avoid relying on &&/||
short-circuit behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5609cb95-f61b-43ec-819e-9bf9785d8808

📥 Commits

Reviewing files that changed from the base of the PR and between b6f3308 and fead597.

📒 Files selected for processing (11)
  • .github/workflows/run-ci-cd.yaml
  • infrastructure/README.md
  • infrastructure/live/main.tf
  • infrastructure/live/outputs.tf
  • infrastructure/live/terraform.tfvars.example
  • infrastructure/live/variables.tf
  • infrastructure/modules/networking/main.tf
  • infrastructure/modules/networking/variables.tf
  • infrastructure/modules/service/main.tf
  • infrastructure/modules/service/tests/service.tftest.hcl
  • infrastructure/modules/service/variables.tf
🚧 Files skipped from review as they are similar to previous changes (1)
  • infrastructure/modules/service/main.tf

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 18 files (changes from recent commits).

Requires human review: Auto-approval blocked by 3 unresolved issues from previous reviews.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infrastructure/modules/networking/main.tf`:
- Around line 1-10: The change replaced exact pins with pessimistic constraints
in the terraform block (see required_version) and the aws provider block
(required_providers.aws.version); either obtain explicit team approval for
switching to pessimistic versioning and document that decision, or revert these
lines back to the team's required exact pins (use the previous exact values for
required_version and required_providers.aws.version) across all modules under
infrastructure/modules/ so the versioning policy remains consistent; ensure the
commit message or PR description records the team's consensus if you keep the
new "~>" constraints.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c9a51eb2-af3c-4329-90ba-a5bc7fb2ad25

📥 Commits

Reviewing files that changed from the base of the PR and between fead597 and c81ddfa.

📒 Files selected for processing (18)
  • .github/workflows/run-ci-cd.yaml
  • infrastructure/live/main.tf
  • infrastructure/live/terraform.tfvars.example
  • infrastructure/live/variables.tf
  • infrastructure/modules/database/main.tf
  • infrastructure/modules/database/outputs.tf
  • infrastructure/modules/database/tests/database.tftest.hcl
  • infrastructure/modules/database/variables.tf
  • infrastructure/modules/networking/main.tf
  • infrastructure/modules/networking/modules/vpc-endpoint/main.tf
  • infrastructure/modules/networking/modules/vpc-endpoint/tests/vpc-endpoint.tftest.hcl
  • infrastructure/modules/networking/modules/vpc-endpoint/variables.tf
  • infrastructure/modules/networking/tests/networking.tftest.hcl
  • infrastructure/modules/networking/variables.tf
  • infrastructure/modules/security/main.tf
  • infrastructure/modules/security/outputs.tf
  • infrastructure/modules/security/tests/security.tftest.hcl
  • infrastructure/modules/security/variables.tf
✅ Files skipped from review due to trivial changes (1)
  • infrastructure/modules/database/tests/database.tftest.hcl

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 8 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="infrastructure/modules/networking/modules/vpc-endpoint/variables.tf">

<violation number="1" location="infrastructure/modules/networking/modules/vpc-endpoint/variables.tf:62">
P1: This unconditional check breaks valid S3-only usage of the module; `private_subnet_ids` is only required when an Interface endpoint is enabled.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/run-ci-cd.yaml (1)

900-918: ⚠️ Potential issue | 🟡 Minor

Add explicit launch-response validation for the migrate ECS task to align with the index-data task.

The migrate task block currently captures the task ARN directly without validating that the ECS run-task API call succeeded. Unlike the index-data task below it, there is no check for .failures or task presence. If the API returns an HTTP 200 with a failures array (which can happen when insufficient resources are available or role restrictions apply), the task step will emit an invalid/empty ARN, deferring the failure to downstream waiter/describe steps with poor diagnostics.

Proposed fix
       - name: Run ECS migrate task
         id: migrate-task
         env:
           CLUSTER_NAME: ${{ steps.tf-outputs.outputs.tasks_cluster_name }}
           NAT_GATEWAY_ENABLED: ${{ steps.tf-outputs.outputs.nat_gateway_enabled }}
           TASKS_SECURITY_GROUP_ID: ${{ steps.tf-outputs.outputs.tasks_security_group_id }}
           TASK_DEFINITION: nest-staging-migrate
           TASKS_SUBNET_IDS: ${{ steps.tf-outputs.outputs.tasks_subnet_ids }}
         run: |
           ASSIGN_PUBLIC_IP=$([ "$NAT_GATEWAY_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED")
-          TASK_ARN=$(aws ecs run-task \
+          RESPONSE=$(aws ecs run-task \
             --cluster "$CLUSTER_NAME" \
             --task-definition "$TASK_DEFINITION" \
             --launch-type FARGATE \
             --network-configuration "awsvpcConfiguration={subnets=[$TASKS_SUBNET_IDS],securityGroups=[$TASKS_SECURITY_GROUP_ID],assignPublicIp=$ASSIGN_PUBLIC_IP}" \
-            --query 'tasks[0].taskArn' \
-            --output text)
+            --output json)
+          FAILURE_COUNT=$(echo "$RESPONSE" | jq '.failures | length')
+          TASK_ARN=$(echo "$RESPONSE" | jq -r '.tasks[0].taskArn // empty')
+          if [ "$FAILURE_COUNT" -gt 0 ] || [ -z "$TASK_ARN" ]; then
+            echo "::error::Failed to start migrate task: $(echo "$RESPONSE" | jq -c '.failures')"
+            exit 1
+          fi
           echo "task_arn=$TASK_ARN" >> $GITHUB_OUTPUT
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/run-ci-cd.yaml around lines 900 - 918, The migrate-task
step currently writes TASK_ARN directly from the aws ecs run-task output without
validating the response; change it to capture the full JSON response (e.g.,
TASK_RESPONSE), check for non-empty .failures and for presence of
tasks[0].taskArn, and if failures exist or taskArn is missing emit a clear error
and exit non-zero; only when validation passes set TASK_ARN and write it to
GITHUB_OUTPUT—mirror the validation logic used by the index-data run-task step
to ensure failures are surfaced immediately.
🧹 Nitpick comments (2)
infrastructure/README.md (2)

269-273: Consider automating AWS_REGION substitution.

The manual AWS_REGION placeholder appears in three separate commands (lines 282, 293, 304), which is error-prone. Users might forget to replace all occurrences or use inconsistent values.

♻️ Proposed automated region extraction

If the AWS region is available in Terraform outputs, extract it alongside other variables:

 CLUSTER=$(terraform output -raw tasks_cluster_name)
 SECURITY_GROUP=$(terraform output -raw tasks_security_group_id)
 SUBNETS=$(terraform output -json tasks_subnet_ids | jq -r 'join(",")')
 NAT_ENABLED=$(terraform output -raw nat_gateway_enabled)
+REGION=$(terraform output -raw aws_region)
 ASSIGN_PUBLIC_IP=$([ "$NAT_ENABLED" = "true" ] && echo "DISABLED" || echo "ENABLED")

Then update the commands to use --region "$REGION" or --region $REGION.

Alternatively, if region isn't in outputs, you could use:

REGION=$(aws configure get region)

This ensures consistency across all three commands.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/README.md` around lines 269 - 273, The README currently uses a
manual AWS_REGION placeholder in three commands which is error-prone; update the
docs to capture the region once (e.g., assign a REGION variable by extracting
from Terraform outputs or via `aws configure get region`) and then replace each
`AWS_REGION` occurrence in the commands with `--region "$REGION"` (or `--region
$REGION`) so all three commands use the same automated REGION variable; mention
both options (Terraform outputs variable name and the aws configure fallback) so
users can copy the exact one that applies.

256-260: Verify jq installation before use.

Line 259 notes that jq is required, but the script doesn't verify it's installed. The command on line 264 will fail with a cryptic error if jq is missing.

🔧 Proposed dependency check

Add a check before the variable extraction block:

# Verify required tools
if ! command -v jq &> /dev/null; then
  echo "Error: jq is required but not installed. Install it with: sudo apt-get install jq"
  exit 1
fi

Alternatively, add installation instructions in the Prerequisites section (lines 7-13) alongside Terraform and AWS CLI.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/README.md` around lines 256 - 260, Add a pre-check for the jq
dependency before the Terraform output parsing block: verify jq is on PATH
(e.g., via command -v jq) and if missing print a clear error message and exit
non‑zero so the subsequent variable extraction doesn't fail cryptically;
alternatively update the README's Prerequisites section to list installation
instructions for jq alongside Terraform and AWS CLI so users install it before
running the script. Include the check immediately before the "Set variables from
Terraform outputs" extraction and ensure the error message mentions how to
install jq.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infrastructure/README.md`:
- Around line 256-267: Add validation guards after extracting Terraform outputs
(CLUSTER, SECURITY_GROUP, SUBNETS, NAT_ENABLED) that verify each variable is
non-empty and that SUBNETS is valid JSON/csv; if any check fails, print a clear
error and exit non-zero so downstream ECS commands don't run on bad data.
Replace the inline ASSIGN_PUBLIC_IP subshell with an explicit if/else that tests
NAT_ENABLED ("true"/"false") and sets ASSIGN_PUBLIC_IP accordingly, and ensure
all error messages reference the variable name (e.g., "Missing Terraform output:
CLUSTER") to fail fast and aid debugging.

---

Outside diff comments:
In @.github/workflows/run-ci-cd.yaml:
- Around line 900-918: The migrate-task step currently writes TASK_ARN directly
from the aws ecs run-task output without validating the response; change it to
capture the full JSON response (e.g., TASK_RESPONSE), check for non-empty
.failures and for presence of tasks[0].taskArn, and if failures exist or taskArn
is missing emit a clear error and exit non-zero; only when validation passes set
TASK_ARN and write it to GITHUB_OUTPUT—mirror the validation logic used by the
index-data run-task step to ensure failures are surfaced immediately.

---

Nitpick comments:
In `@infrastructure/README.md`:
- Around line 269-273: The README currently uses a manual AWS_REGION placeholder
in three commands which is error-prone; update the docs to capture the region
once (e.g., assign a REGION variable by extracting from Terraform outputs or via
`aws configure get region`) and then replace each `AWS_REGION` occurrence in the
commands with `--region "$REGION"` (or `--region $REGION`) so all three commands
use the same automated REGION variable; mention both options (Terraform outputs
variable name and the aws configure fallback) so users can copy the exact one
that applies.
- Around line 256-260: Add a pre-check for the jq dependency before the
Terraform output parsing block: verify jq is on PATH (e.g., via command -v jq)
and if missing print a clear error message and exit non‑zero so the subsequent
variable extraction doesn't fail cryptically; alternatively update the README's
Prerequisites section to list installation instructions for jq alongside
Terraform and AWS CLI so users install it before running the script. Include the
check immediately before the "Set variables from Terraform outputs" extraction
and ensure the error message mentions how to install jq.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7eb29de0-9fe1-4fda-8674-2e00850b0c1d

📥 Commits

Reviewing files that changed from the base of the PR and between c81ddfa and 61cdcf5.

📒 Files selected for processing (8)
  • .github/workflows/run-ci-cd.yaml
  • infrastructure/README.md
  • infrastructure/modules/alb/variables.tf
  • infrastructure/modules/networking/modules/nacl/variables.tf
  • infrastructure/modules/networking/modules/vpc-endpoint/variables.tf
  • infrastructure/modules/service/variables.tf
  • infrastructure/modules/tasks/modules/task/variables.tf
  • infrastructure/modules/tasks/variables.tf

@rudransh-shrivastava
Copy link
Collaborator Author

Running fork CI/CD.

@sonarqubecloud
Copy link

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name=".github/workflows/run-ci-cd.yaml">

<violation number="1">
P2: This gate removes PR/push validation for the deploy frontend image. `docker/frontend/Dockerfile` is now only built on manual/nightly staging runs or release, so release-image breakages can merge unnoticed.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

DOMAIN_NAME: ${{ vars.DOMAIN_NAME }}
DJANGO_CONFIGURATION: Staging
DJANGO_SETTINGS_MODULE: settings.staging
ENABLE_CRON_TASKS: false
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disabled cron tasks for staging please enable them here and in terraform.tfvars.example if this is not correct.

@rudransh-shrivastava rudransh-shrivastava changed the title [WIP] Refactor Infrastructure Code Refactor Infrastructure Code Mar 13, 2026
@rudransh-shrivastava rudransh-shrivastava marked this pull request as ready for review March 13, 2026 12:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
infrastructure/README.md (1)

98-101: Clarify the staging vs live terminology in this section.

This reads as “staging deployment,” but the path is infrastructure/live/. Please add one explicit sentence that infrastructure/live/ currently hosts the staging deployment config to avoid operator confusion during setup.

Also applies to: 125-131

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/README.md` around lines 98 - 101, Add one explicit clarifying
sentence to the NOTE block (the paragraph beginning with "Prerequisite: Create a
`nest-staging` IAM user...") stating that the repository's infrastructure/live
directory currently contains the staging deployment configuration (i.e., "Note:
the infrastructure/live/ directory currently hosts the staging deployment
configuration") so operators won't be confused by the "staging" wording; make
the same exact clarification in the later section around the paragraph at lines
125-131 to keep both places consistent.
infrastructure/modules/tasks/main.tf (1)

14-18: Gate the EventBridge IAM resources with enable_cron_tasks too.

When this flag is false, the schedules disappear, but aws_iam_role.event_bridge_role and aws_iam_policy.event_bridge_ecs_policy are still created with ecs:RunTask and iam:PassRole. That leaves unused privilege in environments that intentionally disable cron. Consider wrapping those resources with the same flag, or moving them into the scheduled-task path so they only exist when a schedule exists.

Based on learnings: In the OWASP/Nest repository, all Terraform files under infrastructure/ (e.g., infrastructure/state/main.tf) must implement production-grade security hardening.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/modules/tasks/main.tf` around lines 14 - 18, The EventBridge
IAM resources aws_iam_role.event_bridge_role and
aws_iam_policy.event_bridge_ecs_policy are still created even when
var.enable_cron_tasks is false, leaving unnecessary ecs:RunTask and iam:PassRole
privileges; guard these resources with the same flag by wrapping their resource
blocks (or moving them into the scheduled-task conditional/module) so they are
only defined when var.enable_cron_tasks is true, ensuring all references that
depend on these resources are likewise conditional or use count/for_each keyed
on var.enable_cron_tasks to avoid dangling references.
infrastructure/modules/tasks/tests/tasks.tftest.hcl (1)

20-64: Assert resource presence, not only local values.

These tests prove the locals flip, but they will not catch a future wiring bug where a task module stops using those locals. As a stronger regression test, also assert that the child aws_cloudwatch_event_rule.task resources are absent/present for the disabled and enabled runs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@infrastructure/modules/tasks/tests/tasks.tftest.hcl` around lines 20 - 64,
Add assertions that verify the actual child resource
aws_cloudwatch_event_rule.task is absent when enable_cron_tasks = false and
present when enable_cron_tasks = true: in the
"test_cron_tasks_disabled_removes_schedules" run add an assert that the
count/length of resources for aws_cloudwatch_event_rule.task == 0, and in
"test_cron_tasks_enabled_creates_schedules" assert that the count/length of
resources for aws_cloudwatch_event_rule.task > 0; keep the existing local checks
(local.sync_data_schedule_expression,
local.update_project_health_metrics_schedule_expression,
local.update_project_health_scores_schedule_expression) and add these
resource-presence asserts to catch wiring regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@infrastructure/modules/alb/variables.tf`:
- Around line 71-74: Update the validation for the variable public_subnet_ids so
it requires at least two entries (not just one) before plan/apply; modify the
validation condition in the validation block that references
var.public_subnet_ids (the validation block in variables.tf) to enforce
length(var.public_subnet_ids) >= 2 (or > 1) and update the error_message to
clearly state that at least two public_subnet_ids are required for ALB subnets
across Availability Zones.

---

Nitpick comments:
In `@infrastructure/modules/tasks/main.tf`:
- Around line 14-18: The EventBridge IAM resources
aws_iam_role.event_bridge_role and aws_iam_policy.event_bridge_ecs_policy are
still created even when var.enable_cron_tasks is false, leaving unnecessary
ecs:RunTask and iam:PassRole privileges; guard these resources with the same
flag by wrapping their resource blocks (or moving them into the scheduled-task
conditional/module) so they are only defined when var.enable_cron_tasks is true,
ensuring all references that depend on these resources are likewise conditional
or use count/for_each keyed on var.enable_cron_tasks to avoid dangling
references.

In `@infrastructure/modules/tasks/tests/tasks.tftest.hcl`:
- Around line 20-64: Add assertions that verify the actual child resource
aws_cloudwatch_event_rule.task is absent when enable_cron_tasks = false and
present when enable_cron_tasks = true: in the
"test_cron_tasks_disabled_removes_schedules" run add an assert that the
count/length of resources for aws_cloudwatch_event_rule.task == 0, and in
"test_cron_tasks_enabled_creates_schedules" assert that the count/length of
resources for aws_cloudwatch_event_rule.task > 0; keep the existing local checks
(local.sync_data_schedule_expression,
local.update_project_health_metrics_schedule_expression,
local.update_project_health_scores_schedule_expression) and add these
resource-presence asserts to catch wiring regressions.

In `@infrastructure/README.md`:
- Around line 98-101: Add one explicit clarifying sentence to the NOTE block
(the paragraph beginning with "Prerequisite: Create a `nest-staging` IAM
user...") stating that the repository's infrastructure/live directory currently
contains the staging deployment configuration (i.e., "Note: the
infrastructure/live/ directory currently hosts the staging deployment
configuration") so operators won't be confused by the "staging" wording; make
the same exact clarification in the later section around the paragraph at lines
125-131 to keep both places consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6bc13bf2-0fd0-429e-8bd2-95610c54bed2

📥 Commits

Reviewing files that changed from the base of the PR and between c81ddfa and ad7f5c3.

📒 Files selected for processing (11)
  • .github/workflows/run-ci-cd.yaml
  • infrastructure/README.md
  • infrastructure/live/main.tf
  • infrastructure/live/terraform.tfvars.example
  • infrastructure/live/variables.tf
  • infrastructure/modules/alb/variables.tf
  • infrastructure/modules/service/variables.tf
  • infrastructure/modules/tasks/main.tf
  • infrastructure/modules/tasks/modules/task/variables.tf
  • infrastructure/modules/tasks/tests/tasks.tftest.hcl
  • infrastructure/modules/tasks/variables.tf
🚧 Files skipped from review as they are similar to previous changes (2)
  • infrastructure/modules/tasks/modules/task/variables.tf
  • infrastructure/live/terraform.tfvars.example

@arkid15r arkid15r enabled auto-merge March 13, 2026 21:44
@arkid15r arkid15r added this pull request to the merge queue Mar 13, 2026
Merged via the queue into OWASP:main with commit fca9222 Mar 13, 2026
67 checks passed
aryanghai12 pushed a commit to aryanghai12/Nest that referenced this pull request Mar 14, 2026
* refactor parameters module to expose Django configuration/settings module

* update desired count

* rename staging/ to live/

* post rename code changes

* temporary changes to run CI/CD

* fix quotes for variables

* unpin terraform, providers, and modules versions

* Revert "temporary changes to run CI/CD"

This reverts commit ec2b007.

* fix django settings name

* add a flag for enabling NAT Gateway and disable it by default

* update docs

* rename create_ prefixed variables to enable_

* add/fix tests

* bot comments

* add flag for enabling cron tasks

* Reapply "temporary changes to run CI/CD"

This reverts commit 68f1d5d.

* temporary changes to run CI/CD

* Revert "temporary changes to run CI/CD"

This reverts commit cdf9078.

* Revert "Reapply "temporary changes to run CI/CD""

This reverts commit 64b5b04.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci docs Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor Infrastructure Code

2 participants