Skip to content

feat: implement multi-factor member scoring algorithm#4264

Open
HarshitVerma109 wants to merge 6 commits intoOWASP:mainfrom
HarshitVerma109:feature/improving-member-ranking
Open

feat: implement multi-factor member scoring algorithm#4264
HarshitVerma109 wants to merge 6 commits intoOWASP:mainfrom
HarshitVerma109:feature/improving-member-ranking

Conversation

@HarshitVerma109
Copy link
Contributor

Proposed change

Resolves #4200

Introduces a multi-factor calculated_score for members that accounts for contribution volume, recency, breadth, type diversity, and leadership roles. Updates the API default ordering and Algolia index to use this new metric for more balanced member ranking.

Checklist

  • Required: I followed the contributing workflow
  • Required: I verified that my code works as intended and resolves the issue as described
  • Required: I ran make check-test locally: all warnings addressed, tests passed
  • I used AI for code, documentation, tests, or communication related to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 13, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds a calculated_score FloatField to User and Member APIs, implements a multi-factor scoring service, updates indexing and ranking rules, adds commands to compute and persist scores, changes API ordering to default by calculated_score, includes migration and comprehensive tests across models, services, commands, and API.

Changes

Cohort / File(s) Summary
Scoring engine & tests
backend/apps/github/services/score.py, backend/tests/apps/github/services/score_test.py
Add a modular multi-factor scoring module (weights, helpers, compute_user_score, context) and extensive unit tests covering factors, edge cases, caps, and end-to-end aggregation.
Model, migration & model tests
backend/apps/github/models/user.py, backend/apps/github/migrations/0045_user_calculated_score.py, backend/tests/apps/github/models/user_test.py
Add calculated_score FloatField (default 0, db_index) to User, migration to add column, and test asserting default value.
Indexing integration & index tests
backend/apps/github/index/registry/user.py, backend/apps/github/models/mixins/user.py, backend/tests/apps/github/models/mixins/user_test.py
Expose idx_calculated_score on mixin, add field to index fields and customRanking settings, and update tests to include the new index field.
API exposure & API tests
backend/apps/api/rest/v0/member.py, backend/tests/apps/api/rest/v0/member_test.py
Expose calculated_score in Member schema; extend allowed ordering to calculated_score/-calculated_score; change default ordering to -calculated_score with id as tiebreaker; update tests.
Management commands & tests
backend/apps/github/management/commands/github_calculate_member_scores.py, backend/apps/github/management/commands/github_update_users.py, backend/tests/apps/github/management/commands/...
Add new command to compute/bulk-save member scores (optional --user, batching, progress); update github_update_users to use scoring context and compute/persist calculated_score; add comprehensive command tests and leadership-aggregation tests.
Local compose
docker-compose/local/compose.yaml
Rename local Postgres volume from db-data to db-data-v2 (volume mapping update).

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested reviewers

  • arkid15r
  • kasya
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: implementing a multi-factor member scoring algorithm, which is the core feature across all the file modifications.
Description check ✅ Passed The PR description is directly related to the changeset, explaining the multi-factor calculated_score implementation and its integration with API ordering and Algolia indexing.
Linked Issues check ✅ Passed The code changes comprehensively implement all major objectives from issue #4200: adds calculated_score field to models, implements multi-factor scoring (contributions, leadership, recency, breadth, type diversity, consistency), updates API ordering and Algolia index, and recalculates scores during GitHub data sync.
Out of Scope Changes check ✅ Passed All code changes are directly aligned with the PR objectives. The docker-compose volume rename (db-data to db-data-v2) appears incidental but is a minor operational change unrelated to feature scope; all other changes implement the scoring feature as specified.
Docstring Coverage ✅ Passed Docstring coverage is 97.83% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/tests/apps/api/rest/v0/member_test.py (1)

61-61: ⚠️ Potential issue | 🟡 Minor

Update the stale docstring for default ordering.

The docstring still says default '-created_at', but the test now validates '-calculated_score'.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/apps/api/rest/v0/member_test.py` at line 61, The test docstring
still claims the default ordering is '-created_at' but the test asserts
'-calculated_score'; update the docstring to accurately say "-calculated_score"
(replace the string "'-created_at'" with "'-calculated_score'") in the test at
the top of the member list test so the description matches the assertion.
🧹 Nitpick comments (3)
backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py (1)

36-37: Use tolerant float assertions to avoid brittle test behavior.

This test file has exact float equality comparisons on lines 36, 37, and 63. Replace these with pytest.approx(...) for stability and to satisfy static-analysis guidance.

Proposed test adjustment
 from unittest.mock import MagicMock, patch
 
+import pytest
+
 from apps.github.management.commands.github_calculate_member_scores import Command
@@
-        assert mock_user_1.calculated_score == 42.5
-        assert mock_user_2.calculated_score == 42.5
+        assert mock_user_1.calculated_score == pytest.approx(42.5)
+        assert mock_user_2.calculated_score == pytest.approx(42.5)
@@
-        assert mock_user.calculated_score == 55
+        assert mock_user.calculated_score == pytest.approx(55)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py`
around lines 36 - 37, Replace brittle exact float equality assertions in the
test by using pytest.approx: update the assertions that check
mock_user_1.calculated_score and mock_user_2.calculated_score (and the other
float assertion around line 63) to use pytest.approx(...) so they tolerate small
floating-point differences, and ensure pytest is imported in
github_calculate_member_scores_test.py if it isn't already; keep the rest of the
test logic unchanged.
backend/tests/apps/api/rest/v0/member_test.py (1)

71-71: Add secondary sort key to prevent non-deterministic ordering for tied scores.

The current implementation only sorts by -calculated_score, leaving users with identical scores in undefined order across requests. Add a deterministic tiebreaker using compound ordering: order_by("-calculated_score", "-updated_at", "login") to ensure consistent pagination.

This applies to both the default ordering (line 71) and the parameterized test cases (lines 78–79), which should also validate compound ordering.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/apps/api/rest/v0/member_test.py` at line 71, The test's
ordering assertion uses only
mock_user_model.objects.order_by.assert_called_with("-calculated_score") which
allows nondeterministic ties; update the assertion(s) to expect the compound
deterministic ordering order_by("-calculated_score", "-updated_at", "login")
instead. Locate the assertions in member_test.py (the call to
mock_user_model.objects.order_by and the parameterized test cases) and replace
their expected arguments to the three-key tuple so both the default ordering
assertion and the parameterized checks validate order_by("-calculated_score",
"-updated_at", "login").
backend/apps/github/services/score.py (1)

99-139: Consider consolidating leadership count queries for efficiency.

The current implementation makes 3 separate filter().count() calls (lines 109-125). For a batch job, this is acceptable, but if performance becomes a concern, you could consolidate into a single annotated query.

♻️ Optional: Consolidate with annotations
from django.db.models import Count, Case, When, IntegerField

counts = leader_memberships.aggregate(
    project_leaders=Count(Case(
        When(entity_type=project_ct, role=EntityMember.Role.LEADER, then=1),
        output_field=IntegerField()
    )),
    chapter_leaders=Count(Case(
        When(entity_type=chapter_ct, role=EntityMember.Role.LEADER, then=1),
        output_field=IntegerField()
    )),
    committee_members=Count(Case(
        When(entity_type=committee_ct, role__in=[EntityMember.Role.LEADER, EntityMember.Role.MEMBER], then=1),
        output_field=IntegerField()
    )),
)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/github/services/score.py` around lines 99 - 139, The three
separate leader_memberships.filter(...).count() calls (project_leader_count,
chapter_leader_count, committee_member_count) are inefficient; replace them with
a single aggregate on leader_memberships using Count + Case/When (or conditional
Sum) to compute project_leaders, chapter_leaders and committee_members in one DB
query (using the existing project_ct, chapter_ct, committee_ct and
EntityMember.Role checks), then use those aggregated counts to compute the three
min(... * POINTS..., MAX_..._POINTS) additions to score and return min(score,
MAX_SCORE).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@backend/tests/apps/api/rest/v0/member_test.py`:
- Line 61: The test docstring still claims the default ordering is '-created_at'
but the test asserts '-calculated_score'; update the docstring to accurately say
"-calculated_score" (replace the string "'-created_at'" with
"'-calculated_score'") in the test at the top of the member list test so the
description matches the assertion.

---

Nitpick comments:
In `@backend/apps/github/services/score.py`:
- Around line 99-139: The three separate leader_memberships.filter(...).count()
calls (project_leader_count, chapter_leader_count, committee_member_count) are
inefficient; replace them with a single aggregate on leader_memberships using
Count + Case/When (or conditional Sum) to compute project_leaders,
chapter_leaders and committee_members in one DB query (using the existing
project_ct, chapter_ct, committee_ct and EntityMember.Role checks), then use
those aggregated counts to compute the three min(... * POINTS...,
MAX_..._POINTS) additions to score and return min(score, MAX_SCORE).

In `@backend/tests/apps/api/rest/v0/member_test.py`:
- Line 71: The test's ordering assertion uses only
mock_user_model.objects.order_by.assert_called_with("-calculated_score") which
allows nondeterministic ties; update the assertion(s) to expect the compound
deterministic ordering order_by("-calculated_score", "-updated_at", "login")
instead. Locate the assertions in member_test.py (the call to
mock_user_model.objects.order_by and the parameterized test cases) and replace
their expected arguments to the three-key tuple so both the default ordering
assertion and the parameterized checks validate order_by("-calculated_score",
"-updated_at", "login").

In
`@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py`:
- Around line 36-37: Replace brittle exact float equality assertions in the test
by using pytest.approx: update the assertions that check
mock_user_1.calculated_score and mock_user_2.calculated_score (and the other
float assertion around line 63) to use pytest.approx(...) so they tolerate small
floating-point differences, and ensure pytest is imported in
github_calculate_member_scores_test.py if it isn't already; keep the rest of the
test logic unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 55eb742c-18a6-406d-8134-783545e37158

📥 Commits

Reviewing files that changed from the base of the PR and between 504211c and ffd1ab2.

📒 Files selected for processing (14)
  • backend/apps/api/rest/v0/member.py
  • backend/apps/github/index/registry/user.py
  • backend/apps/github/management/commands/github_calculate_member_scores.py
  • backend/apps/github/migrations/0045_user_calculated_score.py
  • backend/apps/github/models/mixins/user.py
  • backend/apps/github/models/user.py
  • backend/apps/github/services/__init__.py
  • backend/apps/github/services/score.py
  • backend/tests/apps/api/rest/v0/member_test.py
  • backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py
  • backend/tests/apps/github/models/mixins/user_test.py
  • backend/tests/apps/github/models/user_test.py
  • backend/tests/apps/github/services/__init__.py
  • backend/tests/apps/github/services/score_test.py

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 13, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 issues found across 14 files

Confidence score: 2/5

  • There is a high-confidence, high-impact issue in backend/apps/github/management/commands/github_calculate_member_scores.py: updated_users is never cleared after bulk_save(), which can cause unbounded memory growth and heavy write amplification during score recalculation runs.
  • backend/apps/api/rest/v0/member.py has non-deterministic pagination when ordering only by calculated_score; tied scores can shift between pages, creating concrete user-facing duplication/missing-member behavior across requests.
  • backend/apps/github/services/score.py currently swallows all exceptions around user.owasp_profile, which can hide real data/access failures and make scoring errors hard to detect; fixing these items should significantly reduce merge risk.
  • Pay close attention to backend/apps/github/management/commands/github_calculate_member_scores.py, backend/apps/api/rest/v0/member.py, and backend/apps/github/services/score.py - memory/performance amplification, pagination consistency, and masked runtime failures are the key risks.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/apps/github/services/score.py">

<violation number="1" location="backend/apps/github/services/score.py:133">
P2: Broad exception handling silently catches all exceptions when accessing `user.owasp_profile`, hiding real profile-access failures like database errors, misconfigured relations, or attribute errors. Instead of scoring users correctly, the code silently returns a lower score when ANY exception occurs, making production data issues hard to detect. Catch only the specific `MemberProfile.DoesNotExist` exception for users without profiles.</violation>
</file>

<file name="backend/apps/github/management/commands/github_calculate_member_scores.py">

<violation number="1" location="backend/apps/github/management/commands/github_calculate_member_scores.py:50">
P2: The filter `contributions_count__gt=0` excludes users who may have meaningful scores from leadership roles. Users with zero contributions but leadership positions (project/chapter leaders, board members) will never be recalculated. Additionally, previously scored users whose contribution count drops to 0 will retain stale scores indefinitely.</violation>

<violation number="2" location="backend/apps/github/management/commands/github_calculate_member_scores.py:61">
P1: Batch buffer `updated_users` is never cleared after `bulk_save()`, causing quadratic write amplification and unbounded memory growth. After each checkpoint save, add `updated_users = []` to reset the buffer.</violation>
</file>

<file name="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py">

<violation number="1" location="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py:77">
P2: The test for 'user not found' scenario is incomplete - it only verifies that bulk_save was not called but fails to assert that the error message was written to stdout. The test should verify the observable error output to ensure proper error signaling behavior.</violation>
</file>

<file name="backend/apps/api/rest/v0/member.py">

<violation number="1" location="backend/apps/api/rest/v0/member.py:94">
P1: Default pagination ordering by `calculated_score` lacks a secondary sort key, making pagination non-deterministic. Members with identical scores can appear in different orders across paginated requests, causing duplicates or skipped items during pagination. Add a secondary sort key (e.g., `id` or `created_at`) to ensure stable ordering.</violation>
</file>

<file name="backend/tests/apps/github/services/score_test.py">

<violation number="1" location="backend/tests/apps/github/services/score_test.py:84">
P2: Test `test_project_leader` uses a single mock object for all ContentType lookups, making all entity types (Project, Chapter, Committee) indistinguishable. The side_effect returns count=1 for ANY entity_type matching the mock, so it cannot verify the actual scoring logic correctly counts only project leaders. Combined with the weak assertion `score >= 0`, this test would pass even if the implementation incorrectly counted all entity types or returned wrong per-role weights.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 9 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="docker-compose/local/compose.yaml">

<violation number="1" location="docker-compose/local/compose.yaml:70">
P1: Renaming the Postgres volume `db-data` to `db-data-v2` silently resets every developer's local database and orphans existing data. This change appears unrelated to the multi-factor member scoring algorithm feature described in the PR title. Developers will lose all local data without warning when they next run `make run`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
docker-compose/local/compose.yaml (1)

70-70: Avoid a one-off DB volume rename without a consistent isolation strategy.

Line 70 and Line 152 rename db-data to db-data-v2, which will silently create a new Postgres volume and leave prior local data orphaned. If the goal is branch isolation, apply a consistent suffix strategy to all named volumes in this file; otherwise keep the canonical db-data name and document the reset explicitly in dev notes/PR description.

Based on learnings: In the OWASP/Nest repository, feature branches use unique volume name suffixes in docker-compose files to prevent volume clashes across parallel development efforts.

Also applies to: 152-152

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docker-compose/local/compose.yaml` at line 70, The docker-compose change
renames the Postgres volume from db-data to db-data-v2 which will create a new
volume and orphan existing data; either revert the service volume back to the
canonical db-data or, if branch isolation is intended, apply a consistent suffix
strategy across all named volumes in this compose file (e.g., rename every
occurrence of db-data and related named volumes to use the same suffix) and
update the volumes: definitions accordingly and document the reset in the PR
notes; look for the db-data/db-data-v2 string in the compose.yaml (service
volume mounts and volumes: section) to make the change.
backend/tests/apps/github/services/score_test.py (1)

165-191: Add malformed contribution_data regression tests.

Please add cases where counts are non-numeric (None, "x", "3") to ensure _recency_score and _consistency_score do not raise and safely skip/normalize bad values.

Also applies to: 275-297

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/tests/apps/github/services/score_test.py` around lines 165 - 191, Add
regression tests that supply malformed contribution_data entries (e.g., values
None, "x", "3") to ensure calculator._recency_score and
calculator._consistency_score tolerate non-numeric counts without raising;
create test cases in the existing TestRecencyScore block that set
mock_user.contribution_data to dicts containing these bad values and assert the
call returns a numeric (int/float) score and does not raise, and mirror
equivalent tests in the TestConsistencyScore area (the tests referenced around
the 275-297 region) so both methods skip or normalize bad values safely.
backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py (1)

35-40: Strengthen command tests by asserting bulk_save(..., fields=("calculated_score",)).

Current assertions only verify that bulk_save happened, not that only calculated_score is persisted.

✅ Suggested assertion hardening
+from unittest.mock import ANY, MagicMock, patch
@@
-        mock_user_model.bulk_save.assert_called()
+        mock_user_model.bulk_save.assert_any_call(ANY, fields=("calculated_score",))
@@
         mock_user_model.objects.filter.assert_called_once_with(login="testuser")
         assert mock_user.calculated_score == 55
+        mock_user_model.bulk_save.assert_any_call(ANY, fields=("calculated_score",))

Also applies to: 63-65

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py`
around lines 35 - 40, The test currently only checks that
mock_user_model.bulk_save was called; strengthen it to assert the call includes
the fields tuple limiting persistence to calculated_score by asserting
mock_user_model.bulk_save.assert_called_with(ANY, fields=("calculated_score",))
(use the appropriate first-arg matcher like ANY or the expected list of users),
and make the same change for the other assertion block around lines 63-65 so
both occurrences verify fields=("calculated_score",) on
mock_user_model.bulk_save.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/apps/github/services/score.py`:
- Around line 171-184: contribution_data may contain non-numeric values so
before using counts in arithmetic (the loop over contribution_data, the
recent_90/recent_180/recent_365 accumulation and the weighted_sum calculation)
validate/coerce each count: check that count is an int (or numeric string), try
to coerce via int(count) inside a try/except (or skip if None/"n/a"/invalid),
and only add to recent_90/recent_180/recent_365 when coercion succeeds; ensure
the weighted_sum calculation uses these validated integers so TypeError is
avoided.
- Around line 50-55: The calculate() method currently calls _contribution_score,
_leadership_score, _recency_score, _breadth_score, _type_diversity_score, and
_consistency_score for each user which issues separate DB queries per component;
refactor so the scoring is computed from batched/prefetched data instead: add an
alternate calculate_batch(users, prefetch_data) or change calculate to accept
precomputed aggregates (e.g., contributions_by_user, leadership_by_user,
recency_by_user, breadth_metrics, type_diversity_metrics, consistency_metrics)
and update each component method (_contribution_score, _leadership_score,
_recency_score, _breadth_score, _type_diversity_score, _consistency_score) to
read from those in-memory structures rather than hitting the DB; ensure the
batch caller builds those aggregates with a single set of annotated
queries/prefetch_related calls so per-user scoring is purely in-memory.

---

Nitpick comments:
In
`@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py`:
- Around line 35-40: The test currently only checks that
mock_user_model.bulk_save was called; strengthen it to assert the call includes
the fields tuple limiting persistence to calculated_score by asserting
mock_user_model.bulk_save.assert_called_with(ANY, fields=("calculated_score",))
(use the appropriate first-arg matcher like ANY or the expected list of users),
and make the same change for the other assertion block around lines 63-65 so
both occurrences verify fields=("calculated_score",) on
mock_user_model.bulk_save.

In `@backend/tests/apps/github/services/score_test.py`:
- Around line 165-191: Add regression tests that supply malformed
contribution_data entries (e.g., values None, "x", "3") to ensure
calculator._recency_score and calculator._consistency_score tolerate non-numeric
counts without raising; create test cases in the existing TestRecencyScore block
that set mock_user.contribution_data to dicts containing these bad values and
assert the call returns a numeric (int/float) score and does not raise, and
mirror equivalent tests in the TestConsistencyScore area (the tests referenced
around the 275-297 region) so both methods skip or normalize bad values safely.

In `@docker-compose/local/compose.yaml`:
- Line 70: The docker-compose change renames the Postgres volume from db-data to
db-data-v2 which will create a new volume and orphan existing data; either
revert the service volume back to the canonical db-data or, if branch isolation
is intended, apply a consistent suffix strategy across all named volumes in this
compose file (e.g., rename every occurrence of db-data and related named volumes
to use the same suffix) and update the volumes: definitions accordingly and
document the reset in the PR notes; look for the db-data/db-data-v2 string in
the compose.yaml (service volume mounts and volumes: section) to make the
change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 964b5b26-c13a-4ec9-a572-4782f5890606

📥 Commits

Reviewing files that changed from the base of the PR and between ffd1ab2 and 3f6b9b5.

📒 Files selected for processing (9)
  • backend/apps/api/rest/v0/member.py
  • backend/apps/github/management/commands/github_calculate_member_scores.py
  • backend/apps/github/migrations/0045_user_calculated_score.py
  • backend/apps/github/models/user.py
  • backend/apps/github/services/score.py
  • backend/tests/apps/api/rest/v0/member_test.py
  • backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py
  • backend/tests/apps/github/services/score_test.py
  • docker-compose/local/compose.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/apps/github/management/commands/github_calculate_member_scores.py
  • backend/apps/github/migrations/0045_user_calculated_score.py

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
backend/apps/github/services/score.py (1)

50-55: ⚠️ Potential issue | 🟠 Major

Batch scoring is still needed to avoid sync-time query explosion.

calculate() fan-outs into multiple ORM-backed methods per user, which will scale poorly in full recalculation runs. Please add a batched/aggregated path so scoring is mostly in-memory per user after bulk fetch/annotation.

💡 Refactor direction
+def calculate_batch(self, users):
+    aggregates = self._build_aggregates(users)
+    return {
+        user.id: self._calculate_from_aggregates(user, aggregates)
+        for user in users
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/github/services/score.py` around lines 50 - 55, The calculate()
path currently calls ORM-backed helpers (_contribution_score, _leadership_score,
_recency_score, _breadth_score, _type_diversity_score, _consistency_score) per
user causing N+1 DB queries; add a batched scoring path that bulk-fetches and
annotates all needed data once (e.g., activity, PRs, reviews, timestamps, types)
then computes each of those scores in-memory per user. Implement a new method
(e.g., calculate_batch(users) or an overload of calculate that accepts preloaded
annotations) which uses a single set of aggregated queries to build in-memory
maps keyed by user id, then call the existing _*_score helpers refactored to
accept the preloaded data instead of hitting the DB. Ensure the original
per-user API still works by routing it through the batched implementation for
single-user calls to avoid regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@backend/apps/github/services/score.py`:
- Around line 50-55: The calculate() path currently calls ORM-backed helpers
(_contribution_score, _leadership_score, _recency_score, _breadth_score,
_type_diversity_score, _consistency_score) per user causing N+1 DB queries; add
a batched scoring path that bulk-fetches and annotates all needed data once
(e.g., activity, PRs, reviews, timestamps, types) then computes each of those
scores in-memory per user. Implement a new method (e.g., calculate_batch(users)
or an overload of calculate that accepts preloaded annotations) which uses a
single set of aggregated queries to build in-memory maps keyed by user id, then
call the existing _*_score helpers refactored to accept the preloaded data
instead of hitting the DB. Ensure the original per-user API still works by
routing it through the batched implementation for single-user calls to avoid
regressions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 6c5659b6-0815-479b-8cab-285cd4ca41e6

📥 Commits

Reviewing files that changed from the base of the PR and between 3f6b9b5 and 55c6d1d.

📒 Files selected for processing (1)
  • backend/apps/github/services/score.py

coderabbitai[bot]
coderabbitai bot previously approved these changes Mar 13, 2026
cubic-dev-ai[bot]
cubic-dev-ai bot previously approved these changes Mar 13, 2026
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 1 file (changes from recent commits).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/apps/github/services/score.py (1)

76-86: Clamp the final weighted total to MAX_SCORE.

Right now the 0–100 guarantee depends on the current weights continuing to sum to exactly 1.0. Since this module is meant to be tuned over time, a later weight change can silently push the public API above 100. Capping the final value here keeps that invariant local.

Suggested change
-    return round(total, 2)
+    return round(min(MAX_SCORE, total), 2)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/github/services/score.py` around lines 76 - 86, After computing
total from the weighted components (the variable total), clamp it to the allowed
range before returning: replace the direct return round(total, 2) with a clamped
value such as clamped = min(MAX_SCORE, max(0, total)) and then return
round(clamped, 2); reference the existing constants (e.g., WEIGHT_CONTRIBUTIONS,
WEIGHT_LEADERSHIP, MAX_SCORE) and perform the clamp immediately after the total
calculation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/apps/github/management/commands/github_calculate_member_scores.py`:
- Around line 63-75: The repo filter in github_calculate_member_scores.py is
missing the exclusion of User.get_non_indexable_logins() used in
github_update_users; update the query that builds repo_data (the
RepositoryContributor.objects.exclude(owasp_repo_filter) call and the
owasp_repo_filter definition) to also exclude contributors whose user__login is
in User.get_non_indexable_logins(), and refactor the aggregation logic (the
repo_data / repo_data_map construction) into a shared helper function so both
github_calculate_member_scores and github_update_users call the same aggregation
to prevent future drift.

---

Nitpick comments:
In `@backend/apps/github/services/score.py`:
- Around line 76-86: After computing total from the weighted components (the
variable total), clamp it to the allowed range before returning: replace the
direct return round(total, 2) with a clamped value such as clamped =
min(MAX_SCORE, max(0, total)) and then return round(clamped, 2); reference the
existing constants (e.g., WEIGHT_CONTRIBUTIONS, WEIGHT_LEADERSHIP, MAX_SCORE)
and perform the clamp immediately after the total calculation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 004188d4-32d5-4a87-8728-85c7f0e3aade

📥 Commits

Reviewing files that changed from the base of the PR and between 55c6d1d and 24de670.

📒 Files selected for processing (6)
  • backend/apps/github/management/commands/github_calculate_member_scores.py
  • backend/apps/github/management/commands/github_update_users.py
  • backend/apps/github/services/score.py
  • backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py
  • backend/tests/apps/github/management/commands/github_update_users_test.py
  • backend/tests/apps/github/services/score_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/tests/apps/github/services/score_test.py

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 6 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/tests/apps/github/services/score_test.py">

<violation number="1" location="backend/tests/apps/github/services/score_test.py:422">
P2: The `test_weighted_calculation` test in `TestCalculateMemberScore` no longer verifies the actual weighted aggregation formula. The previous test stubbed each component score and asserted the exact weighted sum, which protected against regressions. The new test only checks `score > 0.0` and that the score is rounded, which would pass even if a weight was omitted, wrong weight used, or the new `WEIGHT_RELEASES` term forgotten. Consider adding a test that uses predictable inputs and asserts the exact expected weighted sum.</violation>
</file>

<file name="backend/tests/apps/github/management/commands/github_update_users_test.py">

<violation number="1" location="backend/tests/apps/github/management/commands/github_update_users_test.py:39">
P2: Tests for the updated command mock away leadership input and don't verify the new multi-factor `calculated_score` field. The `handle()` method's core scoring logic relies on combining contributions, repo_count, and leadership_data to compute `calculated_score`, but the tests stub `_get_leadership_data` to return `{}` and only assert legacy `contributions_count`. A regression in the new scoring feature would pass these tests.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 6 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py">

<violation number="1" location="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py:92">
P2: Test weakened from `assert_called_once_with` to `assert_any_call`, losing the guarantee that only one specific user query is performed when `--user` is specified</violation>
</file>

<file name="backend/tests/apps/github/management/commands/github_update_users_test.py">

<violation number="1" location="backend/tests/apps/github/management/commands/github_update_users_test.py:311">
P2: Test `test_handle_with_leadership_data` is a false positive: it claims to verify leadership data flows into calculated_score but mocks compute_user_score to return a fixed value and never verifies the mock was called with leadership context. The test would pass even if the implementation dropped leadership_data.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
backend/apps/github/services/score.py (1)

42-58: Consider encapsulating scoring inputs into a typed object.

calculate_member_score currently has a wide argument surface (14 params), which makes call sites and future changes brittle. A dataclass/TypedDict input would simplify evolution and reduce mismatch risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@backend/apps/github/services/score.py` around lines 42 - 58, The function
calculate_member_score has too many parameters; refactor it to accept a single
typed input object (e.g., a dataclass like MemberScoreInput or a TypedDict) that
encapsulates contributions_count, distinct_repository_count, release_count,
chapter_leader_count, project_leader_count, committee_member_count,
is_board_member, is_gsoc_mentor, is_owasp_staff, has_pull_requests, has_issues,
has_releases, has_contributions, and contribution_data; update the function
signature to def calculate_member_score(input: MemberScoreInput) -> float (or
accept Union[MemberScoreInput, legacy kwargs] temporarily for backward
compatibility), update all callers to construct and pass the new object,
preserve existing type hints for contribution_data (dict | None), and run/adjust
tests and type checks to ensure no behavior changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@backend/apps/github/services/score.py`:
- Around line 113-141: Negative input counts can produce negative scores; ensure
all count inputs are clamped to zero before applying point multipliers. In
_score_leadership, wrap chapter_leader_count, project_leader_count, and
committee_member_count with max(count, 0) when computing contributions (e.g.,
use min(max(project_leader_count, 0) * POINTS_PROJECT_LEADER,
MAX_PROJECT_LEADER_POINTS)) and keep the final min(score, MAX_SCORE) cap; apply
the same clamping approach in _score_breadth for any count parameters referenced
there so neither function can return a negative value.

---

Nitpick comments:
In `@backend/apps/github/services/score.py`:
- Around line 42-58: The function calculate_member_score has too many
parameters; refactor it to accept a single typed input object (e.g., a dataclass
like MemberScoreInput or a TypedDict) that encapsulates contributions_count,
distinct_repository_count, release_count, chapter_leader_count,
project_leader_count, committee_member_count, is_board_member, is_gsoc_mentor,
is_owasp_staff, has_pull_requests, has_issues, has_releases, has_contributions,
and contribution_data; update the function signature to def
calculate_member_score(input: MemberScoreInput) -> float (or accept
Union[MemberScoreInput, legacy kwargs] temporarily for backward compatibility),
update all callers to construct and pass the new object, preserve existing type
hints for contribution_data (dict | None), and run/adjust tests and type checks
to ensure no behavior changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ea9d246c-53f2-430f-91b0-9aab57bfd789

📥 Commits

Reviewing files that changed from the base of the PR and between 24de670 and 69b2954.

📒 Files selected for processing (6)
  • backend/apps/github/management/commands/github_calculate_member_scores.py
  • backend/apps/github/management/commands/github_update_users.py
  • backend/apps/github/services/score.py
  • backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py
  • backend/tests/apps/github/management/commands/github_update_users_test.py
  • backend/tests/apps/github/services/score_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
  • backend/apps/github/management/commands/github_calculate_member_scores.py
  • backend/apps/github/management/commands/github_update_users.py

@sonarqubecloud
Copy link

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 issues found across 4 files (changes from recent commits).

@HarshitVerma109
Copy link
Contributor Author

@arkid15r , please review it and let me know if any changes are required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve OWASP Nest member ranking algorithm with calculated scores

1 participant