feat: implement multi-factor member scoring algorithm#4264
feat: implement multi-factor member scoring algorithm#4264HarshitVerma109 wants to merge 6 commits intoOWASP:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds a calculated_score FloatField to User and Member APIs, implements a multi-factor scoring service, updates indexing and ranking rules, adds commands to compute and persist scores, changes API ordering to default by calculated_score, includes migration and comprehensive tests across models, services, commands, and API. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/tests/apps/api/rest/v0/member_test.py (1)
61-61:⚠️ Potential issue | 🟡 MinorUpdate the stale docstring for default ordering.
The docstring still says default
'-created_at', but the test now validates'-calculated_score'.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/apps/api/rest/v0/member_test.py` at line 61, The test docstring still claims the default ordering is '-created_at' but the test asserts '-calculated_score'; update the docstring to accurately say "-calculated_score" (replace the string "'-created_at'" with "'-calculated_score'") in the test at the top of the member list test so the description matches the assertion.
🧹 Nitpick comments (3)
backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py (1)
36-37: Use tolerant float assertions to avoid brittle test behavior.This test file has exact float equality comparisons on lines 36, 37, and 63. Replace these with
pytest.approx(...)for stability and to satisfy static-analysis guidance.Proposed test adjustment
from unittest.mock import MagicMock, patch +import pytest + from apps.github.management.commands.github_calculate_member_scores import Command @@ - assert mock_user_1.calculated_score == 42.5 - assert mock_user_2.calculated_score == 42.5 + assert mock_user_1.calculated_score == pytest.approx(42.5) + assert mock_user_2.calculated_score == pytest.approx(42.5) @@ - assert mock_user.calculated_score == 55 + assert mock_user.calculated_score == pytest.approx(55)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py` around lines 36 - 37, Replace brittle exact float equality assertions in the test by using pytest.approx: update the assertions that check mock_user_1.calculated_score and mock_user_2.calculated_score (and the other float assertion around line 63) to use pytest.approx(...) so they tolerate small floating-point differences, and ensure pytest is imported in github_calculate_member_scores_test.py if it isn't already; keep the rest of the test logic unchanged.backend/tests/apps/api/rest/v0/member_test.py (1)
71-71: Add secondary sort key to prevent non-deterministic ordering for tied scores.The current implementation only sorts by
-calculated_score, leaving users with identical scores in undefined order across requests. Add a deterministic tiebreaker using compound ordering:order_by("-calculated_score", "-updated_at", "login")to ensure consistent pagination.This applies to both the default ordering (line 71) and the parameterized test cases (lines 78–79), which should also validate compound ordering.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/apps/api/rest/v0/member_test.py` at line 71, The test's ordering assertion uses only mock_user_model.objects.order_by.assert_called_with("-calculated_score") which allows nondeterministic ties; update the assertion(s) to expect the compound deterministic ordering order_by("-calculated_score", "-updated_at", "login") instead. Locate the assertions in member_test.py (the call to mock_user_model.objects.order_by and the parameterized test cases) and replace their expected arguments to the three-key tuple so both the default ordering assertion and the parameterized checks validate order_by("-calculated_score", "-updated_at", "login").backend/apps/github/services/score.py (1)
99-139: Consider consolidating leadership count queries for efficiency.The current implementation makes 3 separate
filter().count()calls (lines 109-125). For a batch job, this is acceptable, but if performance becomes a concern, you could consolidate into a single annotated query.♻️ Optional: Consolidate with annotations
from django.db.models import Count, Case, When, IntegerField counts = leader_memberships.aggregate( project_leaders=Count(Case( When(entity_type=project_ct, role=EntityMember.Role.LEADER, then=1), output_field=IntegerField() )), chapter_leaders=Count(Case( When(entity_type=chapter_ct, role=EntityMember.Role.LEADER, then=1), output_field=IntegerField() )), committee_members=Count(Case( When(entity_type=committee_ct, role__in=[EntityMember.Role.LEADER, EntityMember.Role.MEMBER], then=1), output_field=IntegerField() )), )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/github/services/score.py` around lines 99 - 139, The three separate leader_memberships.filter(...).count() calls (project_leader_count, chapter_leader_count, committee_member_count) are inefficient; replace them with a single aggregate on leader_memberships using Count + Case/When (or conditional Sum) to compute project_leaders, chapter_leaders and committee_members in one DB query (using the existing project_ct, chapter_ct, committee_ct and EntityMember.Role checks), then use those aggregated counts to compute the three min(... * POINTS..., MAX_..._POINTS) additions to score and return min(score, MAX_SCORE).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@backend/tests/apps/api/rest/v0/member_test.py`:
- Line 61: The test docstring still claims the default ordering is '-created_at'
but the test asserts '-calculated_score'; update the docstring to accurately say
"-calculated_score" (replace the string "'-created_at'" with
"'-calculated_score'") in the test at the top of the member list test so the
description matches the assertion.
---
Nitpick comments:
In `@backend/apps/github/services/score.py`:
- Around line 99-139: The three separate leader_memberships.filter(...).count()
calls (project_leader_count, chapter_leader_count, committee_member_count) are
inefficient; replace them with a single aggregate on leader_memberships using
Count + Case/When (or conditional Sum) to compute project_leaders,
chapter_leaders and committee_members in one DB query (using the existing
project_ct, chapter_ct, committee_ct and EntityMember.Role checks), then use
those aggregated counts to compute the three min(... * POINTS...,
MAX_..._POINTS) additions to score and return min(score, MAX_SCORE).
In `@backend/tests/apps/api/rest/v0/member_test.py`:
- Line 71: The test's ordering assertion uses only
mock_user_model.objects.order_by.assert_called_with("-calculated_score") which
allows nondeterministic ties; update the assertion(s) to expect the compound
deterministic ordering order_by("-calculated_score", "-updated_at", "login")
instead. Locate the assertions in member_test.py (the call to
mock_user_model.objects.order_by and the parameterized test cases) and replace
their expected arguments to the three-key tuple so both the default ordering
assertion and the parameterized checks validate order_by("-calculated_score",
"-updated_at", "login").
In
`@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py`:
- Around line 36-37: Replace brittle exact float equality assertions in the test
by using pytest.approx: update the assertions that check
mock_user_1.calculated_score and mock_user_2.calculated_score (and the other
float assertion around line 63) to use pytest.approx(...) so they tolerate small
floating-point differences, and ensure pytest is imported in
github_calculate_member_scores_test.py if it isn't already; keep the rest of the
test logic unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 55eb742c-18a6-406d-8134-783545e37158
📒 Files selected for processing (14)
backend/apps/api/rest/v0/member.pybackend/apps/github/index/registry/user.pybackend/apps/github/management/commands/github_calculate_member_scores.pybackend/apps/github/migrations/0045_user_calculated_score.pybackend/apps/github/models/mixins/user.pybackend/apps/github/models/user.pybackend/apps/github/services/__init__.pybackend/apps/github/services/score.pybackend/tests/apps/api/rest/v0/member_test.pybackend/tests/apps/github/management/commands/github_calculate_member_scores_test.pybackend/tests/apps/github/models/mixins/user_test.pybackend/tests/apps/github/models/user_test.pybackend/tests/apps/github/services/__init__.pybackend/tests/apps/github/services/score_test.py
There was a problem hiding this comment.
6 issues found across 14 files
Confidence score: 2/5
- There is a high-confidence, high-impact issue in
backend/apps/github/management/commands/github_calculate_member_scores.py:updated_usersis never cleared afterbulk_save(), which can cause unbounded memory growth and heavy write amplification during score recalculation runs. backend/apps/api/rest/v0/member.pyhas non-deterministic pagination when ordering only bycalculated_score; tied scores can shift between pages, creating concrete user-facing duplication/missing-member behavior across requests.backend/apps/github/services/score.pycurrently swallows all exceptions arounduser.owasp_profile, which can hide real data/access failures and make scoring errors hard to detect; fixing these items should significantly reduce merge risk.- Pay close attention to
backend/apps/github/management/commands/github_calculate_member_scores.py,backend/apps/api/rest/v0/member.py, andbackend/apps/github/services/score.py- memory/performance amplification, pagination consistency, and masked runtime failures are the key risks.
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/apps/github/services/score.py">
<violation number="1" location="backend/apps/github/services/score.py:133">
P2: Broad exception handling silently catches all exceptions when accessing `user.owasp_profile`, hiding real profile-access failures like database errors, misconfigured relations, or attribute errors. Instead of scoring users correctly, the code silently returns a lower score when ANY exception occurs, making production data issues hard to detect. Catch only the specific `MemberProfile.DoesNotExist` exception for users without profiles.</violation>
</file>
<file name="backend/apps/github/management/commands/github_calculate_member_scores.py">
<violation number="1" location="backend/apps/github/management/commands/github_calculate_member_scores.py:50">
P2: The filter `contributions_count__gt=0` excludes users who may have meaningful scores from leadership roles. Users with zero contributions but leadership positions (project/chapter leaders, board members) will never be recalculated. Additionally, previously scored users whose contribution count drops to 0 will retain stale scores indefinitely.</violation>
<violation number="2" location="backend/apps/github/management/commands/github_calculate_member_scores.py:61">
P1: Batch buffer `updated_users` is never cleared after `bulk_save()`, causing quadratic write amplification and unbounded memory growth. After each checkpoint save, add `updated_users = []` to reset the buffer.</violation>
</file>
<file name="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py">
<violation number="1" location="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py:77">
P2: The test for 'user not found' scenario is incomplete - it only verifies that bulk_save was not called but fails to assert that the error message was written to stdout. The test should verify the observable error output to ensure proper error signaling behavior.</violation>
</file>
<file name="backend/apps/api/rest/v0/member.py">
<violation number="1" location="backend/apps/api/rest/v0/member.py:94">
P1: Default pagination ordering by `calculated_score` lacks a secondary sort key, making pagination non-deterministic. Members with identical scores can appear in different orders across paginated requests, causing duplicates or skipped items during pagination. Add a secondary sort key (e.g., `id` or `created_at`) to ensure stable ordering.</violation>
</file>
<file name="backend/tests/apps/github/services/score_test.py">
<violation number="1" location="backend/tests/apps/github/services/score_test.py:84">
P2: Test `test_project_leader` uses a single mock object for all ContentType lookups, making all entity types (Project, Chapter, Committee) indistinguishable. The side_effect returns count=1 for ANY entity_type matching the mock, so it cannot verify the actual scoring logic correctly counts only project leaders. Combined with the weak assertion `score >= 0`, this test would pass even if the implementation incorrectly counted all entity types or returned wrong per-role weights.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
backend/apps/github/management/commands/github_calculate_member_scores.py
Show resolved
Hide resolved
backend/apps/github/management/commands/github_calculate_member_scores.py
Outdated
Show resolved
Hide resolved
backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py
Show resolved
Hide resolved
There was a problem hiding this comment.
1 issue found across 9 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="docker-compose/local/compose.yaml">
<violation number="1" location="docker-compose/local/compose.yaml:70">
P1: Renaming the Postgres volume `db-data` to `db-data-v2` silently resets every developer's local database and orphans existing data. This change appears unrelated to the multi-factor member scoring algorithm feature described in the PR title. Developers will lose all local data without warning when they next run `make run`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
docker-compose/local/compose.yaml (1)
70-70: Avoid a one-off DB volume rename without a consistent isolation strategy.Line 70 and Line 152 rename
db-datatodb-data-v2, which will silently create a new Postgres volume and leave prior local data orphaned. If the goal is branch isolation, apply a consistent suffix strategy to all named volumes in this file; otherwise keep the canonicaldb-dataname and document the reset explicitly in dev notes/PR description.Based on learnings: In the OWASP/Nest repository, feature branches use unique volume name suffixes in docker-compose files to prevent volume clashes across parallel development efforts.
Also applies to: 152-152
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docker-compose/local/compose.yaml` at line 70, The docker-compose change renames the Postgres volume from db-data to db-data-v2 which will create a new volume and orphan existing data; either revert the service volume back to the canonical db-data or, if branch isolation is intended, apply a consistent suffix strategy across all named volumes in this compose file (e.g., rename every occurrence of db-data and related named volumes to use the same suffix) and update the volumes: definitions accordingly and document the reset in the PR notes; look for the db-data/db-data-v2 string in the compose.yaml (service volume mounts and volumes: section) to make the change.backend/tests/apps/github/services/score_test.py (1)
165-191: Add malformedcontribution_dataregression tests.Please add cases where counts are non-numeric (
None,"x","3") to ensure_recency_scoreand_consistency_scoredo not raise and safely skip/normalize bad values.Also applies to: 275-297
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/apps/github/services/score_test.py` around lines 165 - 191, Add regression tests that supply malformed contribution_data entries (e.g., values None, "x", "3") to ensure calculator._recency_score and calculator._consistency_score tolerate non-numeric counts without raising; create test cases in the existing TestRecencyScore block that set mock_user.contribution_data to dicts containing these bad values and assert the call returns a numeric (int/float) score and does not raise, and mirror equivalent tests in the TestConsistencyScore area (the tests referenced around the 275-297 region) so both methods skip or normalize bad values safely.backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py (1)
35-40: Strengthen command tests by assertingbulk_save(..., fields=("calculated_score",)).Current assertions only verify that
bulk_savehappened, not that onlycalculated_scoreis persisted.✅ Suggested assertion hardening
+from unittest.mock import ANY, MagicMock, patch @@ - mock_user_model.bulk_save.assert_called() + mock_user_model.bulk_save.assert_any_call(ANY, fields=("calculated_score",)) @@ mock_user_model.objects.filter.assert_called_once_with(login="testuser") assert mock_user.calculated_score == 55 + mock_user_model.bulk_save.assert_any_call(ANY, fields=("calculated_score",))Also applies to: 63-65
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py` around lines 35 - 40, The test currently only checks that mock_user_model.bulk_save was called; strengthen it to assert the call includes the fields tuple limiting persistence to calculated_score by asserting mock_user_model.bulk_save.assert_called_with(ANY, fields=("calculated_score",)) (use the appropriate first-arg matcher like ANY or the expected list of users), and make the same change for the other assertion block around lines 63-65 so both occurrences verify fields=("calculated_score",) on mock_user_model.bulk_save.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/github/services/score.py`:
- Around line 171-184: contribution_data may contain non-numeric values so
before using counts in arithmetic (the loop over contribution_data, the
recent_90/recent_180/recent_365 accumulation and the weighted_sum calculation)
validate/coerce each count: check that count is an int (or numeric string), try
to coerce via int(count) inside a try/except (or skip if None/"n/a"/invalid),
and only add to recent_90/recent_180/recent_365 when coercion succeeds; ensure
the weighted_sum calculation uses these validated integers so TypeError is
avoided.
- Around line 50-55: The calculate() method currently calls _contribution_score,
_leadership_score, _recency_score, _breadth_score, _type_diversity_score, and
_consistency_score for each user which issues separate DB queries per component;
refactor so the scoring is computed from batched/prefetched data instead: add an
alternate calculate_batch(users, prefetch_data) or change calculate to accept
precomputed aggregates (e.g., contributions_by_user, leadership_by_user,
recency_by_user, breadth_metrics, type_diversity_metrics, consistency_metrics)
and update each component method (_contribution_score, _leadership_score,
_recency_score, _breadth_score, _type_diversity_score, _consistency_score) to
read from those in-memory structures rather than hitting the DB; ensure the
batch caller builds those aggregates with a single set of annotated
queries/prefetch_related calls so per-user scoring is purely in-memory.
---
Nitpick comments:
In
`@backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py`:
- Around line 35-40: The test currently only checks that
mock_user_model.bulk_save was called; strengthen it to assert the call includes
the fields tuple limiting persistence to calculated_score by asserting
mock_user_model.bulk_save.assert_called_with(ANY, fields=("calculated_score",))
(use the appropriate first-arg matcher like ANY or the expected list of users),
and make the same change for the other assertion block around lines 63-65 so
both occurrences verify fields=("calculated_score",) on
mock_user_model.bulk_save.
In `@backend/tests/apps/github/services/score_test.py`:
- Around line 165-191: Add regression tests that supply malformed
contribution_data entries (e.g., values None, "x", "3") to ensure
calculator._recency_score and calculator._consistency_score tolerate non-numeric
counts without raising; create test cases in the existing TestRecencyScore block
that set mock_user.contribution_data to dicts containing these bad values and
assert the call returns a numeric (int/float) score and does not raise, and
mirror equivalent tests in the TestConsistencyScore area (the tests referenced
around the 275-297 region) so both methods skip or normalize bad values safely.
In `@docker-compose/local/compose.yaml`:
- Line 70: The docker-compose change renames the Postgres volume from db-data to
db-data-v2 which will create a new volume and orphan existing data; either
revert the service volume back to the canonical db-data or, if branch isolation
is intended, apply a consistent suffix strategy across all named volumes in this
compose file (e.g., rename every occurrence of db-data and related named volumes
to use the same suffix) and update the volumes: definitions accordingly and
document the reset in the PR notes; look for the db-data/db-data-v2 string in
the compose.yaml (service volume mounts and volumes: section) to make the
change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 964b5b26-c13a-4ec9-a572-4782f5890606
📒 Files selected for processing (9)
backend/apps/api/rest/v0/member.pybackend/apps/github/management/commands/github_calculate_member_scores.pybackend/apps/github/migrations/0045_user_calculated_score.pybackend/apps/github/models/user.pybackend/apps/github/services/score.pybackend/tests/apps/api/rest/v0/member_test.pybackend/tests/apps/github/management/commands/github_calculate_member_scores_test.pybackend/tests/apps/github/services/score_test.pydocker-compose/local/compose.yaml
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/apps/github/management/commands/github_calculate_member_scores.py
- backend/apps/github/migrations/0045_user_calculated_score.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
backend/apps/github/services/score.py (1)
50-55:⚠️ Potential issue | 🟠 MajorBatch scoring is still needed to avoid sync-time query explosion.
calculate()fan-outs into multiple ORM-backed methods per user, which will scale poorly in full recalculation runs. Please add a batched/aggregated path so scoring is mostly in-memory per user after bulk fetch/annotation.💡 Refactor direction
+def calculate_batch(self, users): + aggregates = self._build_aggregates(users) + return { + user.id: self._calculate_from_aggregates(user, aggregates) + for user in users + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/github/services/score.py` around lines 50 - 55, The calculate() path currently calls ORM-backed helpers (_contribution_score, _leadership_score, _recency_score, _breadth_score, _type_diversity_score, _consistency_score) per user causing N+1 DB queries; add a batched scoring path that bulk-fetches and annotates all needed data once (e.g., activity, PRs, reviews, timestamps, types) then computes each of those scores in-memory per user. Implement a new method (e.g., calculate_batch(users) or an overload of calculate that accepts preloaded annotations) which uses a single set of aggregated queries to build in-memory maps keyed by user id, then call the existing _*_score helpers refactored to accept the preloaded data instead of hitting the DB. Ensure the original per-user API still works by routing it through the batched implementation for single-user calls to avoid regressions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@backend/apps/github/services/score.py`:
- Around line 50-55: The calculate() path currently calls ORM-backed helpers
(_contribution_score, _leadership_score, _recency_score, _breadth_score,
_type_diversity_score, _consistency_score) per user causing N+1 DB queries; add
a batched scoring path that bulk-fetches and annotates all needed data once
(e.g., activity, PRs, reviews, timestamps, types) then computes each of those
scores in-memory per user. Implement a new method (e.g., calculate_batch(users)
or an overload of calculate that accepts preloaded annotations) which uses a
single set of aggregated queries to build in-memory maps keyed by user id, then
call the existing _*_score helpers refactored to accept the preloaded data
instead of hitting the DB. Ensure the original per-user API still works by
routing it through the batched implementation for single-user calls to avoid
regressions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6c5659b6-0815-479b-8cab-285cd4ca41e6
📒 Files selected for processing (1)
backend/apps/github/services/score.py
24de670
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/apps/github/services/score.py (1)
76-86: Clamp the final weighted total toMAX_SCORE.Right now the 0–100 guarantee depends on the current weights continuing to sum to exactly
1.0. Since this module is meant to be tuned over time, a later weight change can silently push the public API above 100. Capping the final value here keeps that invariant local.Suggested change
- return round(total, 2) + return round(min(MAX_SCORE, total), 2)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/github/services/score.py` around lines 76 - 86, After computing total from the weighted components (the variable total), clamp it to the allowed range before returning: replace the direct return round(total, 2) with a clamped value such as clamped = min(MAX_SCORE, max(0, total)) and then return round(clamped, 2); reference the existing constants (e.g., WEIGHT_CONTRIBUTIONS, WEIGHT_LEADERSHIP, MAX_SCORE) and perform the clamp immediately after the total calculation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/github/management/commands/github_calculate_member_scores.py`:
- Around line 63-75: The repo filter in github_calculate_member_scores.py is
missing the exclusion of User.get_non_indexable_logins() used in
github_update_users; update the query that builds repo_data (the
RepositoryContributor.objects.exclude(owasp_repo_filter) call and the
owasp_repo_filter definition) to also exclude contributors whose user__login is
in User.get_non_indexable_logins(), and refactor the aggregation logic (the
repo_data / repo_data_map construction) into a shared helper function so both
github_calculate_member_scores and github_update_users call the same aggregation
to prevent future drift.
---
Nitpick comments:
In `@backend/apps/github/services/score.py`:
- Around line 76-86: After computing total from the weighted components (the
variable total), clamp it to the allowed range before returning: replace the
direct return round(total, 2) with a clamped value such as clamped =
min(MAX_SCORE, max(0, total)) and then return round(clamped, 2); reference the
existing constants (e.g., WEIGHT_CONTRIBUTIONS, WEIGHT_LEADERSHIP, MAX_SCORE)
and perform the clamp immediately after the total calculation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 004188d4-32d5-4a87-8728-85c7f0e3aade
📒 Files selected for processing (6)
backend/apps/github/management/commands/github_calculate_member_scores.pybackend/apps/github/management/commands/github_update_users.pybackend/apps/github/services/score.pybackend/tests/apps/github/management/commands/github_calculate_member_scores_test.pybackend/tests/apps/github/management/commands/github_update_users_test.pybackend/tests/apps/github/services/score_test.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/tests/apps/github/services/score_test.py
backend/apps/github/management/commands/github_calculate_member_scores.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
2 issues found across 6 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/tests/apps/github/services/score_test.py">
<violation number="1" location="backend/tests/apps/github/services/score_test.py:422">
P2: The `test_weighted_calculation` test in `TestCalculateMemberScore` no longer verifies the actual weighted aggregation formula. The previous test stubbed each component score and asserted the exact weighted sum, which protected against regressions. The new test only checks `score > 0.0` and that the score is rounded, which would pass even if a weight was omitted, wrong weight used, or the new `WEIGHT_RELEASES` term forgotten. Consider adding a test that uses predictable inputs and asserts the exact expected weighted sum.</violation>
</file>
<file name="backend/tests/apps/github/management/commands/github_update_users_test.py">
<violation number="1" location="backend/tests/apps/github/management/commands/github_update_users_test.py:39">
P2: Tests for the updated command mock away leadership input and don't verify the new multi-factor `calculated_score` field. The `handle()` method's core scoring logic relies on combining contributions, repo_count, and leadership_data to compute `calculated_score`, but the tests stub `_get_leadership_data` to return `{}` and only assert legacy `contributions_count`. A regression in the new scoring feature would pass these tests.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
backend/tests/apps/github/management/commands/github_update_users_test.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
2 issues found across 6 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py">
<violation number="1" location="backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py:92">
P2: Test weakened from `assert_called_once_with` to `assert_any_call`, losing the guarantee that only one specific user query is performed when `--user` is specified</violation>
</file>
<file name="backend/tests/apps/github/management/commands/github_update_users_test.py">
<violation number="1" location="backend/tests/apps/github/management/commands/github_update_users_test.py:311">
P2: Test `test_handle_with_leadership_data` is a false positive: it claims to verify leadership data flows into calculated_score but mocks compute_user_score to return a fixed value and never verifies the mock was called with leadership context. The test would pass even if the implementation dropped leadership_data.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
backend/tests/apps/github/management/commands/github_calculate_member_scores_test.py
Outdated
Show resolved
Hide resolved
backend/tests/apps/github/management/commands/github_update_users_test.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
backend/apps/github/services/score.py (1)
42-58: Consider encapsulating scoring inputs into a typed object.
calculate_member_scorecurrently has a wide argument surface (14 params), which makes call sites and future changes brittle. Adataclass/TypedDict input would simplify evolution and reduce mismatch risk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@backend/apps/github/services/score.py` around lines 42 - 58, The function calculate_member_score has too many parameters; refactor it to accept a single typed input object (e.g., a dataclass like MemberScoreInput or a TypedDict) that encapsulates contributions_count, distinct_repository_count, release_count, chapter_leader_count, project_leader_count, committee_member_count, is_board_member, is_gsoc_mentor, is_owasp_staff, has_pull_requests, has_issues, has_releases, has_contributions, and contribution_data; update the function signature to def calculate_member_score(input: MemberScoreInput) -> float (or accept Union[MemberScoreInput, legacy kwargs] temporarily for backward compatibility), update all callers to construct and pass the new object, preserve existing type hints for contribution_data (dict | None), and run/adjust tests and type checks to ensure no behavior changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@backend/apps/github/services/score.py`:
- Around line 113-141: Negative input counts can produce negative scores; ensure
all count inputs are clamped to zero before applying point multipliers. In
_score_leadership, wrap chapter_leader_count, project_leader_count, and
committee_member_count with max(count, 0) when computing contributions (e.g.,
use min(max(project_leader_count, 0) * POINTS_PROJECT_LEADER,
MAX_PROJECT_LEADER_POINTS)) and keep the final min(score, MAX_SCORE) cap; apply
the same clamping approach in _score_breadth for any count parameters referenced
there so neither function can return a negative value.
---
Nitpick comments:
In `@backend/apps/github/services/score.py`:
- Around line 42-58: The function calculate_member_score has too many
parameters; refactor it to accept a single typed input object (e.g., a dataclass
like MemberScoreInput or a TypedDict) that encapsulates contributions_count,
distinct_repository_count, release_count, chapter_leader_count,
project_leader_count, committee_member_count, is_board_member, is_gsoc_mentor,
is_owasp_staff, has_pull_requests, has_issues, has_releases, has_contributions,
and contribution_data; update the function signature to def
calculate_member_score(input: MemberScoreInput) -> float (or accept
Union[MemberScoreInput, legacy kwargs] temporarily for backward compatibility),
update all callers to construct and pass the new object, preserve existing type
hints for contribution_data (dict | None), and run/adjust tests and type checks
to ensure no behavior changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ea9d246c-53f2-430f-91b0-9aab57bfd789
📒 Files selected for processing (6)
backend/apps/github/management/commands/github_calculate_member_scores.pybackend/apps/github/management/commands/github_update_users.pybackend/apps/github/services/score.pybackend/tests/apps/github/management/commands/github_calculate_member_scores_test.pybackend/tests/apps/github/management/commands/github_update_users_test.pybackend/tests/apps/github/services/score_test.py
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/apps/github/management/commands/github_calculate_member_scores.py
- backend/apps/github/management/commands/github_update_users.py
|
|
@arkid15r , please review it and let me know if any changes are required. |



Proposed change
Resolves #4200
Introduces a multi-factor
calculated_scorefor members that accounts for contribution volume, recency, breadth, type diversity, and leadership roles. Updates the API default ordering and Algolia index to use this new metric for more balanced member ranking.Checklist
make check-testlocally: all warnings addressed, tests passed