Skip to content

question detection refining#2443

Merged
arkid15r merged 3 commits intoOWASP:feature/nestbot-ai-assistantfrom
Dishant1804:question_detector_refinement
Oct 22, 2025
Merged

question detection refining#2443
arkid15r merged 3 commits intoOWASP:feature/nestbot-ai-assistantfrom
Dishant1804:question_detector_refinement

Conversation

@Dishant1804
Copy link
Contributor

Proposed change

Resolves #2309

  • implemented scalable solution to detect OWASP related questions

Checklist

  • I've read and followed the contributing guidelines.
  • I've run make check-test locally; all checks and tests passed.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 18, 2025

Summary by CodeRabbit

  • Improvements
    • Enhanced OWASP question detection to use intelligent context analysis instead of keyword-based matching, improving accuracy and relevance of question identification in Slack interactions.
    • Implemented retrieval-based context integration to provide better understanding of question context.
    • Improved error handling for detection failures with more reliable fallback behavior.

Walkthrough

Replaces keyword-based OWASP detection with a retrieval-augmented OpenAI classification flow: adds retriever usage, context formatting, prompt retrieval, and OpenAI error handling; introduces CHUNKS_RETRIEVAL_LIMIT, changes model/tokens, and updates tests to mock retriever/OpenAI paths. (≤50 words)

Changes

Cohort / File(s) Summary
Question Detector Implementation
backend/apps/slack/common/question_detector.py
Replaces keyword-only detection with retrieval + OpenAI flow. Adds is_owasp_question_with_openai() and format_context_chunks(). Introduces CHUNKS_RETRIEVAL_LIMIT = 10. Changes CHAT_MODEL "gpt-4o""gpt-4o-mini" and MAX_TOKENS 50 → 10. Adds retriever usage, context formatting, prompt retrieval, and OpenAI error handling; removes is_question(), contains_owasp_keywords(), and OWASP keyword state.
Question Detector Tests
backend/tests/apps/slack/common/question_detector_test.py
Adds mocked retriever and sample_context_chunks fixture; updates tests to mock is_owasp_question_with_openai() / OpenAI responses and expect prompt with {context}. Adds tests for format_context_chunks() and updates constant expectations to match implementation changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Response improvements and refactoring #2407: Modifies the same question detector module and replaces system prompt usage with Prompt.get_slack_question_detector_prompt, indicating shared prompt and retrieval-related changes.

Suggested labels

backend

Suggested reviewers

  • kasya
  • arkid15r

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title Check ❓ Inconclusive The PR title "question detection refining" is related to the changeset's focus on OWASP question detection, and it accurately reflects that the PR involves modifications to the question detection logic. However, the title is quite vague and generic; it does not convey the core technical change, which is the fundamental shift from keyword-based detection to a retrieval-based, context-aware OpenAI-driven approach. A reader scanning the repository history would not understand from this title alone that the solution now uses retrieval patterns and LLM-based classification rather than simple keyword matching. Consider revising the title to be more specific and descriptive of the main technical change, such as "Replace keyword-based OWASP question detection with retrieval-based OpenAI approach" or "Implement retrieval-augmented OWASP question detection using context and OpenAI". This would make the purpose of the PR immediately clear to reviewers and future maintainers.
✅ Passed checks (3 passed)
Check name Status Explanation
Out of Scope Changes Check ✅ Passed All code changes in this PR are directly scoped to implementing the OWASP question detection solution as specified in issue #2309. The modifications are confined to backend/apps/slack/common/question_detector.py and its corresponding test file, with all changes focused on replacing the keyword-based detection mechanism with a retrieval-augmented OpenAI-driven approach. No unrelated features, documentation changes, or modifications to other modules are present in the changeset.
Description Check ✅ Passed The PR description directly relates to the changeset by explicitly stating "implemented scalable solution to detect OWASP related questions" and linking to issue #2309. The description accurately captures the primary objective of the PR, which aligns with the code changes that introduce retrieval-based context-aware detection to replace simple keyword matching. The contributor also provides evidence of following the contributing guidelines and running local tests successfully.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/apps/slack/common/question_detector.py (1)

121-128: Fix fragile YES/NO substring matching (“ENOUGH” contains “NO”)

Substring checks can misclassify. Parse the first token or exact whole-word answer.

Apply this diff:

-            clean_answer = answer.strip().upper()
-
-            if "YES" in clean_answer:
-                return True
-            if "NO" in clean_answer:
-                return False
+            clean_answer = answer.strip()
+            if not clean_answer:
+                logger.error("OpenAI returned blank content")
+                return None
+            first = clean_answer.strip().split()[0].strip('.,!?:;"\'()[]{}').upper()
+            if first == "YES":
+                return True
+            if first == "NO":
+                return False
             logger.warning("Unexpected OpenAI response")
             return None
🧹 Nitpick comments (8)
backend/apps/slack/common/question_detector.py (5)

47-51: Docstring args out of sync with signature

is_owasp_question does not accept context_chunks, yet the docstring lists it. Remove the param from the docstring (or add the param), to avoid confusion.


88-101: Unify empty-context string to a single value

is_owasp_question_with_openai uses “No context available” while format_context_chunks returns “No context provided”. Use one phrase to simplify testing and prompts.

Apply this diff:

-        formatted_context = (
-            self.format_context_chunks(context_chunks)
-            if context_chunks
-            else "No context available"
-        )
+        formatted_context = (
+            self.format_context_chunks(context_chunks)
+            if context_chunks
+            else "No context provided"
+        )

102-111: Harden model output format; prefer deterministic, structured responses

To eliminate “MAYBE” and reduce parsing risk, set temperature to 0 and request structured output (e.g., JSON with { "answer": "YES" | "NO" }) or force a single-token response. If staying with text, at least set temperature=0.

Example (adjust per openai client version):

response = self.openai_client.chat.completions.create(
    model=self.CHAT_MODEL,
    messages=[...],
    temperature=0,
    max_tokens=self.MAX_TOKENS,
    # response_format={"type": "json_object"},  # if supported in your version
)

Optionally add request timeouts/retries on the client or call for resilience.


130-159: Accept Optional context, and serialize additional_context as JSON

  • Signature should accept None (tests call with None).
  • Printing dicts directly is noisy; JSON is clearer and stable for prompts.

Apply this diff:

-    def format_context_chunks(self, context_chunks: list[dict]) -> str:
+    def format_context_chunks(self, context_chunks: list[dict] | None) -> str:
@@
-        if not context_chunks:
+        if not context_chunks:
             return "No context provided"
@@
-            additional_context = chunk.get("additional_context", {})
+            additional_context = chunk.get("additional_context", {})
@@
-            if additional_context:
-                context_block = (
-                    f"Source Name: {source_name}\nContent: {text}\n"
-                    f"Additional Context: {additional_context}"
-                )
+            if additional_context:
+                import json
+                context_block = (
+                    f"Source Name: {source_name}\nContent: {text}\n"
+                    f"Additional Context: {json.dumps(additional_context, ensure_ascii=False, sort_keys=True)}"
+                )
             else:
                 context_block = f"Source Name: {source_name}\nContent: {text}"

Optional: truncate text snippets (e.g., first 500–1000 chars) to keep prompt size bounded.


63-71: Consider deterministic fallback when OpenAI fails

When OpenAI errors or returns None, you currently return False. To avoid zero recall during outages, consider a low-cost deterministic fallback (e.g., a minimal keyword/phrase classifier limited to OWASP terms). Keep it behind a feature flag.

backend/tests/apps/slack/common/question_detector_test.py (3)

98-103: Strengthen “non-questions” test to ignore OpenAI result

Currently the mock returns False, so the comment “regardless of OpenAI result” isn’t actually verified. Force OpenAI path to return True and still expect False.

Apply this diff:

@@ def test_is_owasp_question_false_cases(self, detector, sample_context_chunks, monkeypatch):
-        # Mock OpenAI to return False for non-OWASP questions
-        mock_openai_method = MagicMock(return_value=False)
+        # Even if OpenAI says True, non-questions must be rejected by the gate
+        mock_openai_method = MagicMock(return_value=True)
         monkeypatch.setattr(detector, "is_owasp_question_with_openai", mock_openai_method)
@@
-        assert not detector.is_owasp_question("OWASP is great")
-        assert not detector.is_owasp_question("Security is important")
+        assert not detector.is_owasp_question("OWASP is great")
+        assert not detector.is_owasp_question("Security is important")

323-346: Add guard test for “NO” substring false positive (e.g., ‘ENOUGH’)

Ensure parsing doesn’t misclassify when ‘NO’ appears inside another word.

Apply this diff to add a new test:

+    def test_is_owasp_question_with_openai_no_substring_not_misclassified(self, sample_context_chunks):
+        """Ensure 'NO' substring inside another word doesn't flip to False."""
+        with (
+            patch.dict(os.environ, {"DJANGO_OPEN_AI_SECRET_KEY": "test-key"}),
+            patch(
+                "apps.slack.common.question_detector.Prompt.get_slack_question_detector_prompt",
+                return_value="System prompt with {context}",
+            ),
+            patch("openai.OpenAI") as mock_openai,
+        ):
+            mock_client = MagicMock()
+            mock_response = MagicMock()
+            mock_response.choices = [MagicMock()]
+            mock_response.choices[0].message.content = "ENOUGH context provided"
+            mock_client.chat.completions.create.return_value = mock_response
+            mock_openai.return_value = mock_client
+
+            detector = QuestionDetector()
+            result = detector.is_owasp_question_with_openai("What is OWASP?", sample_context_chunks)
+            assert result is None

193-209: Optional: cover is_owasp_question path when prompt is missing

After adding early prompt validation, add a test that is_owasp_question returns False (no exception) and avoids retrieval.

Example:

def test_is_owasp_question_missing_prompt_returns_false_and_skips_retrieval(monkeypatch):
    monkeypatch.setenv("DJANGO_OPEN_AI_SECRET_KEY", "test-key")
    # Force missing prompt
    monkeypatch.setattr(
        "apps.slack.common.question_detector.Prompt.get_slack_question_detector_prompt",
        lambda: None,
    )
    # Ensure retriever.retrieve is not called
    mock_retriever = MagicMock()
    mock_retriever.retrieve.side_effect = AssertionError("retrieve should not be called")
    monkeypatch.setattr(
        "apps.slack.common.question_detector.Retriever", MagicMock(return_value=mock_retriever)
    )
    detector = QuestionDetector()
    assert not detector.is_owasp_question("What is OWASP?")
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28932a2 and 690488c.

📒 Files selected for processing (2)
  • backend/apps/slack/common/question_detector.py (3 hunks)
  • backend/tests/apps/slack/common/question_detector_test.py (13 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
backend/tests/apps/slack/common/question_detector_test.py (3)
backend/apps/ai/agent/tools/rag/retriever.py (1)
  • retrieve (184-245)
backend/apps/slack/common/question_detector.py (4)
  • QuestionDetector (20-159)
  • is_owasp_question (42-71)
  • is_owasp_question_with_openai (73-128)
  • format_context_chunks (130-159)
backend/tests/apps/slack/events/message_posted_test.py (1)
  • test_init (44-47)
backend/apps/slack/common/question_detector.py (3)
backend/apps/ai/agent/tools/rag/retriever.py (2)
  • Retriever (21-268)
  • retrieve (184-245)
backend/apps/core/models/prompt.py (2)
  • Prompt (14-161)
  • get_slack_question_detector_prompt (154-161)
backend/apps/ai/agent/tools/rag/rag_tool.py (1)
  • query (34-63)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/apps/slack/common/question_detector.py (1)

112-118: Guard against empty choices and parse deterministically.

  • choices may be empty; guard before indexing.
  • Substring checks misclassify “I DON’T KNOW” because “KNOW” contains “NO”. Parse only the first token as YES|NO.
-            answer = response.choices[0].message.content
+            choices = getattr(response, "choices", [])
+            if not choices or not getattr(choices[0], "message", None):
+                logger.error("OpenAI returned no choices")
+                return None
+            answer = choices[0].message.content
@@
-            clean_answer = answer.strip().upper()
-
-            if "YES" in clean_answer:
-                return True
-            if "NO" in clean_answer:
-                return False
-            logger.warning("Unexpected OpenAI response")
+            clean_answer = (answer or "").strip().upper()
+            # Accept only leading YES|NO to avoid matching inside words (e.g., KNOW)
+            import re  # move to top-level imports if preferred
+            m = re.match(r"^\s*(YES|NO)\b", clean_answer)
+            if m:
+                return m.group(1) == "YES"
+            logger.warning("Unexpected OpenAI response: %r", clean_answer[:120])
             return None
♻️ Duplicate comments (2)
backend/apps/slack/common/question_detector.py (2)

42-59: Add fast non-question gate + validate prompt before retrieval (avoid waste and false positives).

Gate obvious non-questions (no '?' / interrogatives) and check prompt existence before hitting the DB/vector search. This prevents classifying “OWASP is great” as True and saves cost when the prompt is misconfigured. This was previously flagged.

Apply this diff:

 def is_owasp_question(self, text: str) -> bool:
     """Check if the input text is an OWASP-related question.

     This is the main public method that orchestrates the detection logic.
-
-        Args:
-            text: The input text to check.
-            context_chunks: Retrieved context chunks from the user's query.
+        Args:
+            text: The input text to check.
     """
     if not text or not text.strip():
         return False
+
+    # Cheap fast-path: only proceed for likely questions/requests
+    if not self.looks_like_question(text):
+        return False
+
+    # Validate prompt early to avoid unnecessary retrieval/API work on misconfig
+    prompt = Prompt.get_slack_question_detector_prompt()
+    if not prompt or not prompt.strip():
+        logger.error("Prompt with key 'slack-question-detector-system-prompt' not found.")
+        return False
-
-    context_chunks = self.retriever.retrieve(
-        query=text,
-        limit=self.CHUNKS_RETRIEVAL_LIMIT,
-        similarity_threshold=DEFAULT_SIMILARITY_THRESHOLD,
-    )
+    try:
+        context_chunks = self.retriever.retrieve(
+            query=text,
+            limit=self.CHUNKS_RETRIEVAL_LIMIT,
+            similarity_threshold=DEFAULT_SIMILARITY_THRESHOLD,
+        )
+    except Exception:
+        logger.exception("Retriever error; continuing with empty context")
+        context_chunks = []

Add helper to the class (outside this region):

+    @staticmethod
+    def looks_like_question(text: str) -> bool:
+        s = (text or "").strip().lower()
+        if not s:
+            return False
+        if "?" in s:
+            return True
+        prefixes = (
+            "who", "what", "when", "where", "why", "how", "which",
+            "can", "could", "should", "would", "do", "does", "did",
+            "is", "are", "was", "were", "may", "might",
+        )
+        request_phrases = ("please", "explain", "help", "tell me", "recommend", "guide", "show me", "i need help")
+        return s.startswith(prefixes) or any(p in s for p in request_phrases)

84-89: Don’t raise on missing prompt; degrade gracefully and return None/False.

Raising ObjectDoesNotExist will bubble up and 500 the Slack flow. Log and return None so the caller returns False.

-        if not prompt or not prompt.strip():
-            error_msg = "Prompt with key 'slack-question-detector-system-prompt' not found."
-            raise ObjectDoesNotExist(error_msg)
+        if not prompt or not prompt.strip():
+            logger.error("Prompt with key 'slack-question-detector-system-prompt' not found.")
+            return None
🧹 Nitpick comments (4)
backend/apps/slack/common/question_detector.py (4)

98-107: Add request timeout to OpenAI call.

Prevent hanging request threads; set a sane timeout.

             response = self.openai_client.chat.completions.create(
                 model=self.CHAT_MODEL,
                 messages=[
                     {"role": "system", "content": system_prompt},
                     {"role": "user", "content": user_prompt},
                 ],
                 temperature=self.TEMPERATURE,
                 max_tokens=self.MAX_TOKENS,
+                request_timeout=10,
             )

Please confirm the correct parameter name for the client you’re using (request_timeout vs timeout). If it differs, adjust accordingly.


139-155: Trim chunk text and additional_context to cap prompt size.

Unbounded context can bloat prompts and costs. Trim text and context strings.

-        formatted_context = []
+        formatted_context = []
+        MAX_TEXT = 400
+        MAX_CTX = 800
         for i, chunk in enumerate(context_chunks):
             source_name = chunk.get("source_name", f"Unknown Source {i + 1}")
-            text = chunk.get("text", "")
-            additional_context = chunk.get("additional_context", {})
+            raw_text = chunk.get("text", "") or ""
+            text = (raw_text.replace("\r", " ").replace("\n", " "))[:MAX_TEXT]
+            additional_context = chunk.get("additional_context", {})
@@
-            if additional_context:
+            if additional_context:
+                ctx_str = str(additional_context)
+                if len(ctx_str) > MAX_CTX:
+                    ctx_str = ctx_str[:MAX_CTX] + "..."
                 context_block = (
                     f"Source Name: {source_name}\nContent: {text}\n"
-                    f"Additional Context: {additional_context}"
+                    f"Additional Context: {ctx_str}"
                 )
             else:
                 context_block = f"Source Name: {source_name}\nContent: {text}"

47-51: Fix docstring Args for is_owasp_question.

Docstring mentions context_chunks which isn’t a parameter. Clean up to avoid confusion.

-        Args:
-            text: The input text to check.
-            context_chunks: Retrieved context chunks from the user's query.
+        Args:
+            text: The input text to check.

8-10: Move regex import to top-level (if you adopt regex parsing).

If you use the regex-based YES/NO parsing, import re once at the top for clarity.

 import os
+import re
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 690488c and e687a7c.

📒 Files selected for processing (1)
  • backend/apps/slack/common/question_detector.py (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/apps/slack/common/question_detector.py (2)
backend/apps/ai/agent/tools/rag/retriever.py (2)
  • Retriever (21-268)
  • retrieve (184-245)
backend/apps/core/models/prompt.py (2)
  • Prompt (14-161)
  • get_slack_question_detector_prompt (154-161)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Run frontend e2e tests
  • GitHub Check: Run frontend unit tests
  • GitHub Check: Run backend tests
  • GitHub Check: CodeQL (javascript-typescript)

@sonarqubecloud
Copy link

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (7)
backend/tests/apps/slack/common/question_detector_test.py (7)

26-32: Assert Retriever.retrieve is invoked with correct args (limit/threshold).

Great that Retriever is mocked; but none of the tests assert that is_owasp_question() calls retrieve() with CHUNKS_RETRIEVAL_LIMIT and DEFAULT_SIMILARITY_THRESHOLD. Add a focused test to lock this contract.

Apply this diff (adds a new test):

@@
 class TestQuestionDetector:
@@
     def test_init(self, detector):
         """Test QuestionDetector initialization."""
@@
         assert hasattr(detector, "retriever")
 
+    def test_is_owasp_question_invokes_retriever_with_expected_args(self, detector, monkeypatch):
+        """Ensure retriever.retrieve is called with limit and similarity threshold."""
+        retrieve_mock = MagicMock(return_value=[])
+        detector.retriever.retrieve = retrieve_mock
+        monkeypatch.setattr(detector, "is_owasp_question_with_openai", MagicMock(return_value=True))
+
+        q = "What is OWASP?"
+        assert detector.is_owasp_question(q) is True
+        retrieve_mock.assert_called_once()
+        _, kwargs = retrieve_mock.call_args
+        assert kwargs["query"] == q
+        assert kwargs["limit"] == detector.CHUNKS_RETRIEVAL_LIMIT
+        # Import constant from the module under test to avoid path coupling
+        from apps.slack.common.question_detector import DEFAULT_SIMILARITY_THRESHOLD
+        assert kwargs["similarity_threshold"] == DEFAULT_SIMILARITY_THRESHOLD

Also applies to: 74-79


98-101: Clarify misleading comment about “regardless of OpenAI result”.

The wrapper simply returns the OpenAI result (or False on failure). The current comment suggests otherwise and can confuse future maintainers.

Apply this diff:

-        # Test non-questions (should return False regardless of OpenAI result)
+        # Test non-questions; here we mock OpenAI to return False, so wrapper should return False

416-425: Unify empty-context phrasing across code paths (“No context provided” vs “No context available”).

format_context_chunks() returns "No context provided", while is_owasp_question_with_openai() uses "No context available" when chunks are falsy. This minor inconsistency complicates testing and log/search reliability.

Option A (preferred): Always call format_context_chunks(context_chunks) and rely on its "No context provided".

-        formatted_context = (
-            self.format_context_chunks(context_chunks)
-            if context_chunks
-            else "No context available"
-        )
+        formatted_context = self.format_context_chunks(context_chunks)

Option B: Change format_context_chunks() to return "No context available" and adjust tests accordingly.

Based on related production snippet.

Also applies to: 426-430, 431-434


227-248: Also assert OpenAI call parameters and message shape.

These tests validate return values but don’t assert the request payload. Add one representative test to lock model, temperature, max_tokens, and message contents (system+user with context).

Apply this diff:

@@
     def test_is_owasp_question_with_openai_success_yes(self, sample_context_chunks):
@@
-            result = detector.is_owasp_question_with_openai(
-                "What is OWASP?", sample_context_chunks
-            )
+            question = "What is OWASP?"
+            result = detector.is_owasp_question_with_openai(question, sample_context_chunks)
             assert result is True
+
+            # Assert OpenAI call parameters and message structure
+            mock_client.chat.completions.create.assert_called_once()
+            _, kwargs = mock_client.chat.completions.create.call_args
+            assert kwargs["model"] == detector.CHAT_MODEL
+            assert kwargs["temperature"] == detector.TEMPERATURE
+            assert kwargs["max_tokens"] == detector.MAX_TOKENS
+            assert isinstance(kwargs["messages"], list) and len(kwargs["messages"]) == 2
+            assert kwargs["messages"][0]["role"] == "system"
+            assert "System prompt with {context}" in kwargs["messages"][0]["content"]
+            assert kwargs["messages"][1]["role"] == "user"
+            assert "Question:" in kwargs["messages"][1]["content"]
+            assert question in kwargs["messages"][1]["content"]
+            assert "Context:" in kwargs["messages"][1]["content"]

Also applies to: 251-274, 368-391, 392-415


43-66: Cover formatter branch where additional_context is absent.

sample_context_chunks only contains entries with additional_context; add one without it to exercise the else-branch in format_context_chunks().

Apply this diff:

         return [
@@
             },
             {
                 "source_name": "XSS Prevention",
@@
                 "additional_context": {"risk_level": "high", "category": "injection"},
             },
+            {
+                "source_name": "Cheat Sheet Series",
+                "text": "A set of best practices for secure development by topic.",
+                # No additional_context on purpose to cover the else path
+            },
         ]

And assert it in test_format_context_chunks:

-        assert "Source Name: XSS Prevention" in formatted
+        assert "Source Name: XSS Prevention" in formatted
+        assert "Source Name: Cheat Sheet Series" in formatted
+        assert "Additional Context:" in formatted  # present for items that include it

184-184: Also assert CHUNKS_RETRIEVAL_LIMIT.

Since CHUNKS_RETRIEVAL_LIMIT is central to retrieval behavior, include it in class constants tests.

Apply this diff:

     def test_class_constants(self, detector):
         """Test that class constants are properly defined."""
         assert detector.MAX_TOKENS == 10
         assert detector.TEMPERATURE == 0.1
         assert detector.CHAT_MODEL == "gpt-4o-mini"
+        assert detector.CHUNKS_RETRIEVAL_LIMIT == 10

Also applies to: 186-186


74-79: Add a wrapper test for OpenAI None → False.

Ensure is_owasp_question() returns False when is_owasp_question_with_openai() returns None.

Apply this diff:

+    def test_is_owasp_question_returns_false_on_openai_none(self, detector, monkeypatch):
+        monkeypatch.setattr(detector, "is_owasp_question_with_openai", MagicMock(return_value=None))
+        assert detector.is_owasp_question("What is OWASP?") is False

Also applies to: 90-94

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e687a7c and f6ed95b.

📒 Files selected for processing (1)
  • backend/tests/apps/slack/common/question_detector_test.py (14 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/tests/apps/slack/common/question_detector_test.py (3)
backend/apps/ai/agent/tools/rag/retriever.py (1)
  • retrieve (184-245)
backend/apps/slack/common/question_detector.py (4)
  • QuestionDetector (20-155)
  • is_owasp_question (42-69)
  • is_owasp_question_with_openai (71-124)
  • format_context_chunks (126-155)
backend/tests/apps/slack/events/message_posted_test.py (1)
  • test_init (44-47)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: CodeQL (javascript-typescript)
  • GitHub Check: CodeQL (python)
🔇 Additional comments (1)
backend/tests/apps/slack/common/question_detector_test.py (1)

110-120: LGTM: solid, readable tests; nice coverage of cases and error paths.

Case-insensitivity, whitespace/special-chars, positive/negative OpenAI responses, and API error/fallback paths are well covered and easy to maintain.

Also applies to: 121-131, 132-137, 156-161, 169-172, 177-180, 233-234, 245-247, 257-258, 269-271, 293-295, 317-319, 363-364, 368-391, 392-415

@arkid15r arkid15r enabled auto-merge (squash) October 22, 2025 00:07
@arkid15r arkid15r merged commit 36fc06f into OWASP:feature/nestbot-ai-assistant Oct 22, 2025
25 checks passed
arkid15r pushed a commit that referenced this pull request Nov 16, 2025
* question detection refining

* sonar qube fixes

* fix tests
mirnumaan pushed a commit to mirnumaan/Nest that referenced this pull request Nov 16, 2025
* question detection refining

* sonar qube fixes

* fix tests
github-merge-queue bot pushed a commit that referenced this pull request Nov 19, 2025
* add the aria-label

* fixed the code style, fixed by the formatter

* Handled the undefined repositoryName in aria-label.

* fix the white spaces

* Add aria-labels to interactive elements for WCAG 2.1 compliance

* Sync www-repopsitories (#2164)

* spelling fixes and tests

* sonar and code rabbit suggestions implemented

* json chunking and suggestions implemented

* code rabbit and sonar qube suggestions

* code rabbit suggestions

* suggestions implemented

* github advance security addressed

* tests fixed

* fixed tests

* Clean up backend/test_commands.py

---------

Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* Update docker-compose/local.yaml

* Nestbot MVP (#2113)

* Sync www-repopsitories (#2164)

* spelling fixes and tests

* sonar and code rabbit suggestions implemented

* json chunking and suggestions implemented

* code rabbit and sonar qube suggestions

* code rabbit suggestions

* suggestions implemented

* github advance security addressed

* tests fixed

* fixed tests

* Clean up backend/test_commands.py

---------

Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* Consolidate code commits

* Update cspell/custom-dict.txt

* Update docker-compose/local.yaml

* local yaml worder volume fix

* instance check

* poetry file updated

---------

Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* fixed the duplication error for chunks and context (#2343)

* Fix slack and duplication errors (#2352)

* fix slack and duplication errors

* code rabbit suggestions

* integrity error solved

* using set

* Update code

---------

Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>

* Response improvements and refactoring (#2407)

* improvements and refactoring

* added prompt checks and tests for it

* question detection refining (#2443)

* question detection refining

* sonar qube fixes

* fix tests

* Agentic rag (#2432)

* agentic rag

* spelling fixes

* code rabbit and sonar qube suggestions

* code rabbit suggestions

* refining

* fix test

* refining

* added question detectoor to nestbot mentions (#2473)

* Merge NestBot AI Assistant feature branch

* Update docker-compose/local.yaml

* Update backend/Makefile

* fix the white spaces

* Add aria-labels to interactive elements for WCAG 2.1 compliance

* Fix equality checks with floating point values (#2658)

* Edit in test cases for the new aira-label update

* More edits to make the test more unbreakable and few regex edits .

* Revert one change regards two seperate Contributors

* Regex check fix

* Global 10-second timeout and increased other mobile timeout to fix the  Mobile Safari flaky tests

---------

Co-authored-by: Dishant Miyani <dishantmiyani1804@gmail.com>
Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com>
Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org>
Co-authored-by: Kate Golovanova <kate@kgthreads.com>
@coderabbitai coderabbitai bot mentioned this pull request Jan 30, 2026
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants