question detection refining by Dishant1804 · Pull Request #2443 · OWASP/Nest

Dishant1804 · 2025-10-18T12:35:51Z

Proposed change

Resolves #2309

implemented scalable solution to detect OWASP related questions

Checklist

I've read and followed the contributing guidelines.
I've run make check-test locally; all checks and tests passed.

coderabbitai · 2025-10-18T12:36:02Z

Summary by CodeRabbit

Improvements
- Enhanced OWASP question detection to use intelligent context analysis instead of keyword-based matching, improving accuracy and relevance of question identification in Slack interactions.
- Implemented retrieval-based context integration to provide better understanding of question context.
- Improved error handling for detection failures with more reliable fallback behavior.

Walkthrough

Replaces keyword-based OWASP detection with a retrieval-augmented OpenAI classification flow: adds retriever usage, context formatting, prompt retrieval, and OpenAI error handling; introduces CHUNKS_RETRIEVAL_LIMIT, changes model/tokens, and updates tests to mock retriever/OpenAI paths. (≤50 words)

Changes

Cohort / File(s)	Summary
Question Detector Implementation `backend/apps/slack/common/question_detector.py`	Replaces keyword-only detection with retrieval + OpenAI flow. Adds `is_owasp_question_with_openai()` and `format_context_chunks()`. Introduces `CHUNKS_RETRIEVAL_LIMIT = 10`. Changes `CHAT_MODEL` `"gpt-4o"` → `"gpt-4o-mini"` and `MAX_TOKENS` 50 → 10. Adds retriever usage, context formatting, prompt retrieval, and OpenAI error handling; removes `is_question()`, `contains_owasp_keywords()`, and OWASP keyword state.
Question Detector Tests `backend/tests/apps/slack/common/question_detector_test.py`	Adds mocked retriever and `sample_context_chunks` fixture; updates tests to mock `is_owasp_question_with_openai()` / OpenAI responses and expect prompt with `{context}`. Adds tests for `format_context_chunks()` and updates constant expectations to match implementation changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Response improvements and refactoring #2407: Modifies the same question detector module and replaces system prompt usage with Prompt.get_slack_question_detector_prompt, indicating shared prompt and retrieval-related changes.

Suggested labels

backend

Suggested reviewers

kasya
arkid15r

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title Check	❓ Inconclusive	The PR title "question detection refining" is related to the changeset's focus on OWASP question detection, and it accurately reflects that the PR involves modifications to the question detection logic. However, the title is quite vague and generic; it does not convey the core technical change, which is the fundamental shift from keyword-based detection to a retrieval-based, context-aware OpenAI-driven approach. A reader scanning the repository history would not understand from this title alone that the solution now uses retrieval patterns and LLM-based classification rather than simple keyword matching.	Consider revising the title to be more specific and descriptive of the main technical change, such as "Replace keyword-based OWASP question detection with retrieval-based OpenAI approach" or "Implement retrieval-augmented OWASP question detection using context and OpenAI". This would make the purpose of the PR immediately clear to reviewers and future maintainers.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Out of Scope Changes Check	✅ Passed	All code changes in this PR are directly scoped to implementing the OWASP question detection solution as specified in issue #2309. The modifications are confined to `backend/apps/slack/common/question_detector.py` and its corresponding test file, with all changes focused on replacing the keyword-based detection mechanism with a retrieval-augmented OpenAI-driven approach. No unrelated features, documentation changes, or modifications to other modules are present in the changeset.
Description Check	✅ Passed	The PR description directly relates to the changeset by explicitly stating "implemented scalable solution to detect OWASP related questions" and linking to issue #2309. The description accurately captures the primary objective of the PR, which aligns with the code changes that introduce retrieval-based context-aware detection to replace simple keyword matching. The contributor also provides evidence of following the contributing guidelines and running local tests successfully.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/apps/slack/common/question_detector.py (1)

121-128: Fix fragile YES/NO substring matching (“ENOUGH” contains “NO”)

Substring checks can misclassify. Parse the first token or exact whole-word answer.

Apply this diff:

-            clean_answer = answer.strip().upper()
-
-            if "YES" in clean_answer:
-                return True
-            if "NO" in clean_answer:
-                return False
+            clean_answer = answer.strip()
+            if not clean_answer:
+                logger.error("OpenAI returned blank content")
+                return None
+            first = clean_answer.strip().split()[0].strip('.,!?:;"\'()[]{}').upper()
+            if first == "YES":
+                return True
+            if first == "NO":
+                return False
             logger.warning("Unexpected OpenAI response")
             return None

🧹 Nitpick comments (8)

backend/apps/slack/common/question_detector.py (5)

47-51: Docstring args out of sync with signature

is_owasp_question does not accept context_chunks, yet the docstring lists it. Remove the param from the docstring (or add the param), to avoid confusion.

88-101: Unify empty-context string to a single value

is_owasp_question_with_openai uses “No context available” while format_context_chunks returns “No context provided”. Use one phrase to simplify testing and prompts.

Apply this diff:
-        formatted_context = (
-            self.format_context_chunks(context_chunks)
-            if context_chunks
-            else "No context available"
-        )
+        formatted_context = (
+            self.format_context_chunks(context_chunks)
+            if context_chunks
+            else "No context provided"
+        )
102-111: Harden model output format; prefer deterministic, structured responses

To eliminate “MAYBE” and reduce parsing risk, set temperature to 0 and request structured output (e.g., JSON with { "answer": "YES" | "NO" }) or force a single-token response. If staying with text, at least set temperature=0.

Example (adjust per openai client version):
response = self.openai_client.chat.completions.create(
    model=self.CHAT_MODEL,
    messages=[...],
    temperature=0,
    max_tokens=self.MAX_TOKENS,
    # response_format={"type": "json_object"},  # if supported in your version
)
Optionally add request timeouts/retries on the client or call for resilience.

130-159: Accept Optional context, and serialize additional_context as JSON

Signature should accept None (tests call with None).

Printing dicts directly is noisy; JSON is clearer and stable for prompts.

Apply this diff:
-    def format_context_chunks(self, context_chunks: list[dict]) -> str:
+    def format_context_chunks(self, context_chunks: list[dict] | None) -> str:
@@
-        if not context_chunks:
+        if not context_chunks:
             return "No context provided"
@@
-            additional_context = chunk.get("additional_context", {})
+            additional_context = chunk.get("additional_context", {})
@@
-            if additional_context:
-                context_block = (
-                    f"Source Name: {source_name}\nContent: {text}\n"
-                    f"Additional Context: {additional_context}"
-                )
+            if additional_context:
+                import json
+                context_block = (
+                    f"Source Name: {source_name}\nContent: {text}\n"
+                    f"Additional Context: {json.dumps(additional_context, ensure_ascii=False, sort_keys=True)}"
+                )
             else:
                 context_block = f"Source Name: {source_name}\nContent: {text}"
Optional: truncate text snippets (e.g., first 500–1000 chars) to keep prompt size bounded.

63-71: Consider deterministic fallback when OpenAI fails

When OpenAI errors or returns None, you currently return False. To avoid zero recall during outages, consider a low-cost deterministic fallback (e.g., a minimal keyword/phrase classifier limited to OWASP terms). Keep it behind a feature flag.

backend/tests/apps/slack/common/question_detector_test.py (3)

98-103: Strengthen “non-questions” test to ignore OpenAI result

Currently the mock returns False, so the comment “regardless of OpenAI result” isn’t actually verified. Force OpenAI path to return True and still expect False.

Apply this diff:

@@ def test_is_owasp_question_false_cases(self, detector, sample_context_chunks, monkeypatch):
-        # Mock OpenAI to return False for non-OWASP questions
-        mock_openai_method = MagicMock(return_value=False)
+        # Even if OpenAI says True, non-questions must be rejected by the gate
+        mock_openai_method = MagicMock(return_value=True)
         monkeypatch.setattr(detector, "is_owasp_question_with_openai", mock_openai_method)
@@
-        assert not detector.is_owasp_question("OWASP is great")
-        assert not detector.is_owasp_question("Security is important")
+        assert not detector.is_owasp_question("OWASP is great")
+        assert not detector.is_owasp_question("Security is important")

323-346: Add guard test for “NO” substring false positive (e.g., ‘ENOUGH’)

Ensure parsing doesn’t misclassify when ‘NO’ appears inside another word.

Apply this diff to add a new test:

+    def test_is_owasp_question_with_openai_no_substring_not_misclassified(self, sample_context_chunks):
+        """Ensure 'NO' substring inside another word doesn't flip to False."""
+        with (
+            patch.dict(os.environ, {"DJANGO_OPEN_AI_SECRET_KEY": "test-key"}),
+            patch(
+                "apps.slack.common.question_detector.Prompt.get_slack_question_detector_prompt",
+                return_value="System prompt with {context}",
+            ),
+            patch("openai.OpenAI") as mock_openai,
+        ):
+            mock_client = MagicMock()
+            mock_response = MagicMock()
+            mock_response.choices = [MagicMock()]
+            mock_response.choices[0].message.content = "ENOUGH context provided"
+            mock_client.chat.completions.create.return_value = mock_response
+            mock_openai.return_value = mock_client
+
+            detector = QuestionDetector()
+            result = detector.is_owasp_question_with_openai("What is OWASP?", sample_context_chunks)
+            assert result is None

193-209: Optional: cover is_owasp_question path when prompt is missing

After adding early prompt validation, add a test that is_owasp_question returns False (no exception) and avoids retrieval.

Example:

def test_is_owasp_question_missing_prompt_returns_false_and_skips_retrieval(monkeypatch):
    monkeypatch.setenv("DJANGO_OPEN_AI_SECRET_KEY", "test-key")
    # Force missing prompt
    monkeypatch.setattr(
        "apps.slack.common.question_detector.Prompt.get_slack_question_detector_prompt",
        lambda: None,
    )
    # Ensure retriever.retrieve is not called
    mock_retriever = MagicMock()
    mock_retriever.retrieve.side_effect = AssertionError("retrieve should not be called")
    monkeypatch.setattr(
        "apps.slack.common.question_detector.Retriever", MagicMock(return_value=mock_retriever)
    )
    detector = QuestionDetector()
    assert not detector.is_owasp_question("What is OWASP?")

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 28932a2 and 690488c.

📒 Files selected for processing (2)

backend/apps/slack/common/question_detector.py (3 hunks)
backend/tests/apps/slack/common/question_detector_test.py (13 hunks)

🧰 Additional context used

🧬 Code graph analysis (2)

backend/tests/apps/slack/common/question_detector_test.py (3)

backend/apps/ai/agent/tools/rag/retriever.py (1)

retrieve (184-245)

backend/apps/slack/common/question_detector.py (4)

QuestionDetector (20-159)

is_owasp_question (42-71)

is_owasp_question_with_openai (73-128)

format_context_chunks (130-159)

backend/tests/apps/slack/events/message_posted_test.py (1)

test_init (44-47)

backend/apps/slack/common/question_detector.py (3)

backend/apps/ai/agent/tools/rag/retriever.py (2)

Retriever (21-268)

retrieve (184-245)

backend/apps/core/models/prompt.py (2)

Prompt (14-161)

get_slack_question_detector_prompt (154-161)

backend/apps/ai/agent/tools/rag/rag_tool.py (1)

query (34-63)

backend/apps/slack/common/question_detector.py

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

backend/apps/slack/common/question_detector.py (1)

112-118: Guard against empty choices and parse deterministically.

choices may be empty; guard before indexing.
Substring checks misclassify “I DON’T KNOW” because “KNOW” contains “NO”. Parse only the first token as YES|NO.

-            answer = response.choices[0].message.content
+            choices = getattr(response, "choices", [])
+            if not choices or not getattr(choices[0], "message", None):
+                logger.error("OpenAI returned no choices")
+                return None
+            answer = choices[0].message.content
@@
-            clean_answer = answer.strip().upper()
-
-            if "YES" in clean_answer:
-                return True
-            if "NO" in clean_answer:
-                return False
-            logger.warning("Unexpected OpenAI response")
+            clean_answer = (answer or "").strip().upper()
+            # Accept only leading YES|NO to avoid matching inside words (e.g., KNOW)
+            import re  # move to top-level imports if preferred
+            m = re.match(r"^\s*(YES|NO)\b", clean_answer)
+            if m:
+                return m.group(1) == "YES"
+            logger.warning("Unexpected OpenAI response: %r", clean_answer[:120])
             return None

♻️ Duplicate comments (2)

backend/apps/slack/common/question_detector.py (2)

42-59: Add fast non-question gate + validate prompt before retrieval (avoid waste and false positives).

Gate obvious non-questions (no '?' / interrogatives) and check prompt existence before hitting the DB/vector search. This prevents classifying “OWASP is great” as True and saves cost when the prompt is misconfigured. This was previously flagged.

Apply this diff:

 def is_owasp_question(self, text: str) -> bool:
     """Check if the input text is an OWASP-related question.

     This is the main public method that orchestrates the detection logic.
-
-        Args:
-            text: The input text to check.
-            context_chunks: Retrieved context chunks from the user's query.
+        Args:
+            text: The input text to check.
     """
     if not text or not text.strip():
         return False
+
+    # Cheap fast-path: only proceed for likely questions/requests
+    if not self.looks_like_question(text):
+        return False
+
+    # Validate prompt early to avoid unnecessary retrieval/API work on misconfig
+    prompt = Prompt.get_slack_question_detector_prompt()
+    if not prompt or not prompt.strip():
+        logger.error("Prompt with key 'slack-question-detector-system-prompt' not found.")
+        return False
-
-    context_chunks = self.retriever.retrieve(
-        query=text,
-        limit=self.CHUNKS_RETRIEVAL_LIMIT,
-        similarity_threshold=DEFAULT_SIMILARITY_THRESHOLD,
-    )
+    try:
+        context_chunks = self.retriever.retrieve(
+            query=text,
+            limit=self.CHUNKS_RETRIEVAL_LIMIT,
+            similarity_threshold=DEFAULT_SIMILARITY_THRESHOLD,
+        )
+    except Exception:
+        logger.exception("Retriever error; continuing with empty context")
+        context_chunks = []

Add helper to the class (outside this region):

+    @staticmethod
+    def looks_like_question(text: str) -> bool:
+        s = (text or "").strip().lower()
+        if not s:
+            return False
+        if "?" in s:
+            return True
+        prefixes = (
+            "who", "what", "when", "where", "why", "how", "which",
+            "can", "could", "should", "would", "do", "does", "did",
+            "is", "are", "was", "were", "may", "might",
+        )
+        request_phrases = ("please", "explain", "help", "tell me", "recommend", "guide", "show me", "i need help")
+        return s.startswith(prefixes) or any(p in s for p in request_phrases)

84-89: Don’t raise on missing prompt; degrade gracefully and return None/False.

Raising ObjectDoesNotExist will bubble up and 500 the Slack flow. Log and return None so the caller returns False.

-        if not prompt or not prompt.strip():
-            error_msg = "Prompt with key 'slack-question-detector-system-prompt' not found."
-            raise ObjectDoesNotExist(error_msg)
+        if not prompt or not prompt.strip():
+            logger.error("Prompt with key 'slack-question-detector-system-prompt' not found.")
+            return None

🧹 Nitpick comments (4)

backend/apps/slack/common/question_detector.py (4)

98-107: Add request timeout to OpenAI call.

Prevent hanging request threads; set a sane timeout.

             response = self.openai_client.chat.completions.create(
                 model=self.CHAT_MODEL,
                 messages=[
                     {"role": "system", "content": system_prompt},
                     {"role": "user", "content": user_prompt},
                 ],
                 temperature=self.TEMPERATURE,
                 max_tokens=self.MAX_TOKENS,
+                request_timeout=10,
             )

Please confirm the correct parameter name for the client you’re using (request_timeout vs timeout). If it differs, adjust accordingly.

139-155: Trim chunk text and additional_context to cap prompt size.

Unbounded context can bloat prompts and costs. Trim text and context strings.

-        formatted_context = []
+        formatted_context = []
+        MAX_TEXT = 400
+        MAX_CTX = 800
         for i, chunk in enumerate(context_chunks):
             source_name = chunk.get("source_name", f"Unknown Source {i + 1}")
-            text = chunk.get("text", "")
-            additional_context = chunk.get("additional_context", {})
+            raw_text = chunk.get("text", "") or ""
+            text = (raw_text.replace("\r", " ").replace("\n", " "))[:MAX_TEXT]
+            additional_context = chunk.get("additional_context", {})
@@
-            if additional_context:
+            if additional_context:
+                ctx_str = str(additional_context)
+                if len(ctx_str) > MAX_CTX:
+                    ctx_str = ctx_str[:MAX_CTX] + "..."
                 context_block = (
                     f"Source Name: {source_name}\nContent: {text}\n"
-                    f"Additional Context: {additional_context}"
+                    f"Additional Context: {ctx_str}"
                 )
             else:
                 context_block = f"Source Name: {source_name}\nContent: {text}"

47-51: Fix docstring Args for is_owasp_question.

Docstring mentions context_chunks which isn’t a parameter. Clean up to avoid confusion.

-        Args:
-            text: The input text to check.
-            context_chunks: Retrieved context chunks from the user's query.
+        Args:
+            text: The input text to check.

8-10: Move regex import to top-level (if you adopt regex parsing).

If you use the regex-based YES/NO parsing, import re once at the top for clarity.

 import os
+import re

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 690488c and e687a7c.

📒 Files selected for processing (1)

backend/apps/slack/common/question_detector.py (3 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

backend/apps/slack/common/question_detector.py (2)

backend/apps/ai/agent/tools/rag/retriever.py (2)

Retriever (21-268)

retrieve (184-245)

backend/apps/core/models/prompt.py (2)

Prompt (14-161)

get_slack_question_detector_prompt (154-161)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Run frontend e2e tests
GitHub Check: Run frontend unit tests
GitHub Check: Run backend tests
GitHub Check: CodeQL (javascript-typescript)

backend/apps/slack/common/question_detector.py

sonarqubecloud · 2025-10-18T13:25:10Z

Quality Gate passed

Issues
5 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (7)

backend/tests/apps/slack/common/question_detector_test.py (7)

26-32: Assert Retriever.retrieve is invoked with correct args (limit/threshold).

Great that Retriever is mocked; but none of the tests assert that is_owasp_question() calls retrieve() with CHUNKS_RETRIEVAL_LIMIT and DEFAULT_SIMILARITY_THRESHOLD. Add a focused test to lock this contract.

Apply this diff (adds a new test):

@@
 class TestQuestionDetector:
@@
     def test_init(self, detector):
         """Test QuestionDetector initialization."""
@@
         assert hasattr(detector, "retriever")
 
+    def test_is_owasp_question_invokes_retriever_with_expected_args(self, detector, monkeypatch):
+        """Ensure retriever.retrieve is called with limit and similarity threshold."""
+        retrieve_mock = MagicMock(return_value=[])
+        detector.retriever.retrieve = retrieve_mock
+        monkeypatch.setattr(detector, "is_owasp_question_with_openai", MagicMock(return_value=True))
+
+        q = "What is OWASP?"
+        assert detector.is_owasp_question(q) is True
+        retrieve_mock.assert_called_once()
+        _, kwargs = retrieve_mock.call_args
+        assert kwargs["query"] == q
+        assert kwargs["limit"] == detector.CHUNKS_RETRIEVAL_LIMIT
+        # Import constant from the module under test to avoid path coupling
+        from apps.slack.common.question_detector import DEFAULT_SIMILARITY_THRESHOLD
+        assert kwargs["similarity_threshold"] == DEFAULT_SIMILARITY_THRESHOLD

Also applies to: 74-79

98-101: Clarify misleading comment about “regardless of OpenAI result”.

The wrapper simply returns the OpenAI result (or False on failure). The current comment suggests otherwise and can confuse future maintainers.

Apply this diff:

-        # Test non-questions (should return False regardless of OpenAI result)
+        # Test non-questions; here we mock OpenAI to return False, so wrapper should return False

416-425: Unify empty-context phrasing across code paths (“No context provided” vs “No context available”).

format_context_chunks() returns "No context provided", while is_owasp_question_with_openai() uses "No context available" when chunks are falsy. This minor inconsistency complicates testing and log/search reliability.

Option A (preferred): Always call format_context_chunks(context_chunks) and rely on its "No context provided".

-        formatted_context = (
-            self.format_context_chunks(context_chunks)
-            if context_chunks
-            else "No context available"
-        )
+        formatted_context = self.format_context_chunks(context_chunks)

Option B: Change format_context_chunks() to return "No context available" and adjust tests accordingly.

Based on related production snippet.

Also applies to: 426-430, 431-434

227-248: Also assert OpenAI call parameters and message shape.

These tests validate return values but don’t assert the request payload. Add one representative test to lock model, temperature, max_tokens, and message contents (system+user with context).

Apply this diff:

@@
     def test_is_owasp_question_with_openai_success_yes(self, sample_context_chunks):
@@
-            result = detector.is_owasp_question_with_openai(
-                "What is OWASP?", sample_context_chunks
-            )
+            question = "What is OWASP?"
+            result = detector.is_owasp_question_with_openai(question, sample_context_chunks)
             assert result is True
+
+            # Assert OpenAI call parameters and message structure
+            mock_client.chat.completions.create.assert_called_once()
+            _, kwargs = mock_client.chat.completions.create.call_args
+            assert kwargs["model"] == detector.CHAT_MODEL
+            assert kwargs["temperature"] == detector.TEMPERATURE
+            assert kwargs["max_tokens"] == detector.MAX_TOKENS
+            assert isinstance(kwargs["messages"], list) and len(kwargs["messages"]) == 2
+            assert kwargs["messages"][0]["role"] == "system"
+            assert "System prompt with {context}" in kwargs["messages"][0]["content"]
+            assert kwargs["messages"][1]["role"] == "user"
+            assert "Question:" in kwargs["messages"][1]["content"]
+            assert question in kwargs["messages"][1]["content"]
+            assert "Context:" in kwargs["messages"][1]["content"]

Also applies to: 251-274, 368-391, 392-415

43-66: Cover formatter branch where additional_context is absent.

sample_context_chunks only contains entries with additional_context; add one without it to exercise the else-branch in format_context_chunks().

Apply this diff:

         return [
@@
             },
             {
                 "source_name": "XSS Prevention",
@@
                 "additional_context": {"risk_level": "high", "category": "injection"},
             },
+            {
+                "source_name": "Cheat Sheet Series",
+                "text": "A set of best practices for secure development by topic.",
+                # No additional_context on purpose to cover the else path
+            },
         ]

And assert it in test_format_context_chunks:

-        assert "Source Name: XSS Prevention" in formatted
+        assert "Source Name: XSS Prevention" in formatted
+        assert "Source Name: Cheat Sheet Series" in formatted
+        assert "Additional Context:" in formatted  # present for items that include it

184-184: Also assert CHUNKS_RETRIEVAL_LIMIT.

Since CHUNKS_RETRIEVAL_LIMIT is central to retrieval behavior, include it in class constants tests.

Apply this diff:

     def test_class_constants(self, detector):
         """Test that class constants are properly defined."""
         assert detector.MAX_TOKENS == 10
         assert detector.TEMPERATURE == 0.1
         assert detector.CHAT_MODEL == "gpt-4o-mini"
+        assert detector.CHUNKS_RETRIEVAL_LIMIT == 10

Also applies to: 186-186

74-79: Add a wrapper test for OpenAI None → False.

Ensure is_owasp_question() returns False when is_owasp_question_with_openai() returns None.

Apply this diff:

+    def test_is_owasp_question_returns_false_on_openai_none(self, detector, monkeypatch):
+        monkeypatch.setattr(detector, "is_owasp_question_with_openai", MagicMock(return_value=None))
+        assert detector.is_owasp_question("What is OWASP?") is False

Also applies to: 90-94

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e687a7c and f6ed95b.

📒 Files selected for processing (1)

backend/tests/apps/slack/common/question_detector_test.py (14 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

backend/tests/apps/slack/common/question_detector_test.py (3)

backend/apps/ai/agent/tools/rag/retriever.py (1)

retrieve (184-245)

backend/apps/slack/common/question_detector.py (4)

QuestionDetector (20-155)

is_owasp_question (42-69)

is_owasp_question_with_openai (71-124)

format_context_chunks (126-155)

backend/tests/apps/slack/events/message_posted_test.py (1)

test_init (44-47)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: CodeQL (javascript-typescript)
GitHub Check: CodeQL (python)

🔇 Additional comments (1)

backend/tests/apps/slack/common/question_detector_test.py (1)

110-120: LGTM: solid, readable tests; nice coverage of cases and error paths.

Case-insensitivity, whitespace/special-chars, positive/negative OpenAI responses, and API error/fallback paths are well covered and easy to maintain.

Also applies to: 121-131, 132-137, 156-161, 169-172, 177-180, 233-234, 245-247, 257-258, 269-271, 293-295, 317-319, 363-364, 368-391, 392-415

backend/tests/apps/slack/common/question_detector_test.py

* question detection refining * sonar qube fixes * fix tests

* add the aria-label * fixed the code style, fixed by the formatter * Handled the undefined repositoryName in aria-label. * fix the white spaces * Add aria-labels to interactive elements for WCAG 2.1 compliance * Sync www-repopsitories (#2164) * spelling fixes and tests * sonar and code rabbit suggestions implemented * json chunking and suggestions implemented * code rabbit and sonar qube suggestions * code rabbit suggestions * suggestions implemented * github advance security addressed * tests fixed * fixed tests * Clean up backend/test_commands.py --------- Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * Update docker-compose/local.yaml * Nestbot MVP (#2113) * Sync www-repopsitories (#2164) * spelling fixes and tests * sonar and code rabbit suggestions implemented * json chunking and suggestions implemented * code rabbit and sonar qube suggestions * code rabbit suggestions * suggestions implemented * github advance security addressed * tests fixed * fixed tests * Clean up backend/test_commands.py --------- Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * Consolidate code commits * Update cspell/custom-dict.txt * Update docker-compose/local.yaml * local yaml worder volume fix * instance check * poetry file updated --------- Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * fixed the duplication error for chunks and context (#2343) * Fix slack and duplication errors (#2352) * fix slack and duplication errors * code rabbit suggestions * integrity error solved * using set * Update code --------- Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> * Response improvements and refactoring (#2407) * improvements and refactoring * added prompt checks and tests for it * question detection refining (#2443) * question detection refining * sonar qube fixes * fix tests * Agentic rag (#2432) * agentic rag * spelling fixes * code rabbit and sonar qube suggestions * code rabbit suggestions * refining * fix test * refining * added question detectoor to nestbot mentions (#2473) * Merge NestBot AI Assistant feature branch * Update docker-compose/local.yaml * Update backend/Makefile * fix the white spaces * Add aria-labels to interactive elements for WCAG 2.1 compliance * Fix equality checks with floating point values (#2658) * Edit in test cases for the new aira-label update * More edits to make the test more unbreakable and few regex edits . * Revert one change regards two seperate Contributors * Regex check fix * Global 10-second timeout and increased other mobile timeout to fix the Mobile Safari flaky tests --------- Co-authored-by: Dishant Miyani <dishantmiyani1804@gmail.com> Co-authored-by: Arkadii Yakovets <2201626+arkid15r@users.noreply.github.com> Co-authored-by: Arkadii Yakovets <arkadii.yakovets@owasp.org> Co-authored-by: Kate Golovanova <kate@kgthreads.com>

question detection refining

690488c

Dishant1804 requested review from arkid15r and kasya as code owners October 18, 2025 12:35

github-actions bot added nestbot backend-tests labels Oct 18, 2025

coderabbitai bot reviewed Oct 18, 2025

View reviewed changes

backend/apps/slack/common/question_detector.py Outdated Show resolved Hide resolved

sonar qube fixes

e687a7c

coderabbitai bot reviewed Oct 18, 2025

View reviewed changes

backend/apps/slack/common/question_detector.py Show resolved Hide resolved

fix tests

f6ed95b

coderabbitai bot reviewed Oct 18, 2025

View reviewed changes

backend/tests/apps/slack/common/question_detector_test.py Show resolved Hide resolved

arkid15r enabled auto-merge (squash) October 22, 2025 00:07

arkid15r approved these changes Oct 22, 2025

View reviewed changes

arkid15r merged commit 36fc06f into OWASP:feature/nestbot-ai-assistant Oct 22, 2025
25 checks passed

coderabbitai bot mentioned this pull request Oct 22, 2025

Added question detector to nestbot mentions #2473

Merged

3 tasks

Dishant1804 added gsoc2025 Nestbot-ai-assistant labels Nov 9, 2025

arkid15r pushed a commit that referenced this pull request Nov 16, 2025

question detection refining (#2443)

70f4eff

* question detection refining * sonar qube fixes * fix tests

mirnumaan pushed a commit to mirnumaan/Nest that referenced this pull request Nov 16, 2025

question detection refining (OWASP#2443)

b076805

* question detection refining * sonar qube fixes * fix tests

coderabbitai bot mentioned this pull request Jan 30, 2026

Feature/nestbot ai assistant #3699

Open

4 tasks

coderabbitai bot mentioned this pull request Feb 12, 2026

Add Image Extraction Support to NestBot #3925

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

question detection refining#2443

question detection refining#2443
arkid15r merged 3 commits intoOWASP:feature/nestbot-ai-assistantfrom
Dishant1804:question_detector_refinement

Dishant1804 commented Oct 18, 2025

Uh oh!

coderabbitai bot commented Oct 18, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

sonarqubecloud bot commented Oct 18, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Dishant1804 commented Oct 18, 2025

Proposed change

Checklist

Uh oh!

coderabbitai bot commented Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

sonarqubecloud bot commented Oct 18, 2025

Quality Gate passed

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Oct 18, 2025 •

edited

Loading