Skip to content

Chore/scan tunning parameters#174

Merged
agustingroh merged 5 commits intomainfrom
chore/scan-tunning-parameters
Feb 2, 2026
Merged

Chore/scan tunning parameters#174
agustingroh merged 5 commits intomainfrom
chore/scan-tunning-parameters

Conversation

@agustingroh
Copy link
Contributor

@agustingroh agustingroh commented Dec 17, 2025

What's Changed

Added

  • Added scan engine tuning parameters for snippet matching:
    • --min-snippet-hits - Minimum snippet hits required (0 defers to server config)
    • --min-snippet-lines - Minimum snippet lines required (0 defers to server config)
    • --ranking - Enable/disable result ranking (unset/true/false)
    • --ranking-threshold - Ranking threshold value (-1 to 99, -1 defers to server config)
    • --honour-file-exts - Honour file extensions during matching (unset/true/false)
  • Added file_snippet section to scanoss.json settings schema for configuring tuning parameters
  • Added ScanSettingsBuilder class for merging CLI and settings file configurations with priority: CLI > file_snippet > root settings

Summary by CodeRabbit

  • New Features
    • New snippet-tuning options (min-snippet-hits, min-snippet-lines, ranking, ranking-threshold, honour-file-exts) exposed in CLI and configurable via a new file_snippet settings section, including proxy/HTTP options.
    • Scan requests may include scan settings via an encoded header when tuning options are set.
  • Documentation
    • Added "Scan with Snippet Tuning Options" with CLI examples and scanoss.json guidance; minor formatting cleanup.
  • Tests
    • Added tests for settings merging and parsing; updated API test expectations for headers.

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch 4 times, most recently from 0e16a3e to 1bd454d Compare December 17, 2025 15:58
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 90
  • Detected files undeclared: 0
  • Detected files declared: 90
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 1bd454d to c12833c Compare December 17, 2025 16:30
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 90
  • Detected files undeclared: 0
  • Detected files declared: 90
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from c12833c to 92ed9a1 Compare December 17, 2025 16:35
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 1
  • Undeclared components: 0
  • Declared components: 1
  • Detected files: 90
  • Detected files undeclared: 0
  • Detected files declared: 90
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 92ed9a1 to ebe0ef3 Compare December 18, 2025 14:33
@coderabbitai
Copy link

coderabbitai bot commented Dec 18, 2025

📝 Walkthrough

Walkthrough

Adds snippet-tuning options configurable via CLI and scanoss.json, a ScanSettingsBuilder to merge CLI/file/root settings, threads merged settings through Scanner and ScanossApi (sent as a base64 scanoss-settings header), and updates schemas, docs, and tests.

Changes

Cohort / File(s) Summary
Settings schema & data
docs/source/_static/scanoss-settings-schema.json, src/scanoss/data/scanoss-settings-schema.json, scanoss.json, tests/data/scanoss.json
Add file_snippet block with nested proxy/http_config and snippet tuning fields (min_snippet_hits, min_snippet_lines, ranking*, ranking_threshold, honour_file_exts, skip_headers, skip_headers_limit); add hpfm/container schema entries. Verify defaults, ranges, and schema consistency.
Settings Builder
src/scanoss/scan_settings_builder.py
New ScanSettingsBuilder (and MAX_RANKING_THRESHOLD) that merges CLI, file_snippet, and root settings with defined precedence, validates/clamps numeric fields, normalizes booleans, and exposes fluent with_* methods. High logic density — review merging, validation, and defaults.
Settings accessors
src/scanoss/scanoss_settings.py
Add getters for file_snippet and root proxy/http_config and snippet tuning values (min/snippet/ranking/etc.) with file_snippet taking precedence where applicable. Check nullability and fallback semantics.
CLI wiring
src/scanoss/cli.py
Add CLI flags (--min-snippet-hits, --min-snippet-lines, --ranking, --ranking-threshold, --honour-file-exts), new helper get_scanoss_settings_from_args(), and thread scanoss_settings through scan and folder-hash flows. Inspect changed call signatures and propagation.
Scanner & scanning flow
src/scanoss/scanner.py, src/scanoss/scanpostprocessor.py
Rename constructor param scan_settingsscanoss_settings; accept and propagate snippet tuning params; add _merge_cli_with_settings merging logic; pass scanoss_settings to FileFilters, ScanPostProcessor, and scan components. Verify dependency-scan fallback and merged skip_headers behaviour.
API integration
src/scanoss/scanossapi.py
Extend constructor with snippet params; add _build_scan_settings_header() to produce a base64-encoded JSON of effective settings and attach it as scanoss-settings header in scan() requests. Validate encoding and header inclusion.
Docs & help
CHANGELOG.md, CLIENT_HELP.md
Add release note and client help entries documenting snippet tuning CLI flags and scanoss.json examples (CLIENT_HELP contains duplicated/inset block). Confirm examples match schema.
Tests
tests/test_scan_settings_builder.py, tests/scanossapi-test.py
Add comprehensive unit tests for ScanSettingsBuilder; update API header test to expect X-Session. Review coverage, mocked inputs, and expectations.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant CLI
    participant File as ScanossSettingsFile
    participant Builder as ScanSettingsBuilder
    participant Scanner
    participant API as ScanossApi
    participant Server

    User->>CLI: run scan with CLI flags
    CLI->>File: load scanoss.json
    CLI->>Builder: get_scanoss_settings_from_args(args, settings)
    Builder->>Builder: merge (priority: file_snippet > root > CLI)
    Builder-->>Scanner: resolved scanoss_settings + options
    Scanner->>Scanner: merge skip_headers/limits
    Scanner->>API: instantiate with snippet params
    API->>API: _build_scan_settings_header() -> base64(JSON)
    API->>Server: scan request with header "scanoss-settings"
    Server-->>API: response
    API-->>Scanner: results
    Scanner-->>CLI: present results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • eeisegn

Poem

🐰 Snippets tuned with carrot care and cheer,
CLI flags and file settings whisper near.
Builder blends values, neat and spry,
Encoded headers hop off to the sky.
Hooray—matching hops now clearer! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'Chore/scan tunning parameters' is vague and contains a typo ('tunning' instead of 'tuning'). While it references the general domain of scan tuning parameters, it lacks specificity about the main changes (CLI options, settings schema, ScanSettingsBuilder class, documentation). The title is overly broad and does not clearly convey what the primary change is. Clarify the title to be more specific about the main changes, e.g., 'Add scan engine tuning parameters (snippet matching, ranking, headers)' or 'Add CLI and config support for snippet tuning parameters'. Fix the typo: 'tunning' → 'tuning'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 97.40% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/scan-tunning-parameters

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 101
  • Detected files undeclared: 0
  • Detected files declared: 101
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh marked this pull request as ready for review December 18, 2025 14:38
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/scanoss/scanossapi.py (1)

58-80: Scan settings header logic is sound; update docs to match behaviour

Functionally this looks good:

  • Tuning fields are stored on the instance and _build_scan_settings_header() emits a base64-encoded JSON blob only when values are “meaningful” (!= 0 for snippet counts, != 'unset' for booleans, != -1 for ranking threshold).
  • scan() attaches the header as scanoss-settings, so the extra metadata rides with each request.

A few minor doc cleanups would help:

  • __init__ docstring still says the new params default to None, but honour_file_exts and ranking actually default to 'unset'.
  • _build_scan_settings_header docstring references x-scanoss-scan-settings; the actual header key is scanoss-settings.
  • Consider clarifying in the docstring that 0 and -1 are treated as “defer to server configuration” and intentionally omitted from the header.

No changes to runtime logic are needed.

Also applies to: 90-95, 173-180, 293-326

src/scanoss/scanner.py (2)

147-171: Skip-headers merge logic is correct; fix _merge_cli_with_settings docstring

The new logic to merge skip_headers and skip_headers_limit:

settings_skip_headers = file_snippet_settings.get('skip_headers')
settings_skip_headers_limit = file_snippet_settings.get('skip_headers_limit')

skip_headers = Scanner._merge_cli_with_settings(skip_headers, settings_skip_headers)
skip_headers_limit = Scanner._merge_cli_with_settings(skip_headers_limit, settings_skip_headers_limit)

combined with:

if settings_value is not None:
    return settings_value
return cli_value

correctly gives precedence to values from scanoss.json over CLI defaults, which is sensible.

However, the _merge_cli_with_settings docstring claims “Merged value with CLI taking priority over settings”, which is the opposite of what the function actually does (and what the code above relies on). I’d update the “Returns:” description to clearly state that settings take priority over CLI values.

Also applies to: 248-260


173-238: post_processor creation condition uses the wrong variable

Here:

scan_settings = (ScanSettingsBuilder(scanoss_settings)
    .with_proxy(proxy)
    .with_url(url)
    .with_ignore_cert_errors(ignore_cert_errors)
    .with_min_snippet_hits(min_snippet_hits)
    .with_min_snippet_lines(min_snippet_lines)
    .with_honour_file_exts(honour_file_exts)
    .with_ranking(ranking)
    .with_ranking_threshold(ranking_threshold)
)

...

self.post_processor = (
    ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None
)

scan_settings is always a ScanSettingsBuilder instance, so this condition is effectively always true, even when scanoss_settings is None. That’s a behavioural change from the usual “only post-process when we actually have settings” pattern and risks constructing ScanPostProcessor with a None settings object.

If ScanPostProcessor assumes a non-None settings object, this could lead to unexpected errors; even if it handles None, the condition is misleading.

Suggest switching the condition to check scanoss_settings instead of the builder:

Proposed fix
-        self.post_processor = (
-            ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None
-        )
+        self.post_processor = (
+            ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet)
+            if scanoss_settings
+            else None
+        )
🧹 Nitpick comments (4)
tests/test_scan_settings_builder.py (1)

33-359: Good coverage of ScanSettingsBuilder behaviour

The tests exercise initialization, helper methods, all with_* methods, priority rules, and chaining in a realistic way using tests/data/scanoss.json. This gives solid confidence in the builder logic.

If you want to harden things further, consider adding a couple of tests for sentinel values (min_snippet_hits == 0, ranking_threshold == -1) to reflect the semantics used when building the API header (i.e., “defer to server config” cases).

src/scanoss/cli.py (1)

190-222: Snippet-tuning CLI options wired correctly; consider tightening validation

The new --min-snippet-hits, --min-snippet-lines, --ranking, --ranking-threshold, and --honour-file-exts options are consistent with how Scanner and ScanSettingsBuilder consume them, including the “0 / -1 mean defer to server config” semantics.

Two optional polish items:

  • --ranking-threshold: you document -1..99 but don’t enforce it; a custom type or validator would catch out-of-range values early.
  • --honour-file-exts: help text is clear here, but its JSON-schema description currently says “Ignores file extensions”; aligning those descriptions would avoid confusion between positive/negative wording.
docs/source/_static/scanoss-settings-schema.json (1)

142-260: Schema additions align with code; tweak honour_file_exts description

The new settings.proxy, settings.http_config, settings.file_snippet, settings.hpfm, and settings.container blocks line up well with the new getters and builder logic in code (same field names and types, including ranking_*, min_snippet_*, honour_file_exts, skip header controls, etc.).

One small documentation inconsistency:

  • file_snippet.properties.honour_file_exts.description says “Ignores file extensions…”, but the field name and CLI option (--honour-file-exts) are phrased positively (“honour file extensions”).

To avoid confusion, it would be clearer if this description matched the positive semantics (or explicitly stated whether true means “honour” vs “ignore”).

src/scanoss/scanner.py (1)

342-346: Dependency auto-enable via settings: confirm intended behaviour

is_dependency_scan() now returns True either when the dependency bit is set in scan_options or when settings.file_snippet.dependency_analysis is true:

if self.scan_options & ScanType.SCAN_DEPENDENCIES.value:
    return True
file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {}
return file_snippet_settings.get('dependency_analysis', False)

This means a settings file can implicitly turn on dependency scanning even if the user didn’t pass --dependencies/--dependencies-only. That’s reasonable, but it is a change in behaviour.

If this is intentional, it might be worth documenting somewhere that dependency_analysis in scanoss.json acts as a default “enable deps scan” toggle, overridden only by explicitly disabling dependency bits in scan_options.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e0f941 and ebe0ef3.

📒 Files selected for processing (12)
  • CHANGELOG.md (2 hunks)
  • CLIENT_HELP.md (1 hunks)
  • docs/source/_static/scanoss-settings-schema.json (1 hunks)
  • scanoss.json (1 hunks)
  • src/scanoss/__init__.py (1 hunks)
  • src/scanoss/cli.py (5 hunks)
  • src/scanoss/scan_settings_builder.py (1 hunks)
  • src/scanoss/scanner.py (9 hunks)
  • src/scanoss/scanoss_settings.py (1 hunks)
  • src/scanoss/scanossapi.py (6 hunks)
  • tests/data/scanoss.json (1 hunks)
  • tests/test_scan_settings_builder.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
  • ScanSettingsBuilder (31-263)
  • _str_to_bool (220-226)
  • _merge_with_priority (203-209)
  • _merge_cli_with_settings (212-216)
  • with_proxy (69-83)
  • with_url (85-99)
  • with_ignore_cert_errors (101-119)
  • with_min_snippet_hits (121-134)
  • with_min_snippet_lines (136-149)
  • with_honour_file_exts (151-167)
  • with_ranking (169-184)
  • with_ranking_threshold (186-199)
src/scanoss/scanoss_settings.py (1)
  • ScanossSettings (76-433)
src/scanoss/scanner.py (1)
  • _merge_cli_with_settings (249-260)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (4)
  • ScanossSettings (76-433)
  • load_json_file (103-138)
  • set_file_type (140-152)
  • set_scan_type (154-161)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (6)
  • ScanossSettings (76-433)
  • get_file_snippet_settings (339-345)
  • get_file_snippet_proxy (419-425)
  • get_proxy (403-409)
  • get_http_config (411-417)
  • get_file_snippet_http_config (427-433)
src/scanoss/scanner.py (1)
  • _merge_cli_with_settings (249-260)
🪛 markdownlint-cli2 (0.18.1)
CHANGELOG.md

776-776: Link and image reference definitions should be needed
Duplicate link or image reference definition: "1.42.0"

(MD053, link-image-reference-definitions)

🔇 Additional comments (7)
src/scanoss/__init__.py (1)

25-25: LGTM!

Version bump to 1.43.0 appropriately reflects the new scan tuning parameters and ScanSettingsBuilder additions.

scanoss.json (1)

18-21: LGTM!

Adding scanoss-winnowing.py to the BOM includes is appropriate since it's a declared dependency used for fast winnowing functionality.

CHANGELOG.md (1)

11-20: LGTM!

The changelog entry for version 1.43.0 clearly documents the new scan engine tuning parameters and ScanSettingsBuilder functionality.

CLIENT_HELP.md (1)

262-304: LGTM!

Comprehensive documentation for the new snippet tuning options. The examples are clear, and the note about using = syntax for negative values (line 292) is a helpful detail for users.

tests/data/scanoss.json (1)

28-51: LGTM!

The test data structure effectively exercises the nested configuration hierarchy with distinct values at root and file_snippet levels (e.g., different proxy hosts and http_config settings), which enables proper testing of the ScanSettingsBuilder merge priority logic.

src/scanoss/cli.py (1)

1410-1437: scan / wfp / folder- settings wiring looks consistent*

The refactor to use scanoss_settings (instead of the old scan_settings) in wfp(), scan(), folder_hashing_scan(), and folder_hash() is coherent:

  • Settings files are loaded once into ScanossSettings, with legacy vs new file types handled in scan().
  • The same scanoss_settings instance is threaded into Scanner, ScannerHFH, and FolderHasher.
  • New snippet-tuning CLI options are only passed to Scanner for the scan command, which is appropriate given they affect server-side scanning, not fingerprint-only flows.

No functional issues stand out in these call sites.

Also applies to: 1518-1532, 1597-1641, 2716-2831

src/scanoss/scanoss_settings.py (1)

339-433: New settings accessors are straightforward and consistent

The added getters for file_snippet, proxy, and http_config (including min_snippet_*, ranking, honour_file_exts, skip headers, etc.) correctly mirror the underlying data structure and provide clear defaults (None/0/False as documented). They integrate cleanly with ScanSettingsBuilder and Scanner.

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from ebe0ef3 to 59b57ba Compare December 18, 2025 14:55
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 101
  • Detected files undeclared: 0
  • Detected files declared: 101
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
CHANGELOG.md (1)

775-776: Fix duplicate link reference definition.

The duplicate link reference issue flagged in the previous review remains unresolved. Line 776 uses [1.42.0] as the link label, but it should be [1.43.0] to match the version and prevent Markdown rendering issues.

🔎 Apply this diff to fix the link label:
 [1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.41.1...v1.42.0
-[1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0
+[1.43.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0
🧹 Nitpick comments (2)
src/scanoss/cli.py (1)

190-222: Fix inconsistent spacing in choices lists.

Lines 206 and 219 have inconsistent spacing in the choices parameter:

  • Line 206: choices=['unset' ,'true', 'false'] (space before comma)
  • Line 219: choices=['unset','true', 'false'] (no space after first comma)
🔎 Apply this diff for consistent formatting:
     p_scan.add_argument(
         '--ranking',
         type=str,
-        choices=['unset' ,'true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Enable or disable ranking (optional - default: server configuration)',
     )
     p_scan.add_argument(
         '--honour-file-exts',
         type=str,
-        choices=['unset','true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Honour file extensions during scanning. When not set, defers to server configuration (optional)',
     )
src/scanoss/scanossapi.py (1)

75-79: Fix spacing before comma in parameter default.

Line 77 has a space before the comma: honour_file_exts = 'unset' ,

🔎 Apply this diff:
-        honour_file_exts = 'unset' ,
+        honour_file_exts='unset',
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ebe0ef3 and 59b57ba.

📒 Files selected for processing (11)
  • CHANGELOG.md (2 hunks)
  • CLIENT_HELP.md (1 hunks)
  • scanoss.json (1 hunks)
  • src/scanoss/__init__.py (1 hunks)
  • src/scanoss/cli.py (5 hunks)
  • src/scanoss/scan_settings_builder.py (1 hunks)
  • src/scanoss/scanner.py (9 hunks)
  • src/scanoss/scanoss_settings.py (1 hunks)
  • src/scanoss/scanossapi.py (6 hunks)
  • tests/data/scanoss.json (1 hunks)
  • tests/test_scan_settings_builder.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • CLIENT_HELP.md
🚧 Files skipped from review as they are similar to previous changes (3)
  • src/scanoss/scan_settings_builder.py
  • tests/data/scanoss.json
  • tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
  • ScanossSettings (76-433)
  • load_json_file (103-138)
  • set_file_type (140-152)
  • set_scan_type (154-161)
src/scanoss/inspection/utils/file_utils.py (1)
  • load_json_file (29-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (4)
src/scanoss/__init__.py (1)

25-25: LGTM!

Version bump is consistent with the new features introduced in this PR.

scanoss.json (1)

18-21: LGTM!

Adding the scanoss-winnowing.py library to the BOM is appropriate for dependency tracking.

src/scanoss/scanoss_settings.py (1)

339-433: LGTM!

The new getter methods provide clean access to file_snippet settings with consistent patterns, appropriate defaults, and clear documentation.

src/scanoss/scanner.py (1)

173-182: The ScanSettingsBuilder implementation does not match the stated priority order. The builder implements file_snippet > root settings > CLI, where CLI arguments are the lowest priority (used as fallback). This is the opposite of the PR objectives which require CLI > file_snippet > root settings. The priority order logic must be reversed in the merge methods to ensure CLI arguments take precedence when provided.

Likely an incorrect or invalid review comment.

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 59b57ba to 0fb75e9 Compare December 18, 2025 15:14
@agustingroh agustingroh self-assigned this Dec 18, 2025
@agustingroh agustingroh added the enhancement New feature or request label Dec 18, 2025
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 101
  • Detected files undeclared: 0
  • Detected files declared: 101
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@scanoss scanoss deleted a comment from github-actions bot Dec 18, 2025
@scanoss scanoss deleted a comment from github-actions bot Dec 18, 2025
@scanoss scanoss deleted a comment from github-actions bot Dec 18, 2025
@scanoss scanoss deleted a comment from github-actions bot Dec 18, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/scanoss/scanner.py (1)

147-157: Priority order contradicts PR objectives.

Lines 153-155 use _merge_cli_with_settings which implements "settings > cli" priority (as seen in lines 258-260). However, the PR objectives explicitly state: "priority: CLI > file_snippet > root settings".

This means CLI arguments should override settings file values, but the current implementation does the opposite for skip_headers and skip_headers_limit.

🔎 Apply this diff to fix the priority order:
     @staticmethod
     def _merge_cli_with_settings(cli_value, settings_value):
-        """Merge CLI value with settings value (two-level priority: settings > cli).
+        """Merge CLI value with settings value (two-level priority: cli > settings).
 
         Args:
             cli_value: Value from CLI argument
             settings_value: Value from scanoss.json file_snippet settings
         Returns:
-            Merged value with CLI taking priority over settings
+            Merged value with CLI taking priority over settings file
         """
+        if cli_value is not None:
+            return cli_value
         if settings_value is not None:
             return settings_value
-        return cli_value
+        return None
🧹 Nitpick comments (3)
src/scanoss/scanossapi.py (1)

75-79: Consider type enforcement for boolean parameters.

The honour_file_exts and ranking parameters use Union[bool, str, None] to support both boolean values and the 'unset' sentinel string. While the ScanSettingsBuilder correctly converts string inputs to booleans before passing them to ScanossApi, direct instantiation of ScanossApi could lead to strings like 'true'/'false' being serialized as JSON strings rather than booleans.

Consider adding runtime validation in _build_scan_settings_header() to ensure these values are booleans when included:

🔎 Suggested fix
         # honour_file_exts: None = unset, don't send
         if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
-            settings['honour_file_exts'] = self.honour_file_exts
+            settings['honour_file_exts'] = bool(self.honour_file_exts) if not isinstance(self.honour_file_exts, bool) else self.honour_file_exts

         # ranking: None = unset, don't send
         if self.ranking is not None and self.ranking != 'unset':
-            settings['ranking_enabled'] = self.ranking
+            settings['ranking_enabled'] = bool(self.ranking) if not isinstance(self.ranking, bool) else self.ranking
tests/test_scan_settings_builder.py (1)

36-38: Consider using setUpClass for shared fixtures.

The scan_settings fixture is created at class definition time. While this works, using setUpClass is the conventional approach for shared test fixtures in unittest:

🔎 Suggested refactor
 class TestScanSettingsBuilder(unittest.TestCase):
     """Tests for the ScanSettingsBuilder class."""

-    script_dir = os.path.dirname(os.path.abspath(__file__))
-    scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve()
-    scan_settings = ScanossSettings(filepath=scan_settings_path)
+    @classmethod
+    def setUpClass(cls):
+        script_dir = os.path.dirname(os.path.abspath(__file__))
+        cls.scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve()
+        cls.scan_settings = ScanossSettings(filepath=str(cls.scan_settings_path))
src/scanoss/cli.py (1)

190-222: Fix inconsistent spacing in choices lists.

Lines 206 and 219 have inconsistent spacing in the choices parameter:

  • Line 206: choices=['unset' ,'true', 'false'] has a space before the comma after 'unset'
  • Line 219: choices=['unset','true', 'false'] is missing a space after 'unset' comma
🔎 Apply this diff to fix the spacing:
     p_scan.add_argument(
         '--ranking',
         type=str,
-        choices=['unset' ,'true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Enable or disable ranking (optional - default: server configuration)',
     )
     p_scan.add_argument(
         '--ranking-threshold',
         type=int,
         default=None,
         help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)',
     )
     p_scan.add_argument(
         '--honour-file-exts',
         type=str,
-        choices=['unset','true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Honour file extensions during scanning. When not set, defers to server configuration (optional)',
     )
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59b57ba and 0fb75e9.

📒 Files selected for processing (11)
  • CHANGELOG.md (2 hunks)
  • CLIENT_HELP.md (1 hunks)
  • scanoss.json (1 hunks)
  • src/scanoss/__init__.py (1 hunks)
  • src/scanoss/cli.py (5 hunks)
  • src/scanoss/scan_settings_builder.py (1 hunks)
  • src/scanoss/scanner.py (9 hunks)
  • src/scanoss/scanoss_settings.py (1 hunks)
  • src/scanoss/scanossapi.py (6 hunks)
  • tests/data/scanoss.json (1 hunks)
  • tests/test_scan_settings_builder.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • scanoss.json
  • CHANGELOG.md
  • src/scanoss/scan_settings_builder.py
  • src/scanoss/init.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
  • print_debug (58-63)
src/scanoss/spdxlite.py (1)
  • print_debug (61-66)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
  • ScanSettingsBuilder (31-266)
  • _str_to_bool (223-229)
  • _merge_with_priority (203-209)
  • _merge_cli_with_settings (212-219)
  • with_proxy (69-83)
  • with_url (85-99)
  • with_ignore_cert_errors (101-119)
  • with_min_snippet_hits (121-134)
  • with_min_snippet_lines (136-149)
  • with_honour_file_exts (151-167)
  • with_ranking (169-184)
  • with_ranking_threshold (186-199)
src/scanoss/scanoss_settings.py (1)
  • ScanossSettings (76-433)
src/scanoss/scanner.py (1)
  • _merge_cli_with_settings (249-260)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (10)
  • ScanSettingsBuilder (31-266)
  • _merge_cli_with_settings (212-219)
  • with_proxy (69-83)
  • with_url (85-99)
  • with_ignore_cert_errors (101-119)
  • with_min_snippet_hits (121-134)
  • with_min_snippet_lines (136-149)
  • with_honour_file_exts (151-167)
  • with_ranking (169-184)
  • with_ranking_threshold (186-199)
src/scanoss/scanoss_settings.py (1)
  • get_file_snippet_settings (339-345)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (14)
tests/data/scanoss.json (1)

29-51: LGTM!

The test data file is well-structured with a clear hierarchy between root settings and file_snippet settings, enabling comprehensive testing of the priority-based configuration merging logic.

src/scanoss/scanoss_settings.py (1)

339-433: LGTM!

The new accessor methods for file_snippet settings are well-implemented with:

  • Consistent patterns across all getters
  • Proper Optional return types
  • Clear docstrings documenting expected return values
  • Clean separation between root-level and file_snippet-level configuration access
src/scanoss/scanossapi.py (2)

176-179: LGTM!

The integration of the scanoss-settings header into the scan method is clean and correct. The header is only added when meaningful settings are configured.


293-326: LGTM!

The _build_scan_settings_header method correctly:

  • Excludes sentinel values (0, 'unset', -1, None) from the serialized payload
  • Uses base64 encoding for safe header transmission
  • Logs the settings for debugging before encoding
  • Returns None when no settings need to be sent
CLIENT_HELP.md (1)

262-304: LGTM!

The documentation for the new snippet tuning options is comprehensive and well-structured:

  • Clear explanations of each flag's purpose
  • Practical usage examples
  • Helpful note about = syntax for negative threshold values
  • Example showing how to combine multiple options
tests/test_scan_settings_builder.py (2)

44-117: LGTM!

Excellent test coverage for:

  • Initialization with None and with settings object
  • Static helper methods (_str_to_bool, _merge_with_priority, _merge_cli_with_settings)
  • Edge cases like all-None inputs and case-insensitive boolean parsing

119-358: LGTM!

Comprehensive test coverage for all with_* methods including:

  • CLI-only scenarios (no settings file)
  • Settings file values taking priority over CLI
  • Method chaining verification
  • All tuning parameters: proxy, url, ignore_cert_errors, min_snippet_hits, min_snippet_lines, honour_file_exts, ranking, ranking_threshold
src/scanoss/cli.py (3)

1410-1418: LGTM!

Settings loading is correctly refactored to use scanoss_settings instead of scan_settings, maintaining consistency with the rest of the codebase.


1518-1535: LGTM!

Settings loading logic correctly handles different scenarios (identify, ignore, and standard settings) while maintaining backward compatibility.


1631-1641: LGTM!

New snippet tuning parameters are correctly wired from CLI arguments to the Scanner constructor, enabling configuration of snippet matching behavior.

src/scanoss/scanner.py (4)

110-117: LGTM!

Constructor signature correctly updated to accept the new snippet tuning parameters and renamed scanoss_settings for clarity. The new parameters enable fine-grained control over snippet matching behavior.


186-206: LGTM!

The ScanossApi is correctly initialized with merged settings from ScanSettingsBuilder, properly wiring through the new snippet tuning parameters.


337-345: LGTM!

The is_dependency_scan method correctly falls back to checking dependency_analysis in the file_snippet settings when the explicit scan option is not set, providing flexible configuration.


173-184: Correct the stated priority order in review comment.

The review comment incorrectly states the PR objectives. The ScanSettingsBuilder class documentation specifies priority as: file_snippet section > settings section > CLI arguments, and the _merge_with_priority implementation correctly follows this priority order. The code is working as designed; the review comment's stated objectives are reversed.

Likely an incorrect or invalid review comment.

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 0fb75e9 to c62cf42 Compare December 18, 2025 15:26
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 101
  • Detected files undeclared: 0
  • Detected files declared: 101
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 9adb13c to 1f204cb Compare January 15, 2026 17:18
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 103
  • Detected files undeclared: 0
  • Detected files declared: 103
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 1f204cb to 5fa4487 Compare January 15, 2026 17:19
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 103
  • Detected files undeclared: 0
  • Detected files declared: 103
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@src/scanoss/scan_settings_builder.py`:
- Line 32: Update the MAX_RANKING_THRESHOLD constant in scan_settings_builder.py
from 10 to 99 so it matches the documented valid range (-1 to 99) in the PR and
CHANGELOG; locate the MAX_RANKING_THRESHOLD symbol and set its value to 99 to
align code with the specification.
♻️ Duplicate comments (5)
src/scanoss/data/scanoss-settings-schema.json (1)

225-229: Fix misleading honour_file_exts description.

The description says "Ignores file extensions" but the field name honour_file_exts indicates the opposite behavior (honouring/respecting file extensions). Update the description to match the actual semantics.

src/scanoss/scan_settings_builder.py (1)

115-133: with_ignore_cert_errors precedence doesn't match docstring.

The docstring states "priority: CLI True > file_snippet > settings > False", implying CLI True should always win. However, _merge_with_priority returns file_snippet value first if present, meaning if the settings file explicitly sets ignore_cert_errors: false, the CLI flag --ignore-cert-errors will not override it.

If CLI True should genuinely override all settings, the logic needs adjustment:

🔧 Suggested fix if CLI True should always win
     def with_ignore_cert_errors(self, cli_value: bool = False) -> 'ScanSettingsBuilder':
-        """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False.
-
-        Note: CLI value only takes effect if True (flag present). False means
-        the flag was not provided, so settings file values are checked.
-        """
-        result = self._merge_with_priority(
-            cli_value if cli_value else None,
-            self._get_file_snippet_http_config_value('ignore_cert_errors'),
-            self._get_http_config_value('ignore_cert_errors')
-        )
-        self.ignore_cert_errors = result if result is not None else False
+        """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False."""
+        # CLI True always wins
+        if cli_value:
+            self.ignore_cert_errors = True
+            return self
+        # Otherwise fall back to settings
+        result = self._merge_with_priority(
+            None,
+            self._get_file_snippet_http_config_value('ignore_cert_errors'),
+            self._get_http_config_value('ignore_cert_errors')
+        )
+        self.ignore_cert_errors = bool(result) if result is not None else False
         return self
src/scanoss/cli.py (1)

206-225: Inconsistent formatting in choices and incorrect help text for ranking threshold.

This issue was flagged in a previous review and remains unaddressed:

  1. Line 209 has inconsistent spacing: choices=['unset' ,'true', 'false'] (space before comma)
  2. Line 217 states "Valid range: -1 to 10" but ScanSettingsBuilder clamps values to [-1, 99]
  3. Line 222 has inconsistent spacing: choices=['unset','true', 'false'] (no space after 'unset')
Suggested fix
     p_scan.add_argument(
         '--ranking',
         type=str,
-        choices=['unset' ,'true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Enable or disable ranking (optional - default: server configuration)',
     )
     p_scan.add_argument(
         '--ranking-threshold',
         type=int,
         default=-1,
-        help='Ranking threshold value. Valid range: -1 to 10. A value of -1 defers to server configuration (optional)',
+        help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)',
     )
     p_scan.add_argument(
         '--honour-file-exts',
         type=str,
-        choices=['unset','true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Honour file extensions during scanning. When not set, defers to server configuration (optional)',
     )
src/scanoss/scanner.py (2)

231-233: Condition checks wrong variable.

This was flagged in a previous review: The condition checks scan_settings (the ScanSettingsBuilder result) but passes scanoss_settings (the ScanossSettings instance) to ScanPostProcessor. These are different objects.

Fix the condition
         self.post_processor = (
-            ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None
+            ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scanoss_settings else None
         )

243-255: Docstring has contradictory priority descriptions.

The docstring is internally inconsistent:

  • Line 245 states "two-level priority: settings > cli"
  • Line 251 states "Merged value with CLI taking priority over settings"

The implementation gives settings priority (returning settings_value if not None), which aligns with ScanSettingsBuilder._merge_cli_with_settings. However, the PR objectives state "priority: CLI > file_snippet > root settings".

Please clarify the intended behavior and update the docstring accordingly:

If settings should take priority (current behavior)
     `@staticmethod`
     def _merge_cli_with_settings(cli_value, settings_value):
         """Merge CLI value with settings value (two-level priority: settings > cli).

         Args:
             cli_value: Value from CLI argument
             settings_value: Value from scanoss.json file_snippet settings
         Returns:
-            Merged value with CLI taking priority over settings
+            Merged value with settings taking priority over CLI
         """
🧹 Nitpick comments (8)
src/scanoss/data/scanoss-settings-schema.json (2)

246-262: Consider aligning ranking_enabled type with file_snippet for consistency.

In file_snippet, ranking_enabled is ["boolean", "null"] with default null (allowing "unset" state), while here in hpfm it's just boolean without nullable support. If both sections need a "defer to server" option, consider making the types consistent.

♻️ Suggested change for consistency
         "ranking_enabled": {
-          "type": "boolean",
-          "description": "Enable ranking for HPFM"
+          "type": ["boolean", "null"],
+          "description": "Enable ranking for HPFM. When not set, defers to server configuration.",
+          "default": null
         },

263-266: Empty container object is a placeholder.

The container section has no properties defined. If this is intentional scaffolding for future work, consider adding a brief comment in the description (e.g., "Reserved for future container scanning options") to clarify its purpose.

📝 Suggested description improvement
         "container": {
           "type": "object",
-          "description": "Container scanning configuration"
+          "description": "Container scanning configuration (reserved for future options)"
         }
tests/test_scan_settings_builder.py (1)

36-38: Class-level test fixtures may cause issues with test isolation.

Using class-level attributes for scan_settings_path and scan_settings means all test methods share the same ScanossSettings instance. While this works for read-only access, consider using setUp or setUpClass methods for clearer test lifecycle management.

src/scanoss/scan_settings_builder.py (2)

79-81: Improve type hints for honour_file_exts and ranking.

Using any (lowercase) is not a valid Python type hint. Consider using Optional[Union[bool, str]] to accurately reflect that these can hold string values ('true', 'false', 'unset') before conversion or boolean values after.

♻️ Suggested fix
+from typing import TYPE_CHECKING, Optional, Union
+
 ...
-        self.honour_file_exts: Optional[any] = None
-        self.ranking: Optional[any] = None
+        self.honour_file_exts: Optional[Union[bool, str]] = None
+        self.ranking: Optional[Union[bool, str]] = None

194-196: Missing space before != operator.

Minor formatting issue on line 195.

🔧 Fix spacing
-        if self.honour_file_exts is not None and self.honour_file_exts!= 'unset':
+        if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
src/scanoss/scanossapi.py (1)

293-326: Potential type inconsistency for honour_file_exts and ranking values.

When ScanOSSAPI is instantiated through ScannerScanSettingsBuilder, these values are correctly converted to booleans by ScanSettingsBuilder._str_to_bool(). However, if ScanOSSAPI is instantiated directly (e.g., in tests or custom integrations), string values like 'true' or 'false' could be serialized as JSON strings rather than booleans.

Consider adding explicit boolean conversion in _build_scan_settings_header() to ensure consistent JSON output regardless of instantiation path:

Suggested defensive conversion
         # honour_file_exts: None = unset, don't send
         if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
-            settings['honour_file_exts'] = self.honour_file_exts
+            settings['honour_file_exts'] = self._to_bool(self.honour_file_exts)

         # ranking: None = unset, don't send
         if self.ranking is not None and self.ranking != 'unset':
-            settings['ranking_enabled'] = self.ranking
+            settings['ranking_enabled'] = self._to_bool(self.ranking)
+
+    `@staticmethod`
+    def _to_bool(value) -> bool:
+        """Convert value to boolean."""
+        if isinstance(value, bool):
+            return value
+        if isinstance(value, str):
+            return value.lower() == 'true'
+        return bool(value)
src/scanoss/scanner.py (2)

339-340: Consider guarding file_snippet access for legacy settings.

When using legacy settings files (SBOM.json format), accessing file_snippet settings may introduce unexpected behavior. While get_file_snippet_settings() returns an empty dict for missing sections (avoiding crashes), mixing new semantics into the legacy path could cause confusion.

Suggested guard
-        file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {}
-        return file_snippet_settings.get('dependency_analysis', False)
+        if not self.scanoss_settings or self.scanoss_settings.is_legacy():
+            return False
+        file_snippet_settings = self.scanoss_settings.get_file_snippet_settings()
+        return file_snippet_settings.get('dependency_analysis', False)

142-151: Consider guarding file_snippet access for legacy settings (applies here too).

Similar to the is_dependency_scan method, accessing file_snippet_settings in the constructor should be guarded for legacy settings files to maintain consistent behavior:

Suggested guard
         # Get settings values for skip_headers options
-        file_snippet_settings = scanoss_settings.get_file_snippet_settings() if scanoss_settings else {}
+        file_snippet_settings = (
+            scanoss_settings.get_file_snippet_settings()
+            if scanoss_settings and not scanoss_settings.is_legacy()
+            else {}
+        )
         settings_skip_headers = file_snippet_settings.get('skip_headers')
         settings_skip_headers_limit = file_snippet_settings.get('skip_headers_limit')
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9adb13c and 5fa4487.

📒 Files selected for processing (14)
  • CHANGELOG.md
  • CLIENT_HELP.md
  • docs/source/_static/scanoss-settings-schema.json
  • scanoss.json
  • src/scanoss/__init__.py
  • src/scanoss/cli.py
  • src/scanoss/data/scanoss-settings-schema.json
  • src/scanoss/scan_settings_builder.py
  • src/scanoss/scanner.py
  • src/scanoss/scanoss_settings.py
  • src/scanoss/scanossapi.py
  • src/scanoss/scanpostprocessor.py
  • tests/data/scanoss.json
  • tests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (5)
  • tests/data/scanoss.json
  • src/scanoss/init.py
  • docs/source/_static/scanoss-settings-schema.json
  • CLIENT_HELP.md
  • src/scanoss/scanoss_settings.py
🧰 Additional context used
🧬 Code graph analysis (5)
src/scanoss/scan_settings_builder.py (3)
src/scanoss/scanossbase.py (2)
  • ScanossBase (28-107)
  • print_msg (51-56)
src/scanoss/scanoss_settings.py (5)
  • get_file_snippet_settings (339-345)
  • get_file_snippet_proxy (427-433)
  • get_proxy (411-417)
  • get_http_config (419-425)
  • get_file_snippet_http_config (435-441)
src/scanoss/scanner.py (1)
  • _merge_cli_with_settings (244-255)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
  • ScanossSettings (76-441)
  • load_json_file (103-138)
  • set_file_type (140-152)
  • set_scan_type (154-161)
src/scanoss/inspection/utils/file_utils.py (1)
  • load_json_file (29-44)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
  • is_legacy (315-317)
  • get_bom_remove (210-218)
  • get_bom_replace (220-228)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (10)
  • ScanSettingsBuilder (35-311)
  • _merge_cli_with_settings (257-264)
  • with_proxy (83-97)
  • with_url (99-113)
  • with_ignore_cert_errors (115-133)
  • with_min_snippet_hits (135-156)
  • with_min_snippet_lines (158-179)
  • with_honour_file_exts (181-197)
  • with_ranking (199-214)
  • with_ranking_threshold (216-244)
src/scanoss/scanoss_settings.py (1)
  • get_file_snippet_settings (339-345)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
  • print_debug (58-63)
src/scanoss/spdxlite.py (1)
  • print_debug (61-66)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (20)
src/scanoss/data/scanoss-settings-schema.json (5)

143-166: LGTM!

The proxy and http_config sections are well-structured with clear descriptions for API request configuration.


167-194: LGTM!

The nested proxy and http_config within file_snippet properly allow independent configuration for file snippet requests, enabling different endpoints or proxy settings from the root configuration.


195-206: Clarify ranking_threshold default value semantics.

The schema allows both null type and -1 integer value, with the description stating "-1 defers to server configuration." However, the default is 0, which is a valid threshold value. If the intent is to defer to server by default (not set), consider changing the default to null or -1 for consistency with the "defer to server" behavior.

Please verify whether a default of 0 is intentional (meaning: apply zero threshold by default) or if it should be null/-1 to defer to server configuration when not explicitly set.


207-224: LGTM!

The min_snippet_hits, min_snippet_lines, and snippet_range_tolerance properties are well-defined with appropriate constraints (non-negative integers with default 0 to defer to server per PR objectives).


230-244: LGTM!

The dependency_analysis, skip_headers, and skip_headers_limit properties are clearly defined with sensible defaults.

scanoss.json (1)

17-21: LGTM!

The addition of scanoss-winnowing.py to the BOM include list is correctly formatted and consistent with the existing entry structure.

CHANGELOG.md (2)

11-21: LGTM!

The changelog entry for version 1.44.0 is well-structured, follows the Keep a Changelog format, and clearly documents all the new scan engine tuning parameters and related additions.


786-787: LGTM!

Link references are correctly added for versions 1.43.1 and 1.44.0, properly chaining from the previous versions.

tests/test_scan_settings_builder.py (1)

336-358: LGTM!

Comprehensive test coverage for method chaining and the fluent API pattern. All builder methods are exercised and verified in a single chain.

src/scanoss/scan_settings_builder.py (2)

247-274: LGTM!

Static helper methods are well-implemented with clear, focused responsibilities. The _str_to_bool method properly handles None, boolean pass-through, and string conversion.


277-311: LGTM!

Private extraction methods properly handle null cases and safely navigate nested dictionary structures. Good defensive coding with early returns for missing scanoss_settings.

src/scanoss/scanpostprocessor.py (4)

81-100: LGTM!

Clean refactoring from scan_settings to scanoss_settings for naming consistency across the codebase. The parameter, docstring, and instance attribute are all correctly updated.


112-120: LGTM!

Renaming the local variable from result to entry improves clarity and avoids potential confusion with the outer result from the loop iteration.


129-136: LGTM!

Legacy detection now correctly uses self.scanoss_settings.is_legacy() and includes an early return to skip unsupported post-processing.


155-163: LGTM!

The _replace_purls method consistently uses self.scanoss_settings and the renamed entry variable throughout the replacement logic.

src/scanoss/cli.py (2)

1632-1639: LGTM!

The new snippet tuning parameters (min_snippet_hits, min_snippet_lines, ranking, ranking_threshold, honour_file_exts) are correctly propagated from CLI arguments to the Scanner constructor.


2822-2831: LGTM!

The get_scanoss_settings_from_args helper consolidates settings loading logic for folder_hashing_scan and folder_hash commands, following the same pattern used elsewhere in the CLI.

src/scanoss/scanossapi.py (2)

75-109: LGTM!

The new tuning parameters are properly added to the constructor signature with appropriate type hints (Union[bool, str, None] for honour_file_exts and ranking), and the docstring accurately documents the new parameters with their default values.


176-179: LGTM!

The scan settings header is cleanly integrated into the request flow, only adding the scanoss-settings header when tuning parameters are configured.

src/scanoss/scanner.py (1)

167-200: LGTM!

The ScanSettingsBuilder integration cleanly merges CLI arguments with settings file values and passes the merged configuration to ScanossApi. The builder pattern provides a clear, chainable API for configuration merging.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch 2 times, most recently from 59fd38f to 86725c2 Compare January 16, 2026 18:07
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 103
  • Detected files undeclared: 0
  • Detected files declared: 103
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 86725c2 to 882f8ce Compare January 16, 2026 18:11
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 103
  • Detected files undeclared: 0
  • Detected files declared: 103
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 882f8ce to 64e3b80 Compare January 16, 2026 18:13
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@docs/source/_static/scanoss-settings-schema.json`:
- Around line 200-205: The docs and main JSON schemas for ranking_threshold are
inconsistent with the runtime constant MAX_RANKING_THRESHOLD (which enforces a
max of 10) and clamping logic; update the ranking_threshold entries in both
schemas to use "type": ["integer","null"], "minimum": -1, "maximum": 10 and set
"default": -1 so defaults and constraints match the runtime (refer to
MAX_RANKING_THRESHOLD and the builder/clamping logic that clamps values >10).
Ensure both the docs schema and the main schema use these same values to remove
the contradiction.

In `@src/scanoss/cli.py`:
- Around line 231-234: Remove the duplicate argument registration for
'--wfp-output' that is being added a second time via the p_scan.add_argument
call; locate the redundant p_scan.add_argument(... '--wfp-output' ...) block and
delete it so only the original definition remains (keep the first declaration
and remove the later duplicate to avoid argparse conflicts).

In `@src/scanoss/scanner.py`:
- Around line 170-203: The ScanSettingsBuilder usage can cause CLI flags to be
overridden because the merge order currently may let file_snippet/root values
win; update the merge so CLI values take precedence by applying CLI overrides
last: when constructing scan_settings with ScanSettingsBuilder, ensure builder
is fed base settings first (root/file_snippet) and then call
.with_min_snippet_hits(min_snippet_hits),
.with_min_snippet_lines(min_snippet_lines), .with_ranking(ranking),
.with_ranking_threshold(ranking_threshold),
.with_honour_file_exts(honour_file_exts), .with_proxy(proxy), .with_url(url),
.with_ignore_cert_errors(ignore_cert_errors) only if the corresponding CLI
variable is not None (or always call them last to overwrite), so
scan_settings.min_snippet_hits/min_snippet_lines/ranking/ranking_threshold/honour_file_exts/proxy/url/ignore_cert_errors
reflect CLI overrides before passing scan_settings into ScanossApi.
♻️ Duplicate comments (9)
src/scanoss/scan_settings_builder.py (2)

35-43: Confirm merge precedence matches the stated CLI priority.

PR objectives say CLI should override file_snippet/root, but _merge_with_priority / _merge_cli_with_settings currently make settings win. This also means with_ignore_cert_errors can’t guarantee the “CLI True wins” behavior its docstring promises when settings are present. Please confirm the intended precedence; if CLI should win, swap the merge order and update tests/docs accordingly.

🔧 Possible fix if CLI should have highest priority
 def _merge_with_priority(cli_value, file_snippet_value, root_value):
-    if file_snippet_value is not None:
-        return file_snippet_value
-    if root_value is not None:
-        return root_value
-    return cli_value
+    if cli_value is not None:
+        return cli_value
+    if file_snippet_value is not None:
+        return file_snippet_value
+    return root_value

 def _merge_cli_with_settings(cli_value, settings_value):
-    if settings_value is not None:
-        return settings_value
-    return cli_value
+    if cli_value is not None:
+        return cli_value
+    return settings_value

Also applies to: 115-133, 246-264


32-32: Align ranking_threshold max with the intended spec.

MAX_RANKING_THRESHOLD is 10 and the docstring says “-1 to 10”, but the PR objectives state “-1 to 99”. Please confirm the intended max and align constant/docstrings/tests/docs accordingly.

🔧 Possible fix if max should be 99
-MAX_RANKING_THRESHOLD = 10
+MAX_RANKING_THRESHOLD = 99

-        Valid range is -1 to 10. Values outside this range will be clamped and logged.
+        Valid range is -1 to 99. Values outside this range will be clamped and logged.

Also applies to: 216-237

src/scanoss/data/scanoss-settings-schema.json (1)

201-205: Fix misleading honour_file_exts description.

The description says "Ignores file extensions" but the field name and CLI help describe it as "Honour file extensions". This inconsistency will confuse users. Update to: "Whether to honour file extensions during matching. When not set, defers to server configuration."

src/scanoss/scanossapi.py (1)

75-79: Missing snippet_range_tolerance parameter.

The schema and ScanossSettings.get_snippet_range_tolerance() support this parameter, but it's not wired into the API constructor or _build_scan_settings_header(). If this is intentional (server-only config), consider documenting why it's excluded; otherwise, add it for completeness.

docs/source/_static/scanoss-settings-schema.json (1)

219-223: Fix inverted honour_file_exts description.

Same issue as the main schema: description says "Ignores file extensions" but field name means "honour". Update to reflect actual semantics.

src/scanoss/cli.py (1)

211-230: Inconsistent spacing in choices and help text range mismatch.

  1. Line 214: choices=['unset' ,'true', 'false'] has extra space before comma
  2. Line 227: choices=['unset','true', 'false'] missing space after first item
  3. Line 222: Help says "Valid range: -1 to 10" but should match the actual implementation/schema range
src/scanoss/scanner.py (3)

144-153: CLI should override file_snippet for skip_headers merge.

The helper and inline comment still prefer settings over CLI, which conflicts with the stated priority (CLI > file_snippet > root). Flip the precedence and update the docstring/comment.

Also applies to: 245-257


233-235: Use the same variable in the post-processor guard.

The condition checks scan_settings but passes scanoss_settings, so the guard is ineffective when scanoss_settings is None.


341-342: Guard legacy settings before reading file_snippet.

If legacy settings lack file_snippet, this fallback can misbehave; consider skipping when settings are legacy.

🧹 Nitpick comments (1)
src/scanoss/scanossapi.py (1)

310-316: Ensure boolean serialization for honour_file_exts and ranking.

These values have type Union[bool, str, None] with a default of 'unset'. While ScanSettingsBuilder._str_to_bool() converts CLI strings to booleans, direct API instantiation could pass string values that would serialize incorrectly in JSON. Consider adding explicit boolean conversion:

♻️ Suggested defensive conversion
         # honour_file_exts: None = unset, don't send
         if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
-            settings['honour_file_exts'] = self.honour_file_exts
+            settings['honour_file_exts'] = bool(self.honour_file_exts) if isinstance(self.honour_file_exts, bool) else self.honour_file_exts == 'true'

         # ranking: None = unset, don't send
         if self.ranking is not None and self.ranking != 'unset':
-            settings['ranking_enabled'] = self.ranking
+            settings['ranking_enabled'] = bool(self.ranking) if isinstance(self.ranking, bool) else self.ranking == 'true'
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 59fd38f and 86725c2.

📒 Files selected for processing (14)
  • CHANGELOG.md
  • CLIENT_HELP.md
  • docs/source/_static/scanoss-settings-schema.json
  • scanoss.json
  • src/scanoss/__init__.py
  • src/scanoss/cli.py
  • src/scanoss/data/scanoss-settings-schema.json
  • src/scanoss/scan_settings_builder.py
  • src/scanoss/scanner.py
  • src/scanoss/scanoss_settings.py
  • src/scanoss/scanossapi.py
  • src/scanoss/scanpostprocessor.py
  • tests/data/scanoss.json
  • tests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (3)
  • scanoss.json
  • tests/data/scanoss.json
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (6)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (3)
  • ScanossSettings (76-441)
  • load_json_file (103-138)
  • set_file_type (140-152)
src/scanoss/scanner.py (4)
src/scanoss/scan_settings_builder.py (10)
  • ScanSettingsBuilder (35-311)
  • _merge_cli_with_settings (257-264)
  • with_proxy (83-97)
  • with_url (99-113)
  • with_ignore_cert_errors (115-133)
  • with_min_snippet_hits (135-156)
  • with_min_snippet_lines (158-179)
  • with_honour_file_exts (181-197)
  • with_ranking (199-214)
  • with_ranking_threshold (216-244)
src/scanoss/scanoss_settings.py (1)
  • get_file_snippet_settings (339-345)
src/scanoss/scanossapi.py (1)
  • ScanossApi (52-341)
src/scanoss/scanpostprocessor.py (1)
  • ScanPostProcessor (76-289)
tests/test_scan_settings_builder.py (2)
src/scanoss/scan_settings_builder.py (12)
  • ScanSettingsBuilder (35-311)
  • _str_to_bool (268-274)
  • _merge_with_priority (248-254)
  • _merge_cli_with_settings (257-264)
  • with_proxy (83-97)
  • with_url (99-113)
  • with_ignore_cert_errors (115-133)
  • with_min_snippet_hits (135-156)
  • with_min_snippet_lines (158-179)
  • with_honour_file_exts (181-197)
  • with_ranking (199-214)
  • with_ranking_threshold (216-244)
src/scanoss/scanner.py (1)
  • _merge_cli_with_settings (246-257)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
  • print_debug (58-63)
src/scanoss/spdxlite.py (1)
  • print_debug (61-66)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
  • get_file_snippet_settings (339-345)
  • get_file_snippet_proxy (427-433)
  • get_proxy (411-417)
  • get_http_config (419-425)
  • get_file_snippet_http_config (435-441)
src/scanoss/scanner.py (1)
  • _merge_cli_with_settings (246-257)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
  • is_legacy (315-317)
  • get_bom_remove (210-218)
  • get_bom_replace (220-228)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (13)
src/scanoss/__init__.py (1)

25-25: Version bump looks consistent.

CLIENT_HELP.md (1)

262-404: No additional issues to note here.

tests/test_scan_settings_builder.py (1)

33-359: No additional issues to note here.

src/scanoss/scanpostprocessor.py (1)

112-164: Good robustness improvements in post-processing.

Handling list-vs-dict entries and short-circuiting legacy settings is clean and avoids unnecessary processing.

src/scanoss/scanoss_settings.py (1)

338-441: LGTM! Well-structured getters for file_snippet settings.

The new accessor methods follow the established patterns in the class, with consistent docstrings, proper Optional type hints, and appropriate default values matching the schema (e.g., skip_headers_limit defaults to 0, skip_headers defaults to False). The separation of root-level and file_snippet-level proxy/http_config getters supports the priority-based configuration model described in the PR objectives.

src/scanoss/data/scanoss-settings-schema.json (1)

142-242: New schema blocks for file_snippet, hpfm, and container look good.

The schema additions properly define the new tuning parameters with appropriate types, ranges, and defaults. The nested proxy/http_config blocks within file_snippet align with the priority-based configuration approach.

src/scanoss/scanossapi.py (1)

176-179: Scan settings header implementation looks good.

The _build_scan_settings_header() method correctly:

  • Omits parameters with sentinel values (0 for counts, 'unset'/None for booleans, -1 for threshold)
  • Base64-encodes the JSON payload
  • Only attaches the header when settings are present

Also applies to: 293-326

docs/source/_static/scanoss-settings-schema.json (1)

142-238: New proxy, http_config, and file_snippet schema blocks are well-structured.

The hierarchical configuration with root-level and file_snippet-level proxy/http_config follows a sensible pattern for override semantics.

src/scanoss/cli.py (2)

1422-1448: LGTM! wfp() function properly wired with scanoss_settings.

The settings loading and Scanner initialization correctly use the renamed scanoss_settings variable throughout.


1641-1651: LGTM! New tuning parameters properly passed to Scanner.

The new snippet tuning parameters (min_snippet_hits, min_snippet_lines, ranking, ranking_threshold, honour_file_exts) are correctly propagated from CLI args to the Scanner constructor.

src/scanoss/scanner.py (3)

42-42: Good plumbing for scanoss_settings and tuning flags.

Signature updates and storing the settings reference align with the new builder wiring.

Also applies to: 73-114, 141-141


238-242: SBOM propagation looks consistent.

Using scanoss_settings here keeps post-init wiring centralized.


410-410: Consistent settings propagation into FileFilters.

Good to pass scanoss_settings through so filters can access file-based configuration.

Also applies to: 678-678, 993-993

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 2
  • Undeclared components: 0
  • Declared components: 2
  • Detected files: 103
  • Detected files undeclared: 0
  • Detected files declared: 103
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/scanoss/data/scanoss-settings-schema.json`:
- Around line 176-182: Update the MAX_RANKING_THRESHOLD constant in
scan_settings_builder.py from 10 to 99, update the docstring in
scan_settings_builder.py that currently says "Valid range is -1 to 10" to "Valid
range is -1 to 99", and update the documentation JSON schema
(scanoss-settings-schema.json under the docs static assets) to change the
"maximum" value from 10 to 99 so all code, docstring, and docs schema match the
schema maximum of 99.
♻️ Duplicate comments (6)
src/scanoss/scan_settings_builder.py (2)

32-32: MAX_RANKING_THRESHOLD should be 99 per PR objectives.

The PR description and CHANGELOG state the valid range is -1 to 99, but the constant is set to 10. This causes the docstring on line 219 ("Valid range is -1 to 10") to be internally consistent but incorrect relative to the documented behavior.

🔧 Suggested fix
-MAX_RANKING_THRESHOLD = 10
+MAX_RANKING_THRESHOLD = 99

Also update line 219:

-        Valid range is -1 to 10. Values outside this range will be clamped and logged.
+        Valid range is -1 to 99. Values outside this range will be clamped and logged.

115-134: with_ignore_cert_errors precedence doesn't match docstring.

The docstring states "CLI True > file_snippet > settings > False", implying that --ignore-cert-errors on the CLI should override settings file values. However, _merge_with_priority always prefers file_snippet then root over CLI. If a settings file sets ignore_cert_errors: false, the CLI flag won't override it.

🔧 Suggested fix to honour "CLI True wins" semantics
     def with_ignore_cert_errors(self, cli_value: bool = False) -> 'ScanSettingsBuilder':
         """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False.

         Note: CLI value only takes effect if True (flag present). False means
         the flag was not provided, so settings file values are checked.
         """
+        # Explicit CLI True should always win
+        if cli_value:
+            self.ignore_cert_errors = True
+            return self
+
+        # Otherwise, fall back to file_snippet then root, defaulting to False
         result = self._merge_with_priority(
-            cli_value if cli_value else None,
+            None,
             self._get_file_snippet_http_config_value('ignore_cert_errors'),
-            self._get_http_config_value('ignore_cert_errors')
+            self._get_http_config_value('ignore_cert_errors'),
         )
-        self.ignore_cert_errors = result if result is not None else False
+        self.ignore_cert_errors = bool(result) if result is not None else False
         return self
src/scanoss/data/scanoss-settings-schema.json (1)

201-205: Fix honour_file_exts description to match actual semantics.

The description says "Ignores file extensions" but the field name and CLI help describe it as "Honour file extensions". This is confusing for users.

🔧 Suggested fix
             "honour_file_exts": {
               "type": ["boolean", "null"],
-              "description": "Ignores file extensions. When not set, defers to server configuration.",
+              "description": "Whether to honour file extensions during matching. True = honour extensions, False = ignore them. When not set, defers to server configuration.",
               "default": true
             },
docs/source/_static/scanoss-settings-schema.json (2)

200-206: Schema inconsistency: ranking_threshold constraints differ from main schema.

The docs schema defines maximum: 10, default: -1, while the main schema (src/scanoss/data/) uses maximum: 99, default: 0. This creates confusion. Align both schemas to match the PR objectives (-1 to 99).

🔧 Suggested fix to align with main schema
             "ranking_threshold": {
               "type": ["integer", "null"],
               "description": "Ranking threshold value. A value of -1 defers to server configuration",
               "minimum": -1,
-              "maximum": 10,
-              "default": -1
+              "maximum": 99,
+              "default": 0
             },

219-223: Fix honour_file_exts description (same issue as main schema).

The description says "Ignores file extensions" but should reflect the actual behavior: True = honour extensions, False = ignore them.

🔧 Suggested fix
             "honour_file_exts": {
               "type": ["boolean", "null"],
-              "description": "Ignores file extensions. When not set, defers to server configuration.",
+              "description": "Whether to honour file extensions during matching. True = honour extensions, False = ignore them. When not set, defers to server configuration.",
               "default": true
             },
src/scanoss/cli.py (1)

211-230: Inconsistent spacing in choices and incorrect help text for ranking threshold.

  1. Line 214: choices=['unset' ,'true', 'false'] has inconsistent spacing (space before comma)
  2. Line 222: Help says "Valid range: -1 to 10" but should be "-1 to 99" per PR objectives
  3. Line 227: choices=['unset','true', 'false'] has inconsistent spacing
🔧 Suggested fixes
     p_scan.add_argument(
         '--ranking',
         type=str,
-        choices=['unset' ,'true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Enable or disable ranking (optional - default: server configuration)',
     )
     p_scan.add_argument(
         '--ranking-threshold',
         type=int,
         default=-1,
-        help='Ranking threshold value. Valid range: -1 to 10. A value of -1 defers to server configuration (optional)',
+        help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)',
     )
     p_scan.add_argument(
         '--honour-file-exts',
         type=str,
-        choices=['unset','true', 'false'],
+        choices=['unset', 'true', 'false'],
         default='unset',
         help='Honour file extensions during scanning. When not set, defers to server configuration (optional)',
     )
🧹 Nitpick comments (2)
src/scanoss/scan_settings_builder.py (2)

79-81: Use Any from typing instead of lowercase any.

any is not a valid type hint. Use Any from the typing module, or consider a more specific union type like Optional[bool | str] to reflect the actual 'unset' + boolean usage.

♻️ Suggested fix
-from typing import TYPE_CHECKING, Optional
+from typing import TYPE_CHECKING, Any, Optional
...
-        self.honour_file_exts: Optional[any] = None
-        self.ranking: Optional[any] = None
+        self.honour_file_exts: Optional[Any] = None
+        self.ranking: Optional[Any] = None

Or for stricter typing:

        self.honour_file_exts: Optional[bool | str] = None
        self.ranking: Optional[bool | str] = None

195-196: Minor: Missing space before != operator.

-        if self.honour_file_exts is not None and self.honour_file_exts!= 'unset':
+        if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 86725c2 and 64e3b80.

📒 Files selected for processing (8)
  • CHANGELOG.md
  • CLIENT_HELP.md
  • docs/source/_static/scanoss-settings-schema.json
  • src/scanoss/cli.py
  • src/scanoss/data/scanoss-settings-schema.json
  • src/scanoss/scan_settings_builder.py
  • tests/data/scanoss.json
  • tests/test_scan_settings_builder.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • tests/data/scanoss.json
  • CHANGELOG.md
  • CLIENT_HELP.md
  • tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
  • get_file_snippet_settings (339-345)
  • get_file_snippet_proxy (427-433)
  • get_proxy (411-417)
  • get_http_config (419-425)
  • get_file_snippet_http_config (435-441)
src/scanoss/scanner.py (1)
  • _merge_cli_with_settings (246-257)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
  • ScanossSettings (76-441)
  • load_json_file (103-138)
  • set_file_type (140-152)
  • set_scan_type (154-161)
src/scanoss/inspection/utils/file_utils.py (1)
  • load_json_file (29-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: build
🔇 Additional comments (8)
src/scanoss/scan_settings_builder.py (2)

246-265: LGTM!

The merge helper methods correctly implement the documented priority: file_snippet > root settings > CLI for _merge_with_priority and settings > CLI for _merge_cli_with_settings.


267-311: LGTM!

The utility and accessor methods handle null checks properly and correctly extract nested configuration values from the settings.

src/scanoss/data/scanoss-settings-schema.json (1)

142-242: LGTM on new settings sections.

The file_snippet, hpfm, and container sections are well-structured with appropriate types, defaults, and constraints for the new tuning parameters.

docs/source/_static/scanoss-settings-schema.json (1)

142-166: LGTM on root-level proxy and http_config additions.

The docs schema correctly adds root-level proxy and http_config sections that the main schema lacks. This appears intentional for documentation purposes.

src/scanoss/cli.py (4)

198-210: LGTM on --min-snippet-hits and --min-snippet-lines arguments.

The new CLI arguments for snippet tuning are well-defined with appropriate types, defaults, and help text.


1418-1447: LGTM on wfp() function updates.

The refactoring to use scanoss_settings is consistent with the new settings architecture.


1637-1648: LGTM on passing tuning parameters to Scanner.

The new snippet tuning parameters (min_snippet_hits, min_snippet_lines, ranking, ranking_threshold, honour_file_exts) are correctly threaded through to the Scanner constructor.


2828-2837: LGTM on get_scanoss_settings_from_args helper.

Clean extraction of settings loading logic into a reusable helper function for folder_hashing_scan and folder_hash commands.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 64e3b80 to 7f9ab07 Compare January 22, 2026 13:41
@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 3
  • Undeclared components: 0
  • Declared components: 3
  • Detected files: 103
  • Detected files undeclared: 0
  • Detected files declared: 103
  • Licenses detected: 1
  • Licenses detected with copyleft: 0
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In `@CLIENT_HELP.md`:
- Around line 351-382: Delete the two subsections titled "Skip headers" and
"Skip headers limit" from CLIENT_HELP.md, removing the CLI examples that mention
--skip-headers and --skip-headers-limit and the corresponding scanoss.json
examples that show settings file_snippet.skip_headers and
file_snippet.skip_headers_limit; also search the document for any remaining
references to --skip-headers, --skip-headers-limit, skip_headers or
skip_headers_limit and remove or update them so the docs no longer describe
these unimplemented flags.

In `@src/scanoss/scanossapi.py`:
- Around line 146-149: The variable base_url is assigned but never used; change
the logic so self.url is set from base_url instead of only from url—i.e.,
compute base_url = url if url else SCANOSS_SCAN_URL and then assign self.url =
base_url (and keep the existing fallback to DEFAULT_URL2 when api_key is present
and no explicit URL env var), ensuring references to api_key/SCANOSS_API_KEY and
the os.environ.get('SCANOSS_SCAN_URL') check remain intact so lint stops
complaining about the unused base_url.
- Around line 25-35: The import block in scanossapi.py is unsorted; reorder
imports into standard groups (stdlib, third‑party, local) and alphabetize within
each group — e.g., keep base64, http.client (http_client), json, logging, os,
sys, time, uuid, from json.decoder import JSONDecodeError, from urllib.parse
import urlparse, urlunparse, and from typing import Optional, Union in the
stdlib group sorted alphabetically; optionally run isort to apply automatically
so CI lint passes.
♻️ Duplicate comments (11)
docs/source/_static/scanoss-settings-schema.json (1)

219-223: honour_file_exts description is inverted.

It still says “Ignores file extensions…”, which conflicts with the flag name and runtime semantics.

📝 Suggested fix
-              "description": "Ignores file extensions. When not set, defers to server configuration.",
+              "description": "Whether to honour file extensions during matching. True = honour extensions, False = ignore. When not set, defers to server configuration.",
src/scanoss/scan_settings_builder.py (1)

115-133: CLI --ignore-cert-errors doesn’t actually override settings.

_merge_with_priority always prefers file_snippet/root, so a CLI True is ignored when settings exist. Either update the docstring or (preferably) let CLI True win as documented.

🔧 Proposed fix
     def with_ignore_cert_errors(self, cli_value: bool = False) -> 'ScanSettingsBuilder':
         """Set ignore_cert_errors with priority: CLI True > file_snippet > settings > False.
@@
-        result = self._merge_with_priority(
-            cli_value if cli_value else None,
-            self._get_file_snippet_http_config_value('ignore_cert_errors'),
-            self._get_http_config_value('ignore_cert_errors')
-        )
-        self.ignore_cert_errors = result if result is not None else False
+        # Explicit CLI True should always win
+        if cli_value:
+            self.ignore_cert_errors = True
+            return self
+        result = self._merge_with_priority(
+            None,
+            self._get_file_snippet_http_config_value('ignore_cert_errors'),
+            self._get_http_config_value('ignore_cert_errors')
+        )
+        self.ignore_cert_errors = bool(result) if result is not None else False
         return self
src/scanoss/data/scanoss-settings-schema.json (2)

176-182: ranking_threshold schema range/default conflicts with runtime and docs.

Runtime clamps to 10 and docs/tests show “-1 to 10”, but this schema allows 99 and defaults to 0. Align all three sources to avoid invalid configs being accepted then silently clamped.

🧩 Suggested alignment (if 10 is intended)
             "ranking_threshold": {
               "type": ["integer", "null"],
               "description": "Ranking threshold value. A value of -1 defers to server configuration",
               "minimum": -1,
-              "maximum": 99,
-              "default": 0
+              "maximum": 10,
+              "default": -1
             },

201-205: honour_file_exts description is inverted.

The description says “Ignores file extensions…”, but the flag name/behavior is the opposite.

📝 Suggested fix
-            "honour_file_exts": {
-              "type": ["boolean", "null"],
-              "description": "Ignores file extensions. When not set, defers to server configuration.",
-              "default": true
-            },
+            "honour_file_exts": {
+              "type": ["boolean", "null"],
+              "description": "Whether to honour file extensions during matching. True = honour, False = ignore. When not set, defers to server configuration.",
+              "default": true
+            },
src/scanoss/scanossapi.py (1)

338-345: Normalize boolean settings before serializing.
Line 338-345: If ScanossApi is instantiated directly with 'true'/'false' strings, they serialize as strings instead of booleans. Coerce to proper booleans before JSON encoding.

🔧 Proposed normalization
-        # honour_file_exts: None = unset, don't send
-        if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
-            settings['honour_file_exts'] = self.honour_file_exts
-
-        # ranking: None = unset, don't send
-        if self.ranking is not None and self.ranking != 'unset':
-            settings['ranking_enabled'] = self.ranking
+        def _normalize_bool(value):
+            if isinstance(value, str):
+                return value.lower() == 'true'
+            return bool(value)
+
+        # honour_file_exts: None = unset, don't send
+        if self.honour_file_exts is not None and self.honour_file_exts != 'unset':
+            settings['honour_file_exts'] = _normalize_bool(self.honour_file_exts)
+
+        # ranking: None = unset, don't send
+        if self.ranking is not None and self.ranking != 'unset':
+            settings['ranking_enabled'] = _normalize_bool(self.ranking)
src/scanoss/scanoss_settings.py (1)

339-345: Guard against legacy settings being a list.
Line 345: self.data.get(...) will raise if legacy settings are a list. Return {} when self.data isn’t a dict.

🔧 Proposed fix
-        return self.data.get('settings', {}).get('file_snippet', {})
+        data = self.data if isinstance(self.data, dict) else {}
+        return data.get('settings', {}).get('file_snippet', {})
src/scanoss/scanner.py (4)

144-153: CLI overrides are currently ignored.
Line 144-153 and Line 245-257: _merge_cli_with_settings returns settings first, so CLI flags can’t override file settings (contrary to PR objective). This directly affects skip_headers and skip_headers_limit.

🔧 Proposed fix
-    def _merge_cli_with_settings(cli_value, settings_value):
-        """Merge CLI value with settings value (two-level priority: settings > cli).
+    def _merge_cli_with_settings(cli_value, settings_value):
+        """Merge CLI value with settings value (two-level priority: cli > settings).
@@
-        if settings_value is not None:
-            return settings_value
-        return cli_value
+        if cli_value is not None:
+            return cli_value
+        return settings_value

Also applies to: 245-257


170-179: Builder merge order still makes CLI lowest priority.
Line 170-179: ScanSettingsBuilder (per current implementation) prefers file_snippet > root > CLI. That means CLI flags for min_snippet/ranking/honour-file-exts won’t override settings. Align builder priority with the PR objective (CLI > file_snippet > root) or apply CLI overrides after builder construction.


233-235: Condition should check scanoss_settings, not scan_settings.
Line 233-235: You pass scanoss_settings but gate creation on scan_settings, which is always truthy. This creates the post-processor even when no settings were supplied.

🔧 Proposed fix
-        self.post_processor = (
-            ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None
-        )
+        self.post_processor = (
+            ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scanoss_settings else None
+        )

341-342: Guard file_snippet access for legacy settings.
Line 341-342: If legacy settings load into a list, get_file_snippet_settings() can fail. Short‑circuit for legacy files.

🔧 Proposed fix
-        file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {}
-        return file_snippet_settings.get('dependency_analysis', False)
+        if not self.scanoss_settings or self.scanoss_settings.is_legacy():
+            return False
+        file_snippet_settings = self.scanoss_settings.get_file_snippet_settings()
+        return file_snippet_settings.get('dependency_analysis', False)
src/scanoss/cli.py (1)

198-230: Fix CLI choices spacing and ranking-threshold help range.
Line 212-228: Spacing in choices is inconsistent, and the help text for --ranking-threshold doesn’t match the builder’s clamped range.

🔧 Suggested fix
-        choices=['unset' ,'true', 'false'],
+        choices=['unset', 'true', 'false'],
@@
-        help='Ranking threshold value. Valid range: -1 to 10. A value of -1 defers to server configuration (optional)',
+        help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)',
@@
-        choices=['unset','true', 'false'],
+        choices=['unset', 'true', 'false'],

@github-actions
Copy link

SCANOSS SCAN Completed 🚀

  • Detected components: 4
  • Undeclared components: 0
  • Declared components: 4
  • Detected files: 103
  • Detected files undeclared: 0
  • Detected files declared: 103
  • Licenses detected: 2
  • Licenses detected with copyleft: 1
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from b9fdc26 to 0fc0c6a Compare February 2, 2026 09:56
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 3
  • Undeclared components: 0
  • Declared components: 3
  • Detected files: 102
  • Detected files undeclared: 0
  • Detected files declared: 102
  • Licenses detected: 2
  • Licenses detected with copyleft: 1
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh force-pushed the chore/scan-tunning-parameters branch from 0fc0c6a to bb73de0 Compare February 2, 2026 09:58
@github-actions
Copy link

github-actions bot commented Feb 2, 2026

SCANOSS SCAN Completed 🚀

  • Detected components: 3
  • Undeclared components: 0
  • Declared components: 3
  • Detected files: 102
  • Detected files undeclared: 0
  • Detected files declared: 102
  • Licenses detected: 2
  • Licenses detected with copyleft: 1
  • Policies: ✅ 1 pass (1 total)

View more details on SCANOSS Action Summary

@agustingroh agustingroh merged commit 962bca9 into main Feb 2, 2026
6 checks passed
@agustingroh agustingroh deleted the chore/scan-tunning-parameters branch February 2, 2026 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants