Fix the repo path traversal check by JanKrivanek · Pull Request #525 · dotnet/skills

JanKrivanek · 2026-04-14T15:32:57Z

Followup of #524

Motivation

When we want to allow traversal of paths outside of the skill folder (in case of repo specific skills, that are not meant to be shareable) - we should not check for the depth of the path, unless it is in the current skill folder - otherwise the depth check would snap

JanKrivanek · 2026-04-14T15:33:07Z

/evaluate

Copilot

Pull request overview

This PR adjusts the skill-validator’s file reference depth/traversal checks so that when repo traversal is explicitly allowed, parent-directory (..) references aren’t incorrectly subjected to the “max 1 directory deep” rule—intended for keeping portable skills self-contained.

Changes:

Updates SkillProfiler to suppress depth checking for references that include .. when AllowRepoTraversal is enabled.
Updates/extends unit tests to distinguish between deep external refs (allowed with repo traversal) vs deep internal refs (still rejected).

Show a summary per file

File	Description
eng/skill-validator/src/Check/SkillProfiler.cs	Changes traversal/depth validation flow to skip depth checks for `..` refs when repo traversal is allowed.
eng/skill-validator/tests/Check/SkillProfileTests.cs	Updates and adds tests for the updated repo-traversal behavior.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

eng/skill-validator/tests/Check/SkillProfileTests.cs:339

The new coverage verifies that ..-based external refs don't trigger depth errors, but it doesn't cover the important edge case where a path contains .. yet still resolves inside the skill directory (e.g. references/../refs/utils/foo/readme.md). Add a test that expects the depth error in that scenario (with AllowRepoTraversal = true) so the implementation can't accidentally allow deep internal refs by inserting .. segments.

    [Fact]
    public void AllowRepoTraversalAllowsDeepExternalRefs()
    {
        var content = "---\nname: test-skill\n---\n# Title\n1. Step\n```bash\necho\n```\nSee [ref](../../../documentation/guides/setup.md)\n" + new string('x', 4000);
        var options = new CheckOptions { AllowRepoTraversal = true };
        var profile = SkillProfiler.AnalyzeSkill(MakeSkill(content), options);
        Assert.DoesNotContain(profile.Errors, e => e.Contains("traversal") || e.Contains("directories deep"));
    }

    [Fact]
    public void AllowRepoTraversalStillChecksDepthForInternalRefs()
    {
        var content = "---\nname: test-skill\n---\n# Title\n1. Step\n```bash\necho\n```\nSee [ref](refs/utils/foo/readme.md)\n" + new string('x', 4000);
        var options = new CheckOptions { AllowRepoTraversal = true };
        var profile = SkillProfiler.AnalyzeSkill(MakeSkill(content), options);
        Assert.Contains(profile.Errors, e => e.Contains("directories deep"));
    }

Files reviewed: 2/2 changed files
Comments generated: 1

eng/skill-validator/src/Check/SkillProfiler.cs

JanKrivanek · 2026-04-14T15:47:13Z

/evaluate

github-actions · 2026-04-14T16:52:46Z

Skill Validation Results

Skill	Scenario	Quality	Skills Loaded	Overfit	Verdict
coverage-analysis	Project-wide coverage analysis with existing Cobertura data	3.0/5 → 5.0/5 🟢	✅ coverage-analysis; tools: skill, view, create	✅ 0.09	✅
coverage-analysis	Run coverage from scratch without existing data	4.0/5 → 5.0/5 🟢	✅ coverage-analysis; tools: skill	✅ 0.09	✅
coverage-analysis	Coverage plateau diagnosis	3.0/5 → 4.0/5 🟢	✅ coverage-analysis; tools: skill / ✅ coverage-analysis; tools: skill, create	✅ 0.09	✅
migrate-vstest-to-mtp	Migrate MSTest project from VSTest to Microsoft.Testing.Platform	5.0/5 → 5.0/5	✅ migrate-vstest-to-mtp; tools: skill / ✅ migrate-vstest-to-mtp; tools: report_intent, skill	✅ 0.10	❌ [1]
migrate-vstest-to-mtp	Migrate NUnit project from VSTest to Microsoft.Testing.Platform	2.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill	✅ 0.10	✅
migrate-vstest-to-mtp	Migrate xUnit.net v2 project from VSTest to Microsoft.Testing.Platform	2.0/5 → 4.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill, report_intent, view, bash / ✅ migrate-vstest-to-mtp; tools: skill	✅ 0.10	✅
migrate-vstest-to-mtp	Update Azure DevOps pipeline from VSTest task to MTP	2.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill / ✅ migrate-vstest-to-mtp; tools: report_intent, skill	✅ 0.10	✅
migrate-vstest-to-mtp	Migrate MSTest.Sdk project that explicitly uses VSTest	3.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill	✅ 0.10	✅
migrate-vstest-to-mtp	Translate dotnet test VSTest arguments to MTP equivalents	3.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill / ✅ migrate-vstest-to-mtp; tools: report_intent, skill	✅ 0.10	✅
migrate-vstest-to-mtp	Handle exit code 8 when migrating from VSTest to MTP	2.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.10	✅
migrate-vstest-to-mtp	Configure dotnet test MTP mode on .NET 10 SDK	2.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill	✅ 0.10	✅
migrate-vstest-to-mtp	Migrate xUnit.net VSTest filter syntax to MTP	1.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.10	✅
migrate-vstest-to-mtp	Full VSTest to MTP migration plan for MSTest solution	4.0/5 → 5.0/5 🟢	✅ migrate-vstest-to-mtp; tools: skill	✅ 0.10	✅
test-anti-patterns	Detect mixed severity anti-patterns in repository service tests	5.0/5 → 5.0/5	✅ test-anti-patterns; tools: skill, report_intent / ⚠️ NOT ACTIVATED	✅ 0.05	✅
test-anti-patterns	Detect flakiness indicators and test coupling	3.0/5 → 5.0/5 🟢	✅ test-anti-patterns; tools: skill, report_intent / ⚠️ NOT ACTIVATED	✅ 0.05	❌ [2]
test-anti-patterns	Detect duplicated tests and magic values	3.0/5 → 5.0/5 🟢	✅ test-anti-patterns; tools: report_intent, skill	✅ 0.05	✅
test-anti-patterns	Recognize well-written tests without inventing false positives	2.0/5 → 5.0/5 🟢	✅ test-anti-patterns; tools: report_intent, skill	✅ 0.05	✅
migrate-static-to-wrapper	Migrate DateTime.UtcNow to TimeProvider in a service class	5.0/5 → 5.0/5	✅ migrate-static-to-wrapper; tools: skill, bash / ✅ migrate-static-to-wrapper; tools: skill	✅ 0.07	❌ [3]
migrate-static-to-wrapper	Migrate only in scoped files, leaving others untouched	5.0/5 → 5.0/5	✅ migrate-static-to-wrapper; tools: skill, bash / ⚠️ NOT ACTIVATED	✅ 0.07	❌ [4]
migrate-static-to-wrapper	Decline migration when wrapper does not exist yet	4.0/5 → 5.0/5 🟢	✅ migrate-static-to-wrapper; tools: skill / ✅ migrate-static-to-wrapper; tools: skill, glob	✅ 0.07	❌ [5]
run-tests	Run tests in a VSTest MSTest project	4.0/5 → 4.0/5	✅ run-tests; tools: skill, glob / ✅ run-tests; tools: skill	✅ 0.19	✅
run-tests	Run tests with trx reporting on MTP project (SDK 9)	4.0/5 → 4.0/5	✅ run-tests; tools: skill / ✅ run-tests; tools: skill, glob	✅ 0.19	✅
run-tests	Run tests with blame-hang on MTP project (SDK 10)	1.0/5 ⏰ → 2.0/5 🟢	✅ run-tests; tools: skill	✅ 0.19	✅
run-tests	Run tests in a multi-TFM project targeting a specific framework	2.0/5 → 4.0/5 🟢	⚠️ NOT ACTIVATED	✅ 0.19	✅
run-tests	Filter MSTest tests by category on VSTest	5.0/5 → 5.0/5	⚠️ NOT ACTIVATED	✅ 0.19	❌ [6]
run-tests	Filter NUnit tests by class name on VSTest	4.0/5 → 5.0/5 🟢	✅ run-tests; tools: skill, glob / ⚠️ NOT ACTIVATED	✅ 0.19	✅
run-tests	Filter xUnit v3 tests by class on MTP	1.0/5 → 5.0/5 🟢	✅ run-tests; tools: skill, view / ⚠️ NOT ACTIVATED	✅ 0.19	✅
run-tests	Filter xUnit v3 tests by trait on MTP	1.0/5 → 5.0/5 🟢	✅ run-tests; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.19	✅
run-tests	Filter TUnit tests by class using treenode-filter	4.0/5 → 4.0/5	✅ run-tests; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.19	❌ [7]
run-tests	Combine multiple filter criteria on VSTest MSTest	4.0/5 → 4.0/5	⚠️ NOT ACTIVATED / ✅ run-tests; tools: skill	✅ 0.19	✅
run-tests	MTP project on SDK 9 must use -- separator for args	2.0/5 → 5.0/5 🟢	✅ run-tests; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.19	❌ [8]
run-tests	MTP project on SDK 10 passes args directly	3.0/5 → 3.0/5	✅ run-tests; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.19	❌ [9]
run-tests	Detect test platform from Directory.Build.props	2.0/5 → 5.0/5 🟢	✅ run-tests; tools: skill	✅ 0.19	✅
run-tests	Negative test: do not use MTP syntax for a VSTest project	4.0/5 → 5.0/5 🟢	✅ run-tests; tools: skill, view, glob / ⚠️ NOT ACTIVATED	✅ 0.19	❌ [10]
writing-mstest-tests	Write unit tests for a service class	5.0/5 → 4.0/5 🔴	✅ writing-mstest-tests; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.27	❌
writing-mstest-tests	Write data-driven tests for a calculator	4.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: skill	🟡 0.27	✅
writing-mstest-tests	Write async tests with cancellation	2.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: skill	🟡 0.27	✅
writing-mstest-tests	Fix swapped Assert.AreEqual arguments	5.0/5 → 5.0/5	✅ writing-mstest-tests; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.27	❌ [11]
writing-mstest-tests	Modernize legacy test patterns	4.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: skill	🟡 0.27	✅
writing-mstest-tests	Replace ExpectedException with Assert.Throws	3.0/5 → 3.0/5	✅ writing-mstest-tests; tools: skill / ✅ writing-mstest-tests; tools: report_intent, skill	🟡 0.27	✅
writing-mstest-tests	Use proper collection assertions	3.0/5 → 2.0/5 🔴	✅ writing-mstest-tests; tools: skill	🟡 0.27	❌
writing-mstest-tests	Use proper type assertions instead of casts	3.0/5 → 3.0/5	✅ writing-mstest-tests; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.27	✅
writing-mstest-tests	Set up test lifecycle correctly	3.0/5 → 4.0/5 🟢	✅ writing-mstest-tests; tools: skill	🟡 0.27	✅
writing-mstest-tests	Use DynamicData with ValueTuples over object arrays	2.0/5 → 5.0/5 🟢	✅ writing-mstest-tests; tools: report_intent, skill / ⚠️ NOT ACTIVATED	🟡 0.27	✅
mtp-hot-reload	Suggest hot reload for failing test in MTP project (SDK 9)	1.0/5 → 2.0/5 ⏰ 🟢	✅ mtp-hot-reload; tools: skill	✅ 0.14	✅
mtp-hot-reload	Suggest hot reload for failing test in MTP project (SDK 10)	1.0/5 → 4.0/5 🟢	✅ mtp-hot-reload; tools: skill, bash, create	✅ 0.14	✅
mtp-hot-reload	Enable hot reload when package already installed	2.0/5 → 5.0/5 🟢	✅ mtp-hot-reload; tools: skill	✅ 0.14	✅
mtp-hot-reload	Suggest launchSettings.json configuration for hot reload	1.0/5 → 4.0/5 🟢	✅ mtp-hot-reload; tools: skill, bash	✅ 0.14	✅
mtp-hot-reload	Use dotnet run not dotnet test for hot reload	3.0/5 → 3.0/5	✅ mtp-hot-reload; tools: skill	✅ 0.14	✅
mtp-hot-reload	Negative: VSTest project cannot use MTP hot reload	1.0/5 → 3.0/5 🟢	✅ mtp-hot-reload; tools: skill, create	✅ 0.14	✅
mtp-hot-reload	Run specific failing test with hot reload filter	1.0/5 → 5.0/5 🟢	✅ mtp-hot-reload; tools: skill	✅ 0.14	✅
migrate-xunit-to-xunit-v3	Migrate basic xUnit.net v2 project to v3	2.0/5 → 5.0/5 🟢	✅ migrate-xunit-to-xunit-v3; tools: skill, glob / ✅ migrate-xunit-to-xunit-v3; tools: skill	✅ 0.05	✅
migrate-xunit-to-xunit-v3	Detect incompatible target framework and stop migration	1.0/5 → 5.0/5 🟢	✅ migrate-xunit-to-xunit-v3; tools: skill	✅ 0.05	✅
migrate-xunit-to-xunit-v3	Convert async void test methods to async Task	5.0/5 → 5.0/5	✅ migrate-xunit-to-xunit-v3; tools: skill, glob, create / ✅ migrate-xunit-to-xunit-v3; tools: skill	✅ 0.05	❌ [12]
migrate-xunit-to-xunit-v3	Convert string-based attribute constructors to typeof syntax	5.0/5 → 5.0/5	✅ migrate-xunit-to-xunit-v3; tools: skill, create / ✅ migrate-xunit-to-xunit-v3; tools: skill, web_fetch, create	✅ 0.05	❌ [13]
migrate-xunit-to-xunit-v3	Update custom FactAttribute to include source information parameters	5.0/5 → 5.0/5	✅ migrate-xunit-to-xunit-v3; tools: skill / ✅ migrate-xunit-to-xunit-v3; tools: skill, web_fetch	✅ 0.05	✅
migrate-xunit-to-xunit-v3	Update BeforeAfterTestAttribute overrides with IXunitTest parameter	5.0/5 → 4.0/5 🔴	✅ migrate-xunit-to-xunit-v3; tools: skill, create	✅ 0.05	❌
migrate-xunit-to-xunit-v3	Migrate project with YTest.MTP.XUnit2 to xUnit.net v3 preserving MTP	3.0/5 → 5.0/5 🟢	✅ migrate-xunit-to-xunit-v3; tools: skill, web_fetch / ✅ migrate-xunit-to-xunit-v3; tools: skill	✅ 0.05	✅
migrate-xunit-to-xunit-v3	Migrate Xunit.SkippableFact to xUnit.net v3 built-in skip APIs	5.0/5 → 5.0/5	✅ migrate-xunit-to-xunit-v3; tools: skill, create / ✅ migrate-xunit-to-xunit-v3; tools: skill, glob	✅ 0.05	❌ [14]
migrate-xunit-to-xunit-v3	Migrate xUnit v2 packages managed via Central Package Management	5.0/5 → 5.0/5	✅ migrate-xunit-to-xunit-v3; tools: skill, web_fetch / ✅ migrate-xunit-to-xunit-v3; tools: skill, create	✅ 0.05	❌ [15]
migrate-xunit-to-xunit-v3	Recognize project already on xUnit.net v3 — no migration needed	2.0/5 → 5.0/5 🟢	✅ migrate-xunit-to-xunit-v3; tools: skill / ✅ migrate-xunit-to-xunit-v3; tools: skill, glob	✅ 0.05	✅
migrate-xunit-to-xunit-v3	Consolidate xunit.extensibility packages and remove xunit.abstractions	3.0/5 → 3.0/5	✅ migrate-xunit-to-xunit-v3; tools: skill	✅ 0.05	✅
migrate-xunit-to-xunit-v3	Update Xunit.Combinatorial and Xunit.StaFact companion packages	5.0/5 → 4.0/5 🔴	✅ migrate-xunit-to-xunit-v3; tools: skill, create / ✅ migrate-xunit-to-xunit-v3; tools: skill, glob, create	✅ 0.05	✅
generate-testability-wrappers	Generate TimeProvider adoption for DateTime.UtcNow	3.0/5 → 5.0/5 🟢	✅ generate-testability-wrappers; tools: skill / ✅ generate-testability-wrappers; tools: glob, skill, bash	✅ 0.08	✅
generate-testability-wrappers	Generate custom Environment wrapper	3.0/5 → 5.0/5 🟢	✅ generate-testability-wrappers; tools: skill	✅ 0.08	✅
generate-testability-wrappers	Recommend System.IO.Abstractions for file system calls	2.0/5 → 5.0/5 🟢	✅ generate-testability-wrappers; tools: skill, report_intent, view / ✅ generate-testability-wrappers; tools: skill	✅ 0.08	✅
generate-testability-wrappers	Decline wrapper generation for already-abstracted code	2.0/5 → 5.0/5 🟢	✅ generate-testability-wrappers; tools: skill	✅ 0.08	✅
code-testing-agent	Generate tests for ContosoUniversity ASP.NET Core MVC app	3.0/5 → 3.0/5	✅ code-testing-agent; tools: skill, task, read_agent / ✅ code-testing-agent; tools: skill, task, read_bash, read_agent	✅ 0.02	❌ [16]
migrate-mstest-v1v2-to-v3	Migrate MSTest v1 project with assembly reference	3.0/5 → 5.0/5 🟢	✅ migrate-mstest-v1v2-to-v3; tools: skill, edit, bash / ✅ migrate-mstest-v1v2-to-v3; tools: skill	✅ 0.04	✅
migrate-mstest-v1v2-to-v3	Migrate MSTest v2 NuGet project to v3	3.0/5 → 3.0/5	✅ migrate-mstest-v1v2-to-v3; tools: skill	✅ 0.04	❌ [17]
migrate-mstest-v1v2-to-v3	Fix Assert.AreEqual object overload errors after v3 upgrade	4.0/5 → 5.0/5 🟢	✅ migrate-mstest-v1v2-to-v3; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.04	✅
migrate-mstest-v1v2-to-v3	Migrate from .testsettings to .runsettings	4.0/5 → 4.0/5	✅ migrate-mstest-v1v2-to-v3; tools: skill, bash	✅ 0.04	❌ [18]
migrate-mstest-v1v2-to-v3	Fix DataRow type mismatch errors after v3 upgrade	3.0/5 → 3.0/5	✅ migrate-mstest-v1v2-to-v3; tools: skill	✅ 0.04	✅
migrate-mstest-v1v2-to-v3	Migrate to MSTest.Sdk project style	3.0/5 → 5.0/5 🟢	✅ migrate-mstest-v1v2-to-v3; tools: skill, bash	✅ 0.04	✅
migrate-mstest-v1v2-to-v3	Handle dropped target framework during v3 migration	5.0/5 → 5.0/5	⚠️ NOT ACTIVATED	✅ 0.04	❌ [19]
migrate-mstest-v1v2-to-v3	Migrate complex MSTest v2 project with testsettings, DataRow issues, and dropped TFM	3.0/5 → 5.0/5 🟢	✅ migrate-mstest-v1v2-to-v3; tools: skill	✅ 0.04	✅
migrate-mstest-v1v2-to-v3	Correctly identify MSTest v1 vs v2 and recommend different migration paths	4.0/5 → 5.0/5 🟢	✅ migrate-mstest-v1v2-to-v3; tools: skill	✅ 0.04	✅
detect-static-dependencies	Identify static dependencies in a multi-class project	4.0/5 → 4.0/5	✅ detect-static-dependencies; tools: skill, glob / ✅ detect-static-dependencies; tools: skill	✅ 0.06	❌ [20]
detect-static-dependencies	Detect time-related statics and recommend TimeProvider	5.0/5 → 5.0/5	✅ detect-static-dependencies; tools: skill	✅ 0.06	❌ [21]
detect-static-dependencies	Decline scan for non-C# project	5.0/5 → 5.0/5	ℹ️ not activated (expected)	✅ 0.06	✅ [22]
migrate-mstest-v3-to-v4	Migrate custom TestMethodAttribute from Execute to ExecuteAsync	2.0/5 → 3.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	✅
migrate-mstest-v3-to-v4	Replace ExpectedExceptionAttribute with Assert.ThrowsExactly	3.0/5 → 4.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	✅
migrate-mstest-v3-to-v4	Fix multiple v4 breaking changes: Assert, ClassCleanup, TestContext, Timeout	5.0/5 → 5.0/5	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	✅
migrate-mstest-v3-to-v4	Handle net6.0 target framework dropped in MSTest v4	3.0/5 → 5.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.06	✅
migrate-mstest-v3-to-v4	Fix TestMethodAttribute CallerInfo constructor breaking change	5.0/5 → 5.0/5	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	❌
migrate-mstest-v3-to-v4	Understand behavioral changes after MSTest v4 upgrade	3.0/5 → 5.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	✅
migrate-mstest-v3-to-v4	Handle MSTest.Sdk and MTP changes in v4	2.0/5 → 3.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: report_intent, skill	✅ 0.06	✅
migrate-mstest-v3-to-v4	Full MSTest v3 to v4 migration with multiple breaking changes	2.0/5 → 5.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	✅
migrate-mstest-v3-to-v4	Migrate MSTest.Sdk v3 project using ManagedType and TestTimeout	2.0/5 → 4.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	✅
migrate-mstest-v3-to-v4	Correctly identify MSTest v3 project and recommend v4 migration	3.0/5 → 5.0/5 🟢	✅ migrate-mstest-v3-to-v4; tools: skill	✅ 0.06	✅
crap-score	Calculate CRAP score for a single method with partial coverage	4.0/5 → 4.0/5	✅ crap-score; tools: skill, glob / ✅ crap-score; tools: skill	✅ 0.11	✅
crap-score	Identify riskiest methods across a file	4.0/5 → 5.0/5 🟢	✅ crap-score; tools: skill	✅ 0.11	✅
crap-score	Generate coverage then compute CRAP score	3.0/5 → 4.0/5 🟢	✅ crap-score; tools: skill	✅ 0.11	✅
exp-mock-usage-analysis	Detect unused and unreachable mock setups	4.0/5 → 5.0/5 🟢	✅ exp-mock-usage-analysis; tools: skill	✅ 0.07	✅
exp-mock-usage-analysis	Detect redundant mock configurations duplicated across tests	3.0/5 → 3.0/5	✅ exp-mock-usage-analysis; tools: skill	✅ 0.07	❌ [23]
exp-mock-usage-analysis	Detect mocking of stable framework types	3.0/5 → 5.0/5 🟢	✅ exp-mock-usage-analysis; tools: skill	✅ 0.07	✅
exp-mock-usage-analysis	Analyze mock usage in NSubstitute tests	5.0/5 → 4.0/5 🔴	✅ exp-mock-usage-analysis; tools: skill	✅ 0.07	❌ [24]
exp-mock-usage-analysis	Analyze mock usage in FakeItEasy tests	4.0/5 → 5.0/5 🟢	✅ exp-mock-usage-analysis; tools: skill	✅ 0.07	✅
exp-mock-usage-analysis	Detect excessive mock configuration sprawl	3.0/5 → 4.0/5 🟢	✅ exp-mock-usage-analysis; tools: skill	✅ 0.07	✅
exp-test-tagging	Tag an untagged MSTest test suite	3.0/5 → 5.0/5 🟢	✅ exp-test-tagging; tools: skill / ✅ exp-test-tagging; tools: skill, read_bash	🟡 0.36	✅
exp-test-tagging	Tag an untagged xUnit test suite	4.0/5 → 5.0/5 🟢	✅ exp-test-tagging; tools: skill / ⚠️ NOT ACTIVATED	🟡 0.36	✅
exp-test-tagging	Tag an untagged NUnit test suite	3.0/5 → 4.0/5 🟢	✅ exp-test-tagging; tools: skill, glob / ⚠️ NOT ACTIVATED	🟡 0.36	✅
exp-test-tagging	Audit test distribution without modifying files	5.0/5 → 5.0/5	✅ exp-test-tagging; tools: skill	🟡 0.36	❌ [25]
exp-test-tagging	Decline request to write new tests	4.0/5 → 4.0/5	ℹ️ not activated (expected)	🟡 0.36	❌ [26]
exp-test-smell-detection	Detect multiple test smells in order processing test suite	4.0/5 → 5.0/5 🟢	✅ exp-test-smell-detection; tools: skill	✅ 0.05	✅
exp-test-smell-detection	Recognize well-written tests with no significant smells	3.0/5 → 5.0/5 🟢	✅ exp-test-smell-detection; tools: skill	✅ 0.05	✅
exp-test-smell-detection	Recognize integration tests and avoid false positives for external resources	5.0/5 → 5.0/5	✅ exp-test-smell-detection; tools: skill	✅ 0.05	❌ [27]
exp-test-smell-detection	Decline request to write new tests from scratch	5.0/5 → 5.0/5	ℹ️ not activated (expected)	✅ 0.05	❌ [28]
exp-test-gap-analysis	Find boundary mutation gaps in tiered discount and shipping logic	4.0/5 → 5.0/5 🟢	✅ exp-test-gap-analysis; tools: skill	✅ 0.10	✅
exp-test-gap-analysis	Find logic and null-check mutation gaps in access control code	5.0/5 → 5.0/5	✅ exp-test-gap-analysis; tools: skill	✅ 0.10	❌ [29]
exp-test-gap-analysis	Acknowledge well-tested code with few surviving mutations	4.0/5 → 4.0/5	✅ exp-test-gap-analysis; tools: skill, glob / ✅ exp-test-gap-analysis; tools: skill	✅ 0.10	✅
exp-test-gap-analysis	Decline request to write new tests from scratch	4.0/5 → 4.0/5	ℹ️ not activated (expected)	✅ 0.10	❌ [30]
exp-assertion-quality	Identify low assertion diversity in equality-dominated test suite	4.0/5 → 5.0/5 🟢	✅ exp-assertion-quality; tools: skill / ✅ exp-assertion-quality; tools: skill, glob	✅ 0.10	✅
exp-assertion-quality	Flag assertion-free tests and trivial-only assertions	3.0/5 → 4.0/5 🟢	✅ exp-assertion-quality; tools: skill / ⚠️ NOT ACTIVATED	✅ 0.10	✅
exp-assertion-quality	Recognize well-diversified assertion usage	3.0/5 → 5.0/5 🟢	✅ exp-assertion-quality; tools: skill	✅ 0.10	✅
exp-assertion-quality	Decline request to write new tests from scratch	2.0/5 ⏰ → 3.0/5 ⏰ 🟢	ℹ️ not activated (expected)	✅ 0.10	✅
exp-test-maintainability	Recommend data-driven patterns with display names for unclear parameters	4.0/5 → 4.0/5	⚠️ NOT ACTIVATED	✅ 0.08	❌ [31]
exp-test-maintainability	Recognize well-maintained tests that need minimal changes	4.0/5 → 5.0/5 🟢	⚠️ NOT ACTIVATED / ✅ exp-test-maintainability; tools: report_intent, skill	✅ 0.08	✅
exp-test-maintainability	Detect repeated object construction and setup across test methods	3.0/5 → 4.0/5 🟢	✅ exp-test-maintainability; tools: skill	✅ 0.08	✅
exp-test-maintainability	Recognize tests with minimal boilerplate that need no refactoring	4.0/5 → 5.0/5 🟢	✅ exp-test-maintainability; tools: skill	✅ 0.08	❌ [32]
exp-simd-vectorization	Optimize manual min/max with TensorPrimitives	1.0/5 → 5.0/5 🟢	✅ exp-simd-vectorization; tools: skill, glob, create	🟡 0.22	✅
exp-simd-vectorization	Optimize manual product with TensorPrimitives	1.0/5 → 5.0/5 🟢	✅ exp-simd-vectorization; tools: skill, glob, create, bash / ⚠️ NOT ACTIVATED	🟡 0.22	❌ [33]
exp-simd-vectorization	No optimization opportunity — dictionary-based lookup service	1.0/5 → 4.0/5 🟢	⚠️ NOT ACTIVATED	🟡 0.22	✅
exp-simd-vectorization	Optimize int array conditional increment with SIMD	3.0/5 → 4.0/5 🟢	✅ exp-simd-vectorization; tools: skill	🟡 0.22	✅
exp-simd-vectorization	Optimize byte buffer bit reversal with SIMD	1.0/5 ⏰ → 4.0/5 🟢	✅ exp-simd-vectorization; tools: skill, edit, bash / ✅ exp-simd-vectorization; tools: skill	🟡 0.22	❌ [34]

[1] (Plugin) Quality unchanged but weighted score is -7.9% due to: tokens (19082 → 59936), tool calls (0 → 2)
[2] (Plugin) Quality unchanged but weighted score is -2.0% due to: quality, tokens (20438 → 24549)
[3] (Isolated) Quality unchanged but weighted score is -12.3% due to: tokens (66796 → 161170), quality, tool calls (9 → 17), time (31.8s → 55.3s)
[4] (Isolated) Quality unchanged but weighted score is -6.7% due to: tokens (100103 → 162734), tool calls (11 → 19), time (40.8s → 69.8s)
[5] (Isolated) Quality improved but weighted score is -5.7% due to: tokens (71806 → 122970), time (33.3s → 57.8s)
[6] (Isolated) Quality unchanged but weighted score is -2.6% due to: tokens (31188 → 38509), tool calls (3 → 4), time (11.7s → 14.6s)
[7] (Isolated) Quality unchanged but weighted score is -0.7% due to: tokens (38306 → 59456), tool calls (4 → 5), time (26.3s → 31.8s)
[8] (Plugin) Quality unchanged but weighted score is -29.5% due to: quality, judgment, tokens (62063 → 74826)
[9] (Isolated) Quality unchanged but weighted score is -16.5% due to: completion (✓ → ✗), tokens (130577 → 340674), time (67.5s → 150.0s), tool calls (10 → 18)
[10] (Plugin) Quality unchanged but weighted score is -5.8% due to: tokens (30713 → 55560), tool calls (2 → 3), time (14.1s → 17.0s)
[11] (Isolated) Quality unchanged but weighted score is -9.3% due to: tokens (18444 → 38539), tool calls (0 → 1), time (11.5s → 19.8s)
[12] (Plugin) Quality unchanged but weighted score is -10.0% due to: tokens (156817 → 1018419), tool calls (19 → 45), time (94.2s → 293.2s)
[13] (Isolated) Quality unchanged but weighted score is -18.0% due to: judgment, quality, tokens (451439 → 529026)
[14] (Plugin) Quality unchanged but weighted score is -0.5% due to: tokens (189866 → 231027)
[15] (Isolated) Quality unchanged but weighted score is -12.4% due to: judgment, tokens (132558 → 150073)
[16] (Isolated) Quality unchanged but weighted score is -30.4% due to: quality, judgment, tokens (1018206 → 1633192), tool calls (69 → 132), time (366.9s → 533.2s)
[17] (Isolated) Quality unchanged but weighted score is -12.3% due to: completion (✓ → ✗), tokens (33153 → 51417), tool calls (3 → 5)
[18] (Isolated) Quality unchanged but weighted score is -17.6% due to: judgment, quality, tokens (59260 → 68176)
[19] (Plugin) Quality unchanged but weighted score is -1.6% due to: tokens (30748 → 38968)
[20] (Plugin) Quality unchanged but weighted score is -1.8% due to: tokens (73256 → 125415), time (31.7s → 65.4s)
[21] (Isolated) Quality unchanged but weighted score is -18.0% due to: quality, tokens (46245 → 109492), tool calls (5 → 11), time (28.9s → 72.1s)
[22] (Plugin) Quality dropped but weighted score is +0.9% due to: efficiency metrics
[23] (Isolated) Quality unchanged but weighted score is -1.2% due to: tokens (48445 → 70755), time (53.8s → 88.3s), tool calls (5 → 6)
[24] (Plugin) Quality unchanged but weighted score is -2.6% due to: tokens (28649 → 50226), time (43.0s → 78.0s), tool calls (3 → 4)
[25] (Isolated) Quality unchanged but weighted score is -19.2% due to: judgment, quality, tokens (45352 → 66031), tool calls (7 → 9)
[26] (Plugin) Quality unchanged but weighted score is -2.0% due to: tokens (41837 → 206912), tool calls (4 → 14), time (48.4s → 131.3s)
[27] (Plugin) Quality unchanged but weighted score is -8.1% due to: tokens (39926 → 71189), time (42.9s → 81.9s), tool calls (4 → 7)
[28] (Plugin) Quality unchanged but weighted score is -10.0% due to: tokens (40845 → 286462), tool calls (3 → 17), time (29.4s → 142.3s)
[29] (Isolated) Quality unchanged but weighted score is -9.7% due to: judgment, quality
[30] (Plugin) Quality unchanged but weighted score is -3.5% due to: tokens (185339 → 263698), time (78.7s → 104.6s), tool calls (14 → 17)
[31] (Plugin) Quality unchanged but weighted score is -0.2% due to: efficiency metrics
[32] (Plugin) Quality unchanged but weighted score is -6.6% due to: tokens (39395 → 66399), time (24.2s → 63.2s), tool calls (4 → 5)
[33] (Plugin) Quality unchanged but weighted score is -17.1% due to: judgment, quality
[34] (Plugin) Quality unchanged but weighted score is -7.5% due to: tokens (12219 → 27443), tool calls (2 → 4)

⏰ timeout — run(s) hit the (120s, 180s, 300s, 360s) scenario timeout limit; scoring may be impacted by aborting model execution before it could produce its full output (increase via timeout in eval.yaml)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

▶ Sessions Visualisation -- interactive replay of all evaluation sessions

Fix the repo path traversal check

7887228

JanKrivanek requested a review from ViktorHofer as a code owner April 14, 2026 15:32

Copilot AI review requested due to automatic review settings April 14, 2026 15:32

Copilot started reviewing on behalf of JanKrivanek April 14, 2026 15:33 View session

JanKrivanek enabled auto-merge (squash) April 14, 2026 15:34

Copilot AI reviewed Apr 14, 2026

View reviewed changes

eng/skill-validator/src/Check/SkillProfiler.cs Show resolved Hide resolved

ViktorHofer approved these changes Apr 14, 2026

View reviewed changes

JanKrivanek merged commit e6f4168 into main Apr 14, 2026
41 checks passed

JanKrivanek deleted the dev/jankrivanek/fix-repo-refs-check branch April 14, 2026 16:52

github-actions bot added a commit that referenced this pull request Apr 14, 2026

Update PR token usage data (PR #525)

bdb8b38

github-actions bot added a commit that referenced this pull request Apr 14, 2026

Update session data (PR #525)

b0cba63

github-actions bot mentioned this pull request Apr 15, 2026

🏥 Repository Health Dashboard #288

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the repo path traversal check#525

Fix the repo path traversal check#525
JanKrivanek merged 1 commit intomainfrom
dev/jankrivanek/fix-repo-refs-check

JanKrivanek commented Apr 14, 2026

Uh oh!

JanKrivanek commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

JanKrivanek commented Apr 14, 2026

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

JanKrivanek commented Apr 14, 2026

Motivation

Uh oh!

JanKrivanek commented Apr 14, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

JanKrivanek commented Apr 14, 2026

Uh oh!

Uh oh!

github-actions bot commented Apr 14, 2026

Skill Validation Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants