Skip to content

Exception handling revamp, parallel multi-host inflation#4398

Merged
HebaruSan merged 16 commits into
KSP-CKAN:masterfrom
HebaruSan:fix/exception-reporting
Jun 23, 2025
Merged

Exception handling revamp, parallel multi-host inflation#4398
HebaruSan merged 16 commits into
KSP-CKAN:masterfrom
HebaruSan:fix/exception-reporting

Conversation

@HebaruSan
Copy link
Copy Markdown
Member

@HebaruSan HebaruSan commented Jun 23, 2025

Motivations

  • We currently target .NET Framework 4.8 (in addition to .NET8). The final version of .NET Framework, 4.8.1, was released in August 2022. Updating at that time would have caused problems for users who hadn't upgraded, but now that it's been out for a while, it would be nice to use the latest version.
    https://dotnet.microsoft.com/en-us/download/dotnet-framework
  • Task.Run was introduced to replace the simplest usages of Task.Factory.StartNew.
    https://devblogs.microsoft.com/dotnet/task-run-vs-task-factory-startnew/
  • Some of our code emits log4net warnings when the tests are run, which can look like a problem but is supposed to happen.
  • Main DLC store is gone, add GOG and Epic CKAN-meta#3363 retired the broken resources.store links for the DLCs and added resources.gogstore and resources.epicstore, but we need some client updates to display them.
  • Netkan inflates each part of a multi-hosted mod sequentially, but they're completely independent up until the moment they're merged and so a good candidate for parallel processing
  • The Inflator has been emitting Index was outside the bounds of the array. errors for a while now, which cannot be reproduced locally. I have audited our code and not found any suspicious usages of array indexes. To figure out what is causing this, we need the bot to log more information from the exception.

Problems

  • If you override your stability settings to install a pre-release of a mod, trying to downgrade back to the latest stable release doesn't light up the Apply button. You can downgrade to an earlier version, then upgrade to the latest, but the direct downgrade to latest doesn't work.
  • @Clayell pointed out that in the Settings dialog, the Install update button is enabled in the latest stable release. It's supposed to be disabled because there is no newer stable version to install yet.

Causes

  • The changeset tried to represent the downgrade as a ModUpgrade, which then ran afoul of the upgradability-checking logic from Better version specific relationships at install and upgrade #4023.
  • The screenshot tells the tale; locally we have v1.36.0.25098, but from the GitHub API we get only our tag v1.36.0, and the extra build number piece makes these two in-fact equal versions compare as non-equal.
    image

Changes

  • Now our .NET Framework target is updated from 4.8 to 4.8.1
  • Now all calls to Task.Factory.StartNew are changed to Task.Run
  • Now tests that raise log4net warnings suppress their output
  • Now mod info will show the links to the GOG and Epic stores for DLC
  • Now the Inflator handles multi-hosted netkans in parallel with Tasks, which means that it will contact GitHub and SpaceDock simultaneously and process each of them independently, which should be overall quicker than doing it sequentially.
  • Now exception handling is substantially revamped
    • The Kraken.Message property now always holds the user-friendly string that should be displayed to explain the problem
    • Child classes of Kraken no longer override ToString, so we can use the standard implementation to print stack traces if needed
    • In CmdLine, ConsoleUI, GUI, and Netkan, we now always display Kraken.Message for Kraken exceptions rather than custom UI-specific messages, and Exception.ToString() for non-Krakens. This roughly corresponds to printing simple, user-friendly messages for "expected" exceptions and full stack traces for "unexpected" exceptions, and should allow us to identify the cause of the Index was outside the bounds of the array. errors.
  • Now you can downgrade a mod from a pre-release to the latest stable release, because we represent it as a remove operation plus an install operation instead of an upgrade
  • Now the Install update button in the Settings dialog will properly be disabled if you're on the latest stable release with dev builds disabled, and tests are created to cover this
  • Now a new bin/find_unused_resources.sh utility is created to identify unused i18n resources
  • All existing unused i18n resources are removed
  • Two straggling untranslated strings are now internationalized

@HebaruSan HebaruSan added Bug Something is not working as intended Enhancement New features or functionality Build Issues affecting the build system Spec Issues affecting the spec Netkan Issues affecting the netkan data Schema Issues affecting the schema AutoUpdate Issues affecting the automatic updating Tests Issues affecting the internal tests i18n Issues regarding the internationalization of CKAN and PRs that need translating Performance Something's slower than it should be labels Jun 23, 2025
@coveralls
Copy link
Copy Markdown

Pull Request Test Coverage Report for Build 15833375480

Details

  • 48 of 137 (35.04%) changed or added relevant lines in 30 files are covered.
  • 12 unchanged lines in 6 files lost coverage.
  • Overall coverage increased (+0.1%) to 42.236%

Changes Missing Coverage Covered Lines Changed/Added Lines %
Cmdline/Action/Replace.cs 0 1 0.0%
Cmdline/Action/Update.cs 0 1 0.0%
Core/Net/NetAsyncModulesDownloader.cs 9 10 90.0%
Core/Repositories/ReadProgressStream.cs 0 1 0.0%
Core/Types/CkanModule.cs 0 1 0.0%
GUI/Main/MainAutoUpdate.cs 0 1 0.0%
GUI/Main/MainDownload.cs 0 1 0.0%
GUI/Main/MainExport.cs 0 1 0.0%
GUI/Main/MainRecommendations.cs 0 1 0.0%
GUI/Model/GUIMod.cs 4 5 80.0%
Files with Coverage Reduction New Missed Lines %
Cmdline/Action/Show.cs 1 0.0%
GUI/Main/MainRepo.cs 1 0.0%
Netkan/Processors/QueueHandler.cs 1 0.95%
GUI/Main/MainInstall.cs 2 0.0%
Core/Types/Kraken.cs 3 58.29%
Cmdline/Action/Install.cs 4 34.53%
Totals Coverage Status
Change from base Build 15715543629: 0.1%
Covered Lines: 8334
Relevant Lines: 19442

💛 - Coveralls

@HebaruSan HebaruSan added the GUI Issues affecting the interactive GUI label Jun 23, 2025
@HebaruSan HebaruSan merged commit dd37ee2 into KSP-CKAN:master Jun 23, 2025
6 checks passed
@HebaruSan HebaruSan deleted the fix/exception-reporting branch June 23, 2025 19:43
@HebaruSan
Copy link
Copy Markdown
Member Author

HebaruSan commented Jun 25, 2025

  • The Inflator has been emitting Index was outside the bounds of the array. errors for a while now, which cannot be reproduced locally. I have audited our code and not found any suspicious usages of array indexes. To figure out what is causing this, we need the bot to log more information from the exception.

Well, I certainly did not anticipate this!

System.IndexOutOfRangeException: Index was outside the bounds of the array.
 at System.Collections.Generic.Dictionary`2.TryInsert(TKey key, TValue value, InsertionBehavior behavior)
 at CKAN.NetFileCache.GetFileHash(String filePath, String hashSuffix, Dictionary`2 cache, Func`1 getHashAlgo, IProgress`1 progress, Nullable`1 cancelToken)
 at CKAN.NetKAN.Transformers.DownloadAttributeTransformer.Transform(Metadata metadata, TransformOptions opts)+MoveNext()
 at System.Collections.Generic.LargeArrayBuilder`1.AddRange(IEnumerable`1 items)
 at System.Collections.Generic.SparseArrayBuilder`1.ReserveOrAdd(IEnumerable`1 items)
 at System.Linq.Enumerable.SelectManySingleSelectorIterator`2.ToArray()
 at System.Linq.Enumerable.Aggregate[TSource,TAccumulate](IEnumerable`1 source, TAccumulate seed, Func`3 func)
 at CKAN.NetKAN.Transformers.NetkanTransformer.Transform(Metadata metadata, TransformOptions opts)
 at CKAN.Extensions.EnumerableExtensions.<>c__DisplayClass9_1`2.<SelectManyTasks>b__2()
 at System.Threading.Tasks.Task`1.InnerInvoke()
 at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
 at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(Thread threadPoolThread, ExecutionContext executionContext, ContextCallback callback, Object state)
 at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread t

So not only was it not about us using a bad index with an array, it wasn't even about an array at all (insert not-impressed-face meme here)! There is only one Dictionary in GetFileHash into which we could be Trying to Insert:

private string GetFileHash(string filePath,
string hashSuffix,
Dictionary<string, string> cache,
Func<HashAlgorithm> getHashAlgo,
IProgress<int>? progress,
CancellationToken? cancelToken)
{
string hashFile = $"{filePath}.{hashSuffix}";
if (cache.TryGetValue(filePath, out string? hash))
{
return hash;
}
else if (File.Exists(hashFile))
{
hash = File.ReadAllText(hashFile);
cache.Add(filePath, hash);
return hash;
}
else
{
using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
using (BufferedStream bs = new BufferedStream(fs))
using (HashAlgorithm hasher = getHashAlgo())
{
hash = BitConverter.ToString(hasher.ComputeHash(bs, progress, cancelToken)).Replace("-", "");
cache.Add(filePath, hash);
if (Path.GetDirectoryName(hashFile) == cachePath.FullName)
{
hash.WriteThroughTo(hashFile);
}
return hash;
}
}
}

I guess we need to switch that cache to a ConcurrentDictionary:

https://stackoverflow.com/questions/15095817/adding-to-a-generic-dictionary-causes-indexoutofrangeexception

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AutoUpdate Issues affecting the automatic updating Bug Something is not working as intended Build Issues affecting the build system Enhancement New features or functionality GUI Issues affecting the interactive GUI i18n Issues regarding the internationalization of CKAN and PRs that need translating Netkan Issues affecting the netkan data Performance Something's slower than it should be Schema Issues affecting the schema Spec Issues affecting the spec Tests Issues affecting the internal tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants