Skip to content

Support for parallel target generation#863

Closed
asifmohd wants to merge 15 commits intoyonaskolb:masterfrom
asifmohd:parallelized_generation
Closed

Support for parallel target generation#863
asifmohd wants to merge 15 commits intoyonaskolb:masterfrom
asifmohd:parallelized_generation

Conversation

@asifmohd
Copy link
Copy Markdown
Contributor

@asifmohd asifmohd commented May 17, 2020

Background

  • In my current work project, we have around 55 targets being generated using Xcodegen. Out of which 24 targets are targets with no dependency on any other target.
  • After running the time profiler using my project.yml and a debug xcodegen binary, I found out that most of the time was being spent in the Glob.exploreDirectories section.
  • The 55 targets which we have migrated are just approximately 1/3rd of all the targets our app supports today, so the time taken to run Xcodegen would only increase proportionally to the number of targets we have in the future.
  • So the motivation was to be able to generate atleast these 24 targets in parallel to get a good speed boost because Glob can be executed in parallel for such targets.

Overview of changes to code in this PR

  • I started from bottom up, so the attempt was to make getSourceMatches in SourceGenerator.swift, be able to run asynchronously in multiple threads for each target
    • This required making all the functions in the calling chain accept completion handlers, instead of directly returning the generated result.
  • The changes heavily rely on GCD to dispatch to global queues, with QOS as userInitiated, along with heavy use of DispatchGroups, to block queue execution when successive code depend on a result from an async completion handler being executed.
  • The main queue does not perform any operations while the targets are being generated but rather is blocked waiting for the targets to be generated before proceeding to the Writing project phase. Instead of blocking the main queue, we could probably show progress to the user on how many targets have been generated in a separate PR.
    • Had to block the main queue because:
      • Main queue is serial and we want to generate multiple targets, so we need to use a global async queue.
      • I felt we should keep the main queue clear of any background/processing operations, so everything related to generating the targets, is done in a background thread.
      • The execute function in GenerateCommand.swift is a synchronous function, if we return early, the xcodegen command exits as well, so we have to make the main queue wait so that the command thinks, that processing is still going on.
  • Introduced the Atomic class inspired from an objc.io article
    • These are required to make some of the shared variables such as rootGroups in SourceGenerator.swift and more, be thread safe, ie reading and writing happens using a serial queue.
      • A future improvement/refactor would be try and make more and more of these variables independently stored in a struct for each target, have to explore the codebase more to understand if that is possible.
  • Moved the getAllDependenciesPlusTransitiveNeedingEmbedding from SourceGenerator.swift to Target.swift, so that the code in SourceGenerator.swift reduces a bit, and to be able to reuse this function to figure out the order of generating targets, by reusing and calling this function on each target itself, ex: target.getAllDependencies(usingProject:)
    • This change required moving shouldEmbedDependencies as well and marking it as a public function
    • Let me know if this change can be given in a separate PR to reduce the diff on this PR
  • Because of the introduction of dispatch groups, and dispatch queues, there is a lot of diff, with just indentation changes, so apologies for the large diff in advance 🙏

Overview of changes related to tests in this PR

  • To minimise the amount of code changes in the test targets, the helper functions such as generateXcodeProject in PBXProjGeneratorTests.swift have been modified to convert the completion handler based functions to be serial functions, which return a single value instead of accepting and calling the completion handlers
  • All the existing tests pass, but no new tests have been added yet. Will add new tests, after an initial round of review.

Results

  • For the 55 targets, on my MBP 2018 version, it used to take 14 seconds for the xcodegen command to finish.
  • Now after these changes it takes approx 7 seconds, a 50% improvement in project generation times
  • These improvements will vary depending on how many targets you have, and how many of them could be generated in parallel.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the part where I am blocking the main queue, so that the generateCommand gets time to generate all the targets, write the xcodeproj and then finish

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this assumption, please help review, we can probably get rid of this if check as well.

But this helps me parallelise a few targets which just depend on an existing Xcode project reference

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This while loops helps us parallelize more and more targets in each loop, starting with targets, which have no dependencies at all

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generatedTargetNames.contains(dependency.reference), this is how we see if a target is ready to be generated

By checking each dependency of that target, and checking if it's already been generated, if so, then the target is ready to be generated, otherwise it stays in the pendingTargets array

Copy link
Copy Markdown
Contributor Author

@asifmohd asifmohd May 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I couldn't find a way to make the DispatchQueue.global triggers follow a try pattern, I had to fall back to this, where i keep a track of all errors and then throw them once the completion block gets called

@asifmohd asifmohd force-pushed the parallelized_generation branch from b94ee28 to 97b8eea Compare May 17, 2020 15:15
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to Target.swift as an extension on Target

@brentleyjones
Copy link
Copy Markdown
Collaborator

We have a project.yml file that has 276 targets, where 275 depend on a single other target. This change resulted in no speed up or slow down for us 😕.

@asifmohd
Copy link
Copy Markdown
Contributor Author

We have a project.yml file that has 276 targets, where 275 depend on a single other target. This change resulted in no speed up or slow down for us 😕.

I should be able to do something about this, basically I'll have to modify the parallelization logic to sort by the number of dependencies a target has, and then pick the one with the least dependencies and so on, so that more and more targets can be parallelized in each cycle. Right now I just pick the last target from the array, which is probably why you are not seeing an improvement. Will try to make this change this coming weekend 🙂

@yonaskolb
Copy link
Copy Markdown
Owner

@keith, would you mind testing this on the Lyft codebase?

@asifmohd
Copy link
Copy Markdown
Contributor Author

asifmohd commented May 30, 2020

We have a project.yml file that has 276 targets, where 275 depend on a single other target. This change resulted in no speed up or slow down for us

@brentleyjones with my most recent commit, ideally the dependency resolution logic should pick up the target with no dependencies, then targets with just 1 dependencies, and so on.
Ideally this should help in your project, assuming that single target which all other targets depend on, does not have any dependencies.
Therefore the single target will be built first, and then in the next cycle, we'll build all those targets which depend on this single target, and if there are no other dependencies in the targets in this cycle, all 275 of them should build in parallel 🤞

@asifmohd
Copy link
Copy Markdown
Contributor Author

There is a lot of scope in improving the dependency resolution of targets, like we could probably build functionality, which would parse all the targets, and find the most commonly depended target, build that and it's dependencies first, so that rest of the targets can be parallelized. Kinda like our own build system for generating targets.

@brentleyjones
Copy link
Copy Markdown
Collaborator

brentleyjones commented Jun 1, 2020

@asifmohd It's now about 10% slower than master for us. And when we make it so we instead have 275 independent targets we get into a deadlock and it never returns.

@keith
Copy link
Copy Markdown
Collaborator

keith commented Jun 2, 2020

I tested this and also hit the deadlock, here's the main thread in that case:

Call graph:
    2498 Thread_11217851   DispatchQueue_1: com.apple.main-thread  (serial)
    + 2498 start  (in libdyld.dylib) + 1  [0x7fff6cdda7fd]
    +   2498 main  (in xcodegen) + 194  [0x1019e5752]
    +     2498 XcodeGenCLI.execute(arguments:)  (in xcodegen) + 37  [0x1019eb9b5]
    +       2498 CLI.go()  (in xcodegen) + 119  [0x1019a4b47]
    +         2498 CLI.go(with:)  (in xcodegen) + 1352  [0x1019a5118]
    +           2498 protocol witness for Command.execute() in conformance ProjectCommand  (in xcodegen) + 9  [0x1019ead99]
    +             2498 ProjectCommand.execute()  (in xcodegen) + 1261  [0x1019eac8d]
    +               2498 GenerateCommand.execute(specLoader:projectSpecPath:project:)  (in xcodegen) + 6268  [0x1019e8d5c]
    +                 2498 _dispatch_group_wait_slow  (in libdispatch.dylib) + 43  [0x7fff6cd81f61]
    +                   2498 _dlock_wait  (in libdispatch.dylib) + 44  [0x7fff6cd81cd6]
    +                     2498 __ulock_wait  (in libsystem_kernel.dylib) + 10  [0x7fff6cf1c9be]

@asifmohd asifmohd force-pushed the parallelized_generation branch from 3180eae to 48d2153 Compare June 25, 2020 07:40
@asifmohd
Copy link
Copy Markdown
Contributor Author

Thanks for testing this out everyone 🙏
I will try to add some more tests related to parallel target generation, will try to include the scenarios where it's failing for you guys.
I have not been able to find time these past few weeks, and there are still a few other work related commitments i have to take care of. So progress might be slow, will ping back here, once the PR is ready for review again.

asifmohd added 10 commits July 25, 2022 10:07
to unblock the thread calling this function, which is generateXcodeProject
This was more evident in unit tests, without this the performanceTests
would end up being blocked
The alternative ie passing the completion handlers forward
would have required too many changes
in all uses of these helpers, around 80 when i checked
for atomic operations on global variables
Not using this causes various SIGABRT signals
to be triggered because of multiple threads accessing
and mutating the same global variable
and use the same helper function, to split targets into two arrays
one with no dependencies and one with dependencies
so that successive calls can reuse the cached data from a previous job

Without this, because of the parallelization, duplicate groups/file references get
added to the pbxproj
ie keep checking in a while loop, to see if there are targets which
can now be generated in a parallel manner, once the previous cycle
has lead to building a few targets

There is a fallback check to see if there are no targets to be generated
in this cycle, then just remove the last target from the pending
target list and process that target

So in the worst case, the while loop will fallback to generating
targets sequentially
and in the best cases, atleast >2 targets will be generated
parallely in the current while loop
asifmohd added 4 commits July 25, 2022 10:07
We are now generating each target in parallel as much as possible
We are not reusing the generateTargetGroups dispatchGroup for target generation
instead we are creating a new dispatch group to isolate the functionality
related to generating targets in parallel.
@asifmohd asifmohd force-pushed the parallelized_generation branch from 5370e9a to 3258e01 Compare July 25, 2022 05:06
@asifmohd asifmohd closed this Apr 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants