Support ControlNet by ryu38 · Pull Request #153 · apple/ml-stable-diffusion

ryu38 · 2023-04-08T05:12:34Z

I added ControlNet feature in model conversion and inference.

New Files

controlnet.py

ControlNet Model on PyTorch (Reference: Add a ControlNet model & pipeline huggingface/diffusers#2407)
To optimize ControlNet on CoreML, there are some changes from the original model, similar to UNet..

ControlNet.swift

This is used in image generation with Swift.

Main Changes

torch2coreml.py

two new options added
- --convert-contronet
  - Unlike other --convert-* options, it requires controlnet models name after the option.
  - To convert multiple models, provide their names separated by spaces.
  - Example: --convert-contronet lllyasviel/sd-controlnet-mlsd lllyasviel/sd-controlnet-canny
  - ControlNet model is saved as ControlNet_lllyasviel_sd-controlnet-mlsd.mlpackage
- --unet-support-controlnet
  - This option enables UNet to receive ControlNet results as additional inputs.
  - The model is saved with a different name: *_control-unet.mlpackage

unet.py and UNet.swift

Supports ControlNet

pipeline.py

two new options added
- --controlnet
  - Models provided with this option are used in image generation.
  - Enter the option in the same way as --convert-contronet option in torch2coreml.py
- --controlnet-inputs
  - Image inputs corresponding to each ControlNet
  - Enter paths to the images in same order as --controlnet
If ControlNet is enabled, pipeline uses "control-unet.py" instead of "unet.py"

StableDiffusionCLI

two new options added. These are almost the same as ones in pipeline.py.
- --controlnet (enter model file names in Resources/controlnet without extension)
- --controlnet-inputs
If ControlNet is enabled, pipeline uses "ControledUNet.mlmodelc" instead of "UNet.mlmodelc"

Do not erase the below when submitting your pull request:
#########

I agree to the terms outlined in CONTRIBUTING.md

pj4533 · 2023-04-08T13:14:53Z

🎉 nice!

atiorh · 2023-04-12T04:02:52Z

python_coreml_stable_diffusion/torch2coreml.py

                                     ("unet", "Unet"),
                                     ("unet_chunk1", "UnetChunk1"),
                                     ("unet_chunk2", "UnetChunk2"),
+                                     ("control-unet", "ControledUnet"),


NIT: Could we please change this toControlledUnet?

python_coreml_stable_diffusion/torch2coreml.py

atiorh · 2023-04-12T04:17:22Z

swift/StableDiffusion/pipeline/CGImage+vImage.swift

-            var destinationG = try vImage_Buffer(width: Int(width), height: Int(height), bitsPerPixel: 8 * UInt32(MemoryLayout<Float>.size))
-            var destinationB = try vImage_Buffer(width: Int(width), height: Int(height), bitsPerPixel: 8 * UInt32(MemoryLayout<Float>.size))
-
-            var minFloat: [Float] = [-1.0, -1.0, -1.0, -1.0]


The diff in this file looks unexpectedly large, could you please verify that the only changes are related to minFloat and maxFloat vars?

atiorh

Amazing work @ryu38! I left a few comments that you could hopefully address. Do you mind adding the new CLI args (Python and Swift) in the README?

alejandro-isaza · 2023-04-12T15:46:31Z

swift/StableDiffusion/pipeline/ControlNet.swift

+            for n in 0..<results.count {
+                let result = results.features(at: n)
+                if currentOutputs.count < results.count {
+                    let initOutput = result.featureNames.reduce(into: [String: MLMultiArray]()) { output, k in


Let's use MLShapedArray instead of MLMultiArray

alejandro-isaza · 2023-04-12T15:47:11Z

swift/StableDiffusion/pipeline/ControlNet.swift

+                let result = results.features(at: n)
+                if currentOutputs.count < results.count {
+                    let initOutput = result.featureNames.reduce(into: [String: MLMultiArray]()) { output, k in
+                        output[k] = MLMultiArray(


This would be a lot faster if we could pre-allocate the output with the expected size.

Is this suggesting that we should pre-allocate MLShapedArray with a specific shape in output dictionary? If we do this before allocating model results, would we create an MLShapedArray filled with zero values?

Yes, create it with the right size and fill with zeros.

alejandro-isaza · 2023-04-12T15:48:28Z

swift/StableDiffusion/pipeline/StableDiffusionPipeline+Resources.swift

+            let fileName = model + ".mlmodelc"
+            return urls.controlNetDirURL.appending(path: fileName)
+        }
+        if (!controlNetURLs.isEmpty) {


Suggested change

if (!controlNetURLs.isEmpty) {

if !controlNetURLs.isEmpty {

alejandro-isaza · 2023-04-12T15:48:46Z

swift/StableDiffusion/pipeline/StableDiffusionPipeline+Resources.swift

+        let unetURL: URL, unetChunk1URL: URL, unetChunk2URL: URL
+
+        // if ControlNet available, Unet supports additional inputs from ControlNet
+        if (controlNet == nil) {


Suggested change

if (controlNet == nil) {

if controlNet == nil {

alejandro-isaza · 2023-04-12T15:50:05Z

swift/StableDiffusion/pipeline/Unet.swift

                "timestep" : MLMultiArray(t),
                "encoder_hidden_states": MLMultiArray(hiddenStates)
            ]
+            additionalResiduals?[$0.offset].forEach { (k, v) in


Suggested change

additionalResiduals?[$0.offset].forEach { (k, v) in

for (k, v) int additionalResiduals?[$0.offset] {

imk2o · 2023-04-12T23:23:39Z

swift/StableDiffusion/pipeline/StableDiffusionPipeline+Resources.swift

            safetyCheckerURL = baseURL.appending(path: "SafetyChecker.mlmodelc")
            vocabURL = baseURL.appending(path: "vocab.json")
            mergesURL = baseURL.appending(path: "merges.txt")
+            controlNetDirURL = baseURL.appending(path: "Controlnet")


Since torch2coreml seems to export to the controlnet directory, it seems like a good idea to start with lower case here as well.

Thanks for your great contribution!

ryu38 · 2023-04-15T02:34:51Z

Thank you for your reviews! I'll check or fix them one by one. I'll also update README to include about the new args.

atiorh · 2023-04-17T02:47:34Z

@ryu38 I see that you have pushed some commits addressing the feedback. Please let me know when you would like me to re-review :)

atiorh · 2023-04-18T21:03:21Z

Update: I am running the final tests and I will merge this PR when they pass. The latest commit seems to have addressed all the feedback but I will do one more visual pass just in case

ryu38 · 2023-04-18T23:22:59Z

@atiorh I apologize that I pushed new commit just before the branch merged. This commit addressed the remaining feedback and improved inference speed in ControlNet.swift.

pj4533 · 2023-04-18T23:50:28Z

Just wow! Well done all. Can't wait to dig into this! 🎉🎉🎉🎉

atiorh · 2023-04-19T05:23:31Z

Just realized the extra commit, this is my bad too! I don't have concerns with the diff though. Thanks for the contribution @ryu38 !

ryu38 · 2023-04-19T07:01:39Z

@atiorh Thank you for your confirmation!
I'm happy that we were able to incorporate ControlNet in this project! 🙌

TimYao18 · 2023-08-11T09:14:17Z

Excuse me,

I call the following command:

python -m python_coreml_stable_diffusion.torch2coreml \
    --convert-vae-decoder --convert-vae-encoder --convert-unet \
    --unet-support-controlnet --convert-text-encoder \
    --model-version runwayml/stable-diffusion-v1-5 \
    --bundle-resources-for-swift-cli \
    --quantize-nbits 6 \
    --attention-implementation SPLIT_EINSUM_V2 \
    -o ~/MochiDiffusion/models && \
    python -m python_coreml_stable_diffusion.torch2coreml \
    --convert-unet --unet-support-controlnet \
    --model-version runwayml/stable-diffusion-v1-5 \
    --bundle-resources-for-swift-cli \
    --quantize-nbits 6 \
    --attention-implementation SPLIT_EINSUM_V2 \
    -o ~/MochiDiffusion/models

but only these files are generated, no Unet:
ControlledUnet.mlmodelc
TextEncoder.mlmodelc
VAEDecoder.mlmodelc
VAEEncoder.mlmodelc
merges.txt
vocab.json

If I want to get runnable model supported controlNet, what commands should I run?

jrittvo · 2023-08-11T14:40:07Z

The files you ended up with are a working model, when used along with a ControlNet model. But they won't work without a ControlNet model. That is, they won't work for regular inference, or for Image2Image.

To also get the Unet.mlmodelc so that the base model will work with and without a ControlNet in the pipeline, remove --unet-support-controlnet from the second command (the one after the &&). That pass will now add the Unet.mlmodelc to the files from the first pass.

The --unet-support-controlnet modifies the type of of Unet created by the --convert-unet argument. With just --convert-unetyou get a Unet.mlmodelc. With --convert-unet and unet-support-controlnet together, you get a ControlledUnet.mlmodelc. With just unet-support-controlnet you do not get any Unet.

Note: I believe that you will also need to use the --quantize-nbits 6 argument when converting the ControlNet model in order for it to work with a 6-bit base model.

ryu38 added 8 commits April 1, 2023 20:32

add controlnet tentatively

c67bca2

add controlnet in python code

6df09c9

implement swift part

61da4da

support 8-bit quantization

4954edd

Merge branch 'main' into add-controlnet

652d9f6

add controlnet unload when reduce memory

012802e

remove irrelevant changes

1669667

add more description about controlnet option in swift

2b0a64d

ryu38 mentioned this pull request Apr 8, 2023

ControlNet extension? #131

Open

jrittvo mentioned this pull request Apr 8, 2023

Control Net MochiDiffusion/MochiDiffusion#204

Closed

1 task

atiorh reviewed Apr 12, 2023

View reviewed changes

python_coreml_stable_diffusion/torch2coreml.py Show resolved Hide resolved

atiorh reviewed Apr 12, 2023

View reviewed changes

python_coreml_stable_diffusion/torch2coreml.py Show resolved Hide resolved

atiorh reviewed Apr 12, 2023

View reviewed changes

atiorh requested a review from alejandro-isaza April 12, 2023 04:24

alejandro-isaza reviewed Apr 12, 2023

View reviewed changes

imk2o reviewed Apr 12, 2023

View reviewed changes

pcuenca mentioned this pull request Apr 15, 2023

Save base model in ControlNet training scripts when saving trained model or pushing to hub huggingface/diffusers#3115

Closed

fix some for pr and update README

f535554

pre-allocate zero shapedArray + make multi-controlnet faster

dc15b98

atiorh approved these changes Apr 18, 2023

View reviewed changes

atiorh merged commit 7f65e1c into apple:main Apr 18, 2023

	additionalResiduals?[$0.offset].forEach { (k, v) in
	for (k, v) int additionalResiduals?[$0.offset] {

Conversation

ryu38 commented Apr 8, 2023

Uh oh!

pj4533 commented Apr 8, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

atiorh left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryu38 commented Apr 15, 2023

Uh oh!

atiorh commented Apr 17, 2023

Uh oh!

atiorh commented Apr 18, 2023

Uh oh!

ryu38 commented Apr 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pj4533 commented Apr 18, 2023

Uh oh!

atiorh commented Apr 19, 2023

Uh oh!

ryu38 commented Apr 19, 2023

Uh oh!

TimYao18 commented Aug 11, 2023

Uh oh!

jrittvo commented Aug 11, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

ryu38 commented Apr 18, 2023 •

edited

Loading

jrittvo commented Aug 11, 2023 •

edited

Loading