fix: Klein 2 Inpainting breaking when there is a reference image#8803
fix: Klein 2 Inpainting breaking when there is a reference image#8803lstein merged 3 commits intoinvoke-ai:mainfrom
Conversation
Pfannkuchensack
left a comment
There was a problem hiding this comment.
Works without errror but I don't quite understand this behavior.
A prompt for editing does not generate an edit in img2img with Ref image because of the the base noise is the raster layer. I think the functionality still needs to be separated somehow
|
I agree, it doesn't crash. I masked out a dog in the raster layer, uploaded a different dog to the reference image, moved the denoising up to 0.85, and provided the prompt: "Replace the dog with the dog in image 1". Sure enough, this replaced the dog in the raster image with the dog in the reference image, although there were inpainting artifacts visible. I am having some difficulty understanding what the expected behavior should be. The raster layer is being presented as the starting latent, but the prompt is operating on the reference image in flux edit mode and can't "see" the raster layer (can it?). Edit mode works best when you tell the LLM what to "do", while in traditional img2img you instruct the encoder on what to "see." It seems to me that we should use the raster layer as reference image 1, any additional images be reference image 2, 3 and so forth, and mask the denoising process. Does that make sense? |
|
But how do you can do a img2img with Ref images without the edit mode if this should work too ? |
|
It is a bit finicky but I cannot place if it is coz of the implementation or if it is coz of the capabilities of the model itself. One of the use cases is that you select the BBOX area as the reference and the masked area will be the only area that will be filled with the inpaint. You can use an entirely different image as reference and as long as the model is capable of grasping the edit command from both, it will replace the inpaint area using the contents of the reference image. It works sometimes. Doesn't work other times. The PR really just sets the latent dimensions right for concatenating the pasted image which prevents the crash -- which I think it should. As for the utility, well that's another story. Hit and miss. |
|
I can see passing the b-box as reference 1 (whether manually or via toggle) helping in some instances, probably dependent on the structure of the prompt? I.E., an explicit edit request ("replace the dog with the dog from image X"), I would assume would work better with the starting image provided for its context, but a more general reference prompt ("the dog from image 1 is standing in a park") should treat it as a new image (constrained by the strength of the starting latent, of course), and only need the single reference image? Either way, more surgical editing with masks on editing models is an extremely desirable feature, so thanks a ton to all for the work on this! |
|
This PR fixes the crash, so I'm going to go ahead and accept it. The semantics of how img2img and edit should interact will have to wait to a later date. |
Yep. This was one of those cases where it holds when the dimensions and etc are favorable. I've had some cases where it worked perfectly and some cases where not as much. The most common use case is providing the BBOX as ref for localized editing with the power of the masking abilities -- otherwise it'll leave sharp edges. I will look further into how we want this to work perfectly. It'll also setup how we handle to handle editing models going further.
Agreed. |
…oke-ai#8803) Co-authored-by: Alexander Eichhorn <alex@eichhorn.dev> Co-authored-by: Lincoln Stein <lincoln.stein@gmail.com>
Summary
Flux 2 Klein inpainting would not work if there is a reference image and inpaint mask both existed. So now we split the reference latent from the generated latents before merging the generated latent back together.
QA Instructions
Merge Plan
Test and merge.
Checklist
What's Newcopy (if doing a release after this PR)