Skip to content

Conversation

ashen-sensored
Copy link

No description provided.

@ashen-sensored
Copy link
Author

By shifting the guidance start time, allowing the vanilla Unet to lay out the foundation at high noise before applying correction by ControlNet, it is possible to retain most of the information from the original generation.
image
image
image
image
image

@ashen-sensored
Copy link
Author

ashen-sensored commented Feb 26, 2023

Demo:
Guidance Start:0 (default behavior)
image

Guidance Start:0.19
image

@Mikubill
Copy link
Owner

Looks good. Also need to make some changes in API handler

@ashen-sensored
Copy link
Author

Looks good. Also need to make some changes in API handler

The new parameter related change has been applied to api.py and xyz_grid_support.py.
I did a directory search, I think I covered all related locations.

@aiton-sd
Copy link
Contributor

aiton-sd commented Feb 27, 2023

The following change is probably required:

params = [None] * 14 > params = PARAM_COUNT

gitadmin0608 and others added 2 commits February 26, 2023 23:33
Merge branch 'main' into feature/guidance-start

# Conflicts:
#	scripts/api.py
@Mikubill Mikubill merged commit 3752046 into Mikubill:main Feb 27, 2023
@ashen-sensored ashen-sensored deleted the feature/guidance-start branch February 27, 2023 08:51
@enn-nafnlaus
Copy link

enn-nafnlaus commented Feb 27, 2023

I think when guidance was added (I updated the ControlNet extension for the first time last night), it broke the ability to use it in batches. I need to do some more testing, but when doing prompt-travel with ControlNet enabled, which used to work just fine, I was only seeing the impacts of ControlNet for the first 1 or 2 images of the prompt travel. And that would make sense based on how Guidance works, if there's a counter that gets reset only when Generate is clicked.

I also think that the very way these Guidance scroll bars are laid out is... confusing and misleading. I was so confused when messing around with them the first time, as to why I was getting radically different results between 0.16 and 0.17. It's not at all obvious from the name that it's actually a "percentage of your steps", which gets multiplied and then converted to an integer, which isn't at all an intuitive way to do it and requires that the person do math to figure out what number to set vs. how many steps they want it to run for. And also works differently from the bracket notation in AUTOMATIC1111 itself, where you specify the number of steps.

That said, it's undeniably a cool addition!

ED: Come to think about it, I only had the one guidance bar, for the guidance end. Looks like I need to update again and see if the first bug got fixed in the process of the second bar getting added...

@Magicalore
Copy link

How do you even make it work? No matter what I do I get nothing good...

@enn-nafnlaus
Copy link

How does ControlNet? Works fine for me, right out of the box. Try following a tutorial step-by-step and tell us which one and at which step it goes wrong for you.

@Magicalore
Copy link

How does ControlNet? Works fine for me, right out of the box. Try following a tutorial step-by-step and tell us which one and at which step it goes wrong for you.

ControlNet works fine, I'm trying to get better hands to appear in the image by merging a depth map of some hands and the openpose model together and I tried with Guidance Start at 0 or at 0.19 and I get way worse results, even without the openpose on it cannot recognize from the depth map that these are hands
Screenshot_3
Screenshot_4
Screenshot_5

@catboxanon
Copy link
Contributor

catboxanon commented Feb 27, 2023

I think the example here wasn't made very clear because it was broken up into several comments with little explanation. This is what it's doing.

The original generated image is below, no ControlNet is used:
image

The hand is obviously a mess. So, they took this image into Blender and created a hand depth pass as a guide. This is the depth pass they used.
image

They then used this depth pass with ControlNet. However, the default settings make ControlNet affect the output during the entire generation time. This means that all the empty information in the depth pass (the black area) is accounted for, and causes the output to become bad and not match the original prompt. This is that output:
image

So, why don't we delay ControlNet from kicking in so we can let the original noise do it's thing, and then we can control it to fix the hand? That's why this PR was created. Now, when ControlNet is delayed (in this case, only a few steps in, as the value used is 0.19, which relative to 20 steps is not that much), the original composition can play out, but the hand can be fixed. This is that image:
image

tl;dr: This seems to be a way to allow for implementing passes that only control certain elements, without destroying the original image. I don't think this will be entirely useful for if you're generating something completely from scratch, i.e. trying to use the hand depth pass on a completely different seed. It requires knowledge of what is generated normally. A depth pass from Blender is also a bit overkill imo -- think of if you were to use a different module like scribble instead.

@Magicalore
Copy link

Magicalore commented Feb 27, 2023

Oh thank you! Yes by bad I misunderstood how this worked!

@aleksusklim
Copy link

A quick and simple question to whoever has deep understanding of ControlNet structure:

– Why we cannot have a spatial weight on it, to get "mask" for applying to the ControlNet itself?
So we will be able to mask-out everything except the hand on depth map, and then it theoretically would not mess with the other parts of the image.

Is this is not physically possible? Is the weight is not applied to every pixel/latent (even controlling up to 8*8 squares would be great!) independently?

@AbyszOne
Copy link

I think the example here wasn't made very clear because it was broken up into several comments with little explanation. This is what it's doing.

The original generated image is below, no ControlNet is used: image

The hand is obviously a mess. So, they took this image into Blender and created a hand depth pass as a guide. This is the depth pass they used. image

They then used this depth pass with ControlNet. However, the default settings make ControlNet affect the output during the entire generation time. This means that all the empty information in the depth pass (the black area) is accounted for, and causes the output to become bad and not match the original prompt. This is that output: image

So, why don't we delay ControlNet from kicking in so we can let the original noise do it's thing, and then we can control it to fix the hand? That's why this PR was created. Now, when ControlNet is delayed (in this case, only a few steps in, as the value used is 0.19, which relative to 20 steps is not that much), the original composition can play out, but the hand can be fixed. This is that image: image

tl;dr: This seems to be a way to allow for implementing passes that only control certain elements, without destroying the original image. I don't think this will be entirely useful for if you're generating something completely from scratch, i.e. trying to use the hand depth pass on a completely different seed. It requires knowledge of what is generated normally. A depth pass from Blender is also a bit overkill imo -- think of if you were to use a different module like scribble instead.

Although that does help, wouldn't it be enough to edit the depth map of the image itself? With the same settings you would generate the same image only edited. It is also useful to leave the same image in img2img as a guide + pose + depth edit (or just scrible).
In any case, I've found much more unique uses for this feature. Thanks for the addition.👍

@ashen-sensored ashen-sensored restored the feature/guidance-start branch February 27, 2023 20:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants