Have you ever wished to generate multiple images of a person using just one image?
ControlNet reference only is the perfect tool to fulfill this desire. With this technique, you can easily modify the background, pose, clothing, and many other aspects.
In this article, we will guide you through the installation process, demonstrate how to use ControlNet reference only, present validation results, and provide detailed settings. By the end of this article, you’ll have a comprehensive understanding of ControlNet reference only.
- What does “ControlNet reference only” mean?
- Possible applications of ControlNet reference include:
- How to Use ControlNet Reference
- Controlnet reference only, clothing conversion to t-shirts.
- Anime adaptation with controlnet reference only
- Here is the rewritten section:
- Explanation of different settings for ControlNet for reference purposes only.
What does “ControlNet reference only” mean?
“ControlNet reference only” is a technology that enables style transfer. To put it simply, it is a function that allows for the conversion or generation of images while preserving the features that are not specified in the given prompt. For instance, you can modify an image by specifying changes to the clothing or background while keeping the face unchanged. In other words, you can create a new image based on a single image, reflecting the conditions expressed in the prompt.
Distinctions from basic img2img
Similar tasks can be accomplished with img2img, but the difference lies in the level of accuracy in preserving the original image’s characteristics. Particularly, facial features are prone to alteration with img2img, whereas using ControlNet reference only ensures preservation.
Possible applications of ControlNet reference include:
- Clothing transformation: ControlNet reference can be used to modify or transform the appearance of clothing in digital imagery.
- Background transformation: ControlNet reference offers the ability to alter the background or environment in which the subject is depicted.
- Pose transformation: ControlNet reference allows for the manipulation or adjustment of the pose or position of objects or individuals in an image.
- Live-action to animation conversion: ControlNet reference enables the conversion of live-action footage into animation by applying digital enhancements or modifications.
- …
The range of possibilities is vast and limited only by imagination. However, it is important to acknowledge that ControlNet reference is still a relatively new technology and may have some stability issues. Therefore, this study aims to explore the efficacy of using ControlNet reference in various scenarios.
How to Use ControlNet Reference
Installing ControlNet
To utilize ControlNet Reference, you must first install ControlNet, an extension of the Stable Diffusion Web UI. If you have not yet installed ControlNet, please refer to the following article for installation instructions:
No Model Download Required for ControlNet Reference
Normally, when using ControlNet, you would need to download a model. However, with ControlNet Reference, there is no need to download any models.
Steps for Usage in the Web UI
Controlnet reference only, clothing conversion to t-shirts.
We have successfully converted a uniform into a t-shirt with using controlnet reference only, while ensuring that the facial features remain unchanged.
Prompt: 1girl, a 20 years old pretty Japanese girl in classroom. blackboard, t-shirts
I transformed the background to a street scene.
Prompt: 1girl, a 20 years old pretty Japanese girl on the street.school uniform
The image is provided below:
Subsequently, I altered the backdrop once again, this time to a beach environment, while also changing the outfit to a white bikini.
Prompt: 1girl, a 20 years old pretty Japanese girl on the beach.white bikini
I have made the requested changes to the background, clothing, and pose.
Prompt: 1girl, a 20 years old pretty Japanese girl on the beach.white bikini, arms up
Anime adaptation with controlnet reference only
We can create an anime adaptation of a live-action image by not only using prompts but also modifying the model.
Model: AnythingV5Ink_ink
The prompt has been changed to: A pretty Japanese girl, 20 years old, in a classroom wearing a school uniform, standing in front of a blackboard.
Let’s generate the image again with these modifications.
Certainly, even after the anime adaptation, some traces of the original image can still be seen.
Here is the rewritten section:
To translate this article into English, follow these steps:
- Use the
AnythingV5Ink_ink
model to generate the original image and save it in ControlNet with thereference_only
setting. - Switch the model to
beautifulRealistic_brav5
and generate the image.
Here are the links to the images:
Although the images appear realistic, they may seem slightly unnatural. To address this, change the Control Mode to “My prompt is more important.”
The resulting images are as follows:
There doesn’t appear to be a significant change between the images.
Explanation of different settings for ControlNet for reference purposes only.
Style Fidelity
Style Fidelity refers to the level of faithfulness to the style. A higher Style Fidelity means that the reference image has more influence and the prompt has less influence. Conversely, a lower Style Fidelity means that the reference image has less influence and the prompt has more influence. This setting is only effective when the Control Mode is set to “Balanced”.
Preprocessor
The reference-only Preprocessor has three options:
- reference_only (attn)
- reference_adain (adain)
- reference_adain+attn (adain + attn)
The default option is reference_only. The reference_adain Preprocessor was added based on the latest research mentioned in the following article:
Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
According to the author’s recommendation:
- “reference_adain+attn, Style Fidelity=1.0” is the current state-of-the-art method, but it can be overly strong, so it is not the recommended default.
- It is still recommended to use “reference_only + Style Fidelity=0.5” as the default option because it is more robust.
Therefore, it is suggested to use the default option for stability. I have also tested it and found that using the default option produces good results.
Preprocessor Comparison
Let’s compare the default option and the latest method using the previous examples.
Left: Reference Image, Middle: “reference_only + Style Fidelity=0.5”, Right: “reference_adain+attn, Style Fidelity=1.0”
The results are shown in the provided image.
For the latest method, while it retains some features from the original image, it may produce slightly different images.
コメント