What is ControlNet? What Can It Do? A Comprehensive Guide to Installing ControlNet on Stable Diffusion Web UI

ControlNet offers alternative methods for controlling images produced by Stable Diffusion, bypassing the need for prompts.

However, ControlNet can be overwhelming for many users due to its extensive range of features and intricate settings.

To simplify the understanding of ControlNet, this article provides a concise explanation of its key points.

Furthermore, the article introduces the sd-webui-controlnet extension, an open-source tool that enables the use of ControlNet with the Stable Diffusion Web UI. With numerous stars on GitHub, this extension is highly reliable. The article includes detailed instructions for installation and usage.

Mikubill/sd-webui-controlnet: WebUI extension for ControlNet (github.com)

What is ControlNet?

ControlNet is a neural network that is used to control models by incorporating additional conditions into Stable Diffusion. It can also refer to ControlNet for Stable Diffusion Web UI, which is an extension of the Stable Diffusion Web UI.

ControlNet imposes restrictions on images to prevent them from deviating significantly from extracted features such as poses and compositions. This enables the generation of images based on these extracted features.

By utilizing ControlNet, it becomes possible to generate images while maintaining a specific pose when inputting an image or allowing for coloring by generating images while preserving the lines in a given line drawing. This allows for a wide range of expressive possibilities.

ControlNet: What can be achieved?

Here are some specific applications of ControlNet:

  • Defining poses for generated images or stick figures based on images.
  • Altering illustration styles and textures.
  • Adding color to line drawings.

In the examples below, the image on the left serves as the reference, while the image on the right is the result generated using ControlNet.

Example 1(ControlNet Openpose):

Example 2(Controlnet Segmentation):

Example 3(Controlnet Lineart):

Introduction to ControlNet Feature Extraction Model

The ControlNet model incorporates Stable Diffusion with various feature extraction models to achieve image control. By controlling the Stable Diffusion model and extracting features from a separate model, ControlNet ensures that the generated images maintain specific features.

In essence, feature extraction allows us to manipulate images in various ways. The range of feature extraction models is extensive, exploring almost all conceivable possibilities. Some notable models include:

  • Open Pose for Pose Control: OpenPose, originally designed for estimating human poses, is utilized in ControlNet for pose control. It extracts poses from images, enabling the generation of images while preserving those poses. Providing pose instructions through prompts can be challenging, but this approach significantly reduces the required effort. It can even generate images from stick figures, allowing for the reproduction of ideal poses without pre-existing ones.
  • Depth Model for Extracting Depth from Images: A Depth model is capable of extracting depth information from images, enabling control over the image’s spatial dimensions. This understanding of the 3D structure helps generate images with accurate depth representation. It proves useful when changing only the texture of objects, like furniture, within an image.
  • Canny Model, Soft Edge, Scribble for Extracting Edges from Images: The Canny Model is one such model that can extract edges from images. By extracting edges, the model can create line drawings from images. This allows for color changes in illustrations or coloring monochrome line drawings while preserving the original linework. Additionally, there are similar models like Soft Edge and Scribble that extract edges, focusing on major lines. These are commonly used for tasks such as converting the texture of illustrations.

Example of feature extraction
Example of feature extraction:

  1. Original image
  2. OpenPose
  3. Depth
  4. Canny
  5. Soft Edge
  6. Scribble

Introducing the ControlNet Extension for Stable Diffusion Web UI

The ControlNet feature can now be accessed through the Stable Diffusion Web UI by adding the ControlNet extension. This article provides a step-by-step guide on how to use ControlNet from the Stable Diffusion Web UI. Here is the general procedure:

Installation Steps

  1. Check if ControlNet is already installed
  2. Add the ControlNet extension to Stable Diffusion Web UI
  3. Download the feature extraction models

1. Confirming ControlNet Isn’t Installed

First, let’s verify that ControlNet isn’t already installed. It’s common for ControlNet to be inadvertently installed along with the Stable Diffusion Web UI or other extensions. Thus, it’s recommended to check its installation status.

If ControlNet is installed, you’ll find the ControlNet menu at the top of the screen. Should you find ControlNet already in place, please skip to step 2: “Add the ControlNet extension to Stable Diffusion Web UI”.

2. Add the ControlNet extension to Stable Diffusion Web UI

To add the ControlNet extension “sd-webui-controlnet”, please follow these steps:

  1. Go to the Extensions tab.
  2. Switch to the Install from URL tab.
  3. Enter https://github.com/Mikubill/sd-webui-controlnet in the URL for extensions’ git repository field.
  4. Click Install.
  1. Switch Installed Tab.
  2. Click “Apply and restart UI”

Once the UI restarts, if the ControlNet menu is displayed as shown below, the installation was successful.

For Windows users, if you encounter the error “ModuleNotFoundError: No module named ‘pywintypes'” when loading ControlNet, please enter “pip install pypiwin32” in the command prompt to install the required package.

3. Downloading Feature Extraction Models

The ControlNet extension does not include feature extraction models. Therefore, it is necessary to download and place the feature extraction models in the appropriate folder.

You can download the feature extraction models from the Hugging Face repository provided below:

The .pth files are the model files, and the .yaml files are the model structure definition files.

Please download both types of files.

lllyasviel/ControlNet-v1-1 at main (huggingface.co)

You have the option to download all the models, but please note that they have large file sizes and may take a while to download. For efficiency, it is recommended to download them in the following order of priority:

  1. control_v11p_sd15_openpose.pth, control_v11p_sd15_openpose.yaml
  2. control_v11f1p_sd15_depth.pth, control_v11f1p_sd15_depth.yaml
  3. canny, scribble, soft edge
  4. Others

Place the downloaded files under “stable-diffusion-webui/models/ControlNet”.

Using ControlNet OpenPose

To get started, we will use ControlNet that has been installed. Our first step is to try using OpenPose. We will use an image of a person sitting in a formal kneeling position from the free materials provided by Pakutaso. We will then generate an image of a schoolgirl sitting in the same pose.


Woman sitting in a formal kneeling position with hands placed on the knees|Free material from Pakutaso (www.pakutaso.com)

Performing Feature Extraction(1/3)

Let’s start with feature extraction.

Feature extraction:

  1. Open the ControlNet menu.
  2. Set the image.
  3. Choose OpenPose for the Control Type.
  4. Click the feature extraction button.


If the generated image looks like a stick figure as shown below, the feature extraction was successful.


Generating an Image from Extracted Features(2/3)

  1. Enable “Enable” in the ControlNet menu.
  2. Configure desired image generation settings (similar to using txt2img)
    • Prompt: “Photo of Japanese girl sitting on floor in a classroom, school uniform”
    • NegativePrompt: “EasyNegative”
    • Width: 768, Height: 512, Batch size: 6
  3. Click the image generation button (similar to using txt2img)


Images Generated by OpenPose(3/3)

Input Image:

Prompt:

“Photo of Japanese girl sitting on the floor in a classroom, school uniform”

Result:

The generated images closely resemble the input image in terms of pose. By using ControlNet and OpenPose, we can extract poses and generate images in the same pose. I hope you find this explanation helpful.

For more detailed explanations, please refer to the following article.

How to Use ControlNet OpenPose.

Various Functions of ControlNet

Controlling Detailed Features (Preserving Facial Features, Clothing, and Atmosphere)

Learn how to use the ControlNet Tile function and explore specific examples of its application, such as converting between anime and live-action, image correction, and upscaling. This article provides a comprehensive guide to using ControlNet Tile, a feature that may be less familiar compared to OpenPose and Canny. Visit the link for more information: How to use ControlNet Tile and specific examples (mutual conversion between anime and live-action, correction, upscaling, etc.): ControlNet 1.1 New Features

Composition and Shape Control (Generating Images with Consistent Composition)

Discover how to use ControlNet Segmentation to generate images with the same composition. With ControlNet’s numerous functions, it can be challenging to determine which one to use. This article focuses on ControlNet Segmentation and provides practical examples to help you understand its usage. Read more here: How to use ControlNet Segmentation and explanation. Generating images with the same composition.

Maintain consistent composition and three-dimensional structure while generating images using ControlNet NormalMap. This article explores the application of ControlNet NormalMap and provides insight into its usage. Follow the link to learn more: ControlNet NormalMap. Generating images while maintaining composition and 3D structure.

Hand Correction with ControlNet Depth

Explore different methods of hand correction with this article, which covers various approaches and explains their effectiveness. Although these methods may not universally apply, understanding their specific situations can greatly enhance your hand correction process. Read more here: Results of verifying all 6 methods for hand correction…

Line Drawing Extraction (Useful for Coloring, Live-action Adaptation, and Illustration)

Learn how to use ControlNet Soft Edge to change colors, adapt images into a live-action style, create animations, and add colors to line drawings. This article dives into practical examples of using ControlNet Soft Edge. Find out more by visiting the link: ControlNet Soft Edge for color changing, live-actionization, animation, and coloring.

Discover how to use ControlNet Scribble for coloring, live-action adaptation, animation, and adding colors to line drawings. This article provides practical examples to help you understand the potential uses of ControlNet Scribble. Follow the link for more information: ControlNet Scribble for color changing, live-actionization, animation, and coloring.

Achieve high-quality image generation, coloring, and line drawing with ControlNet 1.1 Canny. This article focuses on ControlNet Canny, explaining its features and providing valuable insights into its usage. Click on the link below to learn more: High-quality image generation from Canny, coloring, and line drawing with ControlNet 1.1.

Learn about the new features of ControlNet 1.1 Lineart and Anime Lineart in this article. Gain a better understanding of how to utilize Lineart and Anime Lineart for high-quality image generation from coloring and line drawing. Visit the link for more details: New features of ControlNet 1.1 Lineart. High-quality image generation from coloring and line drawing.

Other Features

Discover how to use ControlNet Inpaint, a powerful feature introduced in ControlNet 1.1. This article provides a comprehensive guide on how to utilize ControlNet Inpaint effectively, comparing it to three other processors. Click on the link to learn more: How to use ControlNet Inpaint and comparison of 3 processors. ControlNet 1.1 New Features

Explore ControlNet Shuffle, a new feature introduced in ControlNet 1.1, and learn how to make the most of its capabilities. This article provides a detailed explanation and practical examples of ControlNet Shuffle. Visit the link for more information: How to use ControlNet Shuffle, a new feature in ControlNet 1.1.

Learn about ControlNet Instruct Pix2Pix, a new feature introduced in ControlNet 1.1, and its usage by trying it out. This article provides insights into how to use ControlNet Instruct Pix2Pix effectively. Click on the link below to find out more: Trying out the new feature “instruct pix2pix (ip2p)” of ControlNet 1.1.

Discover how to free yourself from pose-related challenges by using ControlNet OpenPose. This article explains how to make the most of ControlNet OpenPose and overcome difficulties in achieving complex poses. Follow the link for more information: How to use ControlNet OpenPose. Free yourself from pose-related troubles.

コメント