Stable Diffusion: 8 Techniques for Clearer Faces in Full-Body Photos

Are you facing any issues with your face appearing unattractive or distorted when generating a full body image like the one shown below?

This problem frequently occurs when using the specific command “full body” to generate full body images. It can lead to unnatural shapes and a lack of detail in the resulting image. Several users have encountered this issue, and even after trying different settings, it might remain unresolved.

To address this challenge, we present eight specific solutions that can greatly assist in generating accurate person images. Here’s a breakdown of each solution, including its benefits, drawbacks, and evaluation:

SolutionBenefitsDrawbacksEvaluation
Adjusting with Hires.fixConvenientTime-consuming when executed repeatedly★★
Generating at low resolution and then increasing the resolutionCan be executed as needed and adjusted laterSignificant changes compared to the original image★★
Correction with InpaintCan be executed as needed, adjustable and offers high qualitySlightly time-consuming
Correction with After Detailer (adetailer)Convenient and delivers high qualityTime-consuming when executed repeatedly★★★
Face distortion prevention promptConvenientImage composition tends to focus on the face, making full body creation challenging
Utilizing ControlNet for pose controlEasy to specify composition, convenientRequires a reference image and introduction of ControlNet
Upscaling with ControlNet TileConvenient and maintains original image stateIntroduction of ControlNet required★★
Model selectionConvenientHas limitations

Additionally, we have also included a section on hand correction in the article for your interest.

We have examined six methods for hand correction and share the results…

Why does the quality of the face decline when capturing the entire body?

Before discussing specific solutions, let’s examine why the face deteriorates when including the entire body in an image. One of the main challenges of creating full-body images using AI is the difficulty of showcasing intricate facial features within a limited resolution.

For instance, when depicting a full-body character in a 512×512 pixel image, the number of pixels allocated to the face decreases, resulting in inadequate representation of facial details. This issue becomes more pronounced in full-body portrayals where the face is relatively smaller compared to the overall image.

Moreover, the resolution problem is also linked to the selection of prompts. AI generates illustrations based on prompt information, so when attempting to incorporate both a full-body depiction and intricate facial representation, it becomes unclear which keywords should influence the facial details and which should affect the entire image. Consequently, accuracy is compromised.

In essence, it is crucial to define what elements should be emphasized and to create detailed features within that context. Striking the right balance is the key to successful image generation with AI.

Solution 1: Adjusting the face with Hires.fix

Let me introduce you to a feature called Hires.fix. This feature can be found on the txt2img screen and it performs high-resolution processing after generating an image from a prompt.

Hires.fix is a feature that is already built into the Stable Diffusion Web UI, and it is very easy to use.

To access the Hires.fix menu, simply click on the black triangle next to the Hires.fix menu. Please note that this is the new version of the UI. In the old version, you need to check the Hires.fix checkbox.

Hires.fix menu

When the menu is open, the generated image will automatically be adjusted to a higher resolution.

Comparison of images without and with Hires.fix

The image on the left is generated without applying Hires.fix, while the image on the right is generated with Hires.fix applied. You can see that the face is adjusted when Hires.fix is applied.

One drawback of this method is that an image is generated every time, which can be inefficient. In that case, I recommend the alternative solution 2, which will be introduced next.

Adjusting Hires.fix parameters

There are two important parameters for Hires.fix: Upscaler and Denoising strength. The Upscaler determines how much to increase the resolution. If the face appears distorted, it is recommended to adjust this parameter higher.

Hires.fix parameters

Adjusting Denoising strength

By increasing the Denoising strength, you can create finer details in the image. However, too much noise can make the entire image too clear or create unnecessary details. On the other hand, reducing the Denoising strength can make the image smoother but blurrier.

Therefore, it is recommended to adjust this parameter in moderation, as too high or too low values may not give the desired result.

In the previous images, the face appeared too sharp and unnatural. So, let’s try reducing the Denoising strength to 0.2. The resulting image became slightly blurry, as shown in the left image below. Setting the Denoising strength too low can make the image appear blurred.

Let’s try setting the Denoising strength to 0.5. The resulting image has a good balance, as shown in the right image below. As the appropriate value may vary depending on the composition, it is recommended to adjust it by increasing if it appears too blurry and decreasing if it appears too sharp.

The left image has a Denoising strength of 0.2, and the right image has a Denoising strength of 0.5.

Solution 2: Upscaling after generating at low resolution

The issue arises when attempting to draw the face accurately within the entire character image. The core solution involves increasing the resolution. Enhancing the overall image resolution is one method of improving facial details. Particularly, at around 1000px resolution, the facial features of the entire character often appear with a moderate texture.

To implement this solution, follow these steps:

  1. Utilize the Stable Diffusion UI to transmit the image.
  2. Generate the image using txt2img and select the “🖼️” icon located below the generated image.
  3. The interface will transition to img2img and the image or prompt will be transferred to the img2img tab. (In case of an older version, select “Send to img2img”.)
  4. Modify the resolution setting at the bottom of img2img to “Resize by” (the simpler option).
  5. Set the Scale to 2.
  6. Click the “Generate” button to initiate img2img, and observe the resulting image with the modified face portion.

As seen, the facial image has been enhanced.

Is it preferable to start with high-resolution image generation?

One may question whether it is more advantageous to generate the image at high resolution initially and subsequently upscale it after generating it at a low resolution. However, there are certain problems associated with generating at high resolution from the outset.

  • Reduced efficiency: High resolution necessitates greater rendering time and resources, consequently reducing efficiency. It proves more efficient to commence with a low resolution and subsequently upscale the desired composition.
  • Unstable composition: Generating a large image may result in unstable composition, such as including two individuals within the image. This occurs because the Stable Diffusion model is trained on resolutions approximately 512×512.

By employing this technique, these two issues are mitigated by successively generating images at lower resolutions, refining them into the desired outcome, and subsequently adjusting them to higher resolutions.

Solution 3: Correction with Inpaint – The Setup Method

In this section, we will demonstrate how to utilize Inpaint for image correction. Follow the steps below:

  1. Switch to the img2img mode.
  2. Select Inpaint.
  3. Fill in the face that needs correction.
  4. Ensure that “Only masked” is checked (this is crucial).
  5. Click on the Generate button.

To correct the face using Inpaint, send the image to Inpaint *1/4. After generating the image using txt2img, click on the paintbrush icon located below the generated image to transfer the image to the Inpaint tab. If you want to make modifications to previously generated images, switch to the img2img tab and open the Inpaint tab to select the image.

Example1
  • 2/4: Mask the area you wish to correct.
    Ensure that you are on the img2img tab and verify that the prompt is correctly set. Make any necessary adjustments. Then, switch to the Inpaint tab and mask the area that requires correction.
Example2
  • 3/4: Configure Inpaint.
    Configure Inpaint. For face correction, set Inpaint Area to “Only masked”. Usually, Inpaint is applied to the entire image and replaces the masked area. However, for face correction, it is more effective to extract and apply Inpaint exclusively to the masked area. Also, adjust the resolution to match the original image.
Example3
  • 4/4: Enter the prompt and execute Inpaint.
    Finally, press Generate to apply Inpaint.

The resulting image will be displayed as follows:

Example4
By separating the overall and facial parts in this way, it is possible to strike a balance between the overall depiction of the body and the detailed expression of the face. By focusing on each part separately, it becomes feasible to maintain the overall balance and accurately depict the details.

If you find this process cumbersome, you can use a browser extension called After Detailer (adetailer), which automates the same level of operation. Although correction with Inpaint allows for more precise and reliable changes, if you find it too bothersome, we recommend trying the next method.

Solution 4: Introducing After Detailer (adetailer)

After Detailer (adetailer) is a feature of Stable Diffusion, designed to correct imperfections on faces and hands. The concept behind adetailer is quite similar to using inpaint, as discussed in Solution 3. It works by automatically identifying the face and applying corrective measures using img2img. We have included the images below to demonstrate the effectiveness of adetailer’s corrections.

For detailed instructions and usage methods, please refer to the following comprehensive article:

ADetailer Installation and 5 Usage Methods (Correcting Face, Hands, and Body Deformities) and In-depth Explanation of the Mechanism

Solution 5: Prompt for Preventing Facial Distortions

To achieve precise facial features in AI-generated character creation and prevent deformities, the prompt “detailed face” can be used effectively. Conversely, prompts such as “deformed face” can be employed to avoid undesirable distortions.

Recommended Keywords for the Prompt:

“detailed face”

Recommended Keywords for the Negative Prompt:

“deformed face, ugly, bad face, deformed eyes, bad anatomy”

Including prompts that relate to specific facial features like “cute face” or “brown eyes” often produces visually appealing faces as it highlights the emphasis on the face. This method is particularly helpful for addressing complex compositions involving various facial parts.

Now, let’s generate an image using the provided prompts:

Prompt:

(8k, RAW photo, best quality, masterpiece:1.2), (realistic, photo-realistic:1.4), (extremely detailed 8k wallpaper), cheerleader outfit,  full body, 20-year-old woman, (detailed face: 1.4)

Negative Prompt:

EasyNegative, (worst quality, low quality: 2.0), normal quality, ugly face, unclear eyes, bad mouth, bad anatomy, extra legs, beach, bad anatomy, deformed face, ugly, bad face, deformed eyes, bad anatomy

By doing this, we successfully eliminated facial distortions. However, even though we used the same seed, the resulting image does not include a full body.

It is crucial to maintain a balance when using prompts in drawing instructions. Focusing solely on facial details can result in neglecting the overall composition.

To generate a well-proportioned image, it is important to find the right balance between prompts for the overall depiction and detailed facial features.

If you haven’t installed ControlNet yet, please refer to the following article for an explanation of ControlNet and installation instructions.

Solution 6: Upscaling with ControlNet Tile

Now, let’s delve into the possibilities of using ControlNet Tile. If ControlNet has not been installed in the Stable Diffusion Web UI, please refer to the comprehensive guide provided in this article:

ControlNet: A Thorough Guide on Integrating with the Stable Diffusion Web UI and Practical Examples | kindanai.com

While ControlNet Tile may appear similar to Solution 2, it possesses a crucial advantage. It allows for modifications while preserving the original image’s features. If you have concerns about changes to the background while using img2img, it is worth giving ControlNet Tile a try.

[Examples of Images]

To gain a better understanding of ControlNet Tile and its various applications (such as the interconversion of anime and live-action, corrections, upscaling, etc.), please refer to this guide:

ControlNet Tile 1.1 New Features: Guide and Specific Use Cases | kindanai.com

Solution 7: Find the Perfect AI Model for You

The final solution I propose is model selection. With AI technology evolving rapidly, there are a variety of models available. Each model, such as Beautiful Realistic Asians (ideal for portraying characters with a slight pulled-back look) or Chilloutmix (great for close-up compositions but not full-body), has its own unique strengths in composition. Each model specializes in different areas or styles of depiction. Therefore, it’s crucial to find the model that best matches your desired art style and portrayal.

In-depth Explanation of Beautiful Realistic Asians! Compare all versions, find suitability for Japanese beauties [Part 3]

In-depth Explanation of ChilloutMix: Usage tips, pros and cons, comparison with other models [Part 4]

However, regardless of the chosen model, high resolution is essential for full-body character illustration. Working on a large canvas allows you to maintain a balance between the overall body structure and intricate facial details.

コメント