Open Maya and go to Windows –> Settings/Preferences –> Plugin Manager |
Check “mandala.py” |
In the Mel command line, type “mandala” to open the panel and press enter. |
Select “Diffusion” in the drop-down list. |
Now, click on the “Render” button. The first time you run it, the model has to download, which might take some time, depending on your connection! After a while, a cat should appear in the render view. This is the result of the default prompt. It doesn’t look very good, but that’s because the base model (Stable Diffusion 1.5) is a bit outdated. |
Okay, so let’s try another model. Click on the Model Manager tab, and click on Add from URL. Then, paste this: “darkstorm2150/Protogen_x5.8_Official_Release“. |
Go back to the Image Generation tab. Select the model we have just added in the list. Finally hit render again and wait until it’s finished downloading. Better? |
Tips #
As demonstrated here, the models you use are crucially important to get decent results. |
Models are trained at a specific resolution. Because of this, generating large images (like 1920×1080) on a model trained on 512×512 images (like SD 1.5) will likely result in ugly repetitions. Similarly, using resolutions lower than 512×512 will result in cropped images. (note that the most recent SD versions (2.1) are trained on 768×768 images) |
This problem is less apparent in some other models. For instance, Protogen 5.8 (used in the previous example) doesn’t seem to have too much of this issue. Also, using conditioning (like Depth2Img or ControlNet) mitigates it as well. |
However… You should use upscalers! And one of your image dimensions should match the model’s training size, ideally. |
Model sizes must match when using controlNet. If a controlNet is based on SD 1.5, you must use SD 1.5 or any other model based on it as the base model. Otherwise, an error will pop up. The best is to try it, making a mistake won’t crash the program anyway, allowing for experimentation. |
To demonstrate the point above:
Generated with SD 1.5 at 1024×1024
with no upscaler.
Obvious, ugly repetitions.
Generated with SD 1.5 at 1024×1024
with an upscaler (ESRGAN).
In this configuration, the image is generated at 512×512 and then upscaled.
As you can see, no more ugly repetitions.