Generates an image based on your prompt + negative prompt. No inputs are used in this mode, and nothing in actual 3D is actually rendered. You could use this mode, for instance, to generate a background landscape for your scene.
Example #
So, let’s do just that. Let’s generate a Ghibli-style background:
First, let’s set up a Stable Diffusion model. We could use the base ones, but there is one on HuggingFace.com that is particularly adept at generating anime stuff: https://huggingface.co/andite/anything-v4.0. Of course, do not download it from the website. Instead, add it to Mandala’s Model Manager. |
Now paste the ID of the model: andite/anything-v4.0. HuggingFace models are always in the form: <Author>/<modelName>. |
Now select the “Diffusion” mode in the “Image Generation” tab, and choose the right model. A warning appears that says the model isn’t downloaded yet. No problem, that will be done before inference. |
Now, let’s change the resolution to 1920×1080: |
Next, choose an Upscaler, for example RealESRGAN with the 2x Anime Model: |
Finally, we need a prompt. Let’s type this in the prompt area and hit render: A fantasy open landscape, lush vegetation, Studio Ghibli background, colorful scenery, cliffs, waterfall, flowers, butterflies |
After the model finishes downloading, the render starts and an image appears (it might be totally different on your hardware though) |
Now, let’s click on the image plane icon to put it in the 3D scene. |
Done! Of course this is just an example. The prompt could use a lot more love. |
Parameters #
Resolution: The same as Maya’s render resolution. Ideally, the image dimensions should match the model training size. For example, for Stable Diffusion 1.5 based models, width OR height should be 512, and 768 for Stable Diffusion 2. Otherwise, cropping or repetitions might occur. These artefacts, however, are less noticeable when using conditioned generation like depth2img or controlNet. Additionally, when an upscaler is used, the diffusion will happen on a downscaled image (and therefore be MUCH faster). For example, a 1024 x 1024 image with a 2x upscaler will be computed at 512×512, then upscaled to the final resolution. |
Generates an image based on your prompt + negative prompt, and the input RGB pass.
enableInference: Disable to only render 3D passes and skip Stable Diffusion inference (use it to preview passes like normals, depth, etc) |
modelName: The name of the model to use for denoising inferences. The model will be downloaded automatically if ‘downloadModels’ is checked. |
inferenceSteps: The number of denoising steps. More denoising steps usually lead to a higher-quality image at the expense of slower inference. In general values between 25 and 50 yield good results. |
guidance: Guidance scale. A higher guidance scale encourages to generate images that are closely linked to the prompt , usually at the expense of lower image quality. |
seed: Random seed used for generating the initial noise with a Torch generator. Makes generation more deterministic. |
eta: Only applies to the DDIMScheduler. A diffusion model variable that mixes in a random amount of scaled noise into each timestep. 0 is no noise, 1.0 is more noise. Will be ignored for other schedulers. |
useLora: Load Lora weights and patch the model. Lora weights can be found on HuggingFace or Civita.ai. These are usually small and are a convenient way to patch larger models and fine-tune them. Currently, HuggingFace Lora weights and .safetensor Lora weights are both supported. |
loraModel: The name of the LORA model to use in combination with the diffuser model. Use a .safetensor file in a folder with the name of the model placed in the .cache folder (check out the model manager and make sure to assign it the ‘lora’ category). Additionally, Lora models from Hugginface.com in the format Author/Model can also be used. |
loraWeight: The weight of the lora model when applied on the pipeline, from 0 (no effect) to 1 (full). |
upscaler: The type of upscale to use after the diffusion process. Depending on the scale (x2 or x4), the image will be computed at lower resolution then scaled up to match the output resolution. This is done to reduce the memory footprint of the process. For example, when a 2x upscaler is used, the generation will occur on a half-sized image and then be scaled up to the final resolution. |
upscaleInferenceSteps: The number of denoising steps, only used for latent upscaling. |
upscaleSeed: The random seed used by the latent upscale. Has a minor impact on the results, so in general you can leave this to the default value of 0. |
latentUpscalerModel: The name of the model to use for latent upscaling. The model must be downloaded first. Latent upscaling requires a lot of GPU memory, at least 11GB is recommended. If you have less, you can try using other upscaling methods. |
ESRGanModel: The model used for ESRGan upscaling. The model must be downloaded first. |
enhanceFaces: If enabled, a GPFGan model will be used to detect and enhance the faces of the image automatically (note: This will be moved to another section in the next update) |
GPFGanModel: The model used by the GFPGan face enhancement module. |
inpaintingModel: The inpainting checkpoint to use. Inpainting happens when a “render region” is triggered in the render view. |
regionBlur: The size of the region to blur around the inpainting region, allowing for smoother transitions. |
saveOutputImage: Save the output image, formatted using the render global settings. This option is automatically activated when rendering an animation. It uses the file name definition present in the render globals. |
saveSettings: Save the image settings as PNG meta-data in json format. Only available with PNG format. When this is used, the settings can be restored using Load PNG+Settings in the menu. |
saveLatents: Experimental: Save the latents for further decoding. |
savePasses: Save all the rendered passes (normals, depth…) along with the Stable Diffusion pass. A suffix will be added to the files. |
extension: Output extension. Right now, only PNG can store meta-data, which means the settings will be saved only with PNG as json meta-data. Alpha is not supported yet, but will be in future versions of the plugin. |