
V8: Using BF16 to load in FP32 LORAs, only to scale down to FP8 for saving. This seems to help resolve "grid" issues and improves quality. Tweaked accelerator amounts. Significant NSFW LORA tweaks (and new SNOFS). euler_a/beta recommended for 4-6 steps, lcm/normal for 7-8 steps.
V8: Using BF16 to load in FP32 LORAs, only to scale down to FP8 for saving. This seems to help resolve "grid" issues and improves quality. Tweaked accelerator amounts. Significant NSFW LORA tweaks (and new SNOFS). euler_a/beta recommended for 4-6 steps, lcm/normal for 7-8 steps.